# Freeleaps PVC Backup Job This job creates daily snapshots of critical PVCs in the Freeleaps production environment using Azure Disk CSI Snapshot feature. ## Overview The backup job runs daily at 00:00 PST (Pacific Standard Time) and creates snapshots for the following PVCs: - `gitea-shared-storage` in namespace `freeleaps-prod` - `data-freeleaps-prod-gitea-postgresql-ha-postgresql-0` in namespace `freeleaps-prod` ## Components - **backup_script.py**: Python script that creates snapshots and monitors their status - **Dockerfile**: Container image definition - **build.sh**: Script to build the Docker image - **deploy-argocd.sh**: Script to deploy via ArgoCD - **helm-pkg/**: Helm Chart for Kubernetes deployment - **argo-app/**: ArgoCD Application configuration ## Features - ✅ Creates snapshots with timestamp-based naming (YYYYMMDD format) - ✅ Uses PST timezone for snapshot naming - ✅ Monitors snapshot status until ready - ✅ Comprehensive logging to console - ✅ Error handling and retry logic - ✅ RBAC permissions for secure operation - ✅ Resource limits and security context - ✅ Concurrency control (prevents overlapping jobs) - ✅ Helm Chart for flexible configuration - ✅ ArgoCD integration for GitOps deployment - ✅ Incremental snapshots for cost efficiency ## Building and Deployment ### Option 1: ArgoCD Deployment (Recommended) #### 1. Build and Push Docker Image ```bash # Make build script executable chmod +x build.sh # Build the image ./build.sh # Push to registry docker push freeleaps-registry.azurecr.io/freeleaps-pvc-backup:latest ``` #### 2. Deploy via ArgoCD ```bash # Deploy ArgoCD Application ./deploy-argocd.sh ``` #### 3. Monitor in ArgoCD ```bash # Check ArgoCD application status kubectl get applications -n freeleaps-devops-system # Access ArgoCD UI kubectl port-forward svc/argocd-server -n freeleaps-devops-system 8080:443 ``` Then visit `https://localhost:8080` in your browser. ### Option 2: Direct Helm Deployment #### 1. Build and Push Docker Image ```bash # Build the image ./build.sh # Push to registry docker push freeleaps-registry.azurecr.io/freeleaps-pvc-backup:latest ``` #### 2. Deploy with Helm ```bash # Deploy using Helm Chart helm install freeleaps-data-backup ./helm-pkg/freeleaps-data-backup \ --values helm-pkg/freeleaps-data-backup/values.prod.yaml \ --namespace freeleaps-prod \ --create-namespace ``` ## Monitoring ### Check CronJob Status ```bash kubectl get cronjobs -n freeleaps-prod ``` ### Check Job History ```bash kubectl get jobs -n freeleaps-prod ``` ### View Job Logs ```bash # Get the latest job name kubectl get jobs -n freeleaps-prod --sort-by=.metadata.creationTimestamp # View logs kubectl logs -n freeleaps-prod job/freeleaps-data-backup- ``` ### Check Snapshots ```bash kubectl get volumesnapshots -n freeleaps-prod ``` ## Configuration ### Schedule The job runs daily at 00:00 PST. To modify the schedule, edit the `cronjob.schedule` field in `helm-pkg/freeleaps-data-backup/values.prod.yaml`: ```yaml cronjob: schedule: "0 8 * * *" # UTC 08:00 = PST 00:00 ``` ### PVCs to Backup To add or remove PVCs, modify the `backup.pvcs` list in `helm-pkg/freeleaps-data-backup/values.prod.yaml`: ```yaml backup: pvcs: - "gitea-shared-storage" - "data-freeleaps-prod-gitea-postgresql-ha-postgresql-0" # Add more PVCs here ``` ### Snapshot Class The job uses the `csi-azuredisk-vsc` snapshot class with incremental snapshots enabled. This can be modified in `helm-pkg/freeleaps-data-backup/values.prod.yaml`: ```yaml backup: snapshotClass: "csi-azuredisk-vsc" ``` ### Resource Limits Resource limits can be configured in `helm-pkg/freeleaps-data-backup/values.prod.yaml`: ```yaml resources: requests: memory: "256Mi" cpu: "200m" limits: memory: "512Mi" cpu: "500m" ``` ## How It Works ### Snapshot Naming Snapshots are named using the format: `{PVC_NAME}-snapshot-{YYYYMMDD}` Examples: - `gitea-shared-storage-snapshot-20250805` - `data-freeleaps-prod-gitea-postgresql-ha-postgresql-0-snapshot-20250805` ### Processing Flow 1. **PVC Verification**: Each PVC is verified to exist before processing 2. **Snapshot Creation**: Individual snapshots are created for each PVC 3. **Status Monitoring**: Each snapshot is monitored until ready 4. **Independent Processing**: PVCs are processed independently (one failure doesn't affect others) ### Incremental Snapshots The job uses Azure Disk CSI incremental snapshots, which: - Save storage costs by only storing changed data blocks - Create faster than full snapshots - Maintain full recovery capability ## Troubleshooting ### Common Issues 1. **Permission Denied**: Ensure RBAC is properly configured 2. **PVC Not Found**: Verify PVC names and namespace 3. **Snapshot Creation Failed**: Check Azure Disk CSI driver status 4. **Job Timeout**: Increase timeout in the values file if needed ### Debug Mode To run the script locally for testing: ```bash # Install dependencies pip install -r requirements.txt # Run with local kubeconfig python3 backup_script.py ``` ## Security - The job runs with minimal required permissions - Non-root user execution - Dropped capabilities - Resource limits enforced - No privileged access ## Maintenance ### Cleanup Old Snapshots Old snapshots can be cleaned up manually: ```bash # List all snapshots kubectl get volumesnapshots -n freeleaps-prod # Delete specific snapshot kubectl delete volumesnapshot -n freeleaps-prod # Delete snapshots older than 30 days (example) kubectl get volumesnapshots -n freeleaps-prod -o jsonpath='{.items[?(@.metadata.creationTimestamp<"2024-07-05T00:00:00Z")].metadata.name}' | xargs kubectl delete volumesnapshot -n freeleaps-prod ``` ### Updating Configuration To update the backup configuration: 1. Modify the appropriate values file in `helm-pkg/freeleaps-data-backup/` 2. Commit and push changes to the repository 3. ArgoCD will automatically sync the changes 4. Or manually upgrade with Helm: `helm upgrade freeleaps-data-backup ./helm-pkg/freeleaps-data-backup --values values.prod.yaml` ## Backup Data ### What Gets Backed Up - **gitea-shared-storage**: Gitea repository data, attachments, and configuration - **data-freeleaps-prod-gitea-postgresql-ha-postgresql-0**: PostgreSQL database data ### Recovery To restore from a snapshot: ```bash # Create a PVC from snapshot kubectl apply -f - < kind: VolumeSnapshot apiGroup: snapshot.storage.k8s.io accessModes: - ReadWriteOnce resources: requests: storage: 10Gi EOF ```