Skip to content

Customer Backups

Context

Users can create, download, restore, and delete backups of their game server data. Backups are scoped to primary Deployment provisions that have a sidecar (Minecraft, FiveM, etc.) — the sidecar compresses/extracts via gRPC. Archives are written to the sidecar pod's /tmp (not the customer's data PVC) so a crash during backup leaves no stale files on the customer's volume.

Step 1: Sidecar proto — add ExtractTo

message ExtractToReq {
  string source = 1;       // e.g. "/tmp/restore-{id}.tar.gz"
  string destination = 2;  // e.g. "/data"
}

service FileIO {
  // ... existing RPCs ...
  rpc ExtractTo(ExtractToReq) returns (Empty);
}

Sidecar implementation: tar -xzf source -C destination.

Step 2: Migration

CREATE TABLE backups (
    id           UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    order_id     UUID NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
    provision_id UUID NOT NULL REFERENCES provisions(id) ON DELETE RESTRICT,
    name         TEXT NOT NULL,
    status       TEXT NOT NULL DEFAULT 'creating',
    asset_id     UUID REFERENCES assets(id),
    size_bytes   BIGINT,
    created_at   TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    completed_at TIMESTAMPTZ
);

ON DELETE RESTRICT on provision_id prevents hard-deleting a provision row while backups exist.

Step 3: Domain

pub enum BackupStatus { Creating, Ready, Failed, Restoring }

pub struct Backup {
    id: BackupId,
    order_id: OrderId,
    provision_id: ProvisionId,
    name: String,
    status: BackupStatus,
    asset_id: Option<AssetId>,
    size_bytes: Option<i64>,
    created_at: DateTime<Utc>,
    completed_at: Option<DateTime<Utc>>,
}

pub trait BackupRepository: Interface {
    async fn save(&self, backup: &Backup) -> exn::Result<Backup, RepositoryError>;
    async fn find_by_id(&self, id: &BackupId) -> exn::Result<Option<Backup>, RepositoryError>;
    async fn find_by_provision(&self, provision_id: &ProvisionId) -> exn::Result<Vec<Backup>, RepositoryError>;
    async fn delete(&self, id: &BackupId) -> exn::Result<(), RepositoryError>;
    async fn count_by_provision(&self, provision_id: &ProvisionId) -> exn::Result<i64, RepositoryError>;
    async fn has_active_operation(&self, order_id: &OrderId) -> exn::Result<bool, RepositoryError>;
}

Step 4: Application service

pub enum BackupServiceError {
    OrderNotFound, ProvisionNotFound, BackupNotFound,
    NoSidecar,
    OperationInProgress,
    LimitReached,      // >= 10 backups for this provision
    StatusConflict,
    Unauthorized,
    Storage(String), Cluster(String), Database(RepositoryError),
}

pub trait BackupService: Interface {
    async fn create_backup(&self, order_id, provision_id, user_id, name) -> Result<BackupId, BackupServiceError>;
    async fn list_backups(&self, order_id, provision_id, user_id) -> Result<Vec<Backup>, BackupServiceError>;
    async fn get_backup(&self, backup_id, order_id, user_id) -> Result<Backup, BackupServiceError>;
    async fn delete_backup(&self, backup_id, order_id, user_id) -> Result<(), BackupServiceError>;
    async fn restore_backup(&self, backup_id, order_id, user_id) -> Result<(), BackupServiceError>;
}

create_backup flow

  1. Load order + provision, verify owner or ViewAllOrders
  2. Verify provision: kind=Deployment, is_primary=true
  3. Find sibling sidecar provision — NoSidecar if absent
  4. has_active_operation(order_id)OperationInProgress
  5. count_by_provision(provision_id) >= 10 → LimitReached
  6. Create + persist Backup { status: Creating }; return BackupId
  7. tokio::spawn:
  8. Connect to sidecar gRPC
  9. client.compress_files(sources: ["/data"], output: "/tmp/backup-{id}.tar.gz", format: TARGZ)
  10. Stream file from sidecar → upload to S3 as backups/{order_id}/{backup_id}.tar.gz
  11. Create Asset record; update backup: status=Ready, asset_id, size_bytes
  12. Delete /tmp/backup-{id}.tar.gz (best-effort)
  13. On error: backup.status = Failed

restore_backup flow

  1. Verify backup.status == Ready → else StatusConflict
  2. has_active_operation(order_id)OperationInProgress
  3. Set status = Restoring, persist
  4. tokio::spawn:
  5. Stream from S3 → write to /tmp/restore-{id}.tar.gz on sidecar
  6. client.extract_to(source: "/tmp/restore-{id}.tar.gz", destination: "/data")
  7. Delete /tmp/restore-{id}.tar.gz (best-effort)
  8. Set status = Ready, persist
  9. On error: status = Failed (S3 backup still intact; user can retry)

delete_backup flow

  1. Verify ownership; status not in (Creating, Restoring)StatusConflict
  2. Delete S3 object; delete asset; delete backup row

Step 5: Reprovision guard

reprovision_order_service.rs checks backup_repo.has_active_operation(order_id) before teardown. Returns 409 BackupInProgress.

Step 6: HTTP

Routes under /api/v1/orders/:order_id/provisions/:provision_id/backups:

Method Path Description
GET / List backups
POST / Create backup { name?: string }
DELETE /:backup_id Delete backup
POST /:backup_id/restore Restore backup

Download: reuse GET /api/v1/assets/:id (enforces OwnerOnly ACL, streams from S3).

pub struct BackupView {
    pub id: String,
    pub name: String,
    pub status: String,           // "creating" | "ready" | "failed" | "restoring"
    pub asset_id: Option<String>, // set when ready; use /api/v1/assets/{id} to download
    pub size_bytes: Option<i64>,
    pub created_at: DateTime<Utc>,
    pub completed_at: Option<DateTime<Utc>>,
}

Step 7: Dashboard

backups-content.tsx: - Fetch list on mount - Poll every 3s while any backup is creating or restoring - Table: Name | Size | Status badge | Created | Actions - Actions: Download (ready only), Restore (ready only), Delete (ready or failed only) - "Create Backup" button → optional name modal → POST → refetch - Disabled create when >= 10 backups or operation in progress

i18n keys

order_detail.section_backups
backups.heading, backups.create_button
backups.column_name, backups.column_size, backups.column_status, backups.column_created, backups.column_actions
backups.action_download, backups.action_restore, backups.action_delete
backups.status_creating, backups.status_ready, backups.status_failed, backups.status_restoring
backups.confirm_delete_title, backups.confirm_restore_title
backups.empty_state, backups.limit_reached, backups.error_in_progress