Skip to content

NEMAR Disaster Recovery Documentation

This directory contains comprehensive disaster recovery procedures for NEMAR dataset restoration.

🚨 EMERGENCY RESPONSE GUIDE

Use this first in an emergency!

  • 8-step emergency procedure (< 2 hour recovery)
  • Quick reference cards
  • Essential credentials and contacts
  • Troubleshooting guide
  • Backend fail-safe specifications

Target Audience: nemarRestore operator, Emergency responder


Complete Technical Documentation

Detailed technical guide covering:

  • Restoration architecture
  • Step-by-step procedures with verification
  • Git-annex and DataLad integration
  • End-user verification tests
  • Technical deep-dives

Target Audience: Developers, Technical operators


User Roles and Responsibilities

Defines the NEMAR user account structure:

Target Audience: Administrators, New team members


Located in /scripts/:

Production-ready restoration script for individual datasets.

Usage:

Terminal window
export AWS_ACCESS_KEY_ID="<key>"
export AWS_SECRET_ACCESS_KEY="<secret>"
./scripts/nemar-restore-dataset.sh \
<dataset_id> \
<version> \
<name> \
<zenodo_doi> \
<datalad_id>

Example:

Terminal window
./scripts/nemar-restore-dataset.sh \
nm000105 \
v1.1.0 \
"discrete_gestures" \
10.5281/zenodo.17613958 \
f9028a54-3d7e-4af0-994f-19dc40de6a0a

SQL script to restore database entries after GitHub restoration.

Usage:

Terminal window
wrangler d1 execute nemar-db --remote --file=scripts/restore_database_entries.sql

IF DATASETS ARE ACCIDENTALLY DELETED:

  1. Stay calm - S3 data is likely intact
  2. Open DISASTER_RECOVERY.md
  3. Follow STEP 1-8 (don’t read the whole doc first)
  4. Target recovery time: < 2 hours

Emergency Contact: [email protected]


This disaster recovery system was developed in response to a real incident on 2026-01-18 when datasets nm000103-nm000107 were accidentally deleted during test dataset cleanup.

  • 5 production datasets accidentally deleted from GitHub and database
  • S3 data remained intact (7,976 files)
  • All datasets had Zenodo preservation archives
  • Retrieved datasets from Zenodo archives
  • Restored GitHub repositories with git-annex configuration
  • Restored database entries
  • Total recovery time: 90 minutes (target: < 2 hours)
  • Data loss: None
  1. Zenodo archives are critical for disaster recovery
  2. S3 separation protects data layer
  3. Git-annex configuration requires careful setup
  4. Backend fail-safes needed to prevent deletion
  5. Clear procedures enable fast recovery

Test the recovery procedure every 3 months:

  1. Create test dataset (nm999999)
  2. “Accidentally” delete it
  3. Restore from Zenodo archive
  4. Verify end-to-end functionality
  5. Document timing and issues
  6. Update procedures based on learnings

Last drill: 2026-01-18 (production incident) Next drill: 2026-04-18


  • Issue #37 - Dataset restoration incident and procedures
  • Issue #35 - Backend fail-safes for dataset deletion
  • Issue #34 - Add —yes flags for non-interactive mode

RoleEmailPurpose
Owner[email protected]Emergency decisions, S3 data issues
nemarAdmin[email protected]Day-to-day operations, user management
nemarRestore[email protected]Service account for git commits

VersionDateChanges
1.0.02026-01-18Initial disaster recovery system based on real incident

This documentation may save your datasets. Keep it updated.