Skip to main content
nanograph

Storage Migration to NamespaceLineage

nanograph 1.2.x defaults new graphs to NamespaceLineage.

If you still have a legacy database, the supported upgrade path is:

nanograph storage migrate --db <db>.nano --target lineage-native

This is a one-shot storage migration. nanograph copies the currently committed graph state into a fresh NamespaceLineage database, swaps the database root on success, and keeps the old database as a backup.

What changed in 1.2

New graphs no longer treat the old filesystem manifest as the source of truth.

  • Committed graph state now lives in Lance-backed internal tables, centered around __graph_snapshot.
  • Transaction windows live in __graph_tx.
  • Delete tombstones live in __graph_deletes.
  • Inserts and updates are reconstructed from Lance lineage instead of replaying _wal.jsonl.
  • Managed imported media is stored in __blob_store and exposed as lanceblob://sha256/... URIs.

In simple terms: the old graph.manifest.json / _wal.jsonl mental model is legacy-only now. New graphs commit and replay from Lance-native state.

When you need to migrate

Run migration when either of these is true:

  • nanograph tells you the database is legacy storage and prints nanograph storage migrate --db ... --target lineage-native
  • you want the new default CDC rail, Lance 4 namespace storage, and managed blob storage

You do not need migration for databases that already use NamespaceLineage.

Use lineage-native unless you are doing a very specific intermediate compatibility step.

  • lineage-native is the current default for new graphs and the recommended destination.
  • lance-v4 is an older intermediate namespace rail. It remains available, but it is not the recommended steady state for new work.

What migration preserves

Migration preserves the currently visible graph state:

  • nodes
  • edges
  • property values
  • schema files
  • indexes rebuilt for the migrated tables
  • remote media URIs such as http(s)://, s3://, and other provider-pass-through references

Consequences of migration

Migration is intentionally not history-preserving.

  • Old CDC history does not carry forward. The migrated graph starts a fresh CDC epoch.
  • The migrated graph_version restarts from 1.
  • Old Lance dataset version history is not preserved.
  • Old WAL-era sequencing semantics do not carry forward.
  • Scripts or tools that read or patch graph.manifest.json or _wal.jsonl must stop doing that for migrated graphs.
  • Managed imported media is rewritten into __blob_store; values become lanceblob://sha256/... instead of old filesystem-backed managed paths.
  • Unmigrated legacy databases no longer open for normal read/write in the 1.2 line. The CLI fails fast and prints the exact storage migrate command to run.

That is the main operational tradeoff: you keep the current graph contents, but you do not keep the old storage-era history.

Backup behavior

The migrator moves the original database aside before creating the new one.

  • Active database after success: <db>.nano
  • Automatic backup path: <db>.nano.v3-backup

The backup naming is the same whether you migrate from legacy v3 storage or from the intermediate lance-v4 rail.

Migration refuses to run if the backup path already exists.

Migration procedure

1. Pre-flight checks

nanograph doctor --db <db>.nano --verbose
nanograph describe --db <db>.nano --json > /tmp/pre-migration-describe.json

Record the current row counts and keep a note of any managed media types in the graph.

2. Run the migration

nanograph storage migrate --db <db>.nano --target lineage-native

On success, the command reports:

  • db_path
  • backup_path
  • graph_version
  • node_tables
  • edge_tables
  • media_uris_rewritten

media_uris_rewritten counts imported managed media values that were moved into __blob_store.

3. Verify the migrated database

nanograph version --db <db>.nano
nanograph doctor --db <db>.nano --verbose
nanograph describe --db <db>.nano --json > /tmp/post-migration-describe.json
nanograph changes --db <db>.nano --format json

Validate:

  • row counts per node and edge type match pre-migration
  • the database opens and is writable
  • changes returns lineage-native rows
  • expected managed media values now use lanceblob://sha256/...

Run your normal smoke queries too:

nanograph run --db <db>.nano --query queries.gq --name <query> --param key=value

4. Keep the backup until you trust the new graph

Do not delete <db>.nano.v3-backup immediately. Keep it until:

  • smoke queries pass
  • search and embeddings behave correctly
  • media-backed records resolve correctly
  • downstream tools no longer depend on the old WAL/manifest layout

Rollback

If post-migration verification fails, restore the backup:

mv <db>.nano <db>.nano.failed
mv <db>.nano.v3-backup <db>.nano

If the migration command itself fails, nanograph tries to restore the original database automatically.

Operator notes

These are the behavior changes that matter most after migration:

  • changes is lineage-native for new graphs. Ordering is deterministic, but there is no old seq_in_tx contract.
  • cleanup --retain-tx-versions N is the main CDC retention control. nanograph derives the necessary Lance history automatically.
  • compact and doctor operate against the committed Lance snapshot and retained lineage windows.
  • Managed imported media is now database-managed. NANOGRAPH_MEDIA_ROOT is legacy input for migration, not the active steady-state storage model.

See also