Resume and Recovery¶

vertex-forager stores resumable state in a single SQLite database and keeps DLQ payloads as Arrow IPC files.

Where state lives¶

Default cache layout:

~/.cache/vertex-forager/
  state.db
  dlq/<table>/
    batch_*.ipc

When VERTEXFORAGER_ROOT is set, the cache moves under $VERTEXFORAGER_ROOT/cache/.

What each table stores¶

checkpoints
One row per run ID
Provider, dataset, completed symbols, failed symbols
Used by resume=True to skip symbols already completed in a prior run
run_history
One row per completed run
Run timing, per-table row counts, error count, serialized errors, quality violations, coverage
Used by vertex-forager runs list
dlq_index
One row per spooled DLQ IPC file
Spool path, table, provider, row count, created time, retry status
Used by vertex-forager dlq list, retry, and clear

How checkpoints are created¶

Checkpoints are written automatically at the end of a pipeline run. A completed run persists:

the run ID
the provider and dataset
the completed symbol set
the failed symbol set

Resume a run¶

Use the same provider and dataset with resume=True:

from vertex_forager import create_client

client = create_client(
    provider="sharadar",
    api_key="...",
    rate_limit=300,
)

result = client.get_price_data(
    tickers=["AAPL", "MSFT", "NVDA"],
    connect_db="forager.duckdb",
    resume=True,
)

When a matching checkpoint exists, completed symbols are skipped automatically.

Inspect run history¶

List recent runs:

uv run vertex-forager runs list --limit 10

Delete old run history:

uv run vertex-forager runs clear --before 90d

Inspect DLQ state¶

List pending DLQ entries:

uv run vertex-forager dlq list

Retry all pending entries for one table:

uv run vertex-forager dlq retry --table sharadar_sep --db ./forager.duckdb

Delete old DLQ entries and their IPC files:

uv run vertex-forager dlq clear --before 1d

Configure retention¶

Retention is applied automatically when a pipeline starts.

from vertex_forager import StorageConfig, create_client

client = create_client(
    provider="yfinance",
    rate_limit=60,
    storage=StorageConfig(
        checkpoint_retention_days=7,
        run_history_retention_days=90,
    ),
)

storage.checkpoint_retention_days
Default 7
Keeps completed checkpoint rows only as long as they are likely useful for resume flows
storage.run_history_retention_days
Default 90
Keeps operational and audit history longer than checkpoints
DLQ retention
Follows storage.dlq_tmp_retention_s housekeeping window
Configurable via StorageConfig(dlq_tmp_retention_s=...)

Clear state selectively¶

Clear only checkpoints:

uv run vertex-forager clear --checkpoints

Clear only run history:

uv run vertex-forager clear --runs

Clear only DLQ state:

uv run vertex-forager clear --dlq

Clear everything:

uv run vertex-forager clear --all