Resume and Recovery¶
vertex-forager stores resumable state in a single SQLite database and keeps DLQ payloads as Arrow IPC files.
Where state lives¶
Default cache layout:
~/.cache/vertex-forager/
state.db
dlq/<table>/
batch_*.ipc
When VERTEXFORAGER_ROOT is set, the cache moves under $VERTEXFORAGER_ROOT/cache/.
What each table stores¶
checkpoints- One row per run ID
- Provider, dataset, completed symbols, failed symbols
- Used by
resume=Trueto skip symbols already completed in a prior run run_history- One row per completed run
- Run timing, per-table row counts, error count, serialized errors, quality violations, coverage
- Used by
vertex-forager runs list dlq_index- One row per spooled DLQ IPC file
- Spool path, table, provider, row count, created time, retry status
- Used by
vertex-forager dlq list,retry, andclear
How checkpoints are created¶
Checkpoints are written automatically at the end of a pipeline run. A completed run persists:
- the run ID
- the provider and dataset
- the completed symbol set
- the failed symbol set
Resume a run¶
Use the same provider and dataset with resume=True:
from vertex_forager import create_client
client = create_client(
provider="sharadar",
api_key="...",
rate_limit=300,
)
result = client.get_price_data(
tickers=["AAPL", "MSFT", "NVDA"],
connect_db="forager.duckdb",
resume=True,
)
When a matching checkpoint exists, completed symbols are skipped automatically.
Inspect run history¶
List recent runs:
uv run vertex-forager runs list --limit 10
Delete old run history:
uv run vertex-forager runs clear --before 90d
Inspect DLQ state¶
List pending DLQ entries:
uv run vertex-forager dlq list
Retry all pending entries for one table:
uv run vertex-forager dlq retry --table sharadar_sep --db ./forager.duckdb
Delete old DLQ entries and their IPC files:
uv run vertex-forager dlq clear --before 1d
Configure retention¶
Retention is applied automatically when a pipeline starts.
from vertex_forager import StorageConfig, create_client
client = create_client(
provider="yfinance",
rate_limit=60,
storage=StorageConfig(
checkpoint_retention_days=7,
run_history_retention_days=90,
),
)
storage.checkpoint_retention_days- Default
7 - Keeps completed checkpoint rows only as long as they are likely useful for resume flows
storage.run_history_retention_days- Default
90 - Keeps operational and audit history longer than checkpoints
- DLQ retention
- Follows
storage.dlq_tmp_retention_shousekeeping window - Configurable via
StorageConfig(dlq_tmp_retention_s=...)
Clear state selectively¶
Clear only checkpoints:
uv run vertex-forager clear --checkpoints
Clear only run history:
uv run vertex-forager clear --runs
Clear only DLQ state:
uv run vertex-forager clear --dlq
Clear everything:
uv run vertex-forager clear --all