Vertex Forager Constants Overview¶
This document summarizes centralized constants across the codebase. The goal is to eliminate magic numbers and provide a single source of truth for operational defaults and provider-specific mappings.
Global (vertex_forager/constants.py)¶
- HTTP_MAX_CONNECTIONS, HTTP_MAX_KEEPALIVE_CONNECTIONS, HTTP_USER_AGENT
- DEFAULT_RATE_LIMIT
- RESERVED_PIPELINE_KEYS
- DATE_FMT, ISO8601_Z_SUFFIX, DEFAULT_TIME_ZONE
- FLUSH_THRESHOLD_INFINITE
- PRIORITY_PAGINATION, PRIORITY_NEW_JOB, PRIORITY_SENTINEL
- PROGRESS_LOG_CHUNK_ROWS
- TICKERS_UNIT, PAGES_UNIT
- TRADING_DAYS_PER_YEAR
- FlowController defaults: DEFAULT_AVG_LATENCY_S, CONCURRENCY_MIN, CONCURRENCY_MAX, GRADIENT_QUEUE_SIZE_DEFAULT, GRADIENT_SMOOTHING_DEFAULT, GRADIENT_WINDOW_S
- Queue sizing: QUEUE_TARGET_RAM_RATIO, PACKET_SIZE_EST_BYTES, QUEUE_MIN, QUEUE_MAX, QUEUE_DEFAULT
- Writer(DuckDB): WRITER_DUCKDB_MAX_WORKERS, WAL_AUTOCHECKPOINT_LIMIT
YFinance (providers/yfinance/constants.py)¶
- SIZE_MAP
- PRICE_BATCH_SIZE, PRICE_BATCH_MAX, THREADS_THRESHOLD
- PRICE_BATCH_SIZE_KEY
- DATASET_ENDPOINT (dataset → yfinance API attribute/endpoint)
- DATE_FILTER_COL
- Price defaults/keys: INTERVAL_KEY, START_KEY, END_KEY, PERIOD_KEY, AUTO_ADJUST_KEY, PREPOST_KEY, DEFAULT_INTERVAL, DEFAULT_PRICE_PERIOD, DEFAULT_AUTO_ADJUST, DEFAULT_PREPOST
Sharadar (providers/sharadar/constants.py)¶
- MAX_ROWS_PER_REQUEST, DEFAULT_BATCH_SIZE, MIN_BATCH_SIZE
- TRADING_DAYS_RATIO, QUARTERLY_DAYS_RATIO
- Pagination keys: PAGINATION_META_KEY, PAGINATION_CURSOR_PARAM, MAX_PAGES
- DATASET_ENDPOINT (dataset → Nasdaq Data Link endpoint)
- DATE_FILTER_COL
- INTERNAL_COLS
- Request keys: QOPTS_PER_PAGE, QOPTS_COLUMNS, API_KEY_QUERY_PARAM
- Estimates: BYTES_PER_TICKER_METADATA, BYTES_PER_TICKER_FULL, ESTIMATED_TOTAL_TICKERS
Key Naming Rules (Comparison)¶
| Category | Sharadar | YFinance | Notes |
|---|---|---|---|
| Date Filter Key | DATE_FILTER_COL | DATE_FILTER_COL | Values map to schema date columns |
| Pagination Per-Page | qopts.per_page | N/A | Sharadar Datatables pagination |
| Column Selection | qopts.columns | N/A | Sharadar Datatables requested columns |
| Auth Param | api_key | N/A | Sharadar uses query param auth |
| Price Interval | N/A | interval | Default 1d |
| Range Start | N/A | start | YYYY-MM-DD |
| Range End | N/A | end | YYYY-MM-DD |
| Period | N/A | period | Default max when no start_date |
| Auto Adjust | N/A | auto_adjust | Default False |
| Pre/Post | N/A | prepost | Default False |
Units/Scope/Impact Notes¶
- DEFAULT_RATE_LIMIT (rpm): FlowController rate; affects overall throughput and API throttling risk.
- TRADING_DAYS_PER_YEAR (days): Batch estimation heuristic; used for Sharadar range sizing.
- WAL_AUTOCHECKPOINT_LIMIT (bytes string): DuckDB WAL auto-checkpoint; larger limits reduce checkpoint overhead during bursts.
- DEFAULT_TIME_ZONE (tz): All timestamps stored consistently; switching impacts downstream parsing/queries.
DIP Rules & Boundaries (Summary)¶
- Core → Abstractions: Depend on
IRouter,IWriter,IMapperProtocols, not concrete classes. - Providers → Implementations: Routers and library fetchers live under
vertex_forager/providers/*. - Routers Utilities:
vertex_forager/routers/transforms.pyandrouters/errors.pyare provider-agnostic helpers scoped to the routers layer. - Factories/Registries: Use
create_routerand writer registries to inject implementations; avoid direct imports of concretes in core.
Logging Prefix Rules¶
- Core HTTP (
vertex_forager.core.http): - Error logs include
provider,status(if available), and redacted messages. - Library fetch branch logs: include
scheme,dataset,symbol, and exception type for traceability. - Writers (
vertex_forager.writers.constants): - Prefixes:
WRITER,DUCKDB. - Correlation summary logs include counts and sample IDs for
trace_id/request_id.