Configuration Reference¶
Public runtime configuration is now centered on create_client(...) plus grouped config models.
Required Inputs¶
provider: str— provider identifier such as"sharadar"or"yfinance"api_key: str | None— required for Sharadar, not accepted by theyfinanceoverloadrate_limit: int— required for Sharadar; theyfinanceoverload uses an internal default
Top-Level Client Parameters¶
schedule: SchedulerConfig = SchedulerConfig()quality_check: Literal["warn", "error"] = "warn"concurrency: int | None = None
Stage logs are always emitted at DEBUG level; the host application controls visibility and formatting via standard logging configuration.
quality_check="warn" records quality-rule violations in RunResult.quality_violations and continues. quality_check="error" raises DataQualityError on the first violating flush.
Provider-Specific Direct Clients¶
Some provider-specific configuration only exists on the concrete client, not on create_client(...).
YFinanceClient¶
pickle_compat_datasets: list[str] | None = NoneNoneor[]disables pickle compatibility- a non-empty list enables pickle fallback only for the named datasets
- available on
YFinanceClient(...)and oncreate_client(provider="yfinance", ...) create_client(provider="yfinance", ...)uses an internal default rate limit and does not acceptapi_keyorrate_limit
Grouped Public Config¶
RetryConfig¶
max_attempts: intbase_backoff_s: floatmax_backoff_s: float— must be>= base_backoff_sbackoff_mode: Literal["full_jitter", "equal"]retry_status_codes: tuple[int, ...]
AdaptiveThrottleConfig¶
enabled: boolwindow_s: interror_rate_threshold: float— in[0, 1]rpm_floor_ratio: float— in[0, 1]recovery_factor: float— in[0, 1]healthy_window_s: int
HTTPConfig¶
max_connections: intmax_keepalive_connections: inttimeout_s: float— HTTP request timeout in seconds
SchedulerConfig¶
quantum: int = 3max_pending_per_symbol: int | None = Nonebackpressure_threshold: int | None = None
StorageConfig¶
flush_threshold_rows: int— DuckDB write buffer threshold before flush beginscheckpoint_retention_days: int = 7— retention window for completed checkpoint staterun_history_retention_days: int = 90— retention window for run-history recordsdlq_tmp_retention_s: int— retention window for DLQ.tmpartefacts
DLQ spooling, periodic cleanup, writer chunking, writer worker count, and memory guard thresholds are internal defaults and are no longer public tuning knobs.
Example¶
from vertex_forager import (
AdaptiveThrottleConfig,
HTTPConfig,
RetryConfig,
SchedulerConfig,
StorageConfig,
create_client,
)
client = create_client(
provider="sharadar",
api_key="...",
rate_limit=300,
concurrency=4,
schedule=SchedulerConfig(
quantum=3,
max_pending_per_symbol=50,
backpressure_threshold=120,
),
retry=RetryConfig(max_attempts=3),
throttle=AdaptiveThrottleConfig(rpm_floor_ratio=1.0),
limits=HTTPConfig(max_connections=200, max_keepalive_connections=100, timeout_s=30.0),
storage=StorageConfig(
flush_threshold_rows=500_000,
checkpoint_retention_days=7,
run_history_retention_days=90,
),
)