Skip to content

Troubleshooting

Rate limits (HTTP 429) or throttling

  • Symptoms
  • Frequent 429 responses, long waits between requests.
  • Checks
  • Verify requests_per_minute and FlowController adaptive throttle settings.
  • Inspect structured logs for http_retry_reason:* and record_feedback.
  • Actions
  • Lower rate_limit or enable adaptive throttle:
    • throttle=AdaptiveThrottleConfig(enabled=True, error_rate_threshold=..., rpm_floor_ratio=...)
  • Increase retry backoff within safe bounds (base_backoff_s, max_backoff_s).

Memory peaks during flush

  • Symptoms
  • High RSS or OOM during large-table writes.
  • Checks
  • Review writer_rows.{table} and writer_flush_duration_s.{table} histograms.
  • Actions
  • Adjust storage.flush_threshold_rows to control when buffered rows flush into chunked writes.
  • Consider splitting workloads by dataset or symbols if necessary.

DLQ persistence

  • Symptoms
  • No DLQ files on failure, or spool/write errors appear in summaries.
  • Checks
  • Inspect RunResult.dlq_counts and RunResult.errors for DLQ=spooled or DLQ=spool_failed.
  • Actions
  • Ensure the app root is writable.
  • Use recovery CLI to reinject DLQ artifacts.

Writer validation failures (PK missing/null)

  • Symptoms
  • Errors mention missing/null PK columns; rows not written.
  • Checks
  • Confirm provider schema’s unique_key and mapper normalization.
  • Actions
  • Fix source or normalization to populate PKs.
  • Use DLQ recovery flow to reinject corrected frames.

Connectivity

  • Symptoms
  • Timeouts or connection pool exhaustion.
  • Actions
  • Tune HTTP: limits.timeout_s, limits.max_connections, limits.max_keepalive_connections.
  • Reduce concurrency or increase keepalive where appropriate.