Quickstart¶

Follow this tutorial from a minimal in-memory example to a local DuckDB-backed bulk collection workflow.

Prerequisites¶

Python 3.10+
Install the package from GitHub:

pip install "vertex-forager[yfinance] @ git+https://github.com/coolbress/VertexLab.git@vertex-forager-v0.11.4#subdirectory=packages/vertex-forager"

Optional notebook extras:

pip install "vertex-forager[notebook] @ git+https://github.com/coolbress/VertexLab.git@vertex-forager-v0.11.4#subdirectory=packages/vertex-forager"

1. Create a client¶

Start with the minimum client setup:

provider selects the built-in data source
rate_limit is required for providers like Sharadar; YFinance uses an internal default through create_client(...)

from vertex_forager import create_client

client = create_client(provider="yfinance")

For Sharadar, pass an API key when creating the client:

from vertex_forager import create_client

client = create_client(
    provider="sharadar",
    api_key="YOUR_SHARADAR_API_KEY",
    rate_limit=500,
)

2. Fetch data into memory¶

If you do not pass connect_db, the result stays in memory and you get a Polars DataFrame back immediately.

from vertex_forager import create_client

client = create_client(provider="yfinance")
prices = client.get_price_data(tickers=["AAPL", "MSFT"])
print(prices.head())

This is the right stopping point if you only need a DataFrame for analysis or ad hoc exploration.

3. Persist to DuckDB¶

Add connect_db when you want to build or update a local DuckDB database. In persistence mode the method returns a run summary instead of the in-memory DataFrame.

from vertex_forager import create_client

client = create_client(provider="yfinance")
run = client.get_price_data(
    tickers=["AAPL", "MSFT"],
    connect_db="duckdb://./forager.duckdb",
)
print(run.tables)
print(run.errors)

You can inspect the stored tables directly with DuckDB:

import duckdb

con = duckdb.connect("./forager.duckdb")
print(con.execute("show tables").fetchall())
print(con.execute("select * from yfinance_price limit 5").fetchdf())
con.close()

Next Steps¶

Configure concurrency, retries, and grouped runtime settings: see Configuration
Tune local collection behavior: see Performance tuning
Tune write flush thresholds: see Chunked flush tuning
Resume runs and operate on local state: see Resume and recovery