Skip to content

Data Sources

Compare-my-stocks pulls from several independent sources. This page explains how they fit together — what each one gives you, when they overlap, and how merge / cache / refresh behavior shapes what shows up on the chart.

For initial setup walk-throughs see Quick start. For every field with screenshots see Config help. For a flat field reference see Configuration.


Market data: IB vs Polygon

You pick one source via Input.InputSource. There is no automatic fallback chain — if the chosen source fails, the app falls back to cache only (and prompts unless PromptOnConnectionFail is off).

Interactive Brokers Polygon
Auth Local TCP to TWS / IB Gateway (no API key) REST API key
Granularity Daily bars only Daily bars only (timespan=day)
Stocks / ETFs / Indexes
Crypto ✅ (via IB)
Forex ✅ — used for FX conversion ❌ raises NotImplementedError
Pre-split-adjusted? ❌ — adjustment runs separately (see Split adjustment below) ✅ — adjusted=True in list_aggs()
Position / portfolio reads ✅ via get_positions() ❌ (Polygon has no portfolio concept)
Rate limit None enforced in code; requests serialized through async loop 1 req / sec implied (free tier); single 429 retry with 1.2 s backoff
Connection failure handling TCP preflight gate (PreflightFailCacheSec, default 40 s) before each retry HTTP-level errors propagate

Consequence: if you switch to Polygon, you lose live portfolio sync and forex — the app cannot compute multi-currency P&L without IB up, unless you've previously cached FX history.

Cache-only mode. Setting Input.InputSource = Cache disables all live fetching and reads exclusively from hist_file.cache. Useful for offline / demo / testing, but the app will silently serve stale data — there is no banner.


Transactions: three sources, one merged view

Transactions are independent of market data. You can use any combination of:

Source What it gives you Cadence
IB Flex Web Service (token + query ID) Recent IB trades, pulled on launch Live trickle
IB Activity Statement CSV (Client Portal export) Historical positions + realized P&L baseline One-off seed
My Stocks Portfolio CSV (Peeksoft app) Transactions from any broker Manual export

Typical setup

The common workflow is:

  1. Download an IB Activity Statement CSV once to seed historical positions. Point TransactionHandlers.IBStatement.SrcFile at the file.
  2. Configure the Flex Web Service token + query ID. On every launch the app fetches recent trades from Flex and merges them on top of the statement.
  3. Set OnlyNewerThanIBStatement so Flex trades dated before the statement's period end are skipped — they're already covered by the baseline.

My Stocks CSV is for users without IB, or to add trades from a second broker alongside the IB sources.

How the three sources merge

TransactionHandlerManager.combine_transactions() merges in this order:

  1. IB Flex + My Stocks merged by CombineStrategy:
  2. PREFERIB — IB trades are the base; My Stocks rows added only when no IB row matches by date and amount.
  3. PREFERSTOCKS — My Stocks rows tagged Notes="IB:..." are matched against IB and deduplicated; others added fresh.
  4. Duplicate detection uses CombineDateDiff (days) and CombineAmountPerc (% of value).
  5. IB Activity Statement layered last — for any date not already present in the merged dict, synthetic entries seed the open positions implied by the statement.

The OnlyNewerThanIBStatement knob

When set, Flex trades older than the Activity Statement's period end date are filtered out. The intent is to avoid double-counting trades that the Statement already captures as closing positions.

The gotcha: the cutoff is read from the statement CSV's header, not from "now". If your statement is months old, recent Flex trades layer on cleanly. If your statement is fresh, anything Flex returns from before that date silently disappears.

Generating a baseline from live IB positions

If you have IB connected but no Activity Statement yet, run with --generate-ib-statement (or click Config/Help… → Tutorial / Actions → Generate IB Statement now). The app calls IB's get_positions(), writes a synthetic CSV to TransactionHandlers.IBStatement.SrcFile (default ibstatement.cache), and that file then loads on next startup like a real export.

The generated CSV has only an Open Positions section — no realized P&L history. It seeds today's holdings; configure Flex separately for a deeper trade history going forward. It is not regenerated automatically — re-run the flag whenever you want a fresh baseline.


Fundamentals: yfinance vs Seeking Alpha (both via RapidAPI)

Two separate RapidAPI subscriptions back the fundamentals features:

Config field What it unlocks
Jupyter.RapidYFinanaceKey Stock split history · YFinance-backed earnings (P/E, P/S) · per-symbol fundamentals in the embedded notebook
SeekingAlphaHeaders.X_RapidAPI_Key Seeking Alpha-backed earnings only

One yfinance key really does unlock three features — splits, earnings, and notebook fundamentals all read from the same key. If you're going to subscribe to only one thing, this is it.

The Earnings provider dropdown (TransactionHandlers.Earnings.Provider) controls which earnings source is used. Logic (TransactionHandlerManager._pick_yfinance_earnings):

Setting Behavior
PreferSeekingAlpha (default) Use SA if its key is set, else YFinance if that key is set, else nothing
PreferYFinance Mirror of above with YFinance first
SeekingAlpha Force SA — fails silently if its key is missing
YFinance Force YFinance — same

Two cache files: earnings.cache (Seeking Alpha) and earnings_yf.cache (YFinance) live side by side. Switching providers does not invalidate the other — delete the right one if you change providers mid-stream.


Split adjustment

Splits are applied at transaction-merge time (not at fetch time and not at display time): InputProcessor.process_hist_internal walks transactions chronologically and adjusts share counts and cost basis using the splits dict that StockPrices.get_hist_split populates.

Source of truth is yfinance via RapidAPI — there is no IB or Polygon fallback for splits. If Jupyter.RapidYFinanaceKey is empty, get_hist_split returns early and the splits dict for that symbol stays empty.

Silent staleness risk. Splits are cached in stocksplit.cache with TTL TransactionHandlers.IB.CacheSpan. If the cache is fresh but the key has been removed, no refresh fires and the old ratios stay in use. If a new split happens and the cache hasn't aged out, the app silently uses pre-split share counts. Mitigation: delete stocksplit.cache to force a refresh on next launch.

Polygon-without-RapidAPI gotcha. Polygon's daily bars are already split-adjusted, but your holdings are not — share counts in transactions still need the splits dict to back-adjust. A Polygon-only setup with no RapidAPI key will show correct price lines but wrong share counts after any split.


Currency normalization

Stock prices are fetched in each symbol's native currency. FX is applied at portfolio aggregation, not at fetch time.

  • Base currencySymbols.Basecur (e.g. USD). Every position converts to this when forming portfolio totals.
  • Per-symbol currency overrideSymbols.ExchangeCurrency patches IB's quirks (e.g. LSE returning GBp instead of GBP).
  • MultiplierSymbols.CurrencyFactor handles the GBp→GBP=100 case and similar.
  • FX source — IB forex pairs only. Polygon does not implement get_currency_history. If you've switched to Polygon and have no cached FX history, multi-currency portfolios will show gaps.

FX rates are cached per-currency in memory at run time and persisted inside hist_file.cache alongside price history. The display-time FX lookup has its own age check separate from the price cache TTL.


Caches on disk

Files live in ~/.compare_my_stocks/ (or $COMPARE_STOCK_PATH).

File Holds TTL knob Force-refresh
hist_file.cache Daily price history + currency history Input.MaxCacheTimeSpan Delete file, or set Input.InputSource ≠ Cache and bump TTL
hist_file.cache.back Backup written before every save
Full data pickle (File.FullData) Post-processing InputDataImpl snapshot Input.MaxFullCacheTimeSpan Set Input.FullCacheUsage = DONT
stocksplit.cache Symbol → list of (date, ratio) TransactionHandlers.IB.CacheSpan Delete file
earnings.cache (revdf, epsnorm) from Seeking Alpha TransactionHandlers.Earnings.CacheSpan Delete file
earnings_yf.cache (revdf, epsnorm) from YFinance TransactionHandlers.Earnings.CacheSpan Delete file
ibtrans.cache Cached Flex query response TransactionHandlers.IB.CacheSpan Delete file
buydicnk.cache Cached My Stocks trades per handler CacheSpan Delete file
ibstatement.cache Synthetic Activity Statement CSV (no TTL — text file) Re-run --generate-ib-statement

Cache use modesInput.FullCacheUsage is a three-state enum:

Value Meaning
DONT Never use the full-data cache; always reprocess
USEIFAVAILABLE Use if age < MaxFullCacheTimeSpan
FORCEUSE Use unconditionally — skip the age check

There is no CLI flag for cache invalidation — to force a full refresh you delete the relevant .cache files or set FullCacheUsage = DONT temporarily.

Pickle format risk. Caches are pickled with the active pandas version. hist_file.cache has a built-in repair pass for old nanosecond-vs-second Timestamp keys, so it survives most pandas upgrades; the other caches do not. If you upgrade pandas across a major version, expect to delete and re-fetch.


At the seams: things that surprise people

  • Polygon + no RapidAPI key → wrong share counts after splits. No warning, no log. Subscribe to yfinance-stock-market-data even if you don't want earnings.
  • Two earnings caches. Switching the Earnings provider dropdown leaves the other provider's cache stale on disk. Not a correctness problem (the unused one is never read), but disk hygiene.
  • OnlyNewerThanIBStatement cutoff is the statement's period end, not "today". A stale statement can swallow recent Flex trades.
  • Input.InputSource = Cache mode is silent. No banner, no log reminder. Easy to forget you set it.
  • Currency cache is inside the price cache. Deleting hist_file.cache blows away all FX history too; if Polygon is selected you cannot regenerate it without temporarily switching back to IB.
  • The IB preflight check has its own cooldown. After a failed connection, the app won't even try IB again for PreflightFailCacheSec seconds (default 40). Restart the app or wait it out — fixing TWS doesn't help instantly.
  • Two different RapidAPI keys, two different config paths. Splits and YFinance earnings live under Jupyter.RapidYFinanaceKey; Seeking Alpha lives under SeekingAlphaHeaders.X_RapidAPI_Key. The Jupyter. prefix is historical — that key has nothing to do with the embedded notebook anymore.

See also

  • Quick start — install, hook up your first source.
  • Config help — field-by-field credential reference with screenshots.
  • Configuration — every YAML field, what it does, defaults.
  • CLI--generate-ib-statement and other launch flags.