Data Sources¶
Compare-my-stocks pulls from several independent sources. This page explains how they fit together — what each one gives you, when they overlap, and how merge / cache / refresh behavior shapes what shows up on the chart.
For initial setup walk-throughs see Quick start. For every field with screenshots see Config help. For a flat field reference see Configuration.
Market data: IB vs Polygon¶
You pick one source via Input.InputSource. There is no automatic
fallback chain — if the chosen source fails, the app falls back to
cache only (and prompts unless PromptOnConnectionFail is off).
| Interactive Brokers | Polygon | |
|---|---|---|
| Auth | Local TCP to TWS / IB Gateway (no API key) | REST API key |
| Granularity | Daily bars only | Daily bars only (timespan=day) |
| Stocks / ETFs / Indexes | ✅ | ✅ |
| Crypto | ✅ (via IB) | ✅ |
| Forex | ✅ — used for FX conversion | ❌ raises NotImplementedError |
| Pre-split-adjusted? | ❌ — adjustment runs separately (see Split adjustment below) | ✅ — adjusted=True in list_aggs() |
| Position / portfolio reads | ✅ via get_positions() |
❌ (Polygon has no portfolio concept) |
| Rate limit | None enforced in code; requests serialized through async loop | 1 req / sec implied (free tier); single 429 retry with 1.2 s backoff |
| Connection failure handling | TCP preflight gate (PreflightFailCacheSec, default 40 s) before each retry |
HTTP-level errors propagate |
Consequence: if you switch to Polygon, you lose live portfolio sync and forex — the app cannot compute multi-currency P&L without IB up, unless you've previously cached FX history.
Cache-only mode. Setting Input.InputSource = Cache disables all
live fetching and reads exclusively from hist_file.cache. Useful for
offline / demo / testing, but the app will silently serve stale data
— there is no banner.
Transactions: three sources, one merged view¶
Transactions are independent of market data. You can use any combination of:
| Source | What it gives you | Cadence |
|---|---|---|
| IB Flex Web Service (token + query ID) | Recent IB trades, pulled on launch | Live trickle |
| IB Activity Statement CSV (Client Portal export) | Historical positions + realized P&L baseline | One-off seed |
| My Stocks Portfolio CSV (Peeksoft app) | Transactions from any broker | Manual export |
Typical setup¶
The common workflow is:
- Download an IB Activity Statement CSV once to seed historical
positions. Point
TransactionHandlers.IBStatement.SrcFileat the file. - Configure the Flex Web Service token + query ID. On every launch the app fetches recent trades from Flex and merges them on top of the statement.
- Set
OnlyNewerThanIBStatementso Flex trades dated before the statement's period end are skipped — they're already covered by the baseline.
My Stocks CSV is for users without IB, or to add trades from a second broker alongside the IB sources.
How the three sources merge¶
TransactionHandlerManager.combine_transactions() merges in this
order:
- IB Flex + My Stocks merged by
CombineStrategy: PREFERIB— IB trades are the base; My Stocks rows added only when no IB row matches by date and amount.PREFERSTOCKS— My Stocks rows taggedNotes="IB:..."are matched against IB and deduplicated; others added fresh.- Duplicate detection uses
CombineDateDiff(days) andCombineAmountPerc(% of value). - IB Activity Statement layered last — for any date not already present in the merged dict, synthetic entries seed the open positions implied by the statement.
The OnlyNewerThanIBStatement knob¶
When set, Flex trades older than the Activity Statement's period end date are filtered out. The intent is to avoid double-counting trades that the Statement already captures as closing positions.
The gotcha: the cutoff is read from the statement CSV's header, not from "now". If your statement is months old, recent Flex trades layer on cleanly. If your statement is fresh, anything Flex returns from before that date silently disappears.
Generating a baseline from live IB positions¶
If you have IB connected but no Activity Statement yet, run with
--generate-ib-statement (or click Config/Help… → Tutorial / Actions
→ Generate IB Statement now). The app calls IB's get_positions(),
writes a synthetic CSV to TransactionHandlers.IBStatement.SrcFile
(default ibstatement.cache), and that file then loads on next
startup like a real export.
The generated CSV has only an Open Positions section — no realized P&L history. It seeds today's holdings; configure Flex separately for a deeper trade history going forward. It is not regenerated automatically — re-run the flag whenever you want a fresh baseline.
Fundamentals: yfinance vs Seeking Alpha (both via RapidAPI)¶
Two separate RapidAPI subscriptions back the fundamentals features:
| Config field | What it unlocks |
|---|---|
Jupyter.RapidYFinanaceKey |
Stock split history · YFinance-backed earnings (P/E, P/S) · per-symbol fundamentals in the embedded notebook |
SeekingAlphaHeaders.X_RapidAPI_Key |
Seeking Alpha-backed earnings only |
One yfinance key really does unlock three features — splits, earnings, and notebook fundamentals all read from the same key. If you're going to subscribe to only one thing, this is it.
The Earnings provider dropdown (TransactionHandlers.Earnings.Provider)
controls which earnings source is used. Logic
(TransactionHandlerManager._pick_yfinance_earnings):
| Setting | Behavior |
|---|---|
PreferSeekingAlpha (default) |
Use SA if its key is set, else YFinance if that key is set, else nothing |
PreferYFinance |
Mirror of above with YFinance first |
SeekingAlpha |
Force SA — fails silently if its key is missing |
YFinance |
Force YFinance — same |
Two cache files: earnings.cache (Seeking Alpha) and
earnings_yf.cache (YFinance) live side by side. Switching providers
does not invalidate the other — delete the right one if you change
providers mid-stream.
Split adjustment¶
Splits are applied at transaction-merge time (not at fetch time and
not at display time): InputProcessor.process_hist_internal walks
transactions chronologically and adjusts share counts and cost basis
using the splits dict that StockPrices.get_hist_split populates.
Source of truth is yfinance via RapidAPI — there is no IB or
Polygon fallback for splits. If Jupyter.RapidYFinanaceKey is empty,
get_hist_split returns early and the splits dict for that symbol
stays empty.
Silent staleness risk. Splits are cached in stocksplit.cache with
TTL TransactionHandlers.IB.CacheSpan. If the cache is fresh but the
key has been removed, no refresh fires and the old ratios stay in use.
If a new split happens and the cache hasn't aged out, the app silently
uses pre-split share counts. Mitigation: delete stocksplit.cache
to force a refresh on next launch.
Polygon-without-RapidAPI gotcha. Polygon's daily bars are already split-adjusted, but your holdings are not — share counts in transactions still need the splits dict to back-adjust. A Polygon-only setup with no RapidAPI key will show correct price lines but wrong share counts after any split.
Currency normalization¶
Stock prices are fetched in each symbol's native currency. FX is applied at portfolio aggregation, not at fetch time.
- Base currency —
Symbols.Basecur(e.g.USD). Every position converts to this when forming portfolio totals. - Per-symbol currency override —
Symbols.ExchangeCurrencypatches IB's quirks (e.g. LSE returningGBpinstead ofGBP). - Multiplier —
Symbols.CurrencyFactorhandles the GBp→GBP=100 case and similar. - FX source — IB forex pairs only. Polygon does not implement
get_currency_history. If you've switched to Polygon and have no cached FX history, multi-currency portfolios will show gaps.
FX rates are cached per-currency in memory at run time and persisted
inside hist_file.cache alongside price history. The display-time FX
lookup has its own age check separate from the price cache TTL.
Caches on disk¶
Files live in ~/.compare_my_stocks/ (or $COMPARE_STOCK_PATH).
| File | Holds | TTL knob | Force-refresh |
|---|---|---|---|
hist_file.cache |
Daily price history + currency history | Input.MaxCacheTimeSpan |
Delete file, or set Input.InputSource ≠ Cache and bump TTL |
hist_file.cache.back |
Backup written before every save | — | — |
Full data pickle (File.FullData) |
Post-processing InputDataImpl snapshot |
Input.MaxFullCacheTimeSpan |
Set Input.FullCacheUsage = DONT |
stocksplit.cache |
Symbol → list of (date, ratio) |
TransactionHandlers.IB.CacheSpan |
Delete file |
earnings.cache |
(revdf, epsnorm) from Seeking Alpha |
TransactionHandlers.Earnings.CacheSpan |
Delete file |
earnings_yf.cache |
(revdf, epsnorm) from YFinance |
TransactionHandlers.Earnings.CacheSpan |
Delete file |
ibtrans.cache |
Cached Flex query response | TransactionHandlers.IB.CacheSpan |
Delete file |
buydicnk.cache |
Cached My Stocks trades | per handler CacheSpan |
Delete file |
ibstatement.cache |
Synthetic Activity Statement CSV | (no TTL — text file) | Re-run --generate-ib-statement |
Cache use modes — Input.FullCacheUsage is a three-state enum:
| Value | Meaning |
|---|---|
DONT |
Never use the full-data cache; always reprocess |
USEIFAVAILABLE |
Use if age < MaxFullCacheTimeSpan |
FORCEUSE |
Use unconditionally — skip the age check |
There is no CLI flag for cache invalidation — to force a full
refresh you delete the relevant .cache files or set
FullCacheUsage = DONT temporarily.
Pickle format risk. Caches are pickled with the active pandas
version. hist_file.cache has a built-in repair pass for old
nanosecond-vs-second Timestamp keys, so it survives most pandas
upgrades; the other caches do not. If you upgrade pandas across a
major version, expect to delete and re-fetch.
At the seams: things that surprise people¶
- Polygon + no RapidAPI key → wrong share counts after splits. No warning, no log. Subscribe to yfinance-stock-market-data even if you don't want earnings.
- Two earnings caches. Switching the Earnings provider dropdown leaves the other provider's cache stale on disk. Not a correctness problem (the unused one is never read), but disk hygiene.
OnlyNewerThanIBStatementcutoff is the statement's period end, not "today". A stale statement can swallow recent Flex trades.Input.InputSource = Cachemode is silent. No banner, no log reminder. Easy to forget you set it.- Currency cache is inside the price cache. Deleting
hist_file.cacheblows away all FX history too; if Polygon is selected you cannot regenerate it without temporarily switching back to IB. - The IB preflight check has its own cooldown. After a failed
connection, the app won't even try IB again for
PreflightFailCacheSecseconds (default 40). Restart the app or wait it out — fixing TWS doesn't help instantly. - Two different RapidAPI keys, two different config paths. Splits
and YFinance earnings live under
Jupyter.RapidYFinanaceKey; Seeking Alpha lives underSeekingAlphaHeaders.X_RapidAPI_Key. TheJupyter.prefix is historical — that key has nothing to do with the embedded notebook anymore.
See also¶
- Quick start — install, hook up your first source.
- Config help — field-by-field credential reference with screenshots.
- Configuration — every YAML field, what it does, defaults.
- CLI —
--generate-ib-statementand other launch flags.