📊 Data source: SEC EDGAR (free, no API key)
🔧 Difficulty: Intermediate
📈 Output: Earnings quality score table per company
Python Investor’s Toolkit: Part 1: Momentum · Part 2: Monte Carlo · Part 3: Mean Reversion · Part 4: Earnings Quality · Part 5: Macro Regime
Part 4 of 5 — The Python Investor’s Toolkit: Practical quant tools built with free data.
Price and momentum tell you what the market thinks about a stock. Earnings quality analysis tells you whether the market is right to believe it.
The Sloan Accrual Anomaly — published in 1996 and still generating alpha today — shows that companies with high accounting accruals relative to their assets systematically underperform. The reason: when earnings are driven by accruals (accounting adjustments) rather than cash flow, those earnings are fragile. They tend to reverse. The market, for years, consistently mispriced this. And to a meaningful extent, it still does.
This tool pulls data directly from SEC EDGAR’s free public API — no Bloomberg, no financial data subscription, no API key — and computes the core earnings quality metrics for any public company.
What You’ll Need
pip install requests pandas
💡 About SEC EDGAR’s Free API: The SEC provides structured XBRL financial data for all public companies at data.sec.gov. The only requirement is a User-Agent header identifying yourself — SEC policy requires this to prevent abuse. Use your real name and email. Requests without a proper User-Agent header will be rate-limited or blocked. No registration or API key required.
The Key Metrics Explained
| Metric | Formula | High Quality Signal | Warning Signal |
|---|---|---|---|
| Accruals Ratio | (Net Income − CFO) ÷ Avg Assets | < −0.05 (CFO >> NI) | > +0.05 (NI >> CFO) |
| Cash Conversion | CFO ÷ Net Income | > 1.10 (cash-rich earnings) | < 0.80 (earnings thin on cash) |
| Net Income | From income statement | Stable or growing | Rising while CFO is flat |
| Operating CFO | From cash flow statement | Tracks or exceeds NI | Diverges from NI trend |
Finding a Company’s CIK Number
Every public company has a CIK (Central Index Key) — a unique identifier in the SEC system. Find it at EDGAR Full-Text Search or use this quick lookup function in the code below.
The Code
import requests
import pandas as pd
# Your info (required by SEC for User-Agent)
HEADERS = {'User-Agent': 'YourName your@email.com'}
def find_cik(company_name):
"""
Look up a company's CIK by name using EDGAR's company search API.
Returns the first match — refine the name if results are ambiguous.
"""
# Direct CIK lookup (use known CIKs for reliability)
known_ciks = {
'AAPL': 320193,
'MSFT': 789019,
'AMZN': 1018724,
'GOOGL': 1652044,
'META': 1326801,
'NVDA': 1045810,
'TSLA': 1318605,
'JPM': 19617,
'JNJ': 200406,
'XOM': 34088,
'WMT': 104169,
'BAC': 70858,
'BRK-B': 1067983,
}
return known_ciks.get(company_name.upper())
def get_company_facts(cik):
"""Download all XBRL financial facts for a company from SEC EDGAR."""
cik_padded = str(cik).zfill(10)
url = f"https://data.sec.gov/api/xbrl/companyfacts/CIK{cik_padded}.json"
response = requests.get(url, headers=HEADERS)
if response.status_code != 200:
raise ValueError(f"No data found for CIK {cik}. Check the CIK is correct.")
return response.json()
def extract_annual(facts, concept, unit='USD'):
"""
Extract annual (10-K) values for a specific financial concept.
Common concepts: NetIncomeLoss, NetCashProvidedByUsedInOperatingActivities, Assets
"""
try:
data = facts['facts']['us-gaap'][concept]['units'][unit]
df = pd.DataFrame(data)
df = df[df['form'] == '10-K'].copy()
df['end'] = pd.to_datetime(df['end'])
# Keep one entry per fiscal year end
df = df.sort_values('end').drop_duplicates('end', keep='last')
return df[['end', 'val']].rename(columns={'val': concept}).reset_index(drop=True)
except KeyError:
return pd.DataFrame()
def analyze_earnings_quality(ticker_or_cik, label=None):
"""
Full earnings quality report for a company using SEC EDGAR data.
Pass either a ticker symbol (for known companies) or a CIK integer.
"""
if isinstance(ticker_or_cik, str):
cik = find_cik(ticker_or_cik)
label = label or ticker_or_cik
if not cik:
print(f"CIK not found for {ticker_or_cik}. Add it to known_ciks dict.")
return None
else:
cik = ticker_or_cik
label = label or f"CIK {cik}"
print(f"nFetching EDGAR data for {label} (CIK: {cik})...")
facts = get_company_facts(cik)
# Pull three core line items
ni = extract_annual(facts, 'NetIncomeLoss')
cfo = extract_annual(facts, 'NetCashProvidedByUsedInOperatingActivities')
ta = extract_annual(facts, 'Assets')
if ni.empty or cfo.empty or ta.empty:
print(f" Incomplete data — one or more line items missing for {label}")
return None
# Merge on fiscal year end date
df = ni.merge(cfo, on='end').merge(ta, on='end')
df.columns = ['end', 'net_income', 'cfo', 'total_assets']
df['year'] = df['end'].dt.year
# Compute earnings quality metrics
df['avg_assets'] = (df['total_assets'] + df['total_assets'].shift(1)) / 2
df['accruals'] = df['net_income'] - df['cfo']
df['accruals_ratio'] = (df['accruals'] / df['avg_assets']).round(3)
df['cash_conversion']= (df['cfo'] / df['net_income']).round(2)
def signal(row):
good = row['accruals_ratio'] < -0.03 and row['cash_conversion'] > 1.05
bad = row['accruals_ratio'] > 0.05 or row['cash_conversion'] < 0.80
return 'HIGH QUALITY' if good else ('INVESTIGATE' if bad else 'AVERAGE')
df['signal'] = df.apply(signal, axis=1)
# Format output
out = df.dropna(subset=['accruals_ratio']).tail(6).copy()
out['net_income'] = out['net_income'].apply(lambda x: f"${x/1e9:.1f}B")
out['cfo'] = out['cfo'].apply(lambda x: f"${x/1e9:.1f}B")
print(f"n{'='*65}")
print(f" Earnings Quality Report: {label}")
print(f"{'='*65}")
print(out[['year','net_income','cfo','accruals_ratio',
'cash_conversion','signal']].to_string(index=False))
print(f"{'='*65}")
return df
# -------------------------------------------------------
# Run analysis on multiple companies
# -------------------------------------------------------
for ticker in ['AAPL', 'MSFT', 'AMZN', 'NVDA', 'META']:
analyze_earnings_quality(ticker)
What Good vs. Bad Looks Like
Apple typically shows cash conversion ratios above 1.1 — it consistently converts more than $1.10 of operating cash for every $1.00 of reported net income. That’s a hallmark of high-quality earnings. Contrast this with companies that report growing net income while operating cash flow stagnates: that divergence is one of the most reliable early warning signs of future earnings disappointment or restatement risk.
📈 Key Insight: The most powerful use of this tool isn’t analyzing blue chips — it’s screening mid-caps before an earnings season. Run it on your watchlist in the week before earnings. Companies with deteriorating cash conversion going into a print are at significantly higher risk of guiding down, even if the headline EPS number meets expectations.
⚠️ Watch Out: XBRL taxonomy can vary between companies and fiscal years. Some companies file under slightly different concept names (e.g., NetIncome vs. NetIncomeLoss). If you get an empty result, check facts['facts']['us-gaap'].keys() to see which concepts are available for that company. The SEC EDGAR API is free and highly reliable, but data normalization requires occasional manual review.
📊 Portfolio Takeaway
Favor companies with cash conversion consistently above 1.1 — this means they generate more operating cash than reported net income, a hallmark of durable earnings. Before buying into an earnings beat, check the accruals ratio: if it’s above +0.05, the beat may be accounting-driven rather than cash-backed and is more likely to reverse. Run this analysis quarterly on your top 5–10 holdings — a sustained decline in cash conversion is one of the most reliable early warnings of future earnings disappointment, often appearing 1–2 quarters before the miss.
What’s Next
Part 5 closes the series by zooming out to the macro picture — building a regime detector using FRED’s free API that tells you whether the economy is in expansion, late-cycle, slowdown, or recession risk, and what that implies for your asset allocation.
Series: The Python Investor’s Toolkit
Part 1: Momentum Scanner
Part 2: Monte Carlo Portfolio Stress-Test
Part 3: Mean Reversion Alert System
Part 4: Earnings Quality Analyzer — SEC EDGAR (this post)
Part 5: Macro Regime Detector (FRED API)
