Saturday, March 28, 2026
HomeAI ToolsAnalyze Earnings Quality for Free with Python and SEC EDGAR

Analyze Earnings Quality for Free with Python and SEC EDGAR

Build time: ~25 min
📊 Data source: SEC EDGAR (free, no API key)
🔧 Difficulty: Intermediate
📈 Output: Earnings quality score table per company

Python Investor’s Toolkit: Part 1: Momentum · Part 2: Monte Carlo · Part 3: Mean Reversion · Part 4: Earnings Quality · Part 5: Macro Regime

Part 4 of 5 — The Python Investor’s Toolkit: Practical quant tools built with free data.

Price and momentum tell you what the market thinks about a stock. Earnings quality analysis tells you whether the market is right to believe it.

The Sloan Accrual Anomaly — published in 1996 and still generating alpha today — shows that companies with high accounting accruals relative to their assets systematically underperform. The reason: when earnings are driven by accruals (accounting adjustments) rather than cash flow, those earnings are fragile. They tend to reverse. The market, for years, consistently mispriced this. And to a meaningful extent, it still does.

This tool pulls data directly from SEC EDGAR’s free public API — no Bloomberg, no financial data subscription, no API key — and computes the core earnings quality metrics for any public company.

What You’ll Need

pip install requests pandas

💡 About SEC EDGAR’s Free API: The SEC provides structured XBRL financial data for all public companies at data.sec.gov. The only requirement is a User-Agent header identifying yourself — SEC policy requires this to prevent abuse. Use your real name and email. Requests without a proper User-Agent header will be rate-limited or blocked. No registration or API key required.

The Key Metrics Explained

Metric Formula High Quality Signal Warning Signal
Accruals Ratio (Net Income − CFO) ÷ Avg Assets < −0.05 (CFO >> NI) > +0.05 (NI >> CFO)
Cash Conversion CFO ÷ Net Income > 1.10 (cash-rich earnings) < 0.80 (earnings thin on cash)
Net Income From income statement Stable or growing Rising while CFO is flat
Operating CFO From cash flow statement Tracks or exceeds NI Diverges from NI trend

Finding a Company’s CIK Number

Every public company has a CIK (Central Index Key) — a unique identifier in the SEC system. Find it at EDGAR Full-Text Search or use this quick lookup function in the code below.

The Code

import requests
import pandas as pd

# Your info (required by SEC for User-Agent)
HEADERS = {'User-Agent': 'YourName your@email.com'}

def find_cik(company_name):
    """
    Look up a company's CIK by name using EDGAR's company search API.
    Returns the first match — refine the name if results are ambiguous.
    """
    # Direct CIK lookup (use known CIKs for reliability)
    known_ciks = {
        'AAPL': 320193,
        'MSFT': 789019,
        'AMZN': 1018724,
        'GOOGL': 1652044,
        'META': 1326801,
        'NVDA': 1045810,
        'TSLA': 1318605,
        'JPM': 19617,
        'JNJ': 200406,
        'XOM': 34088,
        'WMT': 104169,
        'BAC': 70858,
        'BRK-B': 1067983,
    }
    return known_ciks.get(company_name.upper())

def get_company_facts(cik):
    """Download all XBRL financial facts for a company from SEC EDGAR."""
    cik_padded = str(cik).zfill(10)
    url = f"https://data.sec.gov/api/xbrl/companyfacts/CIK{cik_padded}.json"
    response = requests.get(url, headers=HEADERS)
    if response.status_code != 200:
        raise ValueError(f"No data found for CIK {cik}. Check the CIK is correct.")
    return response.json()

def extract_annual(facts, concept, unit='USD'):
    """
    Extract annual (10-K) values for a specific financial concept.
    Common concepts: NetIncomeLoss, NetCashProvidedByUsedInOperatingActivities, Assets
    """
    try:
        data = facts['facts']['us-gaap'][concept]['units'][unit]
        df = pd.DataFrame(data)
        df = df[df['form'] == '10-K'].copy()
        df['end'] = pd.to_datetime(df['end'])
        # Keep one entry per fiscal year end
        df = df.sort_values('end').drop_duplicates('end', keep='last')
        return df[['end', 'val']].rename(columns={'val': concept}).reset_index(drop=True)
    except KeyError:
        return pd.DataFrame()

def analyze_earnings_quality(ticker_or_cik, label=None):
    """
    Full earnings quality report for a company using SEC EDGAR data.
    Pass either a ticker symbol (for known companies) or a CIK integer.
    """
    if isinstance(ticker_or_cik, str):
        cik = find_cik(ticker_or_cik)
        label = label or ticker_or_cik
        if not cik:
            print(f"CIK not found for {ticker_or_cik}. Add it to known_ciks dict.")
            return None
    else:
        cik = ticker_or_cik
        label = label or f"CIK {cik}"

    print(f"nFetching EDGAR data for {label} (CIK: {cik})...")
    facts = get_company_facts(cik)

    # Pull three core line items
    ni  = extract_annual(facts, 'NetIncomeLoss')
    cfo = extract_annual(facts, 'NetCashProvidedByUsedInOperatingActivities')
    ta  = extract_annual(facts, 'Assets')

    if ni.empty or cfo.empty or ta.empty:
        print(f"  Incomplete data — one or more line items missing for {label}")
        return None

    # Merge on fiscal year end date
    df = ni.merge(cfo, on='end').merge(ta, on='end')
    df.columns = ['end', 'net_income', 'cfo', 'total_assets']
    df['year'] = df['end'].dt.year

    # Compute earnings quality metrics
    df['avg_assets']     = (df['total_assets'] + df['total_assets'].shift(1)) / 2
    df['accruals']       = df['net_income'] - df['cfo']
    df['accruals_ratio'] = (df['accruals'] / df['avg_assets']).round(3)
    df['cash_conversion']= (df['cfo'] / df['net_income']).round(2)

    def signal(row):
        good = row['accruals_ratio'] < -0.03 and row['cash_conversion'] > 1.05
        bad  = row['accruals_ratio'] >  0.05 or  row['cash_conversion'] < 0.80
        return 'HIGH QUALITY' if good else ('INVESTIGATE' if bad else 'AVERAGE')

    df['signal'] = df.apply(signal, axis=1)

    # Format output
    out = df.dropna(subset=['accruals_ratio']).tail(6).copy()
    out['net_income'] = out['net_income'].apply(lambda x: f"${x/1e9:.1f}B")
    out['cfo']        = out['cfo'].apply(lambda x: f"${x/1e9:.1f}B")

    print(f"n{'='*65}")
    print(f"  Earnings Quality Report: {label}")
    print(f"{'='*65}")
    print(out[['year','net_income','cfo','accruals_ratio',
               'cash_conversion','signal']].to_string(index=False))
    print(f"{'='*65}")
    return df

# -------------------------------------------------------
# Run analysis on multiple companies
# -------------------------------------------------------
for ticker in ['AAPL', 'MSFT', 'AMZN', 'NVDA', 'META']:
    analyze_earnings_quality(ticker)

What Good vs. Bad Looks Like

Apple typically shows cash conversion ratios above 1.1 — it consistently converts more than $1.10 of operating cash for every $1.00 of reported net income. That’s a hallmark of high-quality earnings. Contrast this with companies that report growing net income while operating cash flow stagnates: that divergence is one of the most reliable early warning signs of future earnings disappointment or restatement risk.

📈 Key Insight: The most powerful use of this tool isn’t analyzing blue chips — it’s screening mid-caps before an earnings season. Run it on your watchlist in the week before earnings. Companies with deteriorating cash conversion going into a print are at significantly higher risk of guiding down, even if the headline EPS number meets expectations.

⚠️ Watch Out: XBRL taxonomy can vary between companies and fiscal years. Some companies file under slightly different concept names (e.g., NetIncome vs. NetIncomeLoss). If you get an empty result, check facts['facts']['us-gaap'].keys() to see which concepts are available for that company. The SEC EDGAR API is free and highly reliable, but data normalization requires occasional manual review.

📊 Portfolio Takeaway

Favor companies with cash conversion consistently above 1.1 — this means they generate more operating cash than reported net income, a hallmark of durable earnings. Before buying into an earnings beat, check the accruals ratio: if it’s above +0.05, the beat may be accounting-driven rather than cash-backed and is more likely to reverse. Run this analysis quarterly on your top 5–10 holdings — a sustained decline in cash conversion is one of the most reliable early warnings of future earnings disappointment, often appearing 1–2 quarters before the miss.

What’s Next

Part 5 closes the series by zooming out to the macro picture — building a regime detector using FRED’s free API that tells you whether the economy is in expansion, late-cycle, slowdown, or recession risk, and what that implies for your asset allocation.

Series: The Python Investor’s Toolkit
Part 1: Momentum Scanner
Part 2: Monte Carlo Portfolio Stress-Test
Part 3: Mean Reversion Alert System
Part 4: Earnings Quality Analyzer — SEC EDGAR (this post)
Part 5: Macro Regime Detector (FRED API)

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here