python Code Health

32 open-source python repositories analyzed by activity-weighted risk — complexity × recent commit frequency. Sorted highest risk first.

Repositories 32

Avg Top Risk 19.6

Top Patterns complex_branchingdeeply_nestedexit_heavy

safishamsi/graphify risk 22.5

graphify's extract layer carries the highest activity risk — 2 functions to address first

graphify's extraction layer is carrying extreme structural complexity while being actively changed — a combination that makes regressions likely with every commit touching extract.py.

complex_branchingdeeply_nested

Read →

pandas-dev/pandas risk 22.1

pandas-dev/pandas: parser, JSON encoder, and DataFrame construction carry the highest risk

Five functions in pandas are both structurally extreme and actively changing right now, putting the parser, JSON serialization, and DataFrame construction layers at live regression risk as of commit 395506f. The top-ranked function, `tokenize_bytes`, has an activity-weighted risk score of 22.1 with a cyclomatic complexity of 125 — the single sharpest edge for any engineer shipping CSV or text-parsing changes this week.

complex_branchingdeeply_nested

Read →

sqlmapproject/sqlmap risk 21.6

sqlmap's core engine carries the highest activity risk — 5 functions to address first

Every one of sqlmap's top five structural hotspots is actively changing right now, not sitting in a backlog. The injection-detection function alone carries a cyclomatic complexity of 484 and was touched twice in the last 30 days — that combination makes every commit to it a live regression risk.

complex_branchingdeeply_nested

Read →

agno-agi/agno risk 21.4

agno-agi/agno's async run layer — 5 functions to address first

Every one of agno's five highest-risk functions sits in the async streaming run layer — agent, team, and workflow — and each was committed to within the last day. That combination of extreme structural complexity and live development activity makes this a prioritization decision for engineers shipping code this week, not a backlog item.

complex_branchingdeeply_nested

Read →

apache/airflow risk 20.7

apache/airflow's dev tooling carries the highest activity risk — 5 functions to flag

Five functions in apache/airflow's developer tooling layer are simultaneously structurally extreme and actively changing right now — the combination that makes a bug most likely to ship. The top-ranked function, `run_command`, carries an activity-weighted risk score of 20.73 and was modified zero days ago, sitting at the intersection of 38 independent execution paths and 29 distinct callees.

complex_branchingexit_heavy

Read →

getsentry/sentry risk 20.7

getsentry/sentry's backup, metrics, and query layers carry the highest activity risk — 5 functions to address first

Five functions across sentry's backup, metrics, and query infrastructure are simultaneously structurally complex and actively changing right now — any of them could introduce a regression in the current development cycle. The most urgent is `import_by_model`, a god function with 48 independent execution paths and 45 distinct callees that was modified today.

exit_heavygod_function

Read →

deepspeedai/DeepSpeed risk 20.7

DeepSpeed's accelerator and MoE layers carry the highest risk — 5 functions to fix first

Five functions in DeepSpeed are simultaneously the most structurally complex and the most actively changing in the codebase right now — meaning any engineer merging into these paths this week is doing so under elevated regression risk. The highest-scoring function, `get_accelerator` in the accelerator dispatch layer, carries an activity-weighted risk score of 20.67 with a cyclomatic complexity of 112, and it was last changed 15 days ago.

complex_branchinggod_function

Read →

vllm-project/vllm risk 20.7

vllm's streaming layer carries the highest activity risk — 5 functions to address first

Five functions in vllm's streaming layer and v1 scheduler all carry activity-weighted risk scores above 19 — and every one of them was touched in the last three days. That combination of structural extremity and live commit activity is exactly the condition where a well-intentioned fix is most likely to introduce a regression.

complex_branchingdeeply_nested

Read →

crewAIInc/crewAI risk 20.6

crewAI's LLM streaming layer carries the highest activity risk — 5 functions to flag

Every one of crewAI's five highest-risk functions is both structurally extreme and actively changing right now — the streaming LLM handlers alone carry cyclomatic complexity scores that dwarf typical refactoring thresholds, and all five were touched within the last eight days. For any engineer shipping against this codebase this week, that combination is a live regression risk, not a backlog item.

complex_branchingdeeply_nested

Read →

zhayujie/CowAgent risk 20.3

CowAgent's LLM protocol layer carries the highest activity risk — 2 functions to address first

Two functions in CowAgent are both structurally extreme and changing right now — CC 120+ with nesting 7–11 deep, touched repeatedly in the last 30 days. That combination makes regressions a live risk rather than a cleanup backlog item.

complex_branchingdeeply_nested

Read →

gradio-app/gradio risk 20.0

gradio-app/gradio's event pipeline carries the highest activity risk — 5 functions to address first

At commit 601769e, five functions spread across gradio's TypeScript event pipeline, Python app router, and Python client all land in the 'fire' quadrant simultaneously — structurally complex and receiving commits this month. For any engineer shipping gradio code this week, that combination makes these functions live regression risks, not cleanup items for a future sprint.

complex_branchingdeeply_nested

Read →

mitmproxy/mitmproxy risk 19.9

mitmproxy's web layer carries the highest activity risk — 5 functions to address first

mitmproxy's web UI Redux layer has one function that is both structurally extreme and actively changing right now, while four others in the same critical band have been sitting untouched for over a month — structural debt with a high blast radius the moment anyone picks them up. The split between live-fire risk and accumulated complexity tells a clear story about where the next regression is most likely to emerge.

god_functionlong_function

Read →

streamlit/streamlit risk 19.8

streamlit's server and script-runner layer carries the highest activity risk — 5 functions to address first

Five functions in streamlit's core layers are simultaneously the most structurally complex and the most actively touched — meaning every commit landing on them this week is a regression bet made against code that already has up to 136 independent execution paths. If your team is shipping to streamlit's server or script-runner subsystems right now, these are the numbers that should be on your radar.

complex_branchingexit_heavy

Read →

ray-project/ray risk 19.8

ray-project/ray's RLlib and DAG layers carry the highest risk — 5 functions to fix first

Five functions across ray-project/ray's RLlib, compiled DAG, Ray Data, and autoscaler layers all sit in the 'fire' quadrant right now — structurally complex and touched within the last three days. For any engineer shipping against this codebase this week, that combination means every commit to these files is a change landing on already-overloaded control flow.

complex_branchingexit_heavy

Read →

run-llama/llama_index risk 19.4

llama_index's integration layer carries the highest activity risk — 5 functions to address first

Across 11,078 functions in run-llama/llama_index, five integration-layer functions are simultaneously the most structurally complex and the most recently changed — any engineer shipping to those connectors this week is working inside live regression territory. The pattern is consistent enough to suggest a systemic design convention in the integrations layer that's worth examining before the next round of changes lands.

complex_branchingdeeply_nested

Read →

Aider-AI/aider risk 19.2

Aider's core coder and I/O layers carry the highest structural debt — 5 functions to address first

Every one of aider's top five riskiest functions sits in the debt quadrant — structurally complex, untouched for 66 days, and waiting for the next developer to open them. The most striking case is `send_message` in `base_coder.py`, a single function with 99 independent execution paths that calls 47 distinct callees — in Python, where type resolution happens at runtime, that fan-out is broader than it looks on paper.

long_functionexit_heavy

Read →

Textualize/textual risk 18.8

Textual's parser and layout engine carry the highest activity risk

Of the 222 critical functions I found in Textualize/textual, one stands out as a live regression risk right now: the `parse` function in `_xterm_parser.py` carries an activity-weighted risk score of 18.79, has been touched 3 times in the last 30 days, and was last modified just 2 days ago — all while carrying a cyclomatic complexity of 70. Four more critical functions in markup parsing, grid layout, and message dispatch haven't been touched in 61 days, but their structural mass means the next engineer to open those files will be walking into a high blast-radius situation.

complex_branchingdeeply_nested

Read →

roboflow/supervision risk 18.7

supervision's metrics layer carries the highest risk — 5 functions to fix first

Five functions in roboflow/supervision are both structurally extreme and actively changing right now — the metrics layer alone has three `_compute` implementations each touched within the last two days, sitting at cyclomatic complexity scores that demand dozens of test cases just to cover the existing paths. If you're shipping against supervision this week, these are the functions most likely to introduce regressions under you.

long_functioncomplex_branching

Read →

paperless-ngx/paperless-ngx risk 18.3

paperless-ngx's signal handlers — 5 functions to address first

Every one of the top five riskiest functions in paperless-ngx is in the 'fire' quadrant right now: structurally complex and touched three times in the last 30 days. If you are shipping changes to document consumption, file renaming, or SVG validation this week, these are the functions most likely to produce a regression before the next release.

complex_branchingdeeply_nested

Read →

babysor/MockingBird risk 17.3

MockingBird's training loops carry the most structural debt — 5 functions to fix first

Every one of MockingBird's five highest-risk functions is structural debt, not active churn — they are complex, broadly coupled, and haven't been touched in over three years. The more interesting question is what happens to the two massive training loops the next time someone needs to extend them.

god_functioncomplex_branching

Read →

google/langextract risk 16.9

langextract's batch provider and extraction carry the highest activity risk — 5 hotspots

langextract's core extraction and OpenAI batch provider are both structurally overloaded and actively changing right now — a combination that puts live regression risk on the table, not just future cleanup. With 72 critical-band functions across 348 total, the codebase has meaningful structural debt concentrated in exactly the modules users interact with most.

exit_heavygod_function

Read →

wshobson/agents risk 16.7

wshobson/agents' tooling layer carries the highest activity risk — 5 functions to address first

Every one of the five highest-risk functions in wshobson/agents sits in the tools/ layer, and all five are in the 'fire' quadrant — meaning they are both structurally complex and receiving commits right now. If you are shipping changes to the agent tooling pipeline this week, these are the functions where a regression is most likely to hide.

exit_heavycomplex_branching

Read →

jingyaogong/minimind risk 14.0

minimind's RL trainers carry the highest activity risk — 3 functions to address first

minimind's RL training layer is both its most complex and most actively changing code — two trainer functions at CC 52 with commits landing days ago, making refactoring a live risk.

complex_branchinggod_function

Read →

virattt/ai-hedge-fund

ai-hedge-fund's frontend layer carries the highest activity risk — 3 files to address first

All five of ai-hedge-fund's top hotspots sit in the frontend layer — and every one is a god function flagged as structural debt with a blast radius waiting to be triggered.

exit_heavygod_function

Read →

exo-explore/exo

exo's benchmarking layer carries the highest activity risk — 5 functions to address first

exo has 191 critical functions out of 1,549 — and its highest-risk hotspot is a god-function with cyclomatic complexity of 93 that's actively changing right now, making it a live regression risk.

complex_branchingexit_heavy

Read →

NanmiCoder/MediaCrawler

MediaCrawler's JS bundle dominates the top 5 hotspots — exclude to surface Python risk

Every top hotspot in MediaCrawler lives inside a single minified JS bundle — meaning the real Python risk profile is entirely hidden until that file is excluded from analysis.

exit_heavycomplex_branching

Read →

hesreallyhim/awesome-claude-code

awesome-claude-code's scripts carry the highest activity risk — 5 to address first

The scripts powering awesome-claude-code's automation are its riskiest code right now: process_resources has a cyclomatic complexity of 59 and is actively changing — a live regression risk hiding in p

complex_branchinggod_function

Read →

2noise/ChatTTS

ChatTTS's scheduler carries the highest activity risk — 5 functions to address first

ChatTTS has 40 critical-band functions across 440 total, with its inference and scheduling layer accumulating structural debt that will bite hard the next time anyone opens those files.

god_functionlong_function

Read →

oobabooga/textgen

textgen's chat layer carries the highest activity risk — 3 functions to address first

textgen's chat prompt builder has a cyclomatic complexity of 152 and is actively changing right now — a live regression risk hiding in plain sight inside a single Python function.

complex_branchingdeeply_nested

Read →

microsoft/qlib

qlib's init and model layer carry the highest structural risk — 5 to refactor first

qlib's highest-risk functions aren't being actively changed right now — they're structural debt accumulating blast radius. Five god functions with CC scores up to 58 are overdue for refactoring before

god_functioncomplex_branching

Read →

D4Vinci/Scrapling

Scrapling's engine layer — 5 functions with the highest activity-weighted risk

Two files in Scrapling's engine layer account for the codebase's highest structural risk — and the top-ranked function has been touched 6 times in the last 30 days with a cyclomatic complexity of 41.

exit_heavycomplex_branching

Read →

HKUDS/nanobot

nanobot's message and CLI layers carry the highest activity risk — 2 functions to address first

nanobot's top two hotspots are both complex AND changing right now — one was touched today. That combination makes them live regression risks, not backlog items.

complex_branchingdeeply_nested

Read →

Other Languages

c c++go java javascript rust typescript all →