**TL;DR:** Built an open-source tool that reverse-engineers GitHub repos and runs LLM analysis on Groq. Vercel serverless kept timing out (504s). Solved by splitting into two endpoints, streaming JSON repair on the frontend, and bypassing AI SDK wrappers for raw telemetry. Works in ~2s + streaming.
---
Hey r/nextjs ,
I built [CodeAutopsy](https://github.com/Sidhant0707/codeautopsy) – a diagnostic engine that fetches any GitHub repo, builds a dependency graph (fan‑in, adjacency lists), and sends the context to Groq (Llama-3.3-70b) to generate architectural blueprints and blast‑radius maps.
**Live demo:** https://codeautopsy-lyart.vercel.app/
**Stack:** Next.js 16, Supabase, Groq, Upstash Redis, React Flow, D3.js
## The problem (Vercel serverless limits)
The naive approach – one endpoint that does AST extraction, filtering (node_modules, binaries), graph building, and LLM inference – consistently blew past Vercel's execution limits. Result: **504 Gateway Timeouts**.
I didn't want to fall back to a long‑running Node server. Wanted to keep it edge/serverless.
## How I solved it
### 1. Dual‑endpoint split
- **`/api/analyze`** (extractor):
Fetches Git tree, runs deterministic static analysis (regex + adjacency lists), calculates fan‑in scores, caches the graph in Supabase. Returns instantly (~200‑500ms).
- **`/api/ai`** (inference engine):
Triggered immediately by the client. Takes the parsed context, hits Groq API, and pipes a native Web `ReadableStream` directly back to the frontend.
No more monolithic timeouts – the heavy lift is now a streaming response.
### 2. Streaming JSON repair loop
LLM returns a strict JSON schema (for rendering Mermaid graphs + UI components). But streaming incomplete JSON to React crashes standard parsers like `Zod`.
Built a native stream consumer on the frontend that:
- Catches raw text chunks
- Dynamically repairs incomplete JSON on the fly (closing open strings, appending missing brackets/braces based on count differentials)
Sidhant_07 discusses the development of CodeAutopsy, a serverless tool for reverse-engineering GitHub repos. The project initially faced 504 Gateway Timeouts due to Vercel's serverless limits. The solution involved splitting the process into two endpoints, implementing streaming JSON repair, and using raw telemetry streams. Feedback is sought on optimization and handling larger repositories.
Result: The UI "types out" the architectural report and renders interactive diagrams **while the AI is still generating**.
### 3. Native telemetry
Vercel AI SDK wrappers swallowed telemetry data before the stream closed. Dropped down to raw `TextEncoder`/`TextDecoder` streams, which let me inject [0xtrace](https://github.com/Sidhant0707/0xtrace) (my custom observability lib) directly into the stream pipeline – accurate inference latency + token usage tracking.
## What works well
- File metrics + visual maps load in ~2 seconds
- LLM streams deep‑dive analysis without ever hitting a server timeout
- Entirely serverless, edge‑deployed
## What I'd love your feedback on
**Fan‑in ranking algorithm** – how would you optimise the prioritisation of context sent to the LLM?
**Architectural blind spots** – any obvious flaws in the dual‑endpoint + stream repair approach?
**JSON repair** – is there a more robust way to handle incomplete streaming JSON without blocking the UI?
**Performance cliffs** – this works well for small/mid repos (<1000 files). What would break first at scale?
The entire project is open source. Tear it apart – that's why I'm posting.
**License:** MIT
Thanks for reading.