Build-in-public dashboard. Real architecture decisions, real progress, real trade-offs. No post-hoc polish — just what we're actually building.
Where each system stands right now. Numbers are rough estimates, not KPIs.
Last updated: April 2026
The choices that define how Cerebro is built. Context, decision, trade-offs — documented as they happened.
We need to store a code knowledge graph: functions, classes, modules, and their relationships (calls, imports, inherits, implements). The query pattern is traversal-heavy — "find all callers of this function up to depth 3" — not aggregation-heavy.
Use Neo4j (Community Edition, self-hosted) as the primary store for the code knowledge graph. Each codebase gets a logical namespace via a repo_id property on every node. Cypher is the query language.
Cerebro needs to call LLM APIs (OpenAI, Anthropic, etc.) to generate code and answer questions. The question is whether we act as a proxy (calling the API with our own keys and billing users for token usage) or require users to supply their own API keys (BYOK).
Bring Your Own Key (BYOK). Users configure their own API keys in the VS Code plugin settings. Cerebro never stores keys server-side — they are used exclusively within the user's local plugin context and never transmitted to our backend.
Cerebro's core intelligence is a multi-step pipeline: parse query → retrieve from Neo4j → retrieve from Qdrant → merge context → call LLM → stream response. This pipeline needs to be deterministic, debuggable, and interruptible (for streaming). Simple sequential code gets messy when steps have conditional branches, retries, and partial state.
Use LangGraph (from LangChain) to model the pipeline as an explicit state machine. Each step is a node; edges define transitions. State is typed (TypedDict). The graph is compiled once at startup and re-used per request. Streaming is handled via LangGraph's built-in async generator support.
Cerebro is a multi-tenant product — each user's codebase must be isolated from other users'. The question is whether to use physical isolation (separate database instances per user) or logical isolation (a shared instance with namespace/tenant-ID filtering on all queries).
Logical namespace isolation. Every Neo4j node and Qdrant vector has a user_id property. Every query is scoped by this property. In Qdrant, this is a filter condition on every collection search. In Neo4j, the repo_id property gates every Cypher traversal. A single shared Neo4j instance and a single Qdrant collection serve all users in the current phase.
The backend spans multiple distinct concerns: parsing source code, managing the knowledge graph, serving the AI pipeline, and handling user accounts. Without explicit architecture boundaries, these tend to collapse into a monolithic service where Neo4j queries appear inside auth handlers and vice versa.
Apply Clean Architecture (ports and adapters) with Domain-Driven Design bounded contexts. The four contexts are: Ingestion (parse + ingest code into the graph), Intelligence (the LangGraph query pipeline), Identity (auth and user management), and Billing (plans and subscriptions). Each context has its own domain model and communicates with others through defined interfaces — never direct imports across boundaries.
Every tool chosen for a reason. Every trade-off documented in the ADRs above.
Primary store for the code knowledge graph. Functions, classes, modules and their relationships (calls, imports, inherits) as nodes and edges. Cypher queries for graph traversal.
Semantic code search via embedding vectors. Each function and class is embedded and stored. Query time: retrieve the top-K most semantically similar code entities before graph traversal.
Models the query pipeline as an explicit state machine: parse → retrieve → merge context → stream. Typed state, conditional edges, async streaming, and checkpointing for long ingestion jobs.
Async Python API gateway with automatic OpenAPI docs, Pydantic validation, and Server-Sent Events (SSE) for streaming responses to the VS Code plugin. Multi-domain routing per bounded context.
Stores unstructured metadata about code entities — raw file contents, AST snapshots, ingestion job history, and per-user repository configurations. Flexible schema as the data model evolves.
Query result caching to reduce Neo4j and Qdrant load on repeated lookups. Also used as the task queue for background ingestion jobs (via Redis Streams), avoiding a separate message broker.
Managed PostgreSQL + Auth platform. Handles GitHub OAuth, user sessions, and subscription state. Also powers the engineering page comment system with Row Level Security.
LibCST for Python (preserves whitespace and comments for lossless round-trips), Babel for JS/TS (handles JSX and decorators). Output is a normalized entity model ingested into Neo4j and Qdrant.
Unfiltered build notes. What shipped, what broke, what changed.
user_id header validated against the session token and injected into all downstream Neo4j and Qdrant queries.We're open to collaboration with developers, researchers, and teams who share the same problems. Whether it's contributing to the project, exploring integration opportunities, or future hiring — we want to hear from you.
Language parser plugins, embedding experiments, graph query optimization. If you've worked on similar problems, let's talk.
We're not hiring yet, but when we do, this is where we'll post it first. Backend, infra, DevEx — people who build tools for other developers.
Questions, ideas, corrections — logged-in GitHub users can comment.