About
Hi, I'm Jet Xu.
Turning private work files and code into reliable, evidence-backed context for AI.
Systems Architect | 15+ years building mission-critical infrastructure | AI Harnessing & Context Engineering
Reasoning is improving fast. Reliable context is still the bottleneck.
- Now: building DocMason, a local-first, evidence-first knowledge base for AI-assisted deep research over private work files.
- Before: built code-intelligence systems across llama-github, LlamaPReview, and repo-graph-rag.
- Direction: the Mason ecosystem—moving from deep document analysis to generating native, consulting-grade presentations.
Why This Path
Over 15 years of architecting mission-critical systems, the recurring failure mode is always the same: in high-stakes environments, being “almost right” is useless. The bottleneck to reliable output—whether from humans or AI—is rarely raw reasoning capacity. It is context precision. That constraint drives everything I build.
Current Focus
DocMason is my current open-source focus: a local-first, provenance-first knowledge base for AI-assisted deep research over private work files. It is not a document chatbot. It compiles unstructured artifacts into knowledge infrastructure that agents can actually use. Its native operating pattern is simple: the repo is the app, and Codex is the runtime.
Core architectural priorities:
- Deterministic ingestion: parsing PDFs, decks, spreadsheets, emails, and repository-native text without silent failures.
- Reliable outputs: provenance-first retrieval instead of vague, hallucination-prone document chat.
- Actionable output: extending the Mason ecosystem beyond extraction. The next step is a deterministic pipeline that turns deep document analysis directly into native, consulting-grade presentations (PPTX) for serious white-collar work.
The Foundation
I came to document intelligence through code intelligence.
- llama-github: the retrieval substrate, built to give LLMs GitHub-native context instead of raw repository dumps.
- LlamaPReview: field validation for that thesis. It achieved a measured 61% signal-to-noise ratio in AI code review across 4,000+ active repositories (35K+ combined stars).
- repo-graph-rag: the Code Mesh research artifact, exploring deterministic repository graphs and traversal-first retrieval.
- llamapreview-context-research: formalizing the exact failure mode of Context Instability.
This path started with helping AI understand code diffs, but led to a broader conclusion: the real computing frontier is shifting toward understanding full knowledge environments and generating high-stakes output from them. Code Mesh was the logical end of one inquiry, but not the final product surface.