About

About

Hi, I'm Jet Xu.

Turning private work files and code into reliable, evidence-backed context for AI.

Systems Architect | 15+ years building mission-critical infrastructure | AI Harnessing & Context Engineering

DocMason | Blog | LinkedIn | Email

Reasoning is improving fast. Reliable context is still the bottleneck.

  • Now: building DocMason, a local-first, evidence-first knowledge base for AI-assisted deep research over private work files.
  • Before: built code-intelligence systems across llama-github, LlamaPReview, and repo-graph-rag.
  • Direction: the Mason ecosystem—moving from deep document analysis to generating native, consulting-grade presentations.

Why This Path

Over 15 years of architecting mission-critical systems, the recurring failure mode is always the same: in high-stakes environments, being “almost right” is useless. The bottleneck to reliable output—whether from humans or AI—is rarely raw reasoning capacity. It is context precision. That constraint drives everything I build.

Current Focus

DocMason is my current open-source focus: a local-first, provenance-first knowledge base for AI-assisted deep research over private work files. It is not a document chatbot. It compiles unstructured artifacts into knowledge infrastructure that agents can actually use. Its native operating pattern is simple: the repo is the app, and Codex is the runtime.

Core architectural priorities:

  • Deterministic ingestion: parsing PDFs, decks, spreadsheets, emails, and repository-native text without silent failures.
  • Reliable outputs: provenance-first retrieval instead of vague, hallucination-prone document chat.
  • Actionable output: extending the Mason ecosystem beyond extraction. The next step is a deterministic pipeline that turns deep document analysis directly into native, consulting-grade presentations (PPTX) for serious white-collar work.

The Foundation

I came to document intelligence through code intelligence.

  • llama-github: the retrieval substrate, built to give LLMs GitHub-native context instead of raw repository dumps.
  • LlamaPReview: field validation for that thesis. It achieved a measured 61% signal-to-noise ratio in AI code review across 4,000+ active repositories (35K+ combined stars).
  • repo-graph-rag: the Code Mesh research artifact, exploring deterministic repository graphs and traversal-first retrieval.
  • llamapreview-context-research: formalizing the exact failure mode of Context Instability.

This path started with helping AI understand code diffs, but led to a broader conclusion: the real computing frontier is shifting toward understanding full knowledge environments and generating high-stakes output from them. Code Mesh was the logical end of one inquiry, but not the final product surface.

Selected Writing