A Document-Based Legal AI That Can Say "I Don't Know"
- The dream, and the real problem
- Quick fix: this is not "training"
- How the pieces fit together
- 1) Turning the document into text, and a PDF bug
- 2) Chunking: why and how?
- 3) Embedding: text into numbers, without losing the meaning
- 4) Vector search: boring but enough
- 5) Generation: pushing the model to the source
- Does it really stop hallucination? A test
- When it finds more than one document
- Two engines: cloud Haiku or a local model
- Lessons from the road
- Where it is useful, and how it scales?
- 10 million documents: what big data has waiting for us
- Putting the three models side by side
- For the people who say "talk is cheap, show the code"
Contents
- The dream, and the real problem
- Quick fix: this is not "training"
- How the pieces fit together
- 1) Turning the document into text, and a PDF bug
- 2) Chunking: why and how?
- 3) Embedding: text into numbers, without losing the meaning
- 4) Vector search: boring but enough
- 5) Generation: pushing the model to the source
- Does it really stop hallucination? A test
- When it finds more than one document
- Two engines: cloud Haiku or a local model
- Lessons from the road
- Where it is useful, and how it scales?
- 10 million documents: what big data has waiting for us
- Putting the three models side by side
- For the people who say "talk is cheap, show the code"
Software Engineering Series
#1A Document-Based Legal AI That Can Say "I Don't Know"// reading#1Opening 11 Million Character HTML in a Mobile WebView: Virtual Chunking#1Vibe Coding for 10 Years Experienced Software Developer
Series Index: 3 entries recordeddev_discipline://series_meta


