What are minibases

A working definition, a worked example, and why one big knowledge base stops being enough once you have more than one topic you actually care about.

The first time I tried to give an LLM a real corpus to work with, I pasted seventeen blog posts into the chat window and asked a question. The answer was okay. The second time, I tried fifty posts. The answer was worse. Not because the model was less capable, but because the signal-to-noise ratio had collapsed. Half the posts weren't relevant. A quarter were too long. The rest were fighting each other for the model's attention.

That moment is when most people either give up on AI for their actual work, or they start building what I now call minibases — one per topic that matters to them.

A working definition

A minibase is a small, curated collection of notes on one topic, structured for a language model to read. Three words do most of the work in that sentence.

Small

If you can't roughly summarize what's in it from memory, it's too big. That sounds extreme until you realize that the value of a minibase is not "everything you've ever read", it's "the stuff you'd want a smart friend to have read before they help you think." A friend who's read thirty deeply relevant essays is more useful than one who's skimmed three hundred.

Curated

Someone, usually you, made deliberate choices about what to keep and what to throw out. A folder of auto-clipped articles isn't a minibase. A folder of articles you read, kept, and re-tagged when you found something better is a minibase. The curation is the asset.

For a language model to read

This is the part that's new. A minibase isn't just for you. It's designed so a model can ingest it cleanly: plain text or markdown, one file per idea, descriptive filenames, no PDF wrappers, no scattered screenshots, no "see attached." If a stranger could understand the folder structure in thirty seconds, a model probably can too.

A worked example

Here's one of mine. The topic is "how indie developers find distribution," because I think about it every week. The minibase lives in a folder called distribution/. It contains:

That's twenty-one files. Maybe forty thousand words total. Small enough that I could rebuild it from memory if my hard drive caught fire. Big enough that when I point Claude at it and ask "what's the weakest part of this launch plan," the answer feels like talking to someone who has done the homework.

Why the chat-window approach stops working

Three reasons, all of them obvious in hindsight.

Context windows aren't selection mechanisms

Pasting more doesn't make the model smarter. It makes the model spread thinner. Every additional paragraph you paste competes for attention with the paragraph that actually mattered. The model has no way to know which is which, because you didn't tell it.

Quality decays faster than quantity helps

One genuinely useful essay is worth more than ten mediocre ones, and the mediocre ones aren't neutral. They drag the answer toward their center of gravity. A minibase is partly about what you include, and more about what you refuse to include.

You can't reuse a paste

Every chat starts from zero. Every paste is one-shot. A minibase, by contrast, is reusable, versioned, and improvable. The work you put in compounds.

What's next

The rest of this blog is about how to actually build them. Which tool to use. How to capture from the web without making a mess. How to structure notes so they retrieve cleanly. How to plug the result into an actual model. We'll cover Obsidian, Save, plain folders, web clippers, and the parts of the workflow nobody writes about because they're "obvious" (they aren't).

Start small. Pick one topic. Make twenty files. See how it feels. Then start the next one.

Back to all posts