Knowledge, Retrieval, and Evaluation

Context Strategies and Trade-Offs


Learning Objectives

  • You know several ways to provide context to an LLM application.
  • You can compare those strategies using engineering trade-offs.
  • You can choose a reasonable context strategy for a small CLI application.

Multiple strategies for providing context

When an application needs extra information, there are multiple options:

  • write the needed context directly into the prompt,
  • include one or more full documents in the request,
  • retrieve only the most relevant chunks,
  • or maintain a smaller structured state such as summaries, metadata, or extracted facts.

The right choice depends on the size of the data, the cost of errors, and the shape of the application.

Loading Exercise...

Strategy 1: Hand-written fixed context

If the required context is small and stable, the simplest option is often to write it directly into the prompt or keep it in a small local file.

This works well when:

  • the application handles one narrow task,
  • the rules rarely change,
  • and the total context is short.

The main advantage is simplicity. The main risk is that the prompt becomes stale as the real rules change.

Strategy 2: Long-context prompting

Another approach is to send a larger document or several documents directly in the request.

This can work for small document sets, especially when:

  • the user needs answers from one known file,
  • the document is short enough to fit comfortably in the context window,
  • and traceability is not too difficult.

The trade-off is that latency and cost can grow quickly. It is also harder to tell which parts of the input the model actually relied on, and the quality of the response can considerably depend on the used model.

Strategy 3: Retrieval

Retrieval is a better fit when:

  • the document set is larger,
  • only a small part of it is relevant to any one question,
  • or the system should be able to cite the pieces it used.

Retrieval adds engineering work: chunking, indexing, ranking, and debugging retrieval quality. In return, it often reduces prompt size and improves traceability.

Strategy 4: Structured state and preprocessing

Sometimes the application does not need free-form document retrieval at all. It may be enough to precompute a smaller representation such as:

  • a summary of each document,
  • a table of known facts,
  • or a JSON file of extracted fields.

This can be more reliable than retrieval when the task is highly structured. For example, a CLI tool that answers course deadline questions may work better with a verified schedule JSON file than with free-form search over long documents.

Loading Exercise...

Trade-offs and KISS

One common mistake is to add retrieval because it is fashionable rather than because the application needs it. Another is to keep sending huge prompts even after the document set has grown too large for that approach to stay understandable.

Good engineering means choosing a context strategy that matches the actual task, not the most impressive architecture.

A few engineering questions are especially useful:

  • How much context must be sent on each request?
  • How easy is it to inspect where the answer came from?
  • How expensive is the approach in tokens, latency, and implementation effort?
  • How often does the source material change?
  • Does the application need exact fields or open-ended explanation?

There is also no single best choice for every system. A small local tool might start with fixed context, move to retrieval as the document set grows, and keep some high-value facts in structured form.

Loading Exercise...