Posted on ::

Last post I talked about the code archaeology problem. Here's the thing I've been building.

The workflow

You drop an anchor comment on an entry point:

// @anchor:get-recommendations
public CompletableFuture<Page<Item>> getRecommendationsAsync(
    long userId, RecommendationType type, int pageSize) {
    return recommendationEngine.firstPage(userId, type, pageSize);
}

Then run the analyzer. It uses tree-sitter to parse the code, builds a call graph from that entry point, and spits out an .anchor.yaml file.

Then you run enhance. The LLM reads the method signatures and adds business context, parameter descriptions, debug keywords. Stuff that helps future prompts understand what the code does, not just how it's structured.

What an anchor file looks like for now

anchor_name: "get-recommendations"

methods:
  - id: "ApiService.getRecommendationsAsync"
    file: "src/.../ApiService.java"
    line_range: [279, 290]
    method_signature: "public CompletableFuture<Page<Item>> getRecommendationsAsync(...)"
    
    # LLM-enhanced fields
    business_context: |
      Main API endpoint for recommendations feature. Returns paginated items
      based on user's history and subscription preferences.
    parameters:
      - name: "userId"
        description: "User identifier"
      - name: "type"
        description: "Type of recommendation view (TRENDING, RECENT, etc.)"
    debug_keywords: ["recommendations", "ranking", "personalization"]
    
    calls_methods: ["RecommendationEngine.firstPage"]
    called_by_methods: []

The key bits: business_context tells you what it does in human terms. debug_keywords help you find it when searching for a problem. calls_methods gives you the full graph without the LLM having to crawl.

The PoC

I tested this with a real bug I had already fixed. Old items from years ago were showing at the top of a feed, above recent ones. A sorting issue somewhere in the codebase.

Myself when I fixed it: Maybe like 10 minutes of context switching and another 10 creating a test to replicate behaviour and debug. I don't really remember...

Without anchors: Claude CLI crawled through the codebase for 8 minutes. Reading files, backtracking, reading more files. Eventually found the right spot. (Somewhat impressive)

With anchors: Gave the same prompt to a basic chat model with just the anchor file. 15 seconds. Pointed to the exact line in SortingUtils.java:x-y.

Same answer. 32x faster. And the anchor version didn't need an agentic model burning tokens on file exploration.

Why I think this works

The anchor file front-loads the hard work. The LLM doesn't have to figure out where the sorting logic is, the call graph already traced it. It doesn't have to guess what userId means, the enhanced params already explain it.

The LLM just reads structured context and reasons about it. Which is exactly what it's good at.

What's next

Still early, not sure this proves anything. It was an overly optimised duck-taped test. I need to build something a bit more robust to be able to systematically test it.

Like using fingerprints (hashes of method bodies) to detect when code changes, so I don't re-enhance unchanged methods. Cause now it takes forever to generate these files...

Not sure how far this goes yet. But the early results are promising enough that I'm going to keep poking at it. What surprised me is that I though the main reason to do this was to reduce the hallucination but honestly now I see this very useful in speed/token reduction. Even more beneficial for large codebases with large teams...

Table of Contents