Why We Replace UUIDs With Short IDs Before Sending Data to LLMs

Join the conversation — view and comment on this post on LinkedIn.

When you’re building features that send structured data to an LLM, you quickly run into a problem with UUIDs. They look like this:

019befe8-7c7e-7f5e-a498-b0fcc6cf8603

That single ID is about 10 tokens. If your prompt references 20 objects, you’ve just spent 200 tokens on identifiers alone. But the token cost isn’t the only issue.

The real problem: LLMs get confused

UUIDs are long strings of random hex characters. When a prompt contains many of them, the model has to track which UUID belongs to which object and return the correct one in its response. In practice, we found that LLMs occasionally return the wrong UUID, swapping two characters, mixing up two IDs, or hallucinating one entirely. The more UUIDs in the prompt, the higher the error rate.

This matters when the LLM’s response drives real actions. If you ask a model to pick the best matching category for an invoice line item and it returns the wrong ID, you’ve silently mis-categorized that item.

The fix: a short ID mapper

We use a simple mapper that replaces UUIDs with 6-character alphanumeric IDs for the duration of a single LLM call. The IDs are short, distinct, and easy for the model to handle.

You might wonder why we didn’t just use simple incrementing numbers (1, 2, 3). In our case, Dexter prompts already contain a lot of numbers (currency amounts, quantities, percentages) and we found that numeric IDs made it easier for the model to confuse an ID with a value. Alphanumeric strings like a7x2k9 stand out clearly from numerical data. That said, this is specific to our use case. If your prompts don’t contain much numerical data, plain numbers might work just fine.

class ShortIdMapper
  SHORT_ID_LENGTH = 6

  def initialize
    @uuid_to_short = {}
    @short_to_uuid = {}
  end

  def shorten(uuid)
    return nil if uuid.blank?

    @uuid_to_short[uuid] ||= begin
      short_id = generate_unique_id
      @short_to_uuid[short_id] = uuid
      short_id
    end
  end

  def expand(short_id)
    return nil if short_id.blank?

    @short_to_uuid[short_id]
  end

  def add_mapping(uuid, short_id)
    @uuid_to_short[uuid] = short_id
    @short_to_uuid[short_id] = uuid
  end

  private

  def generate_unique_id
    loop do
      id = SecureRandom.alphanumeric(SHORT_ID_LENGTH).downcase
      return id unless @short_to_uuid.key?(id)
    end
  end
end

Each mapper instance maintains its own bidirectional mapping table. You create one per LLM request, shorten all IDs before building the prompt, and expand the IDs in the response back to UUIDs.

How it looks in practice

Here’s a simplified version of how we use this in our spend categorization pipeline. When building the prompt, we shorten every category ID:

mapper = ShortIdMapper.new

content << "Available categories:"
categories.each do |category|
  content << "  ID: #{mapper.shorten(category.id)}, Name: #{category.name}"
end

Instead of the LLM seeing this:

Available categories:
  ID: 019befe8-7c7e-7f5e-a498-b0fcc6cf8603, Name: IT Equipment
  ID: 019befe8-8a1b-7d2c-b612-d4e8a9f12345, Name: Office Supplies
  ID: 019befe8-9c3d-7e4f-c723-e5f9b0a23456, Name: Facilities

It sees this:

Available categories:
  ID: a7x2k9, Name: IT Equipment
  ID: m3p8v2, Name: Office Supplies
  ID: k9w4j6, Name: Facilities

When the LLM responds with a selected category ID, we expand it back:

short_id = parsed_response["selected_category_id"]
uuid = mapper.expand(short_id)
category = categories.find { |c| c.id == uuid }

The model only ever works with short IDs. The mapping is invisible to it.

The testing bonus

There’s a second benefit when you use a mapper like this: deterministic tests.

We use VCR to record and replay HTTP interactions in our integration tests. The problem with LLM calls is that if you generate random short IDs, the prompt and tool calls change every test run, which means your recorded cassettes won’t match.

The add_mapping method solves this. In tests, you pre-populate the mapper with known IDs:

let(:vcr_id_mapper) do
  ShortIdMapper.new.tap do |mapper|
    mapper.add_mapping(category_it.id, "cat001")
    mapper.add_mapping(category_office.id, "cat002")
  end
end

subject(:service) { described_class.new(id_mapper: vcr_id_mapper) }

Now the prompt is identical every time, the VCR cassette matches, and your tests are fully deterministic. Without the mapper, you’d need to either stub out the ID generation or accept that your integration tests can’t use recorded HTTP fixtures.

Wrapping up

If you’re sending structured data with IDs to an LLM, consider mapping them to shorter identifiers. It’s a small amount of code that gives you:

The pattern works regardless of your LLM provider or programming language. The important thing is keeping the mapping scoped to a single request/response cycle and expanding IDs back on the way out.