The zig hashmap gotcha: it doesn't copy your data

Posted on 2025-11-15 :: zig systems-programming

Note: This is based on Zig 0.15.2. Zig is pre-1.0 and constantly changing.

Here's a gotcha that bit me hard when building a key-value store in Zig knowing little about manual memory management: HashMap doesn't copy your data. It stores pointers to wherever you allocated it.

What hashmap actually allocates

When you create a HashMap in Zig, it allocates internal structures:

var map = std.StringHashMap([]const u8).init(allocator);
defer map.deinit();

The allocator is used for buckets, metadata and hash table arrays.

map.deinit() frees only this internal structure. this was a good gotcha. Because it takes an allocator I thought it will also allocate the values and deinit them when the map was gone.

But the hashmap does NOT allocate the actual key/value data you pass to put():

const key = "mykey";
const value = "myvalue";
try map.put(key, value);  // HashMap stores slices (pointer + length)

HashMap doesn't copy the string data. It stores slices pointing to YOUR memory, wherever you allocated those strings.

The memory ownership problem

This creates a subtle bug pattern:

// Request-scoped arena allocator
var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();  // Frees ALL arena memory at once

// Parse request, allocate keys/values from arena
const key = try arena.allocator().dupe(u8, "mykey");
const value = try arena.allocator().dupe(u8, "myvalue");

// Store in HashMap
try map.put(key, value);

// Request ends, arena is freed
// HashMap now has dangling pointers!

When the arena is freed, the memory that key and value pointed to is gone. The HashMap still has pointers to that memory. Use-after-free.

This doesn't necessarily crash. It might work 99% of the time, then randomly fail. Or worse, it is a big ass security bug.

The fix: different lifetime allocators

In handsight, it's obvious. Why would a hashmap that needs to persist between TPC requests have an allocator that gets deinit after the connection is terminated...

Request-scoped memory (arena for parsing) and storage-scoped memory (keys/values in HashMap) have different lifetimes. You need to handle them separately.

The copy pattern

// 1. Parse request into arena-allocated buffer (temporary)
var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();

const parsed_key = try parseKey(arena.allocator(), input);

// 2. Allocate permanent storage from GPA
const stored_key = try gpa.dupe(u8, parsed_key);

// 3. Store GPA-allocated data in HashMap
try map.put(stored_key, stored_value);

// 4. Arena deinit frees parse buffer (safe - data is copied)
// 5. GPA-allocated key/value stays alive in HashMap

This means copying the data. One extra allocation + memcpy per PUT. But it's necessary because request buffers and storage have different lifetimes.

Redis does this too (blessed pattern then :P). It copies data from read buffers to storage allocations.

You could skip the arena and allocate everything from GPA:

// Parse and store both use GPA
const key = try gpa.dupe(u8, "mykey");
try map.put(key, value);

// Manually free temporary parse buffers
defer gpa.free(temp_buffer);

This avoids the copy but requires manually tracking what's temporary vs. permanent. Which I don't want to do. What is this, c?

Why rust doesn't have this problem

In Rust, the borrow checker prevents this at compile time:

fn bad() {
    let mut map = HashMap::new();
    {
        let key = String::from("mykey");
        map.insert(key, value);  // key is moved into HashMap
    }  // key can't be dropped - HashMap owns it now
}

Rust's ownership system enforces that whoever owns the data is responsible for freeing it. You can't accidentally create dangling pointers.

At the same time, you mostly don't manually allocate in rust as far as I know. It is abstracted by the language.