Wire protocol framing (and why it prevents attacks)

Posted on 2025-12-09 :: systems-programming security

Building a wire protocol for my key-value store, I hit this immediately: how the hell do you know where one message ends and another begins?

TCP gives you a stream of bytes. Just bytes. One read() might give you half of a message, or two messages smooshed together, or one and a half messages. Without framing, can't parse anything.

The problem (with actual suffering)

Client sends: "SET key value"
Server receives: "SET k" ... "ey val" ... "ue"

You don't know when you have the complete message... Just read until I got enough data that looked like a command. Worked great in my toy test. Broke immediately when I tried pipelining requests.

Text framing options

Delimiter approach - end each message with \n:

SET key value\n
GET key\n

Simple, human-readable, you can telnet in and type commands. But what if the value contains \n? Now you need escaping.

Length-prefixed - tell me how long each piece is:

$3\r\nSET\r\n$5\r\nmykey\r\n$7\r\nmyvalue\r\n

This is how Redis (RESP protocol) works. $7 means "7 bytes follow". No escaping, handles binary data. Less readable but way more robust.

Binary framing (the fast way)

First 4 bytes = message length, rest is payload:

[0x00, 0x00, 0x00, 0x0A, ...10 bytes of payload...]

Read 4 bytes, parse the length, then read exactly that many bytes.

var len_buf: [4]u8 = undefined;
try socket.readAll(&len_buf);
const msg_len = std.mem.readInt(u32, &len_buf, .big);  // Network byte order

var buffer = try allocator.alloc(u8, msg_len);
try socket.readAll(buffer);

Efficient, handles any size.

There's also Type-Length-Value (TLV) where each field has a type byte, length, then data. More overhead but self-describing. A bit overkill for me.

Why this matters for security (or: how I learned attackers are creative)

Without proper framing, attackers can wreck you in creative ways:

Buffer overflow - Server keeps reading into a fixed buffer → overflow → crash or worse. This is 90s-era exploitation but people still get it wrong.

Memory exhaustion - claim the message is 4GB:

[LENGTH: 0xFFFFFFFF, ...]

Server goes "okay cool, let me allocate 4GB" → OOM → crash. Fun for the attacker, less fun for you at 2am.

Slowloris - send data 1 byte at a time, never complete the message:

Send: "S"
Wait 10 seconds
Send: "E"
Wait 10 seconds
...

Server keeps connection open waiting for the rest → connection pool exhausted → nobody else can connect. Denial of service with minimal bandwidth.

How you actually fix this

Length-prefixed framing with validation:

var len_buf: [4]u8 = undefined;
try socket.readAll(&len_buf);
const msg_len = std.mem.readInt(u32, &len_buf, .big);

// Validate BEFORE allocating (important!)
if (msg_len > MAX_MESSAGE_SIZE) return error.MessageTooLarge;

// Safe to allocate - we know the size
var buffer = try allocator.alloc(u8, msg_len);

// Read exactly msg_len bytes with timeout
try socket.readAllWithTimeout(buffer, TIMEOUT);

Now you can reject oversized messages, allocate exactly what you need, set timeouts for slow clients, and actually know when the message is complete. Revolutionary, I know.

My actual implementation (with mistakes)

Initially? Just read until \n. Worked fine for simple tests where I manually typed commands.

Then I realized three things while staring at my code at midnight:

What if a value contains \n? (spoiler: it will)
What if someone sends a 10GB message? (my laptop has 16GB RAM and I like using some of it)
How do I handle partial reads? (TCP doesn't give a shit about your message boundaries)

Switched to length-prefixed:

// Read 4-byte length header
var len_buf: [4]u8 = undefined;
try socket.readAll(&len_buf);

const msg_len = std.mem.readInt(u32, &len_buf, .big);

// Validate BEFORE allocating
if (msg_len > MAX_MSG_SIZE) return error.MessageTooLarge;

// Now safe to allocate and read
var buffer = try allocator.alloc(u8, msg_len);
try socket.readAll(buffer);

Simple, safe, handles binary data. The kind of code I should've written first but didn't because I was too clever (read: lazy).

Text vs binary (the eternal debate)

For my learning project, I'm using text-based protocol with length-prefixed framing. Redis does this (RESP with length prefixes). HTTP does this (Content-Length headers). Both work fine.

Binary protocols like Protocol Buffers are more compact and faster to parse. But for a toy KV store handling thousands of ops/sec (not millions), the difference is negligible. And I can telnet in to debug, which is worth the performance hit (that I'll never actually hit).

Choose framing that validates message size before allocating, handles binary data if needed, has clear message boundaries, and lets you detect incomplete messages. Everything else is preference.

I first learned about framing (the hard way) while building hush, a P2P secret sharing tool where trusting arbitrary peers is... optimistic.