Learn Zig Series):Exercise 1: Chunked response encoding on the server side
const std = @import("std");
/// Streams a response body as Transfer-Encoding: chunked. Instead of buffering
/// the whole body, we ask a generator for the next chunk until it returns null.
pub const ChunkedWriter = struct {
stream: std.net.Stream,
header_sent: bool = false,
pub fn init(stream: std.net.Stream) ChunkedWriter {
return .{ .stream = stream };
}
fn sendHeader(self: *ChunkedWriter, status: u16) !void {
var buf: [128]u8 = undefined;
const head = try std.fmt.bufPrint(&buf,
"HTTP/1.1 {d} OK\r\n" ++
"Transfer-Encoding: chunked\r\n" ++
"Content-Type: text/plain\r\n\r\n",
.{status},
);
_ = try self.stream.write(head);
self.header_sent = true;
}
/// Write one chunk. Each chunk is: \r\n\r\n
pub fn writeChunk(self: *ChunkedWriter, data: []const u8) !void {
if (!self.header_sent) try self.sendHeader(200);
if (data.len == 0) return; // never emit a 0-size chunk here -- that ends the body
var size_buf: [16]u8 = undefined;
const size_line = try std.fmt.bufPrint(&size_buf, "{x}\r\n", .{data.len});
_ = try self.stream.write(size_line);
_ = try self.stream.write(data);
_ = try self.stream.write("\r\n");
}
/// Terminate the stream with the final zero-length chunk.
pub fn finish(self: *ChunkedWriter) !void {
if (!self.header_sent) try self.sendHeader(200);
_ = try self.stream.write("0\r\n\r\n");
}
};
/// Example: stream a body in 4 pieces of varying size.
pub fn streamExample(stream: std.net.Stream) !void {
var writer = ChunkedWriter.init(stream);
const pieces = [_][]const u8{ "Chunk one. ", "Two! ", "Three is a bit longer. ", "Done." };
for (pieces) |piece| {
try writer.writeChunk(piece);
}
try writer.finish();
}
The key insight is that the server never needs to know the total size upfront -- it commits to Transfer-Encoding: chunked in the header and then emits each chunk with its own hex length prefix. The finish call sends the terminating 0\r\n\r\n that our ChunkedDecoder from episode 84 is waiting for. A client built on that decoder will reassemble the four pieces into one body without any change.
Exercise 2: Request timeout handling with SO_RCVTIMEO
const std = @import("std");
const posix = std.posix;
/// Apply a receive timeout to a socket. After `seconds`, a blocked recv call
/// returns error.WouldBlock instead of hanging forever -- this defeats slow-read
/// (Slowloris-style) attacks where a client dribbles bytes to tie up a worker.
pub fn setRecvTimeout(fd: posix.socket_t, seconds: u32) !void {
const tv = posix.timeval{
.sec = @intCast(seconds),
.usec = 0,
};
try posix.setsockopt(
fd,
posix.SOL.SOCKET,
posix.SO.RCVTIMEO,
std.mem.asBytes(&tv),
);
}
/// Read a full request, giving up if the client stalls. Returns error.Timeout
/// when the socket read times out before the double CRLF arrives.
pub fn readWithTimeout(stream: std.net.Stream, buf: []u8, timeout_s: u32) ![]u8 {
try setRecvTimeout(stream.handle, timeout_s);
var total: usize = 0;
while (true) {
const n = stream.read(buf[total..]) catch |err| switch (err) {
error.WouldBlock => return error.Timeout, // SO_RCVTIMEO fired
else => return err,
};
if (n == 0) return error.ConnectionClosed;
total += n;
// Stop as soon as we have a complete header section.
if (std.mem.indexOf(u8, buf[0..total], "\r\n\r\n") != null) {
return buf[0..total];
}
if (total == buf.len) return error.RequestTooLarge;
}
}
The trick is that SO_RCVTIMEO turns a blocking read into a bounded one at the kernel level -- no extra thread, no manual clock. When the timeout elapses, the read surfaces as error.WouldBlock in Zig's posix layer, and we translate that into our own error.Timeout. Note bene: the timeout resets on every successful read, so a client that sends one byte per 4 seconds with a 5 second timeout would technically keep the connection alive. A production server adds a total deadline on top of the per-read timeout to close that gap.
Exercise 3: A simple HTTP/1.1 forward proxy
const std = @import("std");
const http = @import("http.zig"); // parseRequest + HttpRequest from episode 84
pub fn runProxy(listen_port: u16) !void {
const addr = try std.net.Address.parseIp4("127.0.0.1", listen_port);
var server = try addr.listen(.{ .reuse_address = true });
defer server.deinit();
var buf: [16384]u8 = undefined;
while (true) {
const conn = server.accept() catch continue;
defer conn.stream.close();
const n = conn.stream.read(&buf) catch continue;
const req = http.parseRequest(buf[0..n], 1024 * 1024) catch continue;
// The Host header tells us where to forward.
const host = req.getHeader("Host") orelse continue;
const target = try std.net.tcpConnectToHost(std.heap.page_allocator, host, 80);
defer target.close();
// Relay the original request bytes verbatim.
_ = target.write(buf[0..n]) catch continue;
// Pump the upstream response back to the client until EOF.
var relay: [16384]u8 = undefined;
while (true) {
const got = target.read(&relay) catch break;
if (got == 0) break;
_ = conn.stream.write(relay[0..got]) catch break;
}
}
}
A forward proxy is conceptually simple: read the client's request, look at the Host header to decide where it goes, open a connection to that host, forward the request, and relay the response back. The version above relays raw bytes, which handles both Content-Length and chunked responses correctly because it just copies until the upstream closes or stops sending. A hardened proxy would parse the response framing so it can reuse the upstream connection (keep-alive) instead of closing after each request -- but that's a refinement, not a requirement for curl --proxy http://127.0.0.1:8080 http://example.com to work.
At the end of episode 84 I said understanding HTTP/1.1 at the byte level would make HTTP/2 easier, and we'd look at the framing layer soon. Well -- here we are ;-)
HTTP/2 looks intimidating because it is binary instead of text, but the semantics are exactly the same as the protocol we just dissected. Same methods, same headers, same status codes. What changed is how the bytes are laid out on the wire, and that change exists to solve one very specific, very annoying problem.
Remember pipelining from last episode? In theory HTTP/1.1 lets a client fire off request #2 before #1's response arrives. In practice almost nobody uses it, because of head-of-line blocking: responses must come back in the exact order the requests went out. If request #1 is a slow database query and #2 is a tiny static file, #2 still has to wait behind #1. One slow response stalls everything queued behind it.
Browsers worked around this by opening 6 parallel TCP connections per host. That "works", but each connection pays its own handshake cost, its own TCP slow-start ramp, and competes for bandwidth with the others. It's a hack on top of a limitation.
HTTP/2 fixes this properly with multiplexing: many independent streams share one TCP connection, and their data is interleaved at the byte level using small frames. Stream #1's slow response no longer blocks stream #3's fast one, because their frames are independent units that can arrive in any order. To make interleaving possible, the protocol had to stop being a text stream and become a sequence of self-describing binary frames. That's the whole idea.
Every HTTP/2 frame begins with a fixed 9-byte header, followed by a variable-length payload:
+-----------------------------------------------+
| Length (24 bits) |
+---------------+---------------+---------------+
| Type (8) | Flags (8) |
+-+-------------+---------------+-------------------------------+
|R| Stream Identifier (31 bits) |
+=+=============================================================+
| Frame Payload (length bytes) |
+---------------------------------------------------------------+
That's it. A 24-bit length (so a single frame payload maxes out at 16 MB, though the default limit is 16 KB), an 8-bit type, 8 bits of flags, one reserved bit, and a 31-bit stream identifier. Compared to scanning for \r\n delimiters in HTTP/1.1, this is a joy to parse -- fixed offsets, no ambiguity, no searching. Let's model it in Zig:
const std = @import("std");
pub const FrameType = enum(u8) {
data = 0x0,
headers = 0x1,
priority = 0x2,
rst_stream = 0x3,
settings = 0x4,
push_promise = 0x5,
ping = 0x6,
goaway = 0x7,
window_update = 0x8,
continuation = 0x9,
_, // unknown types MUST be ignored per the spec -- the open enum allows that
};
pub const FrameHeader = struct {
length: u24,
frame_type: FrameType,
flags: u8,
stream_id: u31,
pub const SIZE = 9;
pub fn parse(buf: []const u8) !FrameHeader {
if (buf.len < SIZE) return error.Incomplete;
const raw_id = std.mem.readInt(u32, buf[5..9], .big);
return .{
.length = std.mem.readInt(u24, buf[0..3], .big),
.frame_type = @enumFromInt(buf[3]),
.flags = buf[4],
// The top bit is reserved and MUST be ignored on receipt -- mask it off.
.stream_id = @intCast(raw_id & 0x7FFF_FFFF),
};
}
pub fn encode(self: FrameHeader, out: []u8) void {
std.mem.writeInt(u24, out[0..3], self.length, .big);
out[3] = @intFromEnum(self.frame_type);
out[4] = self.flags;
std.mem.writeInt(u32, out[5..9], @as(u32, self.stream_id), .big);
}
};
Two things make this clean in Zig. First, u24 and u31 are real types -- I can say exactly what the wire format means without manual bit-shifting into a u32. Second, the non-exhaustive enum (that lonely _, at the end of FrameType) is doing real work: the spec says an endpoint that receives an unknown frame type must ignore it, not crash. The open enum lets @enumFromInt accept any byte without UB, and our handler can match the known variants and discard the rest. A closed enum would panic on an unrecognized value.
Flags are a bitfield whose meaning depends on the frame type. The same 0x1 bit means END_STREAM on a DATA frame but ACK on a SETTINGS frame. A small set of helpers keeps the call sites readable:
pub const Flags = struct {
pub const END_STREAM: u8 = 0x1;
pub const ACK: u8 = 0x1; // same bit, different frame types (SETTINGS, PING)
pub const END_HEADERS: u8 = 0x4;
pub const PADDED: u8 = 0x8;
pub const PRIORITY: u8 = 0x20;
pub fn has(flags: u8, flag: u8) bool {
return (flags & flag) != 0;
}
};
// Every HTTP/2 connection opens with this exact 24-byte client preface,
// sent before any frames. It is intentionally invalid HTTP/1.x so a 1.1-only
// server rejects it instead of misinterpreting it.
pub const CLIENT_PREFACE = "PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n";
That preface string is one of my favorite bits of protocol design. It spells "PRISM" if you squint, and it's deliberately crafted to look like a broken HTTP/1.1 request so an old server bails out cleanly rather than doing something dangerous with it. After the preface, both sides immediately exchange SETTINGS frames to announce their limits.
SETTINGS is the simplest real frame to parse -- its payload is just a list of 6-byte entries, each a 16-bit identifier plus a 32-bit value. It's where each side advertises things like its max concurrent streams and its initial flow-control window:
pub const Setting = enum(u16) {
header_table_size = 0x1,
enable_push = 0x2,
max_concurrent_streams = 0x3,
initial_window_size = 0x4,
max_frame_size = 0x5,
max_header_list_size = 0x6,
_,
};
pub const Settings = struct {
max_concurrent_streams: u32 = 100,
initial_window_size: u32 = 65535, // the protocol default
max_frame_size: u32 = 16384,
pub fn applyFrame(self: *Settings, payload: []const u8) !void {
if (payload.len % 6 != 0) return error.FrameSizeError;
var i: usize = 0;
while (i < payload.len) : (i += 6) {
const id: Setting = @enumFromInt(std.mem.readInt(u16, payload[i..][0..2], .big));
const value = std.mem.readInt(u32, payload[i + 2 ..][0..4], .big);
switch (id) {
.max_concurrent_streams => self.max_concurrent_streams = value,
.initial_window_size => self.initial_window_size = value,
.max_frame_size => self.max_frame_size = value,
else => {}, // unknown settings are ignored, same philosophy as frame types
}
}
}
};
Notice the payload.len % 6 != 0 check returning error.FrameSizeError. The spec gives that exact error name -- a SETTINGS frame whose length isn't a multiple of 6 is a connection error. Validating the size before reading protects us from a malformed peer trying to walk us off the end of the buffer. This is the same defensive parsing mindset we used for the HTTP/1.1 Incomplete case last episode.
TCP gives us a byte stream, so just like in episode 84 we can't assume a whole frame arrives in one read. We need a reader that buffers bytes and yields complete frames as they become available:
pub const Frame = struct {
header: FrameHeader,
payload: []const u8,
};
pub const FrameReader = struct {
buf: [65536]u8 = undefined,
len: usize = 0,
/// Append freshly read socket bytes into the internal buffer.
pub fn feed(self: *FrameReader, data: []const u8) !void {
if (self.len + data.len > self.buf.len) return error.BufferFull;
@memcpy(self.buf[self.len..][0..data.len], data);
self.len += data.len;
}
/// Try to pull one complete frame. Returns null when more bytes are needed.
pub fn next(self: *FrameReader) !?Frame {
if (self.len < FrameHeader.SIZE) return null;
const header = try FrameHeader.parse(self.buf[0..self.len]);
const total = FrameHeader.SIZE + @as(usize, header.length);
if (self.len < total) return null; // payload not fully arrived yet
const payload = self.buf[FrameHeader.SIZE..total];
const frame = Frame{ .header = header, .payload = payload };
// Shift the leftover bytes to the front for the next frame.
const remaining = self.len - total;
std.mem.copyForwards(u8, self.buf[0..remaining], self.buf[total..self.len]);
self.len = remaining;
return frame;
}
};
The pattern is identical in spirit to the two-loop "try to parse, read more if incomplete" approach from the HTTP/1.1 connection handler -- but it's so much cleaner here because the length is right there in the header. No searching, no Content-Length lookup, no chunked state machine. We read 9 bytes, learn exactly how many more we need, and either have a complete frame or we don't. Binary protocols pay for their lack of human-readability with this kind of parsing simplicity.
A stream is an independent, bidirectional sequence of frames within the connection, identified by that 31-bit stream id. Client-initiated streams use odd ids (1, 3, 5, ...), server-initiated (push) streams use even ids. Stream id 0 is reserved for connection-level frames like SETTINGS and PING.
Each stream marches through a well-defined lifecycle, and Zig's enums plus a switch make the state machine almost self-documenting -- this is exactly the tagged-union state-machine pattern from episode 33, applied to a real protocol:
pub const StreamState = enum {
idle,
open,
half_closed_local,
half_closed_remote,
closed,
};
pub const Stream = struct {
id: u31,
state: StreamState = .idle,
send_window: i32 = 65535,
recv_window: i32 = 65535,
/// Advance the state machine when we receive a frame on this stream.
pub fn onRecv(self: *Stream, frame_type: FrameType, end_stream: bool) !void {
switch (self.state) {
.idle => {
if (frame_type != .headers) return error.ProtocolError;
self.state = if (end_stream) .half_closed_remote else .open;
},
.open => {
if (end_stream) self.state = .half_closed_remote;
},
.half_closed_local => {
if (end_stream) self.state = .closed;
},
.half_closed_remote => return error.StreamClosed, // peer already finished
.closed => return error.StreamClosed,
}
}
};
Reading a stream opens when its first HEADERS frame arrives. END_STREAM on a frame moves the sender's half to closed -- "half-closed" means one direction is done but the other can still send. When both halves close, the stream is finished and its id is never reused. Modelling this with an explicit enum means an illegal transition (say, DATA arriving on an idle stream) is caught immediately as a ProtocolError rather than corrupting state silently. That's the payoff of making invalid states unrepresentable, which is a theme we keep coming back to in this series.
The connection owns a table of streams. When a frame arrives, we look up (or create) its stream by id and dispatch. Because each stream tracks its own state and window independently, frames for different streams can interleave freely -- which is the entire point of HTTP/2:
pub const Connection = struct {
streams: std.AutoHashMap(u31, Stream),
settings: Settings,
next_server_id: u31 = 2,
pub fn init(allocator: std.mem.Allocator) Connection {
return .{
.streams = std.AutoHashMap(u31, Stream).init(allocator),
.settings = .{},
};
}
pub fn deinit(self: *Connection) void {
self.streams.deinit();
}
pub fn handleFrame(self: *Connection, frame: Frame) !void {
// Connection-level frames live on stream 0.
if (frame.header.stream_id == 0) {
switch (frame.header.frame_type) {
.settings => if (!Flags.has(frame.header.flags, Flags.ACK))
try self.settings.applyFrame(frame.payload),
.ping, .goaway, .window_update => {}, // handled elsewhere
else => return error.ProtocolError,
}
return;
}
const gop = try self.streams.getOrPut(frame.header.stream_id);
if (!gop.found_existing) {
gop.value_ptr.* = .{ .id = frame.header.stream_id };
}
const end_stream = Flags.has(frame.header.flags, Flags.END_STREAM);
try gop.value_ptr.onRecv(frame.header.frame_type, end_stream);
}
};
getOrPut (which we first met back in episode 22) is perfect here -- one hash lookup either finds the existing stream or gives us a slot to initialise a new one, no double lookup. A real server caps the number of concurrent open streams at max_concurrent_streams and rejects new streams beyond that with a RST_STREAM, otherwise a malicious client could open millions of streams and exhaust memory (this is the basis of the "HTTP/2 Rapid Reset" attack from 2023, by the way).
Multiplexing creates a new problem: if ten streams all dump data onto one connection, a slow client can be overwhelmed. HTTP/2 solves this with flow control windows. Every stream (and the connection as a whole) has a window of credit; sending DATA consumes credit, and the receiver replenishes it by sending WINDOW_UPDATE frames as it processes the data:
pub fn consumeSendWindow(stream: *Stream, conn_window: *i32, n: u32) !void {
const amount: i32 = @intCast(n);
if (stream.send_window < amount or conn_window.* < amount) {
return error.FlowControlBlocked; // must wait for a WINDOW_UPDATE
}
stream.send_window -= amount;
conn_window.* -= amount; // DATA counts against BOTH the stream and the connection
}
pub fn applyWindowUpdate(window: *i32, increment: u32) !void {
if (increment == 0) return error.ProtocolError; // a 0 increment is illegal
const new_val = @as(i64, window.*) + @as(i64, increment);
if (new_val > 0x7FFF_FFFF) return error.FlowControlError; // window overflow
window.* = @intCast(new_val);
}
The subtle part is that DATA frames count against two windows simultaneously: the per-stream window and the connection-wide window. A sender can only transmit min(stream_window, connection_window) bytes before it must stop and wait. Using i32 for the window (not u32) is deliberate -- a SETTINGS change to initial_window_size can retroactively push an active window negative, and the spec explicitly allows that. The signed type lets us represent it honestly instead of wrapping around.
Frame parsers are exactly the kind of code that benefits from tests against hand-built byte sequences. As always in this series (since episode 12), the tests live right next to the code:
test "round-trip a frame header" {
const original = FrameHeader{
.length = 1024,
.frame_type = .data,
.flags = Flags.END_STREAM,
.stream_id = 5,
};
var buf: [9]u8 = undefined;
original.encode(&buf);
const parsed = try FrameHeader.parse(&buf);
try std.testing.expectEqual(original.length, parsed.length);
try std.testing.expectEqual(original.frame_type, parsed.frame_type);
try std.testing.expectEqual(original.stream_id, parsed.stream_id);
}
test "reserved bit in stream id is masked off" {
// Top bit set: 0x80000001 -- the reserved bit must be ignored, leaving id = 1.
const buf = [_]u8{ 0, 0, 0, 0x1, 0, 0x80, 0, 0, 0x01 };
const parsed = try FrameHeader.parse(&buf);
try std.testing.expectEqual(@as(u31, 1), parsed.stream_id);
}
test "frame reader yields complete frames only" {
var reader = FrameReader{};
// A 4-byte DATA frame on stream 1, fed in two halves.
const full = [_]u8{ 0, 0, 4, 0, 0, 0, 0, 0, 1, 'a', 'b', 'c', 'd' };
try reader.feed(full[0..7]);
try std.testing.expect((try reader.next()) == null); // incomplete
try reader.feed(full[7..]);
const frame = (try reader.next()).?;
try std.testing.expectEqualStrings("abcd", frame.payload);
}
test "settings frame rejects bad length" {
var s = Settings{};
const bad = [_]u8{ 0, 0x3, 0, 0, 0 }; // 5 bytes, not a multiple of 6
try std.testing.expectError(error.FrameSizeError, s.applyFrame(&bad));
}
test "stream rejects data before headers" {
var stream = Stream{ .id = 1 };
try std.testing.expectError(error.ProtocolError, stream.onRecv(.data, false));
}
test "window update overflow is caught" {
var window: i32 = 0x7FFF_FFFF;
try std.testing.expectError(error.FlowControlError, applyWindowUpdate(&window, 1));
}
The "fed in two halves" test is the important one -- it proves the reader handles TCP segmentation correctly, which is the bug that bites everyone who assumes one read equals one message. The reserved-bit test guards a classic interop failure: some buggy clients set that top bit, and a parser that doesn't mask it computes a wildly wrong stream id.
In C, you'd reach for nghttp2, the battle-tested reference library. It's excellent but it's a callback-driven C API with manual memory management for every header and frame -- and HTTP/2's HPACK header compression (a topic in its own right) is where most of the hand-rolled C bugs live. Our u24/u31 types and length-prefixed parsing would be manual bit-twiddling and memcpy calls there, with no compiler help if you get an offset wrong.
In Go, net/http speaks HTTP/2 transparently -- you basically get it for free, which is fantastic for productivity but hides the framing entirely. The goroutine-per-stream model maps beautifully onto multiplexing, at the cost of a heavier runtime and garbage collector.
In Rust, h2 plus hyper give you a memory-safe, async implementation with an ergonomic stream API, conceptually close to what we built but production-grade. Rust's enums and pattern matching express the stream state machine just as naturally as Zig's do.
Where Zig sits is its usual spot: you write the framing yourself, with no hidden allocations and no runtime, but the language gives you exact-width integers, non-exhaustive enums for forward-compatible frame handling, and switch exhaustiveness so a forgotten state is a compile error. For embedded or proxy work where you need full control over buffers and latency, that combination is genuinely hard to beat. Having said that -- for a typical web app, you'd use one of the forementioned libraries and never touch a frame ;-)
We now have the framing layer: frames, streams, multiplexing, flow control. What we have not touched is HPACK, the header compression scheme HTTP/2 uses to avoid re-sending the same Host and User-Agent headers on every request. That, and securing all of this with TLS (you rarely see plaintext HTTP/2 in the wild), are natural next steps that build directly on the C interop and binary-parsing muscles we've been training. The frame reader you wrote today is the foundation everything else sits on.
Implement a PING frame handler. A PING frame has an 8-byte opaque payload and stream id 0. When you receive one without the ACK flag, you must echo the exact same payload back with the ACK flag set. Write the encoder, the handler, and a test that round-trips a ping and verifies the ACK bit is set on the reply.
Add RST_STREAM support to the Stream state machine. An RST_STREAM frame carries a 4-byte error code and immediately moves the stream to closed from any state. Add an onReset method, make sure a frame arriving on a reset stream returns error.StreamClosed, and test that a stream in open can be reset to closed.
Build a frame logger -- a small tool that reads a captured HTTP/2 byte stream from a file (you can generate one with curl --http2-prior-knowledge -v against a local server) and prints each frame as type stream_id length flags. Use your FrameReader to walk the stream and skip unknown frame types gracefully without crashing.