Learn Zig Series (#84) - HTTP/1.1 Deep Dive

What will I learn

How HTTP/1.1 works at the byte level -- request lines, headers, and message bodies;
How to build a compliant HTTP/1.1 parser that handles chunked transfer encoding;
How persistent connections (keep-alive) work and why they matter for performance;
How to implement pipelined request handling on a single TCP connection;
How content negotiation and conditional requests reduce bandwidth;
How to test HTTP parsing against edge cases and malformed input.

Requirements

A working modern computer running macOS, Windows or Ubuntu;
An installed Zig 0.14+ distribution (download from ziglang.org);
The ambition to learn Zig programming.

Difficulty

Intermediate

Curriculum (of the `Learn Zig Series`):

Learn Zig Series (#84) - HTTP/1.1 Deep Dive

Solutions to Episode 83 Exercises

Exercise 1: Statistics endpoint via special TXT query

const std = @import("std");
const dns = @import("dns_server.zig");

// Add to DnsServer struct:
pub const DnsServer = struct {
    sock: std.posix.socket_t,
    zone: dns.Zone,
    queries_served: u64,
    start_time: i64,
    bind_port: u16,

    pub fn init(port: u16) !DnsServer {
        const sock = try std.posix.socket(std.posix.AF.INET, std.posix.SOCK.DGRAM, 0);
        errdefer std.posix.close(sock);
        try std.posix.setsockopt(sock, std.posix.SOL.SOCKET, std.posix.SO.REUSEADDR, &std.mem.toBytes(@as(c_int, 1)));
        const addr = std.net.Address.initIp4(.{ 0, 0, 0, 0 }, port);
        try std.posix.bind(sock, &addr.any, addr.getOsSockLen());

        return .{
            .sock = sock,
            .zone = dns.Zone.init(),
            .queries_served = 0,
            .start_time = std.time.timestamp(),
            .bind_port = port,
        };
    }

    fn handleStatsQuery(self: *const DnsServer, query: []const u8, out: []u8) !usize {
        if (query.len < 12) return error.QueryTooShort;
        var hdr: dns.DnsHeader = undefined;
        @memcpy(std.mem.asBytes(&hdr), query[0..12]);
        hdr = hdr.fromNetworkOrder();

        var stats_buf: [256]u8 = undefined;
        const stats_str = std.fmt.bufPrint(&stats_buf, "queries={d} uptime={d}s records={d}", .{
            self.queries_served,
            std.time.timestamp() - self.start_time,
            self.zone.count,
        }) catch "stats_error";

        // Build response header
        const resp_hdr = (dns.DnsHeader{
            .id = hdr.id,
            .flags = 0x8580, // QR=1 AA=1 RD=1 RA=1
            .qdcount = 1,
            .ancount = 1,
            .nscount = 0,
            .arcount = 0,
        }).toNetworkOrder();

        var pos: usize = 0;
        @memcpy(out[0..12], std.mem.asBytes(&resp_hdr));
        pos = 12;

        // Echo question section from query
        var qend: usize = 12;
        while (qend < query.len and query[qend] != 0) qend += 1 + query[qend];
        qend += 1 + 4; // null byte + qtype + qclass
        @memcpy(out[pos..][0 .. qend - 12], query[12..qend]);
        pos += qend - 12;

        // Answer: pointer to question name, type TXT
        out[pos] = 0xC0;
        out[pos + 1] = 12;
        pos += 2;
        std.mem.writeInt(u16, out[pos..][0..2], 16, .big); // TXT
        std.mem.writeInt(u16, out[pos + 2 ..][0..2], 1, .big); // IN
        std.mem.writeInt(u32, out[pos + 4 ..][0..4], 0, .big); // TTL=0
        pos += 8;
        const rdlen: u16 = @intCast(1 + stats_str.len);
        std.mem.writeInt(u16, out[pos..][0..2], rdlen, .big);
        pos += 2;
        out[pos] = @intCast(stats_str.len);
        pos += 1;
        @memcpy(out[pos..][0..stats_str.len], stats_str);
        pos += stats_str.len;

        return pos;
    }
};

The trick is intercepting the query name before normal zone lookup. When the name matches stats.server.local and the type is TXT, we bypass the zone entirely and build the response from live counters. TTL=0 means the stats are never cached -- each query gets fresh numbers.

Exercise 2: Zone reload on SIGUSR1

const std = @import("std");
const posix = std.posix;
const dns = @import("dns_server.zig");

var reload_flag: std.atomic.Value(bool) = std.atomic.Value(bool).init(false);

fn sigusr1Handler(_: c_int) callconv(.C) void {
    reload_flag.store(true, .release);
}

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    const zone_path = "zones.txt";

    var server = try dns.DnsServer.init(5353);
    defer server.deinit();

    // Load initial zone
    const zone_data = try std.fs.cwd().readFileAlloc(allocator, zone_path, 64 * 1024);
    defer allocator.free(zone_data);
    try server.zone.loadFromText(zone_data);
    std.debug.print("loaded {d} records from {s}\n", .{ server.zone.count, zone_path });

    // Install SIGUSR1 handler
    const act = posix.Sigaction{
        .handler = .{ .handler = sigusr1Handler },
        .mask = posix.empty_sigset,
        .flags = posix.SA.RESTART,
    };
    try posix.sigaction(posix.SIG.USR1, &act, null);

    // Serve loop (check reload flag each iteration)
    var recv_buf: [512]u8 = undefined;
    var resp_buf: [512]u8 = undefined;
    while (true) {
        if (reload_flag.load(.acquire)) {
            reload_flag.store(false, .release);
            const new_data = std.fs.cwd().readFileAlloc(allocator, zone_path, 64 * 1024) catch |err| {
                std.debug.print("reload failed: {}\n", .{err});
                continue;
            };
            var new_zone = dns.Zone.init();
            new_zone.loadFromText(new_data) catch |err| {
                std.debug.print("zone parse error on reload: {}\n", .{err});
                allocator.free(new_data);
                continue;
            };
            allocator.free(new_data);
            server.zone = new_zone;
            std.debug.print("zone reloaded: {d} records\n", .{server.zone.count});
        }

        var client_addr: posix.sockaddr.storage = undefined;
        var addr_len: posix.socklen_t = @sizeOf(posix.sockaddr.storage);
        const n = posix.recvfrom(server.sock, &recv_buf, 0, @ptrCast(&client_addr), &addr_len) catch continue;
        if (n < 12) continue;
        const resp_len = dns.buildResponse(recv_buf[0..n], &server.zone, &resp_buf) catch continue;
        _ = posix.sendto(server.sock, resp_buf[0..resp_len], 0, @ptrCast(&client_addr), addr_len) catch continue;
        server.queries_served += 1;
    }
}

The reload is safe because we build a completely new Zone struct before swapping it in. The server.zone = new_zone assignment is a single struct copy -- since our serve loop is single-threaded and we check the flag between queries, there's no race. If we had multiple threads we'd need a mutex or an atomic pointer swap.

Exercise 3: DNS proxy with local zone + upstream forwarding

const std = @import("std");
const posix = std.posix;
const dns = @import("dns_server.zig");

pub const DnsProxy = struct {
    listen_sock: posix.socket_t,
    upstream_sock: posix.socket_t,
    zone: dns.Zone,
    upstream_addr: std.net.Address,

    pub fn init(port: u16, upstream_ip: [4]u8) !DnsProxy {
        const lsock = try posix.socket(posix.AF.INET, posix.SOCK.DGRAM, 0);
        errdefer posix.close(lsock);
        try posix.setsockopt(lsock, posix.SOL.SOCKET, posix.SO.REUSEADDR, &std.mem.toBytes(@as(c_int, 1)));
        const bind_addr = std.net.Address.initIp4(.{ 0, 0, 0, 0 }, port);
        try posix.bind(lsock, &bind_addr.any, bind_addr.getOsSockLen());

        const usock = try posix.socket(posix.AF.INET, posix.SOCK.DGRAM, 0);

        return .{
            .listen_sock = lsock,
            .upstream_sock = usock,
            .zone = dns.Zone.init(),
            .upstream_addr = std.net.Address.initIp4(upstream_ip, 53),
        };
    }

    pub fn serve(self: *DnsProxy) !void {
        var recv_buf: [512]u8 = undefined;
        var resp_buf: [512]u8 = undefined;

        while (true) {
            var client_addr: posix.sockaddr.storage = undefined;
            var addr_len: posix.socklen_t = @sizeOf(posix.sockaddr.storage);
            const n = posix.recvfrom(self.listen_sock, &recv_buf, 0, @ptrCast(&client_addr), &addr_len) catch continue;
            if (n < 12) continue;

            // Try local zone first
            const local_len = dns.buildResponse(recv_buf[0..n], &self.zone, &resp_buf) catch 0;
            if (local_len > 0) {
                // Check if we got a real answer (not NXDOMAIN)
                const rcode = std.mem.readInt(u16, resp_buf[2..4], .big) & 0x000F;
                const ancount = std.mem.readInt(u16, resp_buf[6..8], .big);
                if (rcode == 0 and ancount > 0) {
                    _ = posix.sendto(self.listen_sock, resp_buf[0..local_len], 0, @ptrCast(&client_addr), addr_len) catch continue;
                    continue;
                }
            }

            // Forward to upstream
            _ = posix.sendto(self.upstream_sock, recv_buf[0..n], 0, &self.upstream_addr.any, self.upstream_addr.getOsSockLen()) catch continue;

            var up_addr: posix.sockaddr.storage = undefined;
            var up_len: posix.socklen_t = @sizeOf(posix.sockaddr.storage);
            const resp_n = posix.recvfrom(self.upstream_sock, &resp_buf, 0, @ptrCast(&up_addr), &up_len) catch continue;

            _ = posix.sendto(self.listen_sock, resp_buf[0..resp_n], 0, @ptrCast(&client_addr), addr_len) catch continue;
        }
    }
};

pub fn main() !void {
    var proxy = try DnsProxy.init(5353, .{ 8, 8, 8, 8 });

    // Blocklist: ad domains resolve to 0.0.0.0
    try proxy.zone.loadFromText(
        \\ads.example.com A 3600 0.0.0.0
        \\tracker.example.com A 3600 0.0.0.0
        \\analytics.evil.com A 3600 0.0.0.0
    );

    std.debug.print("DNS proxy on :5353, upstream 8.8.8.8, {d} blocked domains\n", .{proxy.zone.count});
    try proxy.serve();
}

The proxy checks its local zone first -- if there's a match with actual answers (not NXDOMAIN), it serves that directly. For blocked domains this returns 0.0.0.0, which is how Pi-hole works. Everything else gets forwarded to 8.8.8.8 and the response relayed back. A production version would add caching (using the TTL from upstream answers) and timeout handling for the upstream query.

We've been building networking infrastructure from the ground up across these last several episodes -- UDP sockets, DNS resolution, a DNS server. Now it's time to jump up the protocol stack and look at the protocol that runs most of the internet: HTTP/1.1.

Back in episodes 51-54 we built an HTTP server, but we treated request parsing as "find the first line, grab some headers, done." That works for simple cases, but HTTP/1.1 has some real complexity hiding under the surface: chunked transfer encoding, persistent connections, content negotiation, conditional requests -- the stuff that makes the protocol actually work in production. Today we're going deep on all of it ;-)

HTTP/1.1 request anatomy

An HTTP request is text-based (unlike DNS, which is binary). That sounds simpler, but text protocols have their own challenges -- you're scanning for delimiters instead of reading fixed-width fields, and the data can be arbitrarily long. Here's what a request actually looks like on the wire:

GET /api/users?page=2 HTTP/1.1\r\n
Host: example.com\r\n
Accept: application/json\r\n
Connection: keep-alive\r\n
\r\n

Every line ends with \r\n (CRLF), and the header section ends with an empty line (just \r\n by itself). Let's build a proper parser:

const std = @import("std");

pub const Method = enum {
    GET,
    POST,
    PUT,
    DELETE,
    HEAD,
    OPTIONS,
    PATCH,

    pub fn fromString(s: []const u8) !Method {
        if (std.mem.eql(u8, s, "GET")) return .GET;
        if (std.mem.eql(u8, s, "POST")) return .POST;
        if (std.mem.eql(u8, s, "PUT")) return .PUT;
        if (std.mem.eql(u8, s, "DELETE")) return .DELETE;
        if (std.mem.eql(u8, s, "HEAD")) return .HEAD;
        if (std.mem.eql(u8, s, "OPTIONS")) return .OPTIONS;
        if (std.mem.eql(u8, s, "PATCH")) return .PATCH;
        return error.UnknownMethod;
    }
};

pub const Header = struct {
    name: []const u8,
    value: []const u8,
};

pub const HttpRequest = struct {
    method: Method,
    path: []const u8,
    version: []const u8,
    headers: [64]Header,
    header_count: usize,
    body: []const u8,
    raw: []const u8,

    pub fn getHeader(self: *const HttpRequest, name: []const u8) ?[]const u8 {
        for (self.headers[0..self.header_count]) |h| {
            if (std.ascii.eqlIgnoreCase(h.name, name)) return h.value;
        }
        return null;
    }

    pub fn getContentLength(self: *const HttpRequest) ?usize {
        const val = self.getHeader("Content-Length") orelse return null;
        return std.fmt.parseInt(usize, val, 10) catch null;
    }

    pub fn isKeepAlive(self: *const HttpRequest) bool {
        if (self.getHeader("Connection")) |conn| {
            if (std.ascii.eqlIgnoreCase(conn, "close")) return false;
        }
        // HTTP/1.1 defaults to keep-alive
        return std.mem.eql(u8, self.version, "HTTP/1.1");
    }
};

The getHeader function does case-insensitive comparison, which is required by the HTTP spec -- Content-Type and content-type and CONTENT-TYPE are all the same header. This is one of those things that's easy to forget and annoying to debug when a client sends headers in unexpected casing.

Parsing the request line and headers

The tricky part of HTTP parsing is handling the boundaries correctly. You need to find CRLF sequences, handle headers that might span multiple lines (obsolete but technically allowed by the spec), and figure out where the body starts:

pub const ParseError = error{
    Incomplete,
    InvalidRequestLine,
    UnknownMethod,
    HeaderTooLong,
    TooManyHeaders,
    InvalidContentLength,
    BodyTooLarge,
};

pub fn parseRequest(buf: []const u8, max_body: usize) ParseError!HttpRequest {
    var req: HttpRequest = undefined;
    req.raw = buf;
    req.header_count = 0;
    req.body = &.{};

    // Find the end of headers (double CRLF)
    const header_end = std.mem.indexOf(u8, buf, "\r\n\r\n") orelse return error.Incomplete;
    const header_section = buf[0..header_end];

    // Parse request line
    const first_line_end = std.mem.indexOf(u8, header_section, "\r\n") orelse return error.InvalidRequestLine;
    const request_line = header_section[0..first_line_end];

    var parts = std.mem.tokenizeScalar(u8, request_line, ' ');
    const method_str = parts.next() orelse return error.InvalidRequestLine;
    req.method = Method.fromString(method_str) catch return error.UnknownMethod;
    req.path = parts.next() orelse return error.InvalidRequestLine;
    req.version = parts.next() orelse return error.InvalidRequestLine;

    // Parse headers
    var header_data = header_section[first_line_end + 2 ..];
    while (header_data.len > 0) {
        const line_end = std.mem.indexOf(u8, header_data, "\r\n") orelse {
            // Last header without trailing CRLF (shouldn't happen but handle it)
            if (header_data.len > 0) {
                if (req.header_count >= req.headers.len) return error.TooManyHeaders;
                if (parseHeaderLine(header_data)) |h| {
                    req.headers[req.header_count] = h;
                    req.header_count += 1;
                }
            }
            break;
        };

        const line = header_data[0..line_end];
        if (line.len == 0) break;

        if (req.header_count >= req.headers.len) return error.TooManyHeaders;
        if (parseHeaderLine(line)) |h| {
            req.headers[req.header_count] = h;
            req.header_count += 1;
        }

        header_data = header_data[line_end + 2 ..];
    }

    // Determine body boundaries
    const body_start = header_end + 4; // skip \r\n\r\n
    if (req.getContentLength()) |content_len| {
        if (content_len > max_body) return error.BodyTooLarge;
        if (buf.len < body_start + content_len) return error.Incomplete;
        req.body = buf[body_start .. body_start + content_len];
    }

    return req;
}

fn parseHeaderLine(line: []const u8) ?Header {
    const colon_pos = std.mem.indexOfScalar(u8, line, ':') orelse return null;
    const name = std.mem.trim(u8, line[0..colon_pos], " \t");
    const value = std.mem.trim(u8, line[colon_pos + 1 ..], " \t");
    if (name.len == 0) return null;
    return .{ .name = name, .value = value };
}

Notice the Incomplete error -- this is critical for a real server. TCP is a stream protocol, so you might receive half a request in one recv call and the other half in the next. The parser returns Incomplete when it hasn't seen the double CRLF yet (headers not fully received) or when Content-Length says there should be more body data than we have. The caller needs to keep reading from the socket and re-parsing until it gets a complete request. We dealt with this same framing issue in our key-value store TCP server (episode 42) -- it's a fundamental pattern for any TCP-based protocol.

Chunked transfer encoding

Here's where HTTP/1.1 gets interesting. Sometimes the server (or client) doesn't know the total body size upfront. Maybe it's generating content on the fly, or streaming from another source. Instead of a Content-Length header, it uses chunked transfer encoding -- the body arrives as a series of chunks, each prefixed with its size in hexadecimal:

HTTP/1.1 200 OK\r\n
Transfer-Encoding: chunked\r\n
\r\n
1a\r\n
This is the first chunk.\r\n
10\r\n
Second chunk!!\r\n
0\r\n
\r\n

The hex number before each chunk is the number of bytes that follow. The stream ends with a zero-length chunk (0\r\n\r\n). Let's build a decoder:

pub const ChunkedDecoder = struct {
    output: [8192]u8,
    output_len: usize,
    state: State,
    current_chunk_remaining: usize,

    const State = enum {
        reading_size,
        reading_data,
        reading_trailer,
        done,
    };

    pub fn init() ChunkedDecoder {
        return .{
            .output = undefined,
            .output_len = 0,
            .state = .reading_size,
            .current_chunk_remaining = 0,
        };
    }

    pub fn feed(self: *ChunkedDecoder, data: []const u8) !usize {
        var consumed: usize = 0;

        while (consumed < data.len) {
            switch (self.state) {
                .reading_size => {
                    // Look for \r\n after the hex size
                    const remaining = data[consumed..];
                    const crlf = std.mem.indexOf(u8, remaining, "\r\n") orelse return consumed;

                    const size_str = std.mem.trim(u8, remaining[0..crlf], " \t");
                    // Handle chunk extensions (stuff after semicolon)
                    const actual_size = if (std.mem.indexOfScalar(u8, size_str, ';')) |semi|
                        size_str[0..semi]
                    else
                        size_str;

                    self.current_chunk_remaining = std.fmt.parseInt(usize, actual_size, 16) catch return error.InvalidChunkSize;
                    consumed += crlf + 2;

                    if (self.current_chunk_remaining == 0) {
                        self.state = .reading_trailer;
                    } else {
                        self.state = .reading_data;
                    }
                },
                .reading_data => {
                    const remaining = data[consumed..];
                    const to_copy = @min(remaining.len, self.current_chunk_remaining);

                    if (self.output_len + to_copy > self.output.len) return error.OutputFull;
                    @memcpy(self.output[self.output_len..][0..to_copy], remaining[0..to_copy]);
                    self.output_len += to_copy;
                    self.current_chunk_remaining -= to_copy;
                    consumed += to_copy;

                    if (self.current_chunk_remaining == 0) {
                        // Expect trailing \r\n after chunk data
                        if (consumed + 2 > data.len) return consumed;
                        if (data[consumed] != '\r' or data[consumed + 1] != '\n') return error.MissingChunkTrailer;
                        consumed += 2;
                        self.state = .reading_size;
                    }
                },
                .reading_trailer => {
                    // After the 0-length chunk, there might be trailer headers
                    // For simplicity we just look for the final \r\n
                    const remaining = data[consumed..];
                    const crlf = std.mem.indexOf(u8, remaining, "\r\n") orelse return consumed;
                    consumed += crlf + 2;
                    if (crlf == 0) {
                        self.state = .done;
                        return consumed;
                    }
                    // Otherwise it's a trailer header -- skip it
                },
                .done => return consumed,
            }
        }

        return consumed;
    }

    pub fn getOutput(self: *const ChunkedDecoder) []const u8 {
        return self.output[0..self.output_len];
    }

    pub fn isDone(self: *const ChunkedDecoder) bool {
        return self.state == .done;
    }
};

The decoder is a state machine with four states: reading the chunk size (hex), reading chunk data, handling trailers (optional headers after the last chunk), and done. The feed function can be called multiple times as data arrives -- it processes as much as it can and returns how many bytes it consumed. This incremental approach is essential because chunked data might arrive across multiple TCP reads.

The chunk extension handling (the ; check) is one of those HTTP details that most people don't know about. A chunk size line can include extension parameters like 1a;name=value\r\n. We just strip everything after the semicolon. Most servers don't use extensions, but a robust parser should handle them without crashing.

Persistent connections and keep-alive

HTTP/1.0 opened a new TCP connection for every single request -- connect, send request, receive response, close. HTTP/1.1 changed this with persistent connections: the TCP connection stays open by default, and multiple requests can flow over the same connection sequentially. This saves the overhead of TCP handshakes and slow start for every request.

Here's how to handle multiple requests on one connection:

pub const ConnectionHandler = struct {
    stream: std.net.Stream,
    recv_buf: [8192]u8,
    recv_len: usize,

    pub fn init(stream: std.net.Stream) ConnectionHandler {
        return .{
            .stream = stream,
            .recv_buf = undefined,
            .recv_len = 0,
        };
    }

    /// Read and handle requests until the connection closes
    pub fn handleConnection(self: *ConnectionHandler) !void {
        while (true) {
            // Try to parse a complete request from the buffer
            while (true) {
                const request = parseRequest(self.recv_buf[0..self.recv_len], 1024 * 1024) catch |err| switch (err) {
                    error.Incomplete => break, // need more data
                    else => {
                        try self.sendError(400, "Bad Request");
                        return;
                    },
                };

                // Process the request
                try self.processRequest(&request);

                // Check if client wants to close
                if (!request.isKeepAlive()) return;

                // Remove processed data from buffer
                const consumed = getConsumedBytes(&request);
                const remaining = self.recv_len - consumed;
                if (remaining > 0) {
                    std.mem.copyForwards(u8, self.recv_buf[0..remaining], self.recv_buf[consumed..self.recv_len]);
                }
                self.recv_len = remaining;
            }

            // Read more data
            if (self.recv_len >= self.recv_buf.len) {
                try self.sendError(413, "Request Too Large");
                return;
            }
            const n = self.stream.read(self.recv_buf[self.recv_len..]) catch return;
            if (n == 0) return; // connection closed
            self.recv_len += n;
        }
    }

    fn processRequest(self: *ConnectionHandler, request: *const HttpRequest) !void {
        // Build a simple response
        var resp_buf: [4096]u8 = undefined;
        const body = "Hello from Zig HTTP/1.1 server!\n";
        const resp = std.fmt.bufPrint(&resp_buf,
            "HTTP/1.1 200 OK\r\n" ++
            "Content-Type: text/plain\r\n" ++
            "Content-Length: {d}\r\n" ++
            "Connection: {s}\r\n" ++
            "\r\n{s}",
            .{
                body.len,
                if (request.isKeepAlive()) "keep-alive" else "close",
                body,
            },
        ) catch return;
        _ = try self.stream.write(resp);
    }

    fn sendError(self: *ConnectionHandler, code: u16, msg: []const u8) !void {
        var buf: [256]u8 = undefined;
        const resp = std.fmt.bufPrint(&buf,
            "HTTP/1.1 {d} {s}\r\nContent-Length: 0\r\nConnection: close\r\n\r\n",
            .{ code, msg },
        ) catch return;
        _ = self.stream.write(resp) catch {};
    }

    fn getConsumedBytes(request: *const HttpRequest) usize {
        const header_end = std.mem.indexOf(u8, request.raw, "\r\n\r\n").? + 4;
        return header_end + request.body.len;
    }
};

The key insight is the inner while (true) loop that tries to parse multiple requests from the same buffer. After processing one request, we shift the leftover bytes to the front of the buffer and try again -- there might already be another complete request waiting (this is called pipelining, and its how aggressive HTTP clients send multiple requests without waiting for each response).

The outer loop reads more data from the socket when the parser says Incomplete. This two-loop pattern -- "try to parse, if incomplete read more, repeat" -- is the standard approach for stream-based text protocols.

Having said that, there's a subtle issue with pipelining: we must process requests in order and send responses in order. HTTP/1.1 requires this. We can't start sending a response for request #2 before request #1's response is fully sent. In our single-threaded implementation this happens natutally, but if you were using multiple threads you'd need to coordinate the response ordering.

Content negotiation

Content negotiation is how a client tells the server what formats it can handle. The Accept header lists media types in order of preference:

Accept: application/json, text/html;q=0.9, */*;q=0.1

The q parameter is a quality factor from 0 to 1, where 1 is most preferred (and the default when q isn't specified). Let's parse this:

pub const MediaRange = struct {
    type_str: []const u8,
    subtype: []const u8,
    quality: f32,
};

pub fn parseAcceptHeader(accept: []const u8, out: []MediaRange) usize {
    var count: usize = 0;
    var it = std.mem.tokenizeScalar(u8, accept, ',');

    while (it.next()) |entry| {
        if (count >= out.len) break;
        const trimmed = std.mem.trim(u8, entry, " \t");

        // Split on semicolon to separate type from parameters
        var param_it = std.mem.tokenizeScalar(u8, trimmed, ';');
        const type_part = std.mem.trim(u8, param_it.next() orelse continue, " \t");

        var quality: f32 = 1.0;
        while (param_it.next()) |param| {
            const p = std.mem.trim(u8, param, " \t");
            if (p.len > 2 and p[0] == 'q' and p[1] == '=') {
                quality = std.fmt.parseFloat(f32, p[2..]) catch 1.0;
            }
        }

        // Split type/subtype
        if (std.mem.indexOfScalar(u8, type_part, '/')) |slash| {
            out[count] = .{
                .type_str = type_part[0..slash],
                .subtype = type_part[slash + 1 ..],
                .quality = quality,
            };
            count += 1;
        }
    }

    // Sort by quality descending (simple insertion sort)
    var i: usize = 1;
    while (i < count) : (i += 1) {
        var j = i;
        while (j > 0 and out[j].quality > out[j - 1].quality) {
            const tmp = out[j];
            out[j] = out[j - 1];
            out[j - 1] = tmp;
            j -= 1;
        }
    }

    return count;
}

pub fn selectContentType(accept: []const u8, available: []const []const u8) ?[]const u8 {
    var ranges: [16]MediaRange = undefined;
    const count = parseAcceptHeader(accept, &ranges);

    for (ranges[0..count]) |range| {
        for (available) |content_type| {
            if (std.mem.indexOfScalar(u8, content_type, '/')) |slash| {
                const ct_type = content_type[0..slash];
                const ct_sub = content_type[slash + 1 ..];
                const type_match = std.mem.eql(u8, range.type_str, "*") or std.mem.eql(u8, range.type_str, ct_type);
                const sub_match = std.mem.eql(u8, range.subtype, "*") or std.mem.eql(u8, range.subtype, ct_sub);
                if (type_match and sub_match) return content_type;
            }
        }
    }

    return null;
}

The selectContentType function takes the client's Accept header and a list of content types the server can produce, and returns the best match. Wildcards (*/* or text/*) are handled properly. If there's no match at all, the server should return 406 Not Acceptable.

This matters more than you might think. REST APIs use content negotiation to serve JSON, XML, or HTML from the same endpoint. And Accept-Encoding (gzip, deflate, br) uses the same quality-factor mechanism for compression negotiation.

Conditional requests

Conditional requests let the client ask "give me this resource, but only if it's changed since the last time I fetched it." The server compares an ETag (a hash of the content) or Last-Modified timestamp, and either sends the full response (200) or a short "nothing changed" response (304 Not Modified). This saves massive bandwidth for resources that don't change often:

pub const ConditionalCheck = struct {
    etag: ?[]const u8,
    last_modified: ?i64,

    pub fn init() ConditionalCheck {
        return .{ .etag = null, .last_modified = null };
    }

    pub fn setEtag(self: *ConditionalCheck, tag: []const u8) void {
        self.etag = tag;
    }

    pub fn setLastModified(self: *ConditionalCheck, timestamp: i64) void {
        self.last_modified = timestamp;
    }

    /// Check if the client's cached version is still valid
    pub fn shouldReturn304(self: *const ConditionalCheck, request: *const HttpRequest) bool {
        // Check If-None-Match (ETag comparison)
        if (self.etag) |current_etag| {
            if (request.getHeader("If-None-Match")) |client_etag| {
                // Handle multiple ETags separated by commas
                var it = std.mem.tokenizeScalar(u8, client_etag, ',');
                while (it.next()) |tag| {
                    const trimmed = std.mem.trim(u8, tag, " \t");
                    if (std.mem.eql(u8, trimmed, current_etag)) return true;
                    if (std.mem.eql(u8, trimmed, "*")) return true;
                }
            }
        }

        // Check If-Modified-Since (timestamp comparison)
        if (self.last_modified) |mod_time| {
            if (request.getHeader("If-Modified-Since")) |since_str| {
                const since = parseHttpDate(since_str) orelse return false;
                if (mod_time <= since) return true;
            }
        }

        return false;
    }
};

/// Parse a simplified HTTP date (RFC 7231 format: "Sun, 06 Nov 1994 08:49:37 GMT")
fn parseHttpDate(date_str: []const u8) ?i64 {
    // Simplified parser -- handles the most common format
    if (date_str.len < 25) return null;

    // Skip day name and comma
    const rest = if (std.mem.indexOfScalar(u8, date_str, ',')) |comma|
        std.mem.trim(u8, date_str[comma + 1 ..], " ")
    else
        return null;

    if (rest.len < 20) return null;

    const day = std.fmt.parseInt(u8, rest[0..2], 10) catch return null;
    _ = day;
    const month_str = rest[3..6];
    const year = std.fmt.parseInt(u16, rest[7..11], 10) catch return null;
    _ = year;
    const hour = std.fmt.parseInt(u8, rest[12..14], 10) catch return null;
    _ = hour;
    const minute = std.fmt.parseInt(u8, rest[15..17], 10) catch return null;
    _ = minute;
    const second = std.fmt.parseInt(u8, rest[18..20], 10) catch return null;
    _ = second;

    const months = [_][]const u8{ "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" };
    var month: u8 = 0;
    for (months, 1..) |m, i| {
        if (std.mem.eql(u8, month_str, m)) {
            month = @intCast(i);
            break;
        }
    }
    if (month == 0) return null;

    // Return a rough timestamp (not perfectly accurate, but good enough for comparison)
    // Real implementation would use proper calendar math
    return @as(i64, @intCast(year)) * 365 * 86400 + @as(i64, @intCast(month)) * 30 * 86400;
}

pub fn send304Response(stream: std.net.Stream, etag: ?[]const u8) !void {
    var buf: [256]u8 = undefined;
    var pos: usize = 0;

    const prefix = "HTTP/1.1 304 Not Modified\r\n";
    @memcpy(buf[pos..][0..prefix.len], prefix);
    pos += prefix.len;

    if (etag) |tag| {
        const etag_hdr = "ETag: ";
        @memcpy(buf[pos..][0..etag_hdr.len], etag_hdr);
        pos += etag_hdr.len;
        @memcpy(buf[pos..][0..tag.len], tag);
        pos += tag.len;
        @memcpy(buf[pos..][0..2], "\r\n");
        pos += 2;
    }

    const end = "Content-Length: 0\r\n\r\n";
    @memcpy(buf[pos..][0..end.len], end);
    pos += end.len;

    _ = try stream.write(buf[0..pos]);
}

ETags are often just a hash of the file content. When a resource hasn't changed, the 304 response is typically under 200 bytes instead of potentially megabytes of content. For static assets (CSS, JavaScript, images), conditional requests can reduce bandwidth by 90%+ on repeat visits. That's why your browser sends If-None-Match with every request for a cached resource -- it's a deliberate design choice in the protocol.

Building response packets

Let's build a proper response builder that handles status codes, headers, and body:

pub const HttpResponse = struct {
    buf: [16384]u8,
    pos: usize,
    header_sent: bool,
    status_code: u16,
    headers: [32]Header,
    header_count: usize,

    pub fn init(status: u16) HttpResponse {
        return .{
            .buf = undefined,
            .pos = 0,
            .header_sent = false,
            .status_code = status,
            .headers = undefined,
            .header_count = 0,
        };
    }

    pub fn setHeader(self: *HttpResponse, name: []const u8, value: []const u8) void {
        if (self.header_count >= self.headers.len) return;
        self.headers[self.header_count] = .{ .name = name, .value = value };
        self.header_count += 1;
    }

    pub fn build(self: *HttpResponse, body: []const u8) ![]const u8 {
        // Status line
        const reason = statusReason(self.status_code);
        var n = std.fmt.bufPrint(self.buf[0..], "HTTP/1.1 {d} {s}\r\n", .{ self.status_code, reason }) catch return error.BufferTooSmall;
        self.pos = n.len;

        // Add Content-Length if not already set
        var has_content_length = false;
        for (self.headers[0..self.header_count]) |h| {
            if (std.ascii.eqlIgnoreCase(h.name, "Content-Length")) {
                has_content_length = true;
                break;
            }
        }
        if (!has_content_length and body.len > 0) {
            var cl_buf: [20]u8 = undefined;
            const cl_str = std.fmt.bufPrint(&cl_buf, "{d}", .{body.len}) catch "0";
            self.setHeader("Content-Length", cl_str);
        }

        // Write headers
        for (self.headers[0..self.header_count]) |h| {
            const line = std.fmt.bufPrint(self.buf[self.pos..], "{s}: {s}\r\n", .{ h.name, h.value }) catch return error.BufferTooSmall;
            self.pos += line.len;
        }

        // End of headers
        if (self.pos + 2 > self.buf.len) return error.BufferTooSmall;
        @memcpy(self.buf[self.pos..][0..2], "\r\n");
        self.pos += 2;

        // Body
        if (body.len > 0) {
            if (self.pos + body.len > self.buf.len) return error.BufferTooSmall;
            @memcpy(self.buf[self.pos..][0..body.len], body);
            self.pos += body.len;
        }

        return self.buf[0..self.pos];
    }

    fn statusReason(code: u16) []const u8 {
        return switch (code) {
            200 => "OK",
            201 => "Created",
            204 => "No Content",
            301 => "Moved Permanently",
            304 => "Not Modified",
            400 => "Bad Request",
            401 => "Unauthorized",
            403 => "Forbidden",
            404 => "Not Found",
            405 => "Method Not Allowed",
            406 => "Not Acceptable",
            413 => "Payload Too Large",
            500 => "Internal Server Error",
            else => "Unknown",
        };
    }
};

The response builder auto-adds Content-Length when you provide a body and haven't already set the header. This is important because without Content-Length or chunked encoding, the client has no way to know when the response body ends (except closing the connection, which defeats keep-alive).

Testing the parser against edge cases

HTTP parsers are notorious for security bugs. Let's write tests that exercise the tricky parts:

test "parse basic GET request" {
    const raw = "GET /index.html HTTP/1.1\r\nHost: example.com\r\nAccept: text/html\r\n\r\n";
    const req = try parseRequest(raw, 1024 * 1024);
    try std.testing.expectEqual(Method.GET, req.method);
    try std.testing.expectEqualStrings("/index.html", req.path);
    try std.testing.expectEqual(@as(usize, 2), req.header_count);
    try std.testing.expectEqualStrings("example.com", req.getHeader("Host").?);
    try std.testing.expect(req.isKeepAlive());
}

test "parse POST with body" {
    const raw = "POST /api/data HTTP/1.1\r\nHost: example.com\r\nContent-Length: 13\r\n\r\nHello, World!";
    const req = try parseRequest(raw, 1024 * 1024);
    try std.testing.expectEqual(Method.POST, req.method);
    try std.testing.expectEqualStrings("Hello, World!", req.body);
    try std.testing.expectEqual(@as(?usize, 13), req.getContentLength());
}

test "incomplete request returns error" {
    const raw = "GET /test HTTP/1.1\r\nHost: exa";
    const result = parseRequest(raw, 1024 * 1024);
    try std.testing.expectError(error.Incomplete, result);
}

test "connection close header" {
    const raw = "GET / HTTP/1.1\r\nHost: x\r\nConnection: close\r\n\r\n";
    const req = try parseRequest(raw, 1024 * 1024);
    try std.testing.expect(!req.isKeepAlive());
}

test "HTTP/1.0 defaults to close" {
    const raw = "GET / HTTP/1.0\r\nHost: x\r\n\r\n";
    const req = try parseRequest(raw, 1024 * 1024);
    try std.testing.expect(!req.isKeepAlive());
}

test "case insensitive header lookup" {
    const raw = "GET / HTTP/1.1\r\ncontent-type: text/html\r\nCONTENT-LENGTH: 0\r\n\r\n";
    const req = try parseRequest(raw, 1024 * 1024);
    try std.testing.expect(req.getHeader("Content-Type") != null);
    try std.testing.expect(req.getHeader("content-length") != null);
}

test "chunked decoder basic" {
    var decoder = ChunkedDecoder.init();
    const input = "5\r\nHello\r\n6\r\n World\r\n0\r\n\r\n";
    _ = try decoder.feed(input);
    try std.testing.expect(decoder.isDone());
    try std.testing.expectEqualStrings("Hello World", decoder.getOutput());
}

test "chunked decoder with extensions" {
    var decoder = ChunkedDecoder.init();
    const input = "5;ext=val\r\nHello\r\n0\r\n\r\n";
    _ = try decoder.feed(input);
    try std.testing.expect(decoder.isDone());
    try std.testing.expectEqualStrings("Hello", decoder.getOutput());
}

test "content negotiation" {
    const available = [_][]const u8{ "text/html", "application/json", "text/plain" };
    const result = selectContentType("application/json, text/html;q=0.9", &available);
    try std.testing.expect(result != null);
    try std.testing.expectEqualStrings("application/json", result.?);
}

test "content negotiation wildcard" {
    const available = [_][]const u8{"application/xml"};
    const result = selectContentType("text/html, */*;q=0.1", &available);
    try std.testing.expect(result != null);
    try std.testing.expectEqualStrings("application/xml", result.?);
}

test "conditional request 304" {
    var check = ConditionalCheck.init();
    check.setEtag("\"abc123\"");

    const raw = "GET / HTTP/1.1\r\nHost: x\r\nIf-None-Match: \"abc123\"\r\n\r\n";
    const req = try parseRequest(raw, 1024 * 1024);
    try std.testing.expect(check.shouldReturn304(&req));
}

test "conditional request no match" {
    var check = ConditionalCheck.init();
    check.setEtag("\"abc123\"");

    const raw = "GET / HTTP/1.1\r\nHost: x\r\nIf-None-Match: \"xyz789\"\r\n\r\n";
    const req = try parseRequest(raw, 1024 * 1024);
    try std.testing.expect(!check.shouldReturn304(&req));
}

These tests cover the normal cases and several edge cases: incomplete data, HTTP version differences for keep-alive defaults, case-insensitive header lookup, chunked encoding with extensions, content negotiation with wildcards, and conditional request matching. Each test exercises a specific aspect of the protocol. The Incomplete test is particularly important -- if your parser hangs or crashes on partial input, your server is vulnerable to slow-read attacks.

Why HTTP/1.1 still matters

You might wonder: "isn't everyone using HTTP/2 or HTTP/3 now?" Not exactly. HTTP/1.1 is still the most deployed version of HTTP on the internet. Most localhost development, most reverse proxy backends, most internal microservice communication, and a huge percentage of public websites still use HTTP/1.1. Even when you use HTTP/2 frontends (like nginx or cloudflare), the backend connection is often HTTP/1.1.

Understanding HTTP/1.1 at the byte level also makes HTTP/2 easier to learn. HTTP/2 doesn't change the semantics (methods, headers, status codes) -- it changes the framing (binary instead of text, multiplexed streams instead of sequential). And HTTP/3 just swaps TCP for QUIC underneath. The conceptual layer -- requests, responses, headers, content negotiation, conditional requests -- is the same across all three versions.

We'll be looking at the HTTP/2 framing layer in an upcoming episode, and having this HTTP/1.1 parser as a reference point will make the binary framing much easier to follow. The wire format changes, but the concepts remain constant.

Exercises

Implement chunked response encoding on the server side. Add a function to the HttpResponse builder that, instead of buffering the whole body and sending it with Content-Length, sends the response with Transfer-Encoding: chunked. The function should accept a callback or iterator that yields body chunks of varying sizes. Test it by having a client decode the chunked response using our ChunkedDecoder.
Add request timeout handling to the ConnectionHandler. If a client connects but doesn't send a complete request within 5 seconds, the server should close the connection. Use setsockopt with SO_RCVTIMEO to set a receive timeout on the socket (we covered socket options back in episode 21). Write a test that connects and sends data byte-by-byte to verify the timeout fires.
Build a simple HTTP proxy that accepts HTTP/1.1 requests, connects to the target host (from the Host header), forwards the request, and relays the response back to the client. The proxy must handle both Content-Length and chunked responses correctly. Test it by configuring curl to use your proxy: curl --proxy http://127.0.0.1:8080 http://example.com.

Thanks for reading!

Hive account@scipio

Learn Zig Series (#84) - HTTP/1.1 Deep Dive

Learn Zig Series (#84) - HTTP/1.1 Deep Dive

What will I learn

Requirements

Difficulty

Curriculum (of the Learn Zig Series):

Learn Zig Series (#84) - HTTP/1.1 Deep Dive

Solutions to Episode 83 Exercises

HTTP/1.1 request anatomy

Parsing the request line and headers

Chunked transfer encoding

Persistent connections and keep-alive

Content negotiation

Conditional requests

Building response packets

Testing the parser against edge cases

Why HTTP/1.1 still matters

Exercises

Thanks for reading!

Curriculum (of the `Learn Zig Series`):