splice change the picture;Learn Zig Series):Last episode we built a gRPC service from its three honest parts -- protobuf payloads, HTTP/2 transport, and a thin calling convention. The exercises pushed past the unary case: a client-streaming handler, deadline handling, and a typed client stub. Each solution reuses FrameReader, writeMessage, Service, Status, and the protobuf helpers (writeLenField, readVarint, readLen) from episode 92.
Exercise 1: Implement client-streaming
A client-streaming handler consumes many request messages and answers once. The framing did all the heavy lifting already -- this is purely looping FrameReader.next until it hands back null:
// Sum every varint the client streams, reply with the total as one protobuf message.
fn sumNumbers(alloc: std.mem.Allocator, request_body: []const u8) RpcError![]u8 {
var fr = FrameReader{ .buf = request_body };
var total: u64 = 0;
while (fr.next() catch return error.InvalidArgument) |msg| {
total += readSingleVarint(msg) catch return error.InvalidArgument; // one number per framed message
}
var out: std.ArrayListUnmanaged(u8) = .{};
errdefer out.deinit(alloc);
writeVarintField(&out, alloc, 1, total) catch return error.Internal;
return out.toOwnedSlice(alloc) catch return error.Internal;
}
test "client-streaming sums three framed messages" {
const alloc = std.testing.allocator;
var body: std.ArrayListUnmanaged(u8) = .{};
defer body.deinit(alloc);
for ([_]u64{ 10, 20, 12 }) |n| {
var one: [10]u8 = undefined;
try writeMessage(&body, alloc, encodeVarintInto(&one, n)); // frame each number separately
}
const reply = try sumNumbers(alloc, body.items);
defer alloc.free(reply);
var off: usize = 0;
_ = try readVarint(reply, &off); // skip the field tag
try std.testing.expectEqual(@as(u64, 42), try readVarint(reply, &off));
}
The handler never cares whether it got one message or a thousand -- it drains the stream. That is the whole "streaming" feature: same wire format, you just stop assuming exactly one frame.
Exercise 2: Honour the grpc-timeout header
Real clients send a header like grpc-timeout: 100m, where the trailing letter is the unit (n nanoseconds, u micros, m millis, S seconds, ...). Parse the digits, multiply by the unit, and you have a deadline in nanoseconds:
fn parseTimeoutNanos(header: []const u8) ?u64 {
if (header.len < 2) return null;
const unit = header[header.len - 1];
const value = std.fmt.parseInt(u64, header[0 .. header.len - 1], 10) catch return null;
const scale: u64 = switch (unit) {
'n' => 1,
'u' => 1_000,
'm' => 1_000_000,
'S' => 1_000_000_000,
'M' => 60 * 1_000_000_000,
'H' => 60 * 60 * 1_000_000_000,
else => return null, // unknown unit -> treat as no deadline
};
return value *| scale; // saturating multiply: a silly-large timeout clamps instead of wrapping
}
test "grpc-timeout units parse correctly" {
try std.testing.expectEqual(@as(?u64, 100_000_000), parseTimeoutNanos("100m"));
try std.testing.expectEqual(@as(?u64, 1_000_000_000), parseTimeoutNanos("1S"));
try std.testing.expectEqual(@as(?u64, null), parseTimeoutNanos("100x"));
}
In serveCall you would read the deadline before dispatch, and if the handler overruns it (checked against a std.time.Timer from episode 70), you skip the reply and write grpc-status: 4 (deadline_exceeded) in the trailer in stead of a body. The saturating *| is the kind of detail that bites you only in production -- a client sending 99999999999H should not wrap around to a tiny deadline.
Exercise 3: Add a typed client stub
Codegen normally writes this for you. Doing it once by hand shows exactly what a generated stub hides: encode the request, frame it, run it through the service, and -- the crucial part -- translate a non-zero grpc-status back into a Zig error:
fn callSayHello(alloc: std.mem.Allocator, svc: *Service, name: []const u8) RpcError![]u8 {
// 1. Encode HelloRequest { name = 1 } and frame it like a DATA payload.
var req: std.ArrayListUnmanaged(u8) = .{};
defer req.deinit(alloc);
writeLenField(&req, alloc, 1, name) catch return error.Internal;
var framed: std.ArrayListUnmanaged(u8) = .{};
defer framed.deinit(alloc);
writeMessage(&framed, alloc, req.items) catch return error.Internal;
// 2. In a test we hand the frame straight to the handler instead of a socket.
const handler = svc.lookup("/greeter.Greeter/SayHello") orelse return error.NotFound;
var fr = FrameReader{ .buf = framed.items };
const msg = (fr.next() catch return error.Internal) orelse return error.InvalidArgument;
// 3. A returning handler IS grpc-status 0; an error becomes the typed failure.
return handler(alloc, msg); // RpcError propagates as-is -- the stub is honest about failure
}
test "stub returns the greeting on success" {
const alloc = std.testing.allocator;
var svc = Service{ .alloc = alloc };
defer svc.deinit();
try svc.register("/greeter.Greeter/SayHello", sayHello);
const reply = try callSayHello(alloc, &svc, "scipio");
defer alloc.free(reply);
var off: usize = 0;
_ = try readVarint(reply, &off);
try std.testing.expectEqualStrings("Hello, scipio!", try readLen(reply, &off));
}
The honest part is step 3: the stub does not swallow failures into a generic "something went wrong". A non-OK status comes back as the specific RpcError member, so the caller can if (err == error.NotFound) and mean it. Three exercises, and our gRPC service grew a streaming handler, deadline awareness, and a typed client -- the three things that separate a demo from something you'd actually call.
Here we go ;-) At the close of episode 92 I pointed at the messy real world: many machines, behind firewalls, on networks that would rather not talk to each other directly. The very first tool anyone reaches for when a client cannot reach a server directly is a proxy -- a willing middleman that takes the connection you can make and uses it to make the connection you can't. And the proxy protocol that has quietly outlived every fancier alternative is SOCKS5, defined in RFC 1928 back in 1996 and still sitting underneath your SSH dynamic forwards, your Tor client, and half the "VPN" apps on your phone.
The beautiful thing about SOCKS5 -- and the reason it fits this arc so well -- is how little protocol there is. It is a short greeting, a single connect request, and then the proxy gets out of the way and copies bytes. No encryption, no message framing once the tunnel is up, no clever multiplexing. After the handshake, a SOCKS5 proxy is the dumbest program you'll write all year, and that dumbness is exactly the point. Let's build it from the bytes up.
Picture three machines. Your client (a browser, say) can reach the proxy, and the proxy can reach the destination (some website). The client can't reach the destination directly -- maybe a firewall blocks it, maybe the client has no route, maybe it deliberately wants to hide behind the proxy's IP. SOCKS5 is the conversation the client has with the proxy to say "please open a TCP connection to that host and port, and then just shuttle bytes between me and it".
That conversation has exactly three phases. First, a method-negotiation greeting where the two sides agree on authentication (usually "none"). Second, a request where the client names the destination and the command (almost always CONNECT). Third -- and this is the whole rest of the connection's life -- a relay, where the proxy reads from one socket and writes to the other, in both directions, until somebody hangs up. Having said that, phases one and two are a few dozen bytes total. Phase three is where the data lives.
The genius of SOCKS, versus an application-level proxy like an HTTP CONNECT, is that it is protocol-agnostic. The proxy never parses HTTP, never looks at TLS, never knows what's flowing through it. It learns a destination, dials it, and copies. That's why it works for any TCP protocol -- and why, security-wise, it's a fat unauthenticated pipe you must never expose to the open internet without a good reason.
The client opens by listing the auth methods it supports. The wire format is three fields: a version byte (always 0x05), a count, and then that many one-byte method identifiers. The server picks one and replies with two bytes. Method 0x00 is "no authentication", 0x02 is username/password, and 0xFF means "none of your methods are acceptable, goodbye":
const std = @import("std");
const SOCKS_VERSION: u8 = 0x05;
const METHOD_NONE: u8 = 0x00;
const METHOD_USERPASS: u8 = 0x02;
const METHOD_REJECT: u8 = 0xFF;
// Read the client greeting and choose a method. Returns the chosen method byte.
fn negotiateMethod(stream: anytype) !u8 {
var head: [2]u8 = undefined;
try readExact(stream, &head); // [version][nmethods]
if (head[0] != SOCKS_VERSION) return error.BadVersion;
var methods: [255]u8 = undefined;
const n = head[1];
try readExact(stream, methods[0..n]); // the client's offered methods
// We only speak "no auth" here. Walk the list; accept it if offered.
const chosen: u8 = for (methods[0..n]) |m| {
if (m == METHOD_NONE) break METHOD_NONE;
} else METHOD_REJECT;
try stream.writeAll(&[_]u8{ SOCKS_VERSION, chosen }); // [version][chosen method]
if (chosen == METHOD_REJECT) return error.NoAcceptableMethod;
return chosen;
}
The for ... else is one of Zig's quietly lovely shapes (we first leaned on it back in episode 3): the else branch runs only if the loop never breaks, so "the client offered no-auth" and "fall back to reject" land in one expression with no flag variable. And readExact -- a thin loop over stream.read that errors on a short read -- is the same defensive helper from the DNS and HTTP episodes. A proxy talks to hostile clients by definition, so every read is bounds-aware or it's a bug.
Now the interesting parse. The CONNECT request names a destination, and SOCKS5 allows three address types: a 4-byte IPv4 address, a length-prefixed domain name, or a 16-byte IPv6 address. This is precisely what Zig's tagged unions (episode 6) were made for -- one type that is exactly one of several shapes, and a switch the compiler forces you to cover completely:
const ATYP_IPV4: u8 = 0x01;
const ATYP_DOMAIN: u8 = 0x03;
const ATYP_IPV6: u8 = 0x04;
const Destination = union(enum) {
ipv4: struct { addr: [4]u8, port: u16 },
domain: struct { name: []const u8, port: u16 }, // borrows from the read buffer
ipv6: struct { addr: [16]u8, port: u16 },
// Turn the parsed destination into something std.net can dial.
fn resolve(self: Destination, alloc: std.mem.Allocator) !std.net.Address {
return switch (self) {
.ipv4 => |v| std.net.Address.initIp4(v.addr, v.port),
.ipv6 => |v| std.net.Address.initIp6(v.addr, v.port, 0, 0),
.domain => |v| blk: {
// A domain means the PROXY does the DNS, not the client (episode 82).
const list = try std.net.getAddressList(alloc, v.name, v.port);
defer list.deinit();
if (list.addrs.len == 0) return error.HostUnreachable;
break :blk list.addrs[0];
},
};
}
};
Notice the .domain case is where SOCKS5 earns its keep for privacy tools: when the client sends a hostname in stead of an IP, the proxy resolves it. The client's local DNS never sees the lookup. That's the "DNS leak" everyone frets about with cheap VPN apps -- if the client resolves first and only sends an IP, the proxy can't help you. Send the name, and the resolution moves behind the proxy (using exactly the resolver machinery we built back in episode 82).
The request header mirrors the reply: a version byte, a command, a reserved zero, an address type, then the address and a big-endian port. We only implement CONNECT (0x01) -- the other two commands, BIND and UDP ASSOCIATE, are rare enough that a clean "command not supported" is the honest answer:
const CMD_CONNECT: u8 = 0x01;
const Request = struct {
dest: Destination,
name_buf: [256]u8 = undefined, // backing storage for a domain name
fn parse(stream: anytype, self: *Request) !void {
var head: [4]u8 = undefined;
try readExact(stream, &head); // [version][cmd][reserved][atyp]
if (head[0] != SOCKS_VERSION) return error.BadVersion;
if (head[1] != CMD_CONNECT) return error.CommandNotSupported;
switch (head[3]) {
ATYP_IPV4 => {
var raw: [6]u8 = undefined; // 4 addr + 2 port
try readExact(stream, &raw);
self.dest = .{ .ipv4 = .{ .addr = raw[0..4].*, .port = readPort(raw[4..6]) } };
},
ATYP_IPV6 => {
var raw: [18]u8 = undefined; // 16 addr + 2 port
try readExact(stream, &raw);
self.dest = .{ .ipv6 = .{ .addr = raw[0..16].*, .port = readPort(raw[16..18]) } };
},
ATYP_DOMAIN => {
var len_byte: [1]u8 = undefined;
try readExact(stream, &len_byte);
const len = len_byte[0];
try readExact(stream, self.name_buf[0..len]); // length-prefixed, max 255
var port_raw: [2]u8 = undefined;
try readExact(stream, &port_raw);
self.dest = .{ .domain = .{ .name = self.name_buf[0..len], .port = readPort(&port_raw) } };
},
else => return error.AddressTypeNotSupported,
}
}
};
fn readPort(b: []const u8) u16 {
return std.mem.readInt(u16, b[0..2], .big); // network byte order, same as every episode in this arc
}
That domain length is a single byte, so a name is at most 255 characters -- which is exactly why name_buf is [256]u8 and we never allocate for it. The bound is baked into the protocol, so we lean on a fixed buffer in stead of the heap (episode 7's lesson: don't allocate what the spec already bounds for you). And readPort is the same readInt(..., .big) discipline you've now seen in protobuf, msgpack, DNS, and HTTP/2 -- big-endian on the wire, named explicitly, so cross-compilation (episode 35) never reinterprets a port behind your back.
When the proxy tries to connect to the destination, it can fail in a handful of well-defined ways, and SOCKS5 has a reply code for each. This is the same trick we pulled with gRPC status codes last episode: an error set for the failures, and a total mapping to the wire byte so the compiler refuses to let a new error escape unmapped:
const ProxyError = error{
GeneralFailure,
ConnectionNotAllowed,
NetworkUnreachable,
HostUnreachable,
ConnectionRefused,
CommandNotSupported,
AddressTypeNotSupported,
};
// RFC 1928 reply codes. 0x00 is success; the rest mirror ProxyError.
fn replyCode(err: ?ProxyError) u8 {
return switch (err orelse return 0x00) {
error.GeneralFailure => 0x01,
error.ConnectionNotAllowed => 0x02,
error.NetworkUnreachable => 0x03,
error.HostUnreachable => 0x04,
error.ConnectionRefused => 0x05,
error.CommandNotSupported => 0x07,
error.AddressTypeNotSupported => 0x08,
};
}
// Map a std.net dial failure onto the right SOCKS reply.
fn classifyDialError(e: anyerror) ProxyError {
return switch (e) {
error.ConnectionRefused => error.ConnectionRefused,
error.NetworkUnreachable => error.NetworkUnreachable,
error.HostUnreachable, error.UnknownHostName => error.HostUnreachable,
else => error.GeneralFailure, // anything we didn't name degrades to a generic failure
};
}
This is episode 4's philosophy doing real work again. The replyCode switch is exhaustive over ProxyError, so add a reply reason tomorrow and the build breaks until you give it a byte. The client always learns why a connection failed -- "refused" versus "unreachable" versus "not allowed" -- in stead of a useless catch-all, because the type system makes the distinction cheap to keep honest.
Once parsed, the proxy dials the destination, then sends a reply with the bound address (the IPv4 zeros below are the conventional "I'm not telling you my local address" filler, which every real client accepts):
fn sendReply(stream: anytype, code: u8) !void {
// [version][reply code][reserved][atyp=ipv4][bnd.addr 0.0.0.0][bnd.port 0]
try stream.writeAll(&[_]u8{ SOCKS_VERSION, code, 0x00, ATYP_IPV4, 0, 0, 0, 0, 0, 0 });
}
fn connectDestination(req: *const Request, alloc: std.mem.Allocator) ProxyError!std.net.Stream {
const address = req.dest.resolve(alloc) catch |e| return classifyDialError(e);
return std.net.tcpConnectToAddress(address) catch |e| return classifyDialError(e);
}
See how connectDestination returns either a live Stream or a ProxyError, never an ambiguous result. The caller's job is then trivial: on error, send the mapped reply code and close; on success, send 0x00 and start relaying. The whole control-plane -- greeting, parse, dial, reply -- is maybe sixty lines, and after sendReply(stream, 0x00) the proxy never speaks SOCKS again. From here it's pure plumbing.
This is the data plane, and it is gloriously stupid. Two TCP streams, two directions, copy bytes from each into the other until one side closes. The cleanest way (and the one that doesn't burn a CPU on a busy spin) is one thread per direction -- and we already learned to spawn and join threads safely back in episode 30:
// Copy from src to dst until EOF or error. One half of the tunnel.
fn pump(src: std.net.Stream, dst: std.net.Stream) void {
var buf: [16 * 1024]u8 = undefined; // 16 KiB on the stack, reused every read
while (true) {
const n = src.read(&buf) catch break;
if (n == 0) break; // clean EOF: the source closed its side
dst.writeAll(buf[0..n]) catch break;
}
// Tell the other end we're done writing, so its pump sees EOF and unwinds too.
std.posix.shutdown(dst.handle, .send) catch {};
}
fn relay(client: std.net.Stream, upstream: std.net.Stream) !void {
// One thread pushes client -> upstream; we run upstream -> client inline.
const t = try std.Thread.spawn(.{}, pump, .{ client, upstream });
pump(upstream, client); // this direction runs on the current thread
t.join(); // wait for the other half before we let the sockets close
}
The shutdown(.send) at the end of each pump is the part people forget, and forgetting it hangs connections forever. When one direction sees EOF, it half-closes the other socket's write side, which makes that socket's reader see EOF too -- so both pumps wind down instead of one of them blocking on a read that will never return. That's the difference between a proxy that closes connections cleanly and one that slowly leaks file descriptors until it falls over (episode 71's resource limits will tell you exactly when that happens).
The per-connection entry point reads like the three phases it implements -- which is the whole reason we kept each phase honest and small:
fn handleClient(client: std.net.Stream, alloc: std.mem.Allocator) void {
defer client.close();
_ = negotiateMethod(client) catch return; // phase 1: agree on "no auth"
var req: Request = .{};
req.parse(client, &req) catch |e| { // phase 2: read the CONNECT request
const code = if (e == error.CommandNotSupported) @as(u8, 0x07) else 0x01;
sendReply(client, code) catch {};
return;
};
const upstream = connectDestination(&req, alloc) catch |e| {
sendReply(client, replyCode(e)) catch {}; // dial failed: tell the client why
return;
};
defer upstream.close();
sendReply(client, 0x00) catch return; // success -- tunnel is open
relay(client, upstream) catch {}; // phase 3: copy bytes until done
}
Each failure mode sends the right reply byte and bails. There's no shared mutable state between connections, so the accept loop (the same pattern from the HTTP server in episode 51) just spawns handleClient per accepted socket and moves on. Bam, jonguh -- that's a working SOCKS5 proxy, start to finish.
You can't easily unit-test the relay without two live sockets, but the parsing -- the part most likely to have an off-by-one against a hostile client -- tests beautifully against an in-memory stream (episode 12's discipline again). Here's the domain-name path, the trickiest of the three because of its length prefix:
test "parse CONNECT with domain destination" {
// version, cmd=connect, reserved, atyp=domain, len=11, "example.com", port 443
const wire = [_]u8{ 0x05, 0x01, 0x00, 0x03, 11 } ++
"example.com".* ++ [_]u8{ 0x01, 0xBB }; // 0x01BB = 443
var fbs = std.io.fixedBufferStream(&wire);
var req: Request = .{};
try req.parse(fbs.reader(), &req);
try std.testing.expect(req.dest == .domain);
try std.testing.expectEqualStrings("example.com", req.dest.domain.name);
try std.testing.expectEqual(@as(u16, 443), req.dest.domain.port);
}
test "unsupported command is rejected cleanly" {
const wire = [_]u8{ 0x05, 0x02, 0x00, 0x01 }; // cmd=0x02 (BIND), not supported
var fbs = std.io.fixedBufferStream(&wire);
var req: Request = .{};
try std.testing.expectError(error.CommandNotSupported, req.parse(fbs.reader(), &req));
}
The second test is the one that matters for a network-facing service: a command we don't implement must produce a named error we can map to reply 0x07, not a panic and not a silent mis-parse. And the place to point a fuzzer (episode 12) is Request.parse -- a domain length byte of 255 with only three bytes following must surface as a short-read error from readExact, never a wild read past the buffer. Because every read funnels through that one helper, a malicious greeting gets a clean rejection in stead of a crash.
Here's the thing worth internalising: a proxy's CPU cost is almost never in the protocol. The handshake is a few dozen bytes parsed once per connection -- utterly free. Every cycle that matters is in pump, and specifically in the read/writeAll syscalls moving real traffic. So the optimisation playbook is short and it's all about not touching the data more than once.
The 16 KiB stack buffer in pump is already the right move -- it's reused on every read, so a tunnel that moves a gigabyte allocates exactly zero bytes on the heap (episode 26's reuse lesson, made trivial by a stack array). The next rung up, on Linux, is to stop copying through userspace at all: splice moves bytes directly between two socket file descriptors inside the kernel, so the data never enters your process's address space. That's how the serious proxies (HAProxy and friends) push tens of gigabits without melting:
// Linux zero-copy: hand the kernel two fds and a pipe, never touch the bytes.
fn spliceOnce(from: std.posix.fd_t, to: std.posix.fd_t, pipe: [2]std.posix.fd_t) !usize {
const max = 64 * 1024;
const in = try std.posix.splice(from, null, pipe[1], null, max, 0); // socket -> pipe
if (in == 0) return 0;
_ = try std.posix.splice(pipe[0], null, to, null, in, 0); // pipe -> socket
return in;
}
But -- and this is the rule that closes every performance section in this arc -- profile before you reach for it (episode 34). For all but the busiest proxies, the plain 16 KiB copy loop is bottlenecked on the network, not the memcpy, and splice buys you nothing but complexity and a Linux-only code path. Measure the thing under real load, then decide whether the kernel needs to get involved ;-)
In Go, this is almost embarrassingly short. net.Dial gives you the upstream, two goroutines running io.Copy give you the relay, and the runtime's scheduler makes "one goroutine per direction per connection" cost basically nothing even at tens of thousands of connections. The garbage collector owns the buffers, and you'd likely never even think about the shutdown dance because io.Copy plus CloseWrite handles it. It's the most ergonomic of the four, full stop.
In Rust, you'd reach for tokio and tokio::io::copy_bidirectional, which does exactly our relay (including the half-close) in one call, fully async. The borrow checker keeps the two halves of the tunnel from racing on a buffer, and the result is fast and memory-safe -- at the usual price of async colouring and a heavier dependency tree.
In C, you own everything: the poll/epoll loop, the buffers, the half-close logic, and every read that might return EAGAIN. It's the most honest about what's really happening and the easiest to get subtly wrong -- a forgotten shutdown leaks connections, a missed bounds check on the domain length is a remote read primitive (the same footgun we guarded against with readExact).
Zig lands where it keeps landing: we wrote the greeting, the tagged-union address parse, the exhaustive reply mapping, and the threaded relay in a couple hundred readable lines, every buffer visible and every syscall ours. No runtime owns the connection, no GC owns the bytes, and the splice upgrade is right there when a profiler says you need it. For a real deployment you'd add auth, access control, and connection limits -- but you now know precisely what those few dozen handshake bytes mean, and why the data plane is the dumbest part of the whole thing.
Step back at what we just built. A proxy solves "the client can't reach the server" by inserting a machine that can reach both -- a willing, well-connected middleman that both sides agree to route through. It works, it's simple, and it's exactly how SSH dynamic forwards and Tor entry nodes get you out of a hostile network. But notice the catch hiding in plain sight: a proxy only works because somebody has a publicly reachable address. The proxy is on the open internet; that's the entire trick.
So what happens when neither machine has a clean public address? Two laptops, each behind its own home router, each with a private address its router rewrites on the way out -- and no friendly proxy in the middle that either of them can reach. That's the genuinely hard version of "two machines find each other", and the box rewriting those addresses has a name and a behaviour we're going to have to understand byte by byte before we can defeat it. The vocabulary we've built across this whole networking arc -- big-endian ports, treating the peer as hostile, half-closing cleanly, dialling by name -- is exactly the vocabulary that fight is written in.
The thread running through all of it hasn't moved since episode 21: a protocol is a contract written in bytes. Parse every length like the sender means you harm, name your endianness so the target never surprises you, and let the tagged union carry the shape so the compiler catches the case you'd otherwise forget. SOCKS5 sounds like infrastructure-grade plumbing from the outside. From the inside it's a fifteen-byte handshake and a copy loop -- and now you've written both.
Add username/password authentication. Extend negotiateMethod to offer METHOD_USERPASS (0x02) when the client asks for it, then implement the auth subnegotiation from RFC 1929: read [version=0x01][ulen][username][plen][password], check it against a hardcoded credential, and reply [0x01][status] (status 0 = success). Reject the connection cleanly when the password is wrong, and write a test that drives the whole greeting-plus-auth exchange against an in-memory stream.
Enforce a connection budget. A SOCKS proxy with no limits is a denial-of-service waiting to happen. Add an atomic counter (episode 30) of active tunnels, refuse new connections past a configured maximum by sending reply 0x01 (general failure) before relaying, and decrement the counter when handleClient returns. Prove it works with a test that opens the maximum, asserts the next is refused, closes one, and asserts a new one is accepted.
Build a SOCKS5 client. Write the other half: a connectThrough(proxy_addr, dest_host, dest_port) that performs the greeting, sends a domain-name CONNECT request, reads and validates the reply code (returning the right ProxyError on a non-zero reply), and hands back the live Stream ready to use. Test it against your own proxy from this episode by tunnelling a tiny request through localhost and asserting the bytes arrive intact -- the client and server validating each other is the most satisfying test in the whole episode.