Object Properties part 2: examples

In my last post, I went over some of the pros and cons of various proposals for making PHP objects more immutable-ish, and the contexts in which they would be useful. I also posted the link to the PHP Internals list, where it generated some interesting if meandering discussion (as is par for the course on Internals).

One of the requests was for sample code to demonstrate why I felt particular feature proposals were better than others. Fair enough! This post is in response to that request, and I think it will help illuminate the challenges better.

For this exercise, I chose to experiment with a junior version of the PSR-7 request object as a concrete example. The code below is not exactly PSR-7; it's a representative sample of portions of a naive, slightly reduced scope version of PSR-7 requests only, and using all PHP 8.0 features available. The goal is not a complete working object, but sufficient real-world representative examples of situations that an immutability plan would need to address.

I also have ignored PSR-7's wrapping of streams for the body, as immutable streams is a vastly more complex topic and not at all relevant to the scope I'm interested in for now.

PSR-7 today

Here's what the current PSR-7-junior implementation looks like, in modern PHP 8.0 code. I'll explain it a bit after the code block, and then show what it would look like under various proposals. If you'd prefer, I also have a Gist with all of the samples available as well. (Hive's code formatting is not as pretty as GitHub's.)

// PSR-7 today

class Request implements RequestInterface
{
   private UriInterface $uri;
   private array $headers = [];
   private string $method = 'GET';
   private string $version = '1.1';

   public function getUri(): UriInterface
   {
       return $this->uri;
   }

   public function getMethod(): string
   {
       return $this->method;
   }

   public function getVersion(): string
   {
       return $this->version;
   }

   public function getHeaders(): array
   {
       return $this->headers;
   }

   public function getHeader($name): string
   {
       return $this->headers[strtolower($name)] ?? '';
   }

   public function withMethod(string $method): static
   {
       $new = clone($this);
       $new->method = $method;
       return $new;
   }

   public function withProtocolVersion(string $version): static
   {
       if (!in_array($version, ['1.1', '1.0', '2.0'])) {
           throw new InvalidArgumentException();
       }
       $new = clone($this);
       $new->version = $version;
       return $new;
   }

   public function withUri(UriInterface $uri, bool $preserveHost = false): static
   {
       $new = clone($this);

       $new->uri = $uri;

       if ($preserveHost && isset($new->headers['host'])) {
           return $new;
       }

       $new->headers['Host'] = $new->uri->getHost();

       return $new;
   }

   public function withHeader(string $name, string $value): static
   {
       $new = clone($this);
       $new->headers[strtolower($name)] = $value;
       return $this;
   }
}

$r1 = new Request();
$r2 = $r1->withMethod('POST');
$r3 = $r2->withUri(new Uri('https://php.net/'));
$r4 = $r3->withProtocolVersion('2.0');
$r5 = $r4->withHeader('cache', 'none');

print $r5->getMethod();
print $r5->getUri()->getHost();
print $r5->getProtocolVersion();
print_r($r5->getHeaders());
print $r5->getHeader('cache');

In this example, we have 4 properties. All are private, but are exposed through getters.

  • $uri is an inner object.
  • $headers is an inner array, but could be any more complex structure.
  • $method is an inner string; in practice it has a limited set of allowed values, but for the purposes of this example it's unrestricted.
  • $version is an inner string, with an explicitly limited set of allowed values. This is a stand-in for any extra validation that could be needed.

In practice, both $method and $version could and should be covered by enumerations if/when those pass, but we'll put that aside for the moment.

Of particular note is that, per PSR-7, the Host header key in $headers must match up with the Host property of $uri. That is our representative sample for inter-relations between properties.

In conventional code, all four properties are accessed via a getX() method. Most are highly boring, save getHeader() which has some extra processing to do. Changing a value is done via a withX() method, which clones the object, then modifies the clone and returns it. That modification may be simple or elaborate depending on the field in question.

I want to highlight: This currently conventional code works. It accomplishes the goal of an object that is, externally, immutable, yet still reasonably efficient. The downside is that it's verbose, and has lots of boilerplate. Any improvement proposals would need to be at least as easy to write, read, and use as this example, while not losing any guarantees.

initonly with clone-with

Here's what the same code would look like with the addition of two language features:

  • A clone $obj with {proplist} syntax that allows an object to be cloned and the clone modified immediately.
  • An initonly flag on properties that would make the property read only except in certain "initialization" contexts: __construct(), __unserialize(), and clone with.
// PSR-7 with initonly and clone-with

class Request implements RequestInterface
{
   public initonly UriInterface $uri;
   public initonly array $headers = [];
   public initonly string $method = 'GET';
   public initonly string $version = '1.1';

   public function getHeader($name): string
   {
       return $this->headers[strtolower($name)] ?? '';
   }

   public function withProtocolVersion(string $version): static
   {
       if (!in_array($version, ['1.1', '1.0', '2.0'])) {
           throw new InvalidArgumentException();
       }
       return clone($this) with { version: $version };
   }

   public function withUri(UriInterface $uri, bool $preserveHost = false): static
   {
       $args['uri'] = $uri;

       // If headers were itself a pseudo-immutable object, 
       // this would be even uglier.
       if ($preserveHost && isset($this->headers['host'])) {
           $headers = $this->headers;
           $headers['host'] = $uri->host;
           $args['headers'] = $headers;
       }

       return clone($this) with { ...$args };
   }

   public function withHeader(string $name, string $value): static
   {
       $headers = $this->headers;
       $headers[strtolower($name)] = $value;
       return clone($this) with { headers: $headers };
   }
}

$r1 = new Request();
$r2 = clone $r1 with { method: 'POST' };
$r3 = $r2->withUri(new Uri('https://php.net/'));
$r4 = $r3->withProtocolVersion('2.0');
$r5 = $r4->withHeader('cache', 'none');

print $r5->method;
print $r5->uri->host;
print $r5->version;
print_r($r5->headers);
print $r5->getHeader('cache');

// This becomes allowed, but shouldn't be.
$r6 = clone($r5) with {
   uri: new Uri('https://python.org/'),
   headers: [host: 'http://java.com/'],
   version: 'the old one',
};

In this version, all four properties can be made public, which eliminates the need for all but one getX() method. getHeader() is unchanged. (This assumes the same change made to the Uri class, which seems reasonable.)

  • The withMethod() method can also be omitted, as its contents are now safe to run externally. Its contents are just moved outside the object.
  • withUri(), withProtocolVersion(), and withHeader() are all still needed. The first has to also modify the $headers property to keep it in sync. The second has to validate legal values. The third needs to munge the input a bit before setting it for case consistency.

So while it makes the read case easier, it doesn't help the write case all that much. The write case is only externalized in the case where the type system is able to completely and totally represent the legal values and relationships. That is certainly true in many cases, but clearly not all or even most. Whether the clone with syntax is something to expose to calling code is debatable. (I could probably argue both sides quite easily.)

Additionally, the withUri() method gets a bit uglier as all new properties need to be computed in advance. The number of values to set in the with clause is also dynamic, suggesting a need for a variadic signature, as shown here. Whether that would be feasible, or desirable, is another question.

An interesting side effect, though, is that in this setup any property could be modified externally with a clone-with, including in ways that are not valid. That is, it makes it trivially easy to bypass any non-type validation on a public initonly property. That, to me, is a death knell for this particular combination.

initonly with clone arguments

The second combination includes the same initonly property but by modifying __clone() to take arguments. It's very similar to clone with, but the __clone() method is able to then take its arguments and do whatever additional validation is needed. It could even do things with them other than assign them to properties if desired. In this case, the __clone() method is an "initialization context" in which properties are mutable.

Here's what this combination looks like:

// PSR-7 with initonly and __clone args

class Request implements RequestInterface
{
   public initonly UriInterface $uri;
   public initonly array $headers = [];
   public initonly string $method = 'GET';
   public initonly string $version = '1.1';

   public function __clone(...$args)
   {
       foreach ($args as $k => $v) {
           switch ($k) {
               case 'version':
                   if (!in_array($v, ['1.1', '1.0', '2.0'])) {
                       throw new InvalidArgumentException($k);
                   }
                   $this->version = $v;
                   break;
               case 'uri':
                   // This will type fail for us if $v isn't a Uri object.
                   $this->uri = $v;
                   $this->headers['host'] = $v->host;
                   break;
               case 'method':
               case 'headers':
                   $this->$k = $v;
                   break;
           }
       }
   }

   public function getHeader($name): string
   {
       return $this->headers[strtolower($name)] ?? '';
   }

   // Still needed because of the $preserveHost = true option.
   public function withUri(UriInterface $uri, bool $preserveHost = false): static
   {
       $args['uri'] = $uri;

       // If headers were itself a pseudo-immutable object, this would be even uglier.
       if ($preserveHost && isset($this->headers['host'])) {
           $headers = $this->headers;
           $headers['host'] = $uri->host;
           $args['headers'] = $headers;
       }

       return clone($this, ...$args);
   }

   public function withHeader(string $name, string $value): static
   {
       $headers = $this->headers;
       $headers[strtolower($name)] = $value;
       return clone($this, headers:  $headers };
   }
}

$r1 = new Request();
$r2 = clone $r1 with { method: 'POST' };
$r3 = clone($r2, uri: new Uri('https://php.net/'));
$r4 = clone($r3, version: '2.0');
$r5 = $r4->withHeader('cache', 'none');

print $r5->method;
print $r5->uri->host;
print $r5->version;
print_r($r5->headers);
print $r5->getHeader('cache');

// This will now error out.
$r6 = clone($r5,
   uri: new Uri('https://python.org/'),
   headers: [host: 'http://java.com/'],
   version: 'the old one',
);

As with the previous version, most of the getX() methods can be omitted save one. This approach also allows more withX() methods to be omitted, as they are now "safe." The validation is handled by the __clone() method, which will also cause an invalid clone command to error out.

The main downside of this approach is that the __clone() method that results is absolute crap. It shoves all validation and extra handling into a single giant switch-based mess. It would be slightly cleaner if each property were spelled out as an argument to __clone() to type-check a little earlier, but that is just extra work to do, and would not make the code inside that method any nicer. Looking at it, I hate almost everything about the resulting __clone() method. The ability to remove a few withX() methods is not worth that heartache.

Asymmetric visibility and clone with

In this model, properties can be flagged as publicly readable but writable only privately. They are never truly immutable, but rely on the discipline of the class author to not modify them in-place when they shouldn't be. The clone with logic is the same as before, although the idea of an "initialization context" is no longer relevant.

// PSR-7 with asymmetric visibility and clone-with

class Request implements RequestInterface
{
   get:public set:private UriInterface $uri;
   get:public set:private $headers = [];
   get:public set:private $method = 'GET';
   get:public set:private $version = '1.1';

   public function getHeader($name): string
   {
       return $this->headers[strtolower($name)] ?? '';
   }

   public function withMethod(string $method): static
   {
       return clone($this) with {method: $method};
   }

   public function withProtocolVersion(string $version): static
   {
       if (!in_array($version, ['1.1', '1.0', '2.0'])) {
           throw new InvalidArgumentException();
       }
       return clone($this) with {version: $version };
   }

   public function withUri(UriInterface $uri, bool $preserveHost = false): static
   {
       $new = clone($this) with {uri: $uri};

       if ($preserveHost && isset($new->headers['host'])) {
           return $new;
       }

       $new->headers['Host'] = $new->uri->host;

       return $new;
   }

   public function withHeader(string $name, string $value): static
   {
       $headers = $this->headers;
       $headers[strtolower($name)] = $value;
       return clone($this) with { headers:  $headers };
   }
}

$r1 = new Request();
$r2 = $r1->withMethod('POST');
$r3 = $r2->withUri(new Uri('https://php.net/'));
$r4 = $r3->withProtocolVersion('2.0');
$r5 = $r4->withHeader('cache', 'none');

print $r5->method;
print $r5->uri->host;
print $r5->version;
print_r($r5->headers);
print $r5->getHeader('cache');

// This errors out correctly, because the properties
// are not publicly settable.
$r6 = clone($r5) with {
   uri: new Uri('https://python.org/'),
   headers: [host: 'http://java.com/'],
   version: 'the old one',
};

As with the previous examples, all but one getX() method goes away. However, none of the withX() methods do, since no property is publicly settable. (initonly was only able to omit a single withX() method.) The public clone with call also errors out, as it should, since the properties are not publicly settable.

Internally, the withX() methods all get simpler, but only one of them becomes a single line that could be condensed with short functions. (The same one that would have been omitted entirely with initonly.) As an aside, Nikita's recent RFC to allow associative arrays to use unpacking would likely simplify this code even further and make withHeader() a single expression, too.

withUri() in particular feels considerably simpler than its initonly counterpart.

I'd say this version wins for making writing easier, but is kind of a wash with initonly in terms of making using it easier. It retains all the withX() methods, but doesn't provide a back door way to create an invalid object.

Asymmetric visibility and clone arguments

The final combination would be asymmetric visibility with the expanded __clone() option. It's very similar to the previous example, and I'd say offers absolutely no benefit over that one but includes the fugly __clone() mess.

class Request implements RequestInterface
{
   get:public set:private UriInterface $uri;
   get:public set:private $headers = [];
   get:public set:private $method = 'GET';
   get:public set:private $version = '1.1';

   public function __clone(...$args)
   {
       foreach ($args as $k => $v) {
           switch ($k) {
               case 'version':
                   if (!in_array($v, ['1.1', '1.0', '2.0'])) {
                       throw new InvalidArgumentException($k);
                   }
                   $this->version = $v;
                   break;
               case 'uri':
                   // This will type fail for us if $v isn't a Uri object.
                   $this->uri = $v;
                   $this->headers['host'] = $v->host;
                   break;
               case 'method':
               case 'headers':
                   $this->$k = $v;
                   break;
           }
       }
   }

   public function getHeader($name): string
   {
       return $this->headers[strtolower($name)] ?? '';
   }

   public function withMethod(string $method): static
   {
       return clone($this, method: $method);
   }

   public function withProtocolVersion(string $version): static
   {
       if (!in_array($version, ['1.1', '1.0', '2.0'])) {
           throw new InvalidArgumentException();
       }
       return clone($this, version: $version };
   }

   public function withUri(UriInterface $uri, bool $preserveHost = false): static
   {
       $new = clone($this, uri: $uri);

       if ($preserveHost && isset($new->headers['host'])) {
           return $new;
       }

       $new->headers['Host'] = $new->uri->host;

       return $new;
   }

   public function withHeader(string $name, string $value): static
   {
       $headers = $this->headers;
       $headers[strtolower($name)] = $value;
       return clone($this, headers:  $headers);
   }
}

$r1 = new Request();
$r2 = $r1->withMethod('POST');
$r3 = $r2->withUri(new Uri('https://php.net/'));
$r4 = $r3->withProtocolVersion('2.0');
$r5 = $r4->withHeader('cache', 'none');

print $r5->method;
print $r5->uri->host;
print $r5->version;
print_r($r5->headers);
print $r5->getHeader('cache');

// This errors out correctly, because the properties
// are not publicly settable.
$r6 = clone($r5,
   uri: new Uri('https://python.org/'),
   headers: [host: 'http://java.com/'],
   version: 'the old one',
);

There's not much else to say on this option, other than "but... why?"

Conclusions

I think this was a highly valuable exercise; I certainly learned quite a bit from it. My conclusions in no particular order:

  • All of the options are effectively identical on reading from a safely exposed property, and all are nicer to work with than the status quo.
  • Adding arguments to __clone() is a terrible idea. Let's not do that.
  • clone with is useful, but it's not the prettiest thing externally. It absolutely has use cases, but most of the time it will likely end up internalized within a method anyway.
  • public initonly is great for simple cases, but breaks encapsulation and leads to invalid data on more complex use cases.
  • Asymmetric visibility effectively doesn't even try to make public-set nicer; it just makes public-get easy, and leaves the set/with workflow unchanged. (clone with makes writing withX() methods nicer.)

That is, of the available options, I am still of the belief that clone with and asymmetric visibility, in combination, is the best option.

However, it also suggests two other factors I had not previously considered, or at least not explicitly.

First, "modify only on cloning" does have some valid use cases, even in an otherwise asymmetric visibility world. I'm not sure if there's a non-ugly way to capture that. Perhaps, and this is just off-the-cuff, something like this:

get:public set:initonly $prop

Which would be world-readable, and settable only via clone with? I'm not sure I like the code that would result internally (compare the different versions of withUri()), but it's something to consider. Maybe initonly is still internally writable, just publicly writable only in clone?

Second, this really calls out the need for a way to integrate more validation into property definitions. Enumerations will certainly help here by allowing more use cases to be encapsulated directly into the type system. However, there are other cases that will never be reasonable to implement in a purely declarative fashion.

Do we want to explore baking "validate on set" callbacks into the object design? Or is that a case of "that's what property accessors are for, dummy, just go all the way?" (Assuming a performant implementation could be found.) I'm not sure, but it's something to consider. I think I'd probably lean toward a set accessor and a validation callback being the same thing, and thus not favor a dedicated validation process, but it's more food for thought.

H2
H3
H4
3 columns
2 columns
1 column
Join the conversation now
Logo
Center