Identity, Ontology, and Pass-by-Value

Surprisingly, parameter passing is a contentious issue. One wouldn’t expect that such an apparently simple thing could invoke disagreements but it does. Perhaps the most prominent disagreement involves pass-by-value in C, C++ and Java. According to the language designers, these languages always pass by value; but according to many programmers, arrays are passed by var and in Java all objects are passed by var.

The roots of this controversy can be traced back to the ancient Greeks who argued over how many kinds of enduring Substance there were. No really. The argument over parameter passing really comes down to ontology --what categories of things there are.

The two sides of the debate might be called the cyber-materialists and the cyber-dualists. According to the cyber-materialists, there is just one ultimate category of Substance and all programming language objects are essentially --or should be viewed as-- concrete material things. The cyber-dualists agree that concrete material things exist but maintain that there are also abstract things which are radically different.

To a cyber-materialist the pass-by-value issue is simple: when a parameter is passed by value, the called function cannot modify it. When a parameter is passed by var, the called function can modify it. In C, C++ and Java arrays are passed in such a way that the called function can modify the value, therefore arrays are passed by var. QED.

To a cyber-dualist, the situation is quite a bit more complex. But before we get into the weeds of cyber-dualism, let’s talk about Bob. Bob is a typical grad student of the 1980s (we have to pick a time before all documents are stored on-line and passed by email or it confuses the issues). Bob drives a sweet 1972 Firebird Trans Am and he has just finished a paper on the semantics of pattern matching in SNOBOL4. Today, he has two errands to run: he must send his paper to the publisher and he must take his car in to be detailed because he has a first date tonight with a hot grad student from the business college.

As Bob is sitting in his Firebird holding his paper in his hand, he has two choices about how to send the paper. First, he can shove it into a manilla envelope and send it to the publisher, second, he can make a copy of it and send the copy to the publisher. Either choice works.

For the car, Bob has no such option; he has to take in the very car that he is sitting in. Even if he could make an exact copy of the car, dirt and all, the car that gets detailed must be the car that he takes tonight. His date is not going to be impressed if he shows up in a filthy car but assures her that there is an identical clean copy at the detailer's.

The difference here is that the car is a concrete physical object. It is not the same object as any car that is not exactly in the same place at the same time. Even if you could make a car that was identical in all respects except for it’s physical location, there would be two different cars that will go on to become more different over time. By contrast, the SNOBOL4 paper is an abstract object that is represented physically with ink on paper but is not itself physical. Once the paper is published there will be many physical copies, but there will always be just one paper on the semantics of SNOBOL4 pattern matching written by Bob (Bob is getting out of semantics because compilers are more fun).

So in the real world there are these two dramatically different categories of things, and cyber-dualists see the same two categories in computer programs. In Java, an int is an abstract value represented by bits on the computer. The int itself is not the bits. The int is not in the computer at all; it is an abstraction that has no physical existence. If you want to pass an int to another function --or to another machine for that matter-- you send a copy of the bits that represent the int but since the bits only represent the int, you are only sending a reference to the int, not the int itself.

If the called function changes the bits, it does not thereby change the int that the calling function is referencing because the calling function has a different copy of the reference. This is what call-by-value means: the caller sends a copy of its reference so the callee cannot change the caller's reference. In call by reference, the caller sends a reference to its reference. In this case, the caller can access the callee's reference and modify the bits in it, thereby making the callee's reference refer to a different int.

By contrast, a Java object is a concrete object that exists in a particular time and place, an object that changes over time. Changing the bits of an object is the same as changing the object itself. So when a caller sends a reference to an object, it is sending a reference to something that can change. If the called function changes this thing, then naturally the caller will see the effects of the change.

But here is the important part: just like for the case with int, the caller has only passed a copy of its reference to the object. The called function cannot change the caller's reference, so when the function returns, you know that whatever the object looks like, the object that the caller is referencing is still the very same object that it was referencing before the call. The called function cannot change what object the caller is holding on to. So let me stress this: it is call by value because the called function cannot change which object the calling function is referencing. In call by reference the calling function passes a reference to its reference. This lets the called function change the reference of the caller and does let it change which object the caller is referencing.

Bob sent both his paper and his car by value. The journal editor could write obscene comments on the paper, but that would not change the paper that Bob wrote, it only changes the editor's physical instance of the paper. If the detailer writes obscene comments on the Firebird, it very definitely is changing Bob's car and Bob probably won't be happy about it. But it is still the same car; it is still Bob's Firebird. The detailer can't change that. Even if the dealer steals the car and sells it off to a chop shop, it is still the same car as long as it is a car at all (the analogy breaks down here because the detailer can, in fact, destroy the car).

So this is the real difference between pass by value and pass by reference as it was intended by the language designers: it isn't whether the called function can modify the object --that is a function of what category of object it is. The difference is that in call by value, the called function has an independent reference to the object and cannot modify the caller's reference. In call by reference the called function does have access to the caller's reference and can change what value or object the caller is referencing.

There is more to this story, including the argument of the cyber-abstractionists and their relationship to functional languages.