Is a total, impure function considered to be Referentially Transparent (RT)?
Example:
// Given an 'id', try to retrieve an optional Person
// from the database, i.e. side-effect.
// Note that 'String' is a poor choice for the `Left`,
// but it's not important for this question
def getPerson(id: Long): Either[String, Option[Person]] =
// ignore implementation
Assuming that getPerson
is implemented to always return an Either
, i.e. ignoring any fatal errors (Out of Memory, Stack Overflow, etc.), is it considered to be RT?
Perhaps the possibility of an Out of Memory or Stack Overflow error breaks RT?
I read https://softwareengineering.stackexchange.com/a/234237/72147, but I’m not sure of the answer to my question.
2
The definition of RT that we use in “the red book” is as follows:
An expression
e
is referentially transparent if for every program
p
, every occurrence ofe
inp
can be replaced by the result of
evaluatinge
without affecting the meaning ofp
.
So the expression getPerson(0)
is considered to be RT if wherever you see getPerson(0)
, you could replace it with the result of the call.
Whether you can do this depends on the implementation. If the implementation contains an immutable map from Long
to Person
, and is simply a lookup in that never-changing map, then getPerson(x)
is RT for any x
and getPerson
is considered a pure function.
If instead the call to getPerson(x)
makes a connection to a time-varying database across a network, then the result is going to vary depending on the context. In other words, then getPerson(x)
is not purely a function of x
.
Concerning your other question, about whether OOMEs and SOEs break RT, the answer gets a bit more nuanced. You could argue that no complex Scala expression is RT. Consider:
case class Foo(s: String)
Now, Foo("hello") == Foo("hello")
. But any two expressions that evaluate to Foo("hello")
are observably different, because Foo("hello") eq Foo("hello")
is false
. So given the existence of eq
, the expression Foo("hello")
is not RT.
We’ll have to alter our definition of RT a bit:
An expression e is referentially transparent with regard to a program
p
if every occurrence ofe
inp
can be replaced by the result of evaluatinge
without affecting the meaning ofp
.
With this definition, functional programming becomes an optimization problem rather than an absolute. That is, we’re concerned with programming using expressions that are RT with regard to as many programs as possible.
So when we say “RT”, it’s really implied that we mean “RT with regard to the class of programs that don’t use eq
, or run out of memory, or dump core, or get into an infinite loop.”
2
Referential transparency is not some formally defined concept but
usually means that an expression always evaluates to the same result in any context. Source
Here’s a simple test for referential transparency for an expression e
: does inlining a declaration of e
change (potential) behavior?
That is, is there a difference between:
val x = e
doFoo(x)
doBar(x)
and the following
doFoo(e)
doBar(e)
In this case there is a huge difference! For example, this code will work fine:
val user = getPerson(32)
deleteUser(user)
displayUserDeletion(user)
whereas this code will blow up
deleteUser(getPerson(32))
displayUserDeletion(getPerson(32))
So this is definitely not referentially transparent. As for your second question:
Perhaps the possibility of an Out of Memory or Stack Overflow error breaks RT?
Usually when we think of referential transparency or really any property of our programs we are thinking of them as running on perfect machines with infinite memory. This simplifying assumption is of course wrong but really necessary to get anywhere in talking about programs. Basically, stack overflow and OOM “break RT” then RT is always broken because for virtually any expression e
there is some sufficiently tiny expression computer such that
val x = e
doFoo(x)
doBar(x)
works but
doFoo(e)
doFoo(e)
runs out of memory. To avoid referential transparency being a useless concept we have to ignore the limitation of our runtimes.
This definition also shows why we should care about RT, as it enables powerful optimizations such as Common Subexpression Elimination (Haskell compilers do this) where if e
is RT we can make it so that it is only evaluated once, e.g. by transforming
doFoo(e)
doBar(e)
into
val x = e
doFoo(x)
doBar(x)
As for the question in the title, there actually are impure functions that are referentially transparent, but it’s quite an unusual circumstance:
- Suppose we have a mutable object
S
that only has one method for modifying it,S.f()
, that is is idempotent. In this caseS.f()
is impure (changes state) but is referentially transparent.
To make this more concrete suppose we have
class Foo {
private var _x = 0
def this(x: Int) { this(); _x = x }
def x = _x
def setXToFive(): Unit = {
_x = 5
}
}
Here if foo : Foo
, foo.setXToFive()
is referentially transparent.
6
No. An impure function cannot be referentially transparent by definition*. The function you gave is not pure. Writing a function to return Either
does not automatically make it pure. It still does not always return the same value for the same input. If the function you gave accepted a person
parameter and a database
/lookup
parameter, then the function would be pure and referentially transparent, because if given the same referentially transparent database
/ lookup
, it would always return the same value.
In regards to fatal errors (Out of Memory, Stack Overflow, etc.), I guess, yes? You can be pedantic and say technically, no function or program is pure because the machines running could break down. Somebody could just pull the power cord. But then no function is pure
/ referentially transparent
and those terms lose all meaning…
*An expression e is referentially transparent if for every program p, every occurrence of e in p can be replaced by the result of evaluating e without affecting the meaning of p.