2.3 equality
2.3.1 Types of Equality
Pyret has three notions of equality. Two values can be equal now, always equal, and/or identical. The following table summarizes the functions and operators that test for these relationships, and how they compare to some other languages’ operators:
Name | Operator | Partial Predicate | Total Predicate | Similar To |
Equal Now | =~ | equal-now | equal-now3 | equal? (Racket) == (Python, Ruby) |
Always Equal | == | equal-always | equal-always3 | = (Ocaml) |
Identical | <=> | identical | identical3 | eq? (Scheme) == (Ocaml) === (JavaScript) is (Python) == (Java) |
In most programs, you should use always equal, or ==, to compare values that you want to check for same-ness. If you are working with mutable data, you may want to consider the special behavior of equal now. For some optimizations, defensive code, and capability patterns, you may have a reason to use identical.
2.3.2 Equal Now
Checks if the two values are equal now (they may not be later). Corresponds to the =~ operator.
2.3.2.1 Equal Now and Primitives
equal-now checks primitive equality on numbers, strings, and booleans:
check: 5 is%(equal-now) 5 5 is-not%(equal-now) 6 "abc" is%(equal-now) "abc" "a" is-not%(equal-now) "b" "a" is-not%(equal-now) 5 end
2.3.2.2 Equal Now and Structured Data
For instances of data (including, for example, instances of List), and objects, equal-now traverses their members and checks for pairwise equality. So, for example, lists will recursively check that their contents are the same, including the case where their contents are objects:
check: l1 = [list: 1, 2, 3] l2 = [list: 1, 2, 3] l1 is%(equal-now) l2 link(1, l1) is-not%(equal-now) l2 l3 = [list: {x: 5}] l4 = [list: {x: 5}] l5 = [list: {x: 6}] l3 is%(equal-now) l4 l3 is-not%(equal-now) l5 end
2.3.2.3 Equal Now and References
Equal Now checks the contents of mutable data it reaches. This gives it its name: since it only checks the current values, and those fields might change, it is not true that if e1 =~ e2, then later e1 =~ e2 will hold again. For example:
data MyBox: | my-box(ref x) end check: b1 = my-box(1) b2 = my-box(1) b1 is%(equal-now) b2 b1!{x : 2} b1 is-not%(equal-now) b2 end
Equal Now will recognize when references form a cycle, and cycles of the same shape are recognized as equal (even though the references might change their contents later):
data InfiniteList: | i-link(first, ref rest) | i-empty end check: l1 = i-link(1, i-empty) l2 = i-link(1, i-empty) l3 = i-link(1, i-link(2, i-empty)) l1!{rest : l1} l2!{rest : l2} l3!rest!{rest : l3} l1 is%(equal-now) l2 l1 is-not%(equal-now) l3 end
2.3.3 Identical
2.3.3.1 Identical and Primitives
Identical has the same behavior on primitives as Equal Now (Equal Now and Primitives).
2.3.3.2 Identical and Structural Equality
Identical does not visit members of objects or data instances. Instead, it checks if the values are actually the same exact value (the operator is meant to indicate that the values are interchangable). So objects with the same fields are not identical to anything but themselves:
check: o = { x: 5 } o2 = { x: 5 } o is-not%(identical) o2 o is%(identical) o o2 is%(identical) o2 end
2.3.3.3 Identical and Mutable Data
Identical does not inspect the contents of mutable data, either. It can be used to tell if two references are aliases for the same underlying state, or if they are in fact different (even though they may be equal right now).
data InfiniteList: | i-link(first, ref rest) | i-empty end check: l1 = i-link(1, i-empty) l2 = i-link(1, i-empty) l1!{rest : l1} l2!{rest : l2} l1 is%(identical) l1 l1!rest is%(identical) l1 l1 is-not%(identical) l2 l1!rest is-not%(identical) l2 l2 is%(identical) l2 l2!rest is%(identical) l2 l2 is-not%(identical) l1 l2!rest is-not%(identical) l1 end
2.3.4 Always Equal
Checks if the two values will always be equal, and corresponds to the == operator.
equal-always checks for primitive and structural equality like equal-now, with the exception that it stops at mutable data and only checks that the mutable values are identical. Stopping at mutable boundaries ensures that if two values were equal-always at any point, they will still be equal-always later.
2.3.4.1 Always Equal and Mutable Data
Here are some examples of equal-always stopping at mutable data, but checking immutable data, contrasted with equal-now
data MyBox: | my-box(ref x) end check: b1 = my-box(1) b2 = my-box(1) b1 is-not%(equal-always) b2 b1 is%(equal-now) b2 b2!{x : 2} b1 is-not%(equal-always) b2 b1 is-not%(equal-now) b2 b3 = my-box(2) # remember that b2 currently contains 2 l1 = [list: b1, b2] l2 = [list: b1, b2] l3 = [list: b1, b3] l1 is%(equal-now) l2 l1 is%(equal-always) l2 l1 is-not%(identical) l2 l1 is%(equal-now) l3 l1 is-not%(equal-always) l3 l1 is-not%(identical) l3 b2!{x: 5} l1 is%(equal-now) l2 l1 is%(equal-always) l2 l1 is-not%(identical) l2 l1 is-not%(equal-now) l3 l1 is-not%(equal-always) l3 l1 is-not%(identical) l3 end
2.3.5 Properties of Equality Functions
The discussion above hints at a relationship between the three functions. In particular, if two values are Identical, they ought to be Always Equal, and if they are Always Equal, they ought to be Equal Now. The following table summarizes this relationship, which in fact does hold:
If ↓, then → | v1 <=> v2 could be... | v1 == v2 could be... | v1 =~ v2 could be... |
v1 <=> v2 is true | - | true only | true only |
v1 == v2 is true | true or false | - | true only |
v1 =~ v2 is true | true or false | true or false | - |
This table doesn’t have all the false cases in it, because we need to complete the story for a few values that haven’t been discussed before we can give the whole picture.
2.3.6 Bounded Equalities
When comparing numbers, it’s often useful to be able to compare within a range. For example, if we write an algorithm that computes an answer to within a given tolerance, we may want to check if the answer is within that tolerance.
check: sqrt-5 = num-sqrt(5) (sqrt-5 < 2.23) is true (sqrt-5 > 2.22) is true end
Pyret has a family of built-in functions for cases like this, and the default is within:
It takes an argument representing the relative error, and returns a function that can be used to check equality up to that relative error. For example, we can check if an answer is within 10% of a desired result:
check: within-10-percent = within(0.1) within-10-percent(9.5, 10.5) is true end
Relative difference is defined by multiplying the mean of the two numbers by tol, and checking that the result is less than the difference between them. That is, in the expression above, within checks:
(((9.5 + 10.5) / 2) * 0.1) < (10.5 - 9.5)
Converting to exact numbers first avoids overflows on computing the mean. Put yet another way, aside from some slight differences in bounds checking for errors, we could implement the numeric comparison of within as:
fun my-within(tol): lam(left, right): (((num-exact(left) + num-exact(right)) / 2) * num-exact(tol)) < num-abs(num-exact(left) - num-exact(right)) end end
The tol argument must be between 0 and 1.
It’s common to use within along with is% to define the binary predicate inline with the test:
check: num-sqrt(10) is%(within(0.1)) 3.2 num-sqrt(10) is-not%(within(0.1)) 5 end
As a convenient shorthand, is-roughly is defined as a shorthand for is%(within(0.000001)), that is, an error tolerance of one-millionth, or six digits of accuracy:
check: num-acos(-1) is-roughly ~3.14159 num-acos(-1) is%(within(0.000001)) ~3.14159 end
Finally, within accepts any two values, not just numbers. On non-numeric arguments, within traverses the structures just as in equal-always, but deferring to the bounds-checking equality when a pair of numbers is encountered. All other values are compared with equal-always.
check: l7 = [list: 1] l8 = [list: ~1.2] l7 is%(within-rel(0.5)) l8 l7 is-not%(within-rel(0.1)) l8 l7 is%(within-rel(~0.5)) l8 l7 is-not%(within-rel(~0.1)) l8 end
An alias for within.
Like within-rel, but compares with absolute tolerance rather than relative. The definition is equivalent to:
fun my-within-abs(tol): lam(left, right): num-abs(num-exact(left) - num-exact(right)) <= tol end end
check: la = [list: 10] lb = [list: ~12] la is%(within-abs(2)) lb la is-not%(within-abs(1)) lb la is%(within-abs(~5.5)) lb la is-not%(within-abs(~1.9999)) lb end
Like within-rel and within-abs, but they traverse mutable structures as in equal-now.
check: aa = [array: 10] ab = [array: ~12] aa is%(within-rel-now(~0.2)) ab aa is-not%(within-rel(~0.2)) ab aa is%(within-abs-now(2)) ab aa is-not%(within-abs(2)) ab end
2.3.7 Undefined Equalities
For some values, Pyret refuses to report true or false for any equality predicate, and raises an error instead. For example:
check: (~3 == ~3) raises "equality-failure" (1 == ~1) raises "equality-failure" (lam(x): x end == lam(y): y end) raises "equality-failure" end
This section discusses why this is the case.
2.3.7.1 Roughnums and Equality
How to Design Programs describes this design space well. Numbers’ representations in programs reflect a number of tradeoffs, but the upshot is that numbers have finite, approximate representations for performance reasons. Numbers like e, π, and √2 are only represented up to some approximation of their true (irrational) value. When such a result is used in a computation, it represents a rough approximation of the true value.
Pyret calls these numbers Roughnums, and they have special rules related to equality. In particular, they cannot be directly compared for equality, even if it seems like they ought to be equal:
check: (~3 == ~3) raises "equality-failure" end
In addition, Roughnums cannot be compared for equality with Exactnums, either.
check: (~0.1 == 0.1) raises "equality-failure" end
This example is not Pyret-specific, but matches the behavior of IEEE floating point. Returning either true or false in this case would be misleading, as because of unavoidable inaccuracies, both of the following expressions evaluate to ~0.1:
(~1 - ~0.9) + 0.00000000000000003 ~0.2 - ~0.1
So in the following check block, if we chose either true or false for the result of ~0.1 == 0.1, one of the tests would have a misleading failure:
check: ((~1 - ~0.9) + 0.00000000000000003) is 0.1 (~0.2 - ~0.1) is 0.1 end
For example, if Pyret answered true for the rough equivalent, ~0.1 == ~0.1, then this test would pass:
check: ((~1 - ~0.9) + 0.00000000000000003) is (~0.2 - ~0.1) end
To avoid giving misleading answers in cases like these, Pyret triggers an error on any number-to-number comparison that involves a Roughnum, which looks like:
These two values cannot be compared for direct equality: |
|
~0.1 |
~0.1 |
|
Approximations of numbers (Roughnums) cannot be compared for equality. The |
program may need to use within(). |
If you see this error, it’s a hint that the program should be using an equality from the within family of functions to do a relative comparison, rather than a direct equality comparison. So in this case, we could check that the answer is equal up to an relative error of 0.001:
check: ((~1 - ~0.9) + 0.00000000000000003) is%(within(0.001)) (~0.2 - ~0.1) end
It can be useful to check that two Roughnums are actually indistinguishable, even though they may be approximating different values. This can be expressed by checking that the numbers are within a tolerance of ~0:
check: ((~1 - ~0.9) + 0.00000000000000003) is%(within(~0)) (~0.2 - ~0.1) end
Note that the same won’t work for a tolerance of 0, the exact zero, which will give an error if used to compare two Roughnums.
2.3.7.2 Functions and Equality
When comparing two functions or two methods, all the equality operators raise an exception. Why? Well, the traditional way to compare functions for equality (short of solving the halting problem), is to use reference equality (or identical) on the functions’ representations, the same way as mutable data works. For a hint of why this can be a misleading definition of equality, consider this data definition:
data Stream<a>: | stream(first :: a, rest :: (-> Stream<a>)) end check: fun mk-ones(): stream(1, mk-ones) end ones = mk-ones() ones is ones # Should this succeed? ones is mk-ones() # What about this? ones.rest() is mk-ones() # Or this...? end
All of these values (ones, mk-ones(), etc.) have the same behavior, so we could argue that is (which uses == behind the scenes) ought to succeed on these. And indeed, if we used reference equality, it would succeed. But consider this small tweak to the program:
check: fun mk-ones(): stream(1, lam(): mk-ones() end) # <-- changed this line end ones = mk-ones() ones is ones # Should this succeed? ones is mk-ones() # What about this? ones.rest() is mk-ones() # Or this...? end
If we used reference equality on these functions, all of these tests would now fail, and ones has the exact same behavior. Here’s the situation:
In fact, a famous result in theoretical computer science is that it is impossible to figure out out if two functions do the same thing in general, even if it is possible in certain special cases (like reference equality).
When reference equality returns true, we know that the two functions must have the same behavior. But when it returns false, we know nothing. The functions may behave exactly the same, or they might be completely different, and the equality predicate can’t tell us either way.
Pyret takes the following stance: You probably should rethink your program if it relies on comparing functions for equality, since Pyret cannot give reliable answers (no language can). So, all the examples above (with one notable exception) actually raise errors:
check: fun mk-ones(): stream(1, lam(): mk-ones() end) # <-- changed this line end ones = mk-ones() ones == ones is true ones == mk-ones() raises "Attempted to compare functions" ones.rest() == mk-ones() raises "Attempted to compare functions" end
The first test is true because two identical values are considered equal-always. This is an interesting point in this design space that Pyret may explore more in the future – it isn’t clear if the benefits of this relationship between identical and equal-always are worth the slight brittleness in the above example.
Note 1: Functions can be compared with non-function values and return false. That is, the equality operators only throw the error if actual function values need to be compared to one another, not if a function value is compared to another type of value:
check: f = lam(): "no-op" end g = lam(): "also no-op" end f == f raises "Attempted to compare functions" f == g raises "Attempted to compare functions" g == f raises "Attempted to compare functions" 5 is-not%(equal-always) f { x: 5 } is-not%(equal-always) { x: f } end
Note 2: This rule about functions interacts with structural equality. When comparing two values, it seems at first unclear whether the result should be false or an error for this test:
check: { x: 5, f: lam(): "no-op" end } is%(equal-always) { x: 6, f: lam(): "no-op" end } end
This comparison will return false. The rule is that if the equality algorithm can find values that differ without comparing functions, it will report the difference and return false. However, if all of the non-function comparisons are true, and some functions were compared, then an error is raised. A few more examples:
check: o = { x: 5, y: { z: 6 }, lam(): "no-op" end } o2 = { x: 5, y: { z: 7 }, lam(): "no-op" end } (o == o) raises "Attempted to compare functions" o is-not%(equal-always) o2 # Test succeeds, because z fields differ end
2.3.8 Total Equality Functions (Avoiding Incomparability Errors)
Most Pyret programs should be written using equal-always, equal-now, and identical, which guarantee that an error will be raised if functions are compared. Some programs, however, need to be able to compare arbitrary values, and it’s convenient to have the ability to compare values without raising an exception. Since the equality of functions is unknown, we define the result of a total equality check with a new datatype:
- NotEqual :: (
- reason :: String,
- value1 :: Any,
- value2 :: Any
- )
- -> EqualityResult
- Unknown :: (
- reason :: String,
- value1 :: Any,
- value2 :: Any
- )
- -> EqualityResult
We define three parallel functions to the equality predicates that return EqualityResult values. They return Equal and NotEqual whenever the corresponding function would, and Unknown whenever the corresponding function would throw an error:
check: f = lam(): 5 end equal-always3(f, f) is Unknown equal-always3(f, 5) satisfies is-NotEqual equal-now3(f, f) is Unknown equal-now3("a", f) satisfies is-NotEqual identical3(f, f) is Unknown identical3("a", f) satisfies is-NotEqual end
We can now modify our table from above to be more complete:
If ↓, then → | identical(v1, v2) could be... | equal-always(v1, v2) could be... | equal-now(v1, v2) could be... |
identical(v1, v2) is Equal | - | Equal only | Equal only |
equal-always(v1, v2) is Equal | Equal or NotEqual | - | Equal only |
equal-now(v1, v2) is Equal | Equal or NotEqual | Equal or NotEqual | - |
identical(v1, v2) is NotEqual | - | Equal or NotEqual or Unknown | Equal or NotEqual or Unknown |
equal-always(v1, v2) is NotEqual | NotEqual only | - | Equal or NotEqual or Unknown |
equal-now(v1, v2) is NotEqual | NotEqual only | NotEqual only | - |
identical(v1, v2) is Unknown | - | Unknown only | Unknown only |
equal-always(v1, v2) is Unknown | Unknown or NotEqual | - | Unknown only |
equal-now(v1, v2) is Unknown | Unknown or NotEqual | Unknown or NotEqual | - |
There are corresponding total functions defined for within as well:
2.3.9 Datatype-defined Equality
The functions equal-now and equal-always are defined to work over values created with data by comparing fields in the same position. However, sometimes user-defined values need a more sophisticated notion of equality than this simple definition provides.
For consider implementing an unordered set of values in Pyret. We might choose to implement it as a function that creates an object closing over the implementation of the set itself:
fun make-empty-set<a>(): { add(self, element :: a): ... end, member(self, element :: a) -> Boolean: ... end, equal-to-other-set(self, other) -> Boolean: ... end } end
We could fill in the bodies of the methods to have this implementation let clients create sets and add elements to them, but it won’t work well with testing:
check: s = make-empty-set().add(5) s2 = make-empty-set().add(5) s.member(5) is true s2.member(5) is true s.equal-to-other-set(s2) is true s == s2 raises "Attempted to compare functions" end
The final test raises an exception because it traverses the structure of the object, and the only visible values are the three methods, which cannot be compared. We might just say that users of custom datatypes have to use custom predicates for testing, for example they could write:
check: # as before ... fun equal-sets(set1, set2): set1.equal-to-other-set(set2) end s is%(equal-sets) s2 end
This works for sets on their own, but the built-in testing and equality operators will not work with nested user-defined data structures. For example, since lists are a dataype that checks built-in equality on their members, a list of sets as defined above will not use the equal-to-other-set method when comparing elements, and give an "Attempted to compare functions" error:
check: # as before ... ([list: s] == [list: s2]) raises "Attempted to compare functions" end
To help make this use case more pleasant, Pyret picks a method name to call, if it is present, on user-defined objects when checking equality. The method name is _equals, and it has the following signature:
- ._equals :: (
- other :: a,
- equal-rec :: (Any, Any -> EqualityResult)
- )
- -> EqualityResult
Where a is the type of the object itself (so for sets, other would be annotated with Set<a>).
The _equals method is called in the equality algorithm when:
The two values are either both data values or both objects, AND
If they are data values, the two values are of the same data type and variant, AND
If they are objects not created by data, they have the same set of Brands
So, for example, an object with an _equals method that always returns Equal is not considered equal to values that aren’t also objects:
import Equal from equality check: eq-all = { _equals(self, other, eq): Equal end } eq-all is-not== f eq-all is-not== m eq-all is-not== 0 eq-all is-not== "a" eq-all is== {} end
The last argument to _equals is the recursive equality callback to use for checking equality of any members. When checking for equality of members (say in our set implementation above), we would use this callback rather than one of equal-always3 or equal-now3. The reasons for this are threefold:
In order to check for equality of cyclic values, Pyret needs to do internal bookkeeping of visited references. This information is stored within the callback, and calling e.g. equal-now3 directly would not take previously visted references into account.
To avoid requiring datatypes to implement two equality methods, the callback also knows whether this equality call was started by equal-now or by equal-always. Any recursive calls should use the original semantics for comparing references, so using the callback ensures that equality checks on elements have the right semantics (even in deeply nested data structures).
The recursive equality predicate closes over and remembers the tolerance for within-family functions, and whether or not the tolerance is absolute or relative.
2.3.10 Inequalities
The inequality operators and functions in Pyret follow different rules than those for equality. In particular:
There are no 3-valued forms for the inequality functions, because...
All the inequalities (even non-strict inequality like <=) are defined on Roughnums.
Comparing approximate numbers with inequalities is technically a bit fraught. If x < y and x and y are both approximations, it may be that the approximation error causes the comparison to return true rather than false. The same argument holds for the other inequality operators. However, the inequality operators can be a part of correct use of approximations, for example by using a test like x < (y + tolerance), (where tolerance could be usefully specified as either positive or negative), in applications that closely track approximation error. Since in common cases inequality comparison of approximation is quite useful, and it is quite onerous to program with an analog of within for inequalities as well, Pyret chooses to allow the inequality operators to work on approximations.
The inequality operators all work on either:
Pairs of numbers (whether exact or approximate)
Pairs of strings
Left-hand-side objects with an appropriately-named method (for example, for the < operator, the object must have a _lessthan method.)
Numbers are compared in their standard mathematical order. Strings are compared lexicographically (examples below). For objects with overloaded methods, the method should return a Boolean and that return value is used as the result.
check "strings": "a" < "b" is true "b" > "a" is true "a" < "a" is false "a" > "a" is false "a" <= "a" is true "a" >= "a" is true "A" < "a" is true "a" > "A" is true "a" < "A" is false "A" > "a" is false "a" < "aa" is true "a" > "aa" is false "a" < "baa" is true "a" > "baa" is false "abb" < "b" is true "abb" > "b" is false "ab" < "aa" is false "ab" > "aa" is true "aa" < "ab" is true "aa" > "ab" is false end
check "numbers": ~5 < 5 is false ~5 > 5 is false ~5 <= 5 is true ~5 >= 5 is true ~5 < ~5 is false ~4.9 < ~5 is true ~5 <= ~5 is true ~5 >= ~5 is true end