Every programmer I know (including my past self!) goes through the following brain melting moments when learning Javascript.
First, we learn to set up some variables pointing to some values:
let form = "gato"
let gloss = "cat"
Then we learn that we can retrieve the value that variable labels, just by referring to it (Letâs pretend Javascriptâs response is spoken by a robotâŠ):
form
![]()
"gato"
âŠyou get the string "gato"
back. This is not amazing.
Then we learn about the ==
equality operator, which results in the further not-amazingness:
"gato" == "cat"
![]()
false
"gato" == "gato"
![]()
true
Incredible!
And we can do other incantations:
form == gloss // effectively the same as "gato" == "cat"
![]()
false
Or
gloss == "cat"
![]()
true
Okeydokey, we get the idea. Now we learn about the wondrous world of objects, where we can bundle up more than one key-value pair to represent something complex. So for instance, instead of futzing about with these two values to represent the notion of a âwordâ, we can create a single object that represents our word:
let word = {"form": "gato", "gloss": "cat" }
This seems useful. Maybe letâs make another one:
let word2 == {"form": "perro", "gloss": "dog" }
Well, now we should be able to use ==
on these two words and prove that they are not equal. Right?
word == word2
![]()
"false"
Well, sure looks that way!
Narrator: It is not that way.
Just to make sure, letâs verify that a cat is a catâŠ
word == word
Ha! See? It works!
Narrator: It doesnât work.
It just has to work! I want it to work! Look, even if I donât use variables, it will work!
({"form": "gato", "gloss": "cat" } == {"form": "gato", "gloss": "cat" })
Note: the parens are necessary in the last line because otherwise the first
{
will be interpreted as beginning a code block and not an object. Donâttoo much about it at the moment, itâs mostly irrelevant to the current discussion!
I mean, obviously those are the same two things! They are identical!
![]()
"false"
wat
A weird thing
This is indisputably a weird thing. Letâs review:
"perro" == "gato"
// false, different strings.
let dogVariable = "perro"
let catVariable = "gato"
dogVariable == catVariable
// false, variables labeling different strings.
let dog = { "form": "perro", "gloss": "dog" }
let cat = { "form": "gato", "gloss": "cat" }
dog == cat
// false, of course not
dog == dog
// true, seems unsurprisingâŠ
dog == cat
// false, again unsurprisingâŠ
// so what if we just "skip the variables":
({ "form": "perro", "gloss": "dog" } == { "form": "perro", "gloss": "dog" })
// false! WAT?
When an object is not an object (in Javascript)
Itâs worth noting that this is a Javascript-specific thing.
Python is very clever about comparing objects:
>>> { "form": "perro", "gloss": "dog" } == { "form": "perro", "gloss": "dog" }
True
Python is even clever enough to know that the order of the key/value pairs in an object shouldnât affect object equality, because objects (well, âdict
sâ in PythonâŠ) are unordered. So this counts as true as well:
>>> { "gloss": "dog", "form": "perro" } == { "form": "perro", "gloss": "dog"}
True
On the other hand, making this kind of object equality the default has an efficiency cost. Also, what if you want to modify the definition of equality? For instance, suppose you have two âwordâ objects with the same form and gloss, but one of them happens to have a property domain
with a semantic domain in it. You might very well want to ignore that domain
property for the purposes of determining equality.
That is to say, are the two words in this array âequalâ?
[
{ "gloss": "dog", "form": "perro" } ,
{ "gloss": "dog", "form": "perro", "domain": "animals" }
]
Yes? Kind of? Maybe? No? It depends?
The only way to tell your programming language how to handle such things is to write some code that implements your definition of equality, whether youâre using Python, Javascript, or any other language.
In Python you override the __eq__
⊠er, whatever __those_python_thingies__
are called. (Ask @meaganvigus or @tillyb or @xrotwng or @sunny, your friendly post author is a Javascript guy!)
Anyway, the point is, you have to write some code to get your program to grok what you think of as being a word.
The same thing ends up being the case in Javascript, itâs just that you always have to write a function when you want to compare two objects. Conveniently, consider the next heading.
How to write a function to compare two objects in Javascript
Basically the strategy goes something like this:
- To compare two objects
a
andb
:- For every property in
a
(letâs call itkey
) you care about:- If
b
doesnât havekey
, they are not the same object - If
b
does havekey
buta
âs value forb
âs value forkey
are different, they are not the same object.
- If
- Otherwise, they are the same object.
- For every property in
Implementation of equals(a,b)
Hereâs a Javascript implementation of that:
let equals = (a,b) => {
return Object.keys(a)
.every(key in b && b[key] == a[key])
}
By way of explanation:
Object.keys(object)
The Object.keys
method will return an array of all the keys in an object:
let word = {"form": "gato", "gloss": "cat" }
Object.keys(word)
![]()
["form", "gloss"]
The .every() Array method
The
every()
method tests whether all elements in the array pass the test implemented by the provided function. It returns a Boolean [true
orfalse
] value.
The .every Array method
will go through every item in an array and check to see if the function it is passed returns true
for every item in the array. Here are a couple examples:
let words = [ "perro", "gato", "rato" ]
words.every(word => word.includes("a")) // false
words.every(word => word.endsWith("o")) // true
let numbers = [1,2,3,4,5]
numbers.every(number => number < 10) // true
numbers.every(number => number < 3) // false
So in our equals(a,b)
function, there is an âanonymousâ function takes each key of a
in turn, and checks whether uses that key to check our definition of equality.
So if we run:
equals(
{ form: "gato", gloss: "cat" }, // this is a
{ form: "gato", gloss: "cat" } // this is b
)
Then this code is run for each key:
key in b && b[key] == a[key]
The in
operator is used to ask if the current key of b
is also a key of b
. So in our example we ask first if b
has the key form
.
"form" in b
Recall that b
is { form: "gato", gloss: "cat" }
, so yes, "form"
is in there:
![]()
true
Now, that &&
business (the âlogical ANDâ) means âevaluate whatâs next only if what weâve seen so far is true.â
So only now do we ask if b
âs value for "form"
is the same as a
âs, and that works out to true as well:
b["form"] == a["form"] // works out to "gato" == "gato"
![]()
true
Next, this happens for gloss
, and running key in b && b[key] == a[key]
also works out to true
(since gloss
is in { form: "gato", gloss: "cat" }
and "cat" == "cat"
.
Donesies! We have implemented an equals function for words.
Except of course
That our data might not be so tidy. What about this?
equals([
{ "gloss": "dog", "form": "perro" } ,
{ "gloss": "dog", "form": "perro", "domain": "animals" }
])
![]()
true
Hereâs a table representation of the step-by-step execution of the call above â note that the words only qualify as equal if all of the values in the equals
column are true:
key |
in a ? |
in b ? |
a[key] |
b[key] |
equals |
---|---|---|---|---|---|
"form" |
true |
true |
"gato" |
"gato" |
true |
"gloss" |
true |
true |
"cat" |
"cat" |
true |
Yes! Victory! These are the same word!
Except, erâŠ
equals([
{ "gloss": "dog", "form": "perro", "domain": "animals" } ,
{ "gloss": "dog", "form": "perro" }
])
![]()
false
key |
in a ? |
in b ? |
a[key] |
b[key] |
equals |
---|---|---|---|---|---|
"form" |
true |
true |
"gato" |
"gato" |
true |
"gloss" |
true |
true |
"cat" |
"cat" |
true |
"domain" |
true |
false |
"animals" |
â | false |
Now we are looking at the domain
key since itâs in a
. And hence we get a fail. But this is weird, because weâre comparing the same two words as before.
If we really want to define our equality as meaning that both words have not just the same values as the keys of a
, but rather, the exact same values and the exact same keys, then we have to say so.
One simple way to do this is to make sure that both objects have the same number of keys, and only then to check that the values of every key is equal in both objects:
let equals = (a, b) => {
return Object.keys(a).length == Object.keys(b).length &&
Object.keys(a)
.every(key => key in b && b[key] == a[key])
}
equals(
{"form": "gato", "gloss": "cat", "domain": "animals"},
{"form": "gato", "gloss": "cat"}
)
key |
in a ? |
in b ? |
a[key] |
b[key] |
a keys length |
b keys length |
equals |
---|---|---|---|---|---|---|---|
3 | 2 | false |
|||||
"form" |
true |
true |
"gato" |
"gato" |
true |
||
"gloss" |
true |
true |
"cat" |
"cat" |
true |
||
"domain" |
true |
false |
"animals" |
â | false |
Note that the visualization table below is quite distinct, since we only compare lengths the first time around. In fact, rows 2-5 are irrelevant, since weâve already failed to have true
s in the last column. (In fact, Javascript doesnât bother to run them.)
Note that this will also detect the case where the number of keys is identical, but
one of the keys differs between a
and b
:
equals(
{"form": "gato", "gloss": "cat", "domain": "animals"},
{"form": "gato", "gloss": "cat", "wordClass": "noun"}
)
key |
in a ? |
in b ? |
a[key] |
b[key] |
a keys length |
b keys length |
equals |
---|---|---|---|---|---|---|---|
3 | 3 | true |
|||||
"form" |
true |
true |
"gato" |
"gato" |
true |
||
"gloss" |
true |
true |
"cat" |
"cat" |
true |
||
"domain" |
true |
false |
"animals" |
â | false |
||
"wordClass" |
false |
true |
"noun" |
â | false |
![]()
false
*(Again, Javascript doesnât bother to run the last line.)
So, the last thing we want to enable is the case where we want to limit comparison to some of the fields. We can do this by adding an extra parameter to our equals function implementation:
let equals = (a, b, keys=null) => {
return keys.every(key =>
key in b &&
key in a &&
b[key] == a[key])
}
Now, we can limit our comparison to "form"
and "gloss"
if we desire:
equals(
{"form": "gato", "gloss": "cat", "domain": "animals"},
{"form": "gato", "gloss": "cat", "wordClass": "noun"},
["form", "gloss"]
)
![]()
true
The match.js
module in docling.js
I have been making use of this sort of stuff a lot in docling.js
. In fact we want to enable defining all these kinds of equality, and thereâs more to it than what weâve gone over here. If youâre interested, you can take a look at the JS
module here:
https://docling.land/modules/match.js
And if youâre familiar with testing then you can see some the growing test library for the module here:
https://docling.land/modules/match.test.js
The implementation is as simple as I ahve been able to make it (but it still needs work):
export let match = (queries, comparand, fields=[]) => {
if(!Array.isArray(queries)){
let queryObject = queries
queries = Object.entries(queryObject)
}
if(fields.length){
queries = queries
.filter(([key,value]) => fields.includes(key))
}
let comparandHasAllKeys = queries
.every(([key,value]) => comparand[key])
if(!comparandHasAllKeys){ return false }
let allValuesMatch = queries
.every(([key,value]) => {
if(typeof value == 'string' && value.trim().length == 0){
return false
} else if(typeof value == 'string'){
return comparand[key].includes(value)
} else if(identifyType(value) == 'number'){
return match(value, comparand[key])
} else if(identifyType(value) == 'object'){ // wtf
return match(value, comparand[key])
} else if(value instanceof RegExp){
return value.test(comparand[key])
}
})
return allValuesMatch
}
PS. Why should you care about any of this?
Fair question. The answer, primarily, is search. If you have a lexicon with 3000 words in it, you want to be able to search that lexicon for words that match criteria. And you want to be able to do that in flexible ways.
Because match()
returns true
or false
, you can use it inside an .filter()
method to filter an array. Like this:
let lexicon = { "metadata": {"title": "A tiny lexiconâŠ"},
"words": [
{"form": "gato", "gloss": "kato"},
{"form": "perro", "gloss": "hundo"},
{"form": "pĂĄjaro", "gloss": "birdo"}
]
}
lexicon.words.filter(word => match({"form": "perro"}, word))
This sort of thing is the beginning of many kinds of search patterns.