This repository was archived by the owner on Mar 8, 2020. It is now read-only.

Description
I have already seen several times that it is often required to quickly compare two sets of UAST nodes.
The naive approach is to compare each to each which is O(n^2). If we had a standard way to hash UAST nodes, we would solve this problem in O(n).
There are two ways of hashing nodes: a single tiny change rewrites the hash (1) and locality sensitive hashing (2). (1) is easy to implement by hashing the serialization of a node. (2) requires some research, and I've got a sufficiently working implementation in https://github.com/src-d/treediff/blob/master/treediff.py#L33 which recursively uses hashes of the sorted children.