Skip to content
This repository was archived by the owner on Mar 8, 2020. It is now read-only.
This repository was archived by the owner on Mar 8, 2020. It is now read-only.

Feature request: node hashes #78

@vmarkovtsev

Description

@vmarkovtsev

I have already seen several times that it is often required to quickly compare two sets of UAST nodes.
The naive approach is to compare each to each which is O(n^2). If we had a standard way to hash UAST nodes, we would solve this problem in O(n).

There are two ways of hashing nodes: a single tiny change rewrites the hash (1) and locality sensitive hashing (2). (1) is easy to implement by hashing the serialization of a node. (2) requires some research, and I've got a sufficiently working implementation in https://github.com/src-d/treediff/blob/master/treediff.py#L33 which recursively uses hashes of the sorted children.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions