Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,6 @@ Initially, design considerations will be discussed and voted on. Eventually, the
# Workflow Runner
The following repo will, after you stand it up, provide an endpoint to which you can post workflows and registered ARAs will execute the operations they have implemented.
https://github.com/NCATSTranslator/workflow-runner

# New Operations
Creation of a new operation involves generation of the operation definition in the `operations` directory, then rebuilding the docs with `docs/build/generate_docs.py` and the schema with `schema/build/generate_schema.py`
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
- [filter_results_top_n](./filter_results_top_n.md)
- [lookup](./lookup.md)
- [lookup_and_score](./lookup_and_score.md)
- [normalize_nodes](./normalize_nodes.md)
- [overlay](./overlay.md)
- [overlay_compute_jaccard](./overlay_compute_jaccard.md)
- [overlay_compute_ngd](./overlay_compute_ngd.md)
Expand Down
30 changes: 30 additions & 0 deletions docs/normalize_nodes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# normalize nodes

This operation updates the identifiers on qgraph and kgraph nodes to their preferred identifiers, and adds equivalent identifiers in a property for knodes. When two kgraph nodes normalize to the same preferred identifier, the two knodes are merged. The new node contain the union of the properties of the two original nodes. All edges attached to either of the two original nodes are now subsequently attached to the new merged knode. Qnodes are not merged, so that the structure of the query can be preserved. The updates to kgraph node identifiers also necessitates the updating of result node bindings.

### examples

- [input](../examples/normalize_nodes/messages/01_prenormalized_message.json), [output](../examples/)
- [input](../examples/), [output](../examples/normalize_nodes/messages/02_postnormalized_message.json)

### input requirements

None

### output guarantees

None

### allowed changes

- modify qnodes
- modify knodes
- remove knodes
- modify kedges
- modify node bindings

### parameters

```yaml
{}
```
170 changes: 170 additions & 0 deletions examples/normalize_nodes/messages/01_prenormalized_message.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
{
"message": {
"query_graph": {
"nodes": {
"n1": {
"ids": ["HGNC:11603"],
"categories": [
"biolink:Gene"
]
},
"n2": {
"ids": ["NCBIGene:9496"],
"categories": [
"biolink:Gene"
]
},
"n3": {
"ids": ["MONDO:0005002"],
"categories": [
"biolink:Disease"
]
},
"n4": {
"ids": ["DOID:3083"],
"categories": [
"biolink:Disease"
]
},
"n5": {
"categories": [
"biolink:Disease"
]
}
},
"edges": {
"e1": {
"subject": "n1",
"object": "n3"
},
"e2": {
"subject": "n2",
"object": "n4",
"predicates": ["biolink:related_to"]
},
"e3": {
"subject": "n1",
"object": "n5"
}
}
},
"knowledge_graph": {
"nodes": {
"HGNC:11603": {
"name": "TBX4",
"categories": [
"biolink:Gene"
]
},
"NCBIGene:9496": {
"name": "T-box transcription factor 4",
"categories": [
"biolink:Gene"
]
},
"MONDO:0005002": {
"name": "chronic obstructive pulmonary disease",
"categories": [
"biolink:Disease"
]
},
"DOID:3083": {
"name": "chronic obstructive pulmonary disease",
"categories": [
"biolink:Disease"
]
},
"UMLS:CN202575": {
"name": "heritable pulmonary arterial hypertension",
"categories": [
"biolink:Disease"
]
}
},
"edges": {
"a8575c4e-61a6-428a-bf09-fcb3e8d1644d": {
"subject": "HGNC:11603",
"object": "MONDO:0005002",
"predicate": "biolink:related_to"
},
"2d38345a-e9bf-4943-accb-dccba351dd04": {
"subject": "NCBIGene:9496",
"object": "DOID:3083",
"predicate": "biolink:related_to"
},
"044a7916-fba9-4b4f-ae48-f0815b0b222d": {
"subject": "HGNC:11603",
"object": "UMLS:CN202575",
"predicate": "biolink:related_to"
}
}
},
"results": [
{
"node_bindings": {
"n1": [
{
"id": "HGNC:11603"
}
],
"n3": [
{
"id": "MONDO:0005002"
}
]
},
"edge_bindings": {
"e1": [
{
"id": "a8575c4e-61a6-428a-bf09-fcb3e8d1644d"
}
]
}
},
{
"node_bindings": {
"n2": [
{
"id": "NCBIGene:9496"
}
],
"n4": [
{
"id": "DOID:3083"
}
]
},
"edge_bindings": {
"e2": [
{
"id": "2d38345a-e9bf-4943-accb-dccba351dd04"
}
]
}
},
{
"node_bindings": {
"n1": [
{
"id": "HGNC:11603"
}
],
"n5": [
{
"id": "UMLS:CN202575"
}
]
},
"edge_bindings": {
"e3": [
{
"id": "044a7916-fba9-4b4f-ae48-f0815b0b222d"
}
]
}
}
]
},
"logs": null,
"status": null
}
Loading