-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Summary
update_codelist_version_compatibility runs when a new release of a coding system is imported.
This updates the compatible_versions property of relevant CodelistVersions.
As part of its compatibility checking, it compares the Hierarchy of a CodelistVersion within the previous CodingSystemRelease to the incoming release using _check_version_by_hierarchy.
This function compares parts of the hierarchies in an order-sensitive manner when an order-insensitive approach is more appropriate, leading to false assertions of incompatibility of a CodelistVersion with a CodingSystemRelease.
There are now several codelist versions presented on OpenCodelists with inaccurate coding system release compatibility information, which misleads researchers as to their suitability for use.
What did you do and what did you expect to happen?
- Visited this codelist version page
- Expected to see it was compatible with the latest release of SNOMED CT (confirmed by cloning, creating new version, and comparing).
- It was not stated to be compatible with this release
Impact of the bug
Renders large parts of the analysis I have done on usage of out-of-date codelists unreliable.
Provides inaccurate information on appropriate codelist use to researchers.
Worked Example
Get the most recent SNOMED CT release, and the previous one
coding_system_id="snomedct"
new_coding_system = CODING_SYSTEMS[coding_system_id](
...: database_alias=new_database_alias
...: )
coding_system=new_coding_system
previous_release = (
...: CodingSystemRelease.objects.filter(
...: coding_system=coding_system.id,
...: valid_from__lt=coding_system.release.valid_from,
...: )
...: .exclude(id=coding_system.release.id)
...: .latest("valid_from")
...: )Ascertain the versions to have their compatibility checked as per update_codelist_version_compatibility, isolate above-linked version of interest.
versions = set(
...: CodelistVersion.objects.exclude(status=Status.DRAFT).filter(
...: Q(compatible_releases__id=previous_release.id)
...: | Q(coding_system_release=previous_release)
...: )
...: )
version = [v for v in versions if v.hash == "7e32e792"][0]Call the version search and hierarchy compatibility inner functions
_check_version_by_search(
...: coding_system, version
...: )
Out[40]: True
_check_version_by_hierarchy(coding_system, version)
Out[41]: FalseManually compare state of original and new hierarchies as generated within _check_version_by_hierarchy
In [47]: original_hierarchy_data
Out[47]:
{'root': '138875005',
'nodes': ['73211009',
'362965005',
'170745003',
'164971000119101',
'64572001',
'404684003',
'170763003',
'308537004',
'441742003',
'24481000000101',
'75934005',
'243860001',
'44054006',
'126877002',
'362969004',
'307824009',
'444110003',
'127325009',
'243851009',
'138875005',
'20957000',
'243815002'],
'child_map': {'307824009': ['243815002'],
'362965005': ['362969004'],
'404684003': ['307824009', '127325009', '64572001'],
'441742003': ['170763003'],
'126877002': ['73211009'],
'170763003': ['444110003'],
'64572001': ['362965005', '75934005'],
'138875005': ['404684003'],
'243851009': ['243860001'],
'170745003': ['24481000000101'],
'243815002': ['308537004'],
'362969004': ['73211009'],
'20957000': ['126877002'],
'308537004': ['243851009'],
'127325009': ['441742003'],
'444110003': ['164971000119101'],
'44054006': ['444110003'],
'75934005': ['20957000'],
'73211009': ['44054006'],
'243860001': ['170745003']},
'parent_map': {'243815002': ['307824009'],
'362969004': ['362965005'],
'127325009': ['404684003'],
'170763003': ['441742003'],
'73211009': ['362969004', '126877002'],
'444110003': ['44054006', '170763003'],
'75934005': ['64572001'],
'307824009': ['404684003'],
'404684003': ['138875005'],
'243860001': ['243851009'],
'24481000000101': ['170745003'],
'308537004': ['243815002'],
'64572001': ['404684003'],
'126877002': ['20957000'],
'243851009': ['308537004'],
'441742003': ['127325009'],
'164971000119101': ['444110003'],
'20957000': ['75934005'],
'44054006': ['73211009'],
'362965005': ['64572001'],
'170745003': ['243860001']}}
In [48]: new_hierarchy_data
Out[48]:
{'root': '138875005',
'nodes': ['64572001',
'308537004',
'307824009',
'75934005',
'170763003',
'20957000',
'24481000000101',
'126877002',
'444110003',
'362965005',
'362969004',
'73211009',
'138875005',
'170745003',
'243815002',
'44054006',
'243860001',
'164971000119101',
'243851009',
'127325009',
'404684003',
'441742003'],
'child_map': {'444110003': ['164971000119101'],
'126877002': ['73211009'],
'404684003': ['307824009', '127325009', '64572001'],
'307824009': ['243815002'],
'64572001': ['75934005', '362965005'],
'127325009': ['441742003'],
'243851009': ['243860001'],
'138875005': ['404684003'],
'308537004': ['243851009'],
'170763003': ['444110003'],
'75934005': ['20957000'],
'362965005': ['362969004'],
'362969004': ['73211009'],
'441742003': ['170763003'],
'73211009': ['44054006'],
'20957000': ['126877002'],
'243815002': ['308537004'],
'44054006': ['444110003'],
'170745003': ['24481000000101'],
'243860001': ['170745003']},
'parent_map': {'164971000119101': ['444110003'],
'73211009': ['362969004', '126877002'],
'127325009': ['404684003'],
'243815002': ['307824009'],
'75934005': ['64572001'],
'441742003': ['127325009'],
'243860001': ['243851009'],
'404684003': ['138875005'],
'243851009': ['308537004'],
'307824009': ['404684003'],
'444110003': ['44054006', '170763003'],
'20957000': ['75934005'],
'64572001': ['404684003'],
'362969004': ['362965005'],
'170763003': ['441742003'],
'44054006': ['73211009'],
'126877002': ['20957000'],
'308537004': ['243815002'],
'362965005': ['64572001'],
'24481000000101': ['170745003'],
'170745003': ['243860001']}}
In [50]: original_hierarchy_data["nodes"] == new_hierarchy_data["nodes"]
Out[50]: False
In [51]: set(original_hierarchy_data["nodes"]) == set(new_hierarchy_data["nodes"])
Out[51]: True
In [55]: set(original_hierarchy_data["child_map"].keys()).difference(set(new_hierarchy_data["child_map"].keys()))
Out[55]: set()
In [56]: set(original_hierarchy_data["parent_map"].keys()).difference(set(new_hierarchy_data["parent_map"].keys()))
Out[56]: set()
In [62]: [(k,v) for k,v in original_hierarchy_data["child_map"].items() if v != new_hierarchy_data["child_map"][k]]
Out[62]: [('64572001', ['362965005', '75934005'])]
In [64]: new_hierarchy_data["child_map"]["64572001"]
Out[64]: ['75934005', '362965005']We can see that the nodes are the same between these hierarchies, just ordered differently. Since they are compared as lists rather than sets the comparison is order-sensitive.
Equally, the keys for the parent and child maps are the same, and where there are differing values, the codes within those values are the same just in a different order.