Skip to content

Conversation

@GARRISON-2
Copy link

Initial version of the VCF comparison program. Still a a lot of known issues, but it is in a relatively workable state. Main problems still in this version include issues with handling of position-only files and issues with the 1 and 0 based offsets causing no 0 distances for the levenshtein comparisons

GARRISON-2 and others added 2 commits January 7, 2026 18:06
* initial commit, scratch work using pandas

* redoing the file reading/parsing logic. Also added vcf data files to catalog

* created file info object, started work on comparisons

* Made changes to fileInfo object and began logic to grab genotype

* Changed file objects, improved position alignment

* Comparisons seem to be working, cleaned up code

* improved methodology for setting file info in VCF

* made script for looking at overlapping positions

* more program cleaning, added pos comps for vcfs

* began implementation of multi running VCFs

* logic for readers to be used in a with statement. added .gz streaming

* Created base class for readers

* Added more error handling and alignment checks

* moved functions to new util file, improved gt methods

* added logic for custom errors, simplified reader read method

* Created new class that uses VCFReader as super, removed internal self.fields use

* cleaned up mainloop, fixed chrom order check

* fixed error with straglr output, more method cleaning/debugging

* restructured folders for cleanliness
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant