Skip to content

Reference file update #7

@knaegle

Description

@knaegle

Feature Request
Build a reference file update to the ProteomeScout flat text files. This should migrate all old records to current UniProt records

Describe the solution you'd like
Considerations:
Migration:

  1. It should migrate all records with a UniProt ID.
  2. Migration looks like the following: if original sequence and new sequence are the same, no changes to PTMs. If they are different, use pairwise alignment, migrate only if global alignment is sufficiently good, then check for local alignment in the sequence before updating PTM information (site numbers).
  3. All protein data information should be updated (GO Terms, domains from InterPro)
    Update:
  4. Once migration is complete (all old data moved to new protein records), perform an integration update with key resources (PSP and Uniprot PTM annotations). Need to decide on how many species of PTMs we will pull.

Describe alternatives you've considered
Maintaining a mySQL database of this information is too much labor without enough support for long-term maintenance, but the annotation and integration of the PTM database is a key feature that we want to maintain.

Additional context
We will want to log information and keep track of changes between old and new, update the reference files, and move forward with installation and configuration of the API.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions