TASK-5564 - Update data sources for CellBase 6.2 #696

jtarraga · 2024-07-01T06:36:59Z

Update data sources for CellBase 6.2

…aded, #TASK-5575, #TASK-5564

…loaded, #TASK-5575, #TASK-5564

…, #TASK-5575, #TASK-5564

…5575, #TASK-5564

…es, #TASK-5576, #TASK-5564

…#TASK-5564

…fixing sonnar issues, #TASK-5576, #TASK-5564

…peats builder, #TASK-5576, #TASK-5564

…tion builder, #TASK-5576, #TASK-5564

…s (e.g., mmusculus), #TASK-6426, #TASK-5564

…on file for species (e.g., mmusculus), and update the variant downloader according to these changes, #TASK-6426, #TASK-5564

…by the different data, e.g., repeats, #TASK-6142, #TASK-5564

- Re-using the fucntion loadJsonFile - Adding a mongodb index for the collection genome info - Adding log messages

…5564

…revious changes, #TASK-6142, #TASK-5564

… files, and rename some constants, #TASK-5776, #TASK-5564

…atest changes, #TASK-6142, #TASK-5564

…K-5564

TASK-7809 - Upgrade avro version from 1.9.1 to 1.11.4

And fix checkstyle after merging

…ASK-5564

…from the variantion processing, #TASK-5564

…#TASK-5564

…query, #TASK-5564

…ure releases, #TASK-5564

Copilot

Pull request overview

This pull request updates data sources for CellBase 6.2, involving a significant refactoring of the builder infrastructure and modernization of clinical variant processing.

Key Changes:

Refactored builder class hierarchy by replacing CellBaseBuilder with AbstractBuilder as the base class
Updated clinical variant indexers to handle new data formats and sources (ClinVar, COSMIC, CIViC, GWAS)
Added new builders for polygenic scores (PGS Catalog)
Enhanced gene annotation with additional data sources (imprinted genes, gene fusions from ChimerDB, gnomAD constraints)
Updated data source versions and file formats (e.g., UniProt to version 202502, new ClinVar XML structure)

Reviewed changes

Copilot reviewed 107 out of 226 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
CosmicBuilder.java	Changed parent class from `CellBaseBuilder` to `AbstractBuilder`
ClinicalVariantBuilder.java	Major refactoring: added file validation, updated to process new ClinVar/COSMIC/CIViC/GWAS formats with version tracking
ClinicalIndexer.java	Added version/assembly fields, updated normalization config, removed inner `SequenceLocation` class, added constant for original property ID
ClinVarParser.java	Changed parent class from `CellBaseBuilder` to `AbstractBuilder`
ClinVarIndexer.java	Updated to handle new ClinVar format with version parameter, added import for `SequenceLocation`, improved error handling
CivicIndexerCallback.java	New file implementing CIViC data indexing callback with evidence entry creation
CivicIndexer.java	New file for CIViC data source integration
VariationBuilder.java	Complete rewrite to handle VCF files for non-human species
SpliceBuilder.java	Updated parent class and constant references
RocksDbManager.java	Added methods for gene imprinting and gene fusion data retrieval, added missing imports
RevelScoreBuilder.java	Updated to use new data model `ProteinSubstitutionPrediction`, enhanced error handling
RepeatsBuilder.java	Added configuration-based file validation and support for multiple repeat data sources
RegulatoryRegionBuilder.java	File deleted (deprecated)
RegulatoryFeatureBuilder.java	Complete rewrite with PFM matrix download and new file format handling
RefSeqGeneBuilderIndexer.java	Simplified to delegate to common gene builder methods
RefSeqGeneBuilder.java	Major refactoring with configuration-based file validation and improved indexing
PubMedBuilder.java	Enhanced with configuration-based validation and improved logging
ProteinBuilder.java	Updated to UniProt 202502 format with InterPro integration and chunk processing
PolygenicScoreBuilder.java	New file for PGS Catalog data processing
OntologyBuilder.java	Refactored with configuration-based file validation
MiRTarBaseIndexer.java	New file extracting miRTarBase indexing logic
InteractionBuilder.java	Changed parent class to `AbstractBuilder`
GenomeSequenceFastaBuilder.java	Updated parent class and improved logging
GeneExpressionAtlasBuilder.java	Changed parent class to `AbstractBuilder`
GeneBuilderUtils.java	File deleted (deprecated)
GeneBuilderIndexer.java	Extensive additions for constraints, imprinted genes, gene fusions, and ChimerDB integration
GeneBuilder.java	Complete rewrite delegating to Ensembl and RefSeq gene builders
DbSnpBuilder.java	Updated constant name and parent class
CellBaseBuilder.java	File deleted (replaced by AbstractBuilder)
CaddAllAnnotationBuilder.java	Changed parent class to `AbstractBuilder`
pom.xml	Version bump to 6.7.0-SNAPSHOT, added dependencies for commons-compress and commons-csv

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-19T10:21:57Z

.../src/main/java/org/opencb/cellbase/lib/builders/clinical/variant/ClinicalVariantBuilder.java

+        if (checked) {
+            return;
+        }


The boolean field checked is referenced but not declared in this class. It should be declared as a class field (e.g., private boolean checked = false;) or inherited from AbstractBuilder.

Copilot · 2025-12-19T10:21:58Z

cellbase-lib/src/main/java/org/opencb/cellbase/lib/builders/GeneBuilder.java

+        if (1 == 1) {
+            CellBaseJsonFileSerializer refSeqGeneSerializer = new CellBaseJsonFileSerializer(buildPath, REFSEQ_GENE_BASENAME);
+            this.refSeqGeneBuilder = new RefSeqGeneBuilder(downloadPath.resolve(REFSEQ_DATA), speciesConfiguration, configuration,
+                    refSeqGeneSerializer);
+        }


The condition if (1 == 1) is always true and suggests incomplete or placeholder code. This should either be replaced with a proper condition or the block should be unconditional.

Copilot · 2025-12-19T10:21:58Z

cellbase-lib/src/main/java/org/opencb/cellbase/lib/builders/GeneBuilder.java

+        if (1 == 1) {
+            refSeqGeneBuilder.check();
+        }


The condition if (1 == 1) is always true and suggests incomplete or placeholder code. This should either be replaced with a proper condition or the block should be unconditional.

Copilot · 2025-12-19T10:21:58Z

cellbase-lib/src/main/java/org/opencb/cellbase/lib/builders/GeneBuilder.java

+        if (1 == 1) {
+            if (!Files.exists(downloadPath.resolve(REFSEQ_DATA).resolve(REFSEQ_GENE_OUTPUT_FILENAME))) {
+                refSeqGeneBuilder.parse();
            } else {
-                tabixReader = new TabixReader(tfbsFile.toAbsolutePath().toString(), tabixFile.toAbsolutePath().toString());
-            }
-
-            // Preparing the fasta file for fast accessing
-//            System.out.println("genomeSequenceFilePath.toString() = " + genomeSequenceFilePath.toString());
-            FastaIndex fastaIndex = new FastaIndex(genomeSequenceFilePath);
-
-            // Empty transcript and exon dictionaries
-            transcriptDict.clear();
-            exonDict.clear();
-            logger.info("Parsing gtf...");
-            GtfReader gtfReader = new GtfReader(gtfFile);
-
-            // Gene->Transcript->Feature->GTF line
-            Map<String, Map<String, Map<String, Object>>> gtfMap = null;
-            if (flexibleGTFParsing) {
-                gtfMap = loadGTFMap(gtfReader);
-                initializePointers(gtfMap);
-            }
-
-            Gtf gtf;
-            while ((gtf = getGTFEntry(gtfReader, gtfMap)) != null) {
-
-                if (gtf.getFeature().equals("gene") || gtf.getFeature().equals("transcript")
-                        || gtf.getFeature().equals("UTR") || gtf.getFeature().equals("Selenocysteine")) {
-                    continue;
-                }
-
-                String geneId = gtf.getAttributes().get("gene_id");
-                String transcriptId = gtf.getAttributes().get("transcript_id");
-                String geneName = gtf.getAttributes().get("gene_name");
-                if (newGene(gene, geneId)) {
-                    // If new geneId is different from the current then we must serialize before data new gene
-                    if (gene != null) {
-                        serializer.serialize(gene);
-                    }
-
-                    GeneAnnotation geneAnnotation = new GeneAnnotation(indexer.getExpression(geneId), indexer.getDiseases(geneName),
-                            indexer.getDrugs(geneName), indexer.getConstraints(geneId), indexer.getMirnaTargets(geneName),
-                            indexer.getCancerGeneCensus(geneName), indexer.getCancerHotspot(geneName));
-
-                    gene = new Gene(geneId, geneName, gtf.getSequenceName().replaceFirst("chr", ""),
-                            gtf.getStart(), gtf.getEnd(), gtf.getStrand(), gtf.getAttributes().get("gene_version"),
-                            gtf.getAttributes().get("gene_biotype"), "KNOWN", SOURCE, indexer.getDescription(geneId),
-                            new ArrayList<>(), indexer.getMirnaGene(transcriptId), geneAnnotation);
-                }
-
-                // Check if Transcript exist in the Gene Set of transcripts
-                if (!transcriptDict.containsKey(transcriptId)) {
-                    transcript = getTranscript(gene, indexer, tabixReader, gtf, transcriptId);
-                } else {
-                    transcript = gene.getTranscripts().get(transcriptDict.get(transcriptId));
-                }
-
-                // At this point gene and transcript objects are set up
-                // Update gene and transcript genomic coordinates, start must be the
-                // lower, and end the higher
-                updateTranscriptAndGeneCoords(transcript, gene, gtf);
-
-                String transcriptIdWithoutVersion = transcript.getId().split("\\.")[0];
-                if (gtf.getFeature().equalsIgnoreCase("exon")) {
-                    // Obtaining the exon sequence
-                    String exonId = gtf.getAttributes().get("exon_id") + "." + gtf.getAttributes().get("exon_version");
-                    String exonSequence = fastaIndex.query(gtf.getSequenceName(), gtf.getStart(), gtf.getEnd());
-
-                    exon = new Exon(exonId, gtf.getSequenceName().replaceFirst("chr", ""),
-                            gtf.getStart(), gtf.getEnd(), gtf.getStrand(), 0, 0, 0, 0, 0, 0, -1, Integer.parseInt(gtf
-                            .getAttributes().get("exon_number")), exonSequence);
-                    transcript.getExons().add(exon);
-
-                    exonDict.put(transcriptIdWithoutVersion + "_" + exon.getExonNumber(), exon);
-                    if (gtf.getAttributes().get("exon_number").equals("1")) {
-                        cdna = 1;
-                        cds = 1;
-                    } else {
-                        // with every exon we update cDNA length with the previous exon length
-                        cdna += exonDict.get(transcriptIdWithoutVersion + "_" + (exon.getExonNumber() - 1)).getEnd()
-                                - exonDict.get(transcriptIdWithoutVersion + "_" + (exon.getExonNumber() - 1)).getStart() + 1;
-                    }
-                } else {
-                    exon = exonDict.get(transcriptIdWithoutVersion + "_" + exon.getExonNumber());
-                    if (gtf.getFeature().equalsIgnoreCase("CDS")) {
-                        // Protein ID is only present in CDS lines
-                        String proteinId = gtf.getAttributes().get("protein_id") != null
-                                ? gtf.getAttributes().get("protein_id") + "." + gtf.getAttributes().get("protein_version")
-                                : "";
-                        transcript.setProteinId(proteinId);
-                        transcript.setProteinSequence(indexer.getProteinFasta(proteinId));
-
-                        if (gtf.getStrand().equals("+") || gtf.getStrand().equals("1")) {
-                            // CDS states the beginning of coding start
-                            exon.setGenomicCodingStart(gtf.getStart());
-                            exon.setGenomicCodingEnd(gtf.getEnd());
-
-                            // cDNA coordinates
-                            exon.setCdnaCodingStart(gtf.getStart() - exon.getStart() + cdna);
-                            exon.setCdnaCodingEnd(gtf.getEnd() - exon.getStart() + cdna);
-                            // Set cdnaCodingEnd to prevent those cases without stop_codon
-
-                            transcript.setCdnaCodingEnd(gtf.getEnd() - exon.getStart() + cdna);
-                            exon.setCdsStart(cds);
-                            exon.setCdsEnd(gtf.getEnd() - gtf.getStart() + cds);
-
-                            // increment in the coding length
-                            cds += gtf.getEnd() - gtf.getStart() + 1;
-                            transcript.setCdsLength(cds - 1);  // Set cdnaCodingEnd to prevent those cases without stop_codon
-
-                            exon.setPhase(Integer.parseInt(gtf.getFrame()));
-
-                            if (transcript.getGenomicCodingStart() == 0 || transcript.getGenomicCodingStart() > gtf.getStart()) {
-                                transcript.setGenomicCodingStart(gtf.getStart());
-                            }
-                            if (transcript.getGenomicCodingEnd() == 0 || transcript.getGenomicCodingEnd() < gtf.getEnd()) {
-                                transcript.setGenomicCodingEnd(gtf.getEnd());
-                            }
-                            // only first time
-                            if (transcript.getCdnaCodingStart() == 0) {
-                                transcript.setCdnaCodingStart(gtf.getStart() - exon.getStart() + cdna);
-                            }
-                            // strand -
-                        } else {
-                            // CDS states the beginning of coding start
-                            exon.setGenomicCodingStart(gtf.getStart());
-                            exon.setGenomicCodingEnd(gtf.getEnd());
-                            // cDNA coordinates
-                            // cdnaCodingStart points to the same base position than genomicCodingEnd
-                            exon.setCdnaCodingStart(exon.getEnd() - gtf.getEnd() + cdna);
-                            // cdnaCodingEnd points to the same base position than genomicCodingStart
-                            exon.setCdnaCodingEnd(exon.getEnd() - gtf.getStart() + cdna);
-                            // Set cdnaCodingEnd to prevent those cases without stop_codon
-                            transcript.setCdnaCodingEnd(exon.getEnd() - gtf.getStart() + cdna);
-                            exon.setCdsStart(cds);
-                            exon.setCdsEnd(gtf.getEnd() - gtf.getStart() + cds);
-
-                            // increment in the coding length
-                            cds += gtf.getEnd() - gtf.getStart() + 1;
-                            transcript.setCdsLength(cds - 1);  // Set cdnaCodingEnd to prevent those cases without stop_codon
-                            exon.setPhase(Integer.parseInt(gtf.getFrame()));
-
-                            if (transcript.getGenomicCodingStart() == 0 || transcript.getGenomicCodingStart() > gtf.getStart()) {
-                                transcript.setGenomicCodingStart(gtf.getStart());
-                            }
-                            if (transcript.getGenomicCodingEnd() == 0 || transcript.getGenomicCodingEnd() < gtf.getEnd()) {
-                                transcript.setGenomicCodingEnd(gtf.getEnd());
-                            }
-                            // only first time
-                            if (transcript.getCdnaCodingStart() == 0) {
-                                // cdnaCodingStart points to the same base position than genomicCodingEnd
-                                transcript.setCdnaCodingStart(exon.getEnd() - gtf.getEnd() + cdna);
-                            }
-                        }
-
-                    }
-//                if (gtf.getFeature().equalsIgnoreCase("start_codon")) {
-//                    // nothing to do
-//                    System.out.println("Empty block, this should be redesigned");
-//                }
-                    if (gtf.getFeature().equalsIgnoreCase("stop_codon")) {
-                        //                      setCdnaCodingEnd = false; // stop_codon found, cdnaCodingEnd will be set here,
-                        //                      no need to set it at the beginning of next feature
-                        if (exon.getStrand().equals("+")) {
-                            updateStopCodingDataPositiveExon(exon, cdna, cds, gtf);
-
-                            cds += gtf.getEnd() - gtf.getStart();
-                            // If stop_codon appears, overwrite values
-                            transcript.setGenomicCodingEnd(gtf.getEnd());
-                            transcript.setCdnaCodingEnd(gtf.getEnd() - exon.getStart() + cdna);
-                            transcript.setCdsLength(cds - 1);
-
-                        } else {
-                            updateNegativeExonCodingData(exon, cdna, cds, gtf);
-
-                            cds += gtf.getEnd() - gtf.getStart();
-                            // If stop_codon appears, overwrite values
-                            transcript.setGenomicCodingStart(gtf.getStart());
-                            // cdnaCodingEnd points to the same base position than genomicCodingStart
-                            transcript.setCdnaCodingEnd(exon.getEnd() - gtf.getStart() + cdna);
-                            transcript.setCdsLength(cds - 1);
-                        }
-                    }
-                }
-            }
-
-            // last gene must be serialized
-            serializer.serialize(gene);
-
-            // cleaning
-            gtfReader.close();
-            serializer.close();
-            fastaIndex.close();
-            indexer.close();
-        } catch (Exception e) {
-            indexer.close();
-            throw e;
-        }
-    }
-
-    private Transcript getTranscript(Gene gene, EnsemblGeneBuilderIndexer indexer, TabixReader tabixReader, Gtf gtf, String transcriptId)
-            throws IOException, RocksDBException {
-        Map<String, String> gtfAttributes = gtf.getAttributes();
-
-        // To match Ensembl, we set the ID as transcript+version. This also matches the Ensembl website.
-        String transcriptIdWithVersion = transcriptId + "." + gtfAttributes.get("transcript_version");
-        String biotype = gtfAttributes.get("transcript_biotype") != null ? gtfAttributes.get("transcript_biotype") : "";
-        String transcriptChromosome = gtf.getSequenceName().replaceFirst("chr", "");
-        List<TranscriptTfbs> transcriptTfbses = getTranscriptTfbses(gtf, transcriptChromosome, tabixReader);
-
-        List<FeatureOntologyTermAnnotation> ontologyAnnotations = getOntologyAnnotations(indexer.getXrefs(transcriptId), indexer);
-        TranscriptAnnotation transcriptAnnotation = new TranscriptAnnotation(ontologyAnnotations, indexer.getConstraints(transcriptId));
-
-        Transcript transcript = new Transcript(transcriptIdWithVersion, gtfAttributes.get("transcript_name"), transcriptChromosome,
-                gtf.getStart(), gtf.getEnd(), gtf.getStrand(), biotype, "KNOWN",
-                0, 0, 0, 0, 0,
-                indexer.getCdnaFasta(transcriptIdWithVersion), "", "", "",
-                gtfAttributes.get("transcript_version"), SOURCE, new ArrayList<>(), indexer.getXrefs(transcriptId), transcriptTfbses,
-                new HashSet<>(), transcriptAnnotation);
-
-        // Adding Ids appearing in the GTF to the xrefs is required, since for some unknown reason the ENSEMBL
-        // Perl API often doesn't return all genes resulting in an incomplete xrefs.txt file. We must ensure
-        // that the xrefs array contains all ids present in the GTF file
-        addGtfXrefs(transcript, gene, gtfAttributes);
-
-        // Add HGNC ID mappings, with this we can know which Ensembl and Refseq transcripts match to HGNC ID
-        String hgncId = indexer.getHgncId(gene.getName());
-        if (StringUtils.isNotEmpty(hgncId)) {
-            transcript.getXrefs().add(new Xref(hgncId, "hgnc_id", "HGNC ID"));
-        }
-
-        // Add MANE Select mappings, with this we can know which Ensembl and Refseq transcripts match according to MANE
-        for (String suffix: Arrays.asList("refseq", "refseq_protein")) {
-            String maneRefSeq = indexer.getMane(transcriptIdWithVersion, suffix);
-            if (StringUtils.isNotEmpty(maneRefSeq)) {
-                transcript.getXrefs().add(new Xref(maneRefSeq, "mane_select_" + suffix,
-                        "MANE Select RefSeq" + (suffix.contains("_") ? " Protein" : "")));
+                logger.info(DATA_ALREADY_BUILT, getDataName(REFSEQ_DATA) + " gene");
            }
        }


The condition if (1 == 1) is always true and suggests incomplete or placeholder code. This should either be replaced with a proper condition or the block should be unconditional.

Copilot · 2025-12-19T10:21:58Z

cellbase-lib/src/main/java/org/opencb/cellbase/lib/builders/GeneBuilder.java

+        if (1 == 1) {
+            if (isHSapiens || isDataSupported(configuration.getDownload().getManeSelect(), prefixId)) {
+                dataList.add(MANE_SELECT_DATA);
            }


The condition if (1 == 1) is always true and suggests incomplete or placeholder code. This should either be replaced with a proper condition or the block should be unconditional.

The merge-base changed after approval.

jtarraga requested a review from imedina July 1, 2024 06:37

j-coll changed the title ~~TASK-5564 Update data sources for CellBase 6.2~~ TASK-5564 - Update data sources for CellBase 6.2 Jul 1, 2024

jtarraga and others added 28 commits July 24, 2024 16:10

lib: fix minor sonnar issue, #TASK-5575, #TASK-5564

4ba788d

lib: improve protein downloader by checking if data is already downlo…

6fc7129

…aded, #TASK-5575, #TASK-5564

lib: improve variation downloader by checking if data is already down…

8ed0e0d

…loaded, #TASK-5575, #TASK-5564

lib: fix variation folder in downloader, #TASK-5575, #TASK-5564

1442766

core: remove DISGENET, #TASK-5575, #TASK-5564

e48d27d

lib: improve gene downloader, removing DISGENET, fixing sonnar issues…

642935a

…, #TASK-5575, #TASK-5564

lib: fix command line to execute Perl script, #TASK-5575, #TASK-5564

8030b02

lib: add files generated by scripts in the version JSON files, #TASK-…

e17e51d

…5575, #TASK-5564

lib: improve genome builder by checking files, and fixing sonnar issu…

733cade

…es, #TASK-5576, #TASK-5564

lib: take into account the parameter --keep when gunzip, #TASK-5576, …

ddc1056

…#TASK-5564

lib: improve conservation builder by adding checks, log messages and …

8c6dc78

…fixing sonnar issues, #TASK-5576, #TASK-5564

lib: add support for multi-species, checks and log messages in the re…

847f835

…peats builder, #TASK-5576, #TASK-5564

lib: add support for multi-species, checks and log messages in regula…

b0d1c67

…tion builder, #TASK-5576, #TASK-5564

Merge branch 'release-6.2.x' into TASK-6399

280bb79

lib: fix protein builder, #TASK-5576, #TASK-5564

039aa81

lib: fix gene downloader for RefSeq files, #TASK-5575, #TASK-5564

7f77dec

Prepare Port patch #TASK-6515

b52c83d

Merge branch 'release-6.2.x' into TASK-6515

1c1d8a2

lib: improve gene (Ensembl/RefSeq) builder by supporting multi-specie…

0eb898e

…s (e.g., mmusculus), #TASK-6426, #TASK-5564

lib: fix sonnar issues, #TASK-5576, #TASK-5564

1d47fd9

lib: add variant and variant_structural_variations in the configurati…

7fbc054

…on file for species (e.g., mmusculus), and update the variant downloader according to these changes, #TASK-6426, #TASK-5564

app: improve CellBase loader by creating a new function to be reused …

d483dcf

…by the different data, e.g., repeats, #TASK-6142, #TASK-5564

lib: improve genome sequence and info loader, #TASK-6142, #TASK-5564

7f62ce7

- Re-using the fucntion loadJsonFile - Adding a mongodb index for the collection genome info - Adding log messages

app: update CellBase loader for conservation data, #TASK-6142, #TASK-…

0602bba

…5564

app: update CellBase loader for genes and proteins according to the p…

2b4fbeb

…revious changes, #TASK-6142, #TASK-5564

lib: add VariantBuilder to generate the variation JSON files from VCF…

d693f57

… files, and rename some constants, #TASK-5776, #TASK-5564

app: update the CellBase loader for variation data according to the l…

38400c1

…atest changes, #TASK-6142, #TASK-5564

app: add check before building variation data, #TASK-5776, #TASK-5564

3117337

j-coll and others added 25 commits November 7, 2025 15:38

cicd: Add missing dependency_repos. #TASK-7809

6ef97fa

lib: reduce batch size for PubMed data when loading, #TASK-6142, #TAS…

edf4515

…K-5564

lib: fix sonnar issues, #TASK-6142, #TASK-5564

7d1f297

lib: Fix avro native types check. #TASK-7809

c204863

Merge pull request #736 from opencb/TASK-7809

51ae209

TASK-7809 - Upgrade avro version from 1.9.1 to 1.11.4

Merge branch 'release-6.x.x' into TASK-5564

46de716

And fix checkstyle after merging

core: fix configuration file for JUnit test, #TASK-5564

604e42e

lib: fix loader (PubMed), #TASK-6142, #TASK-5564

28afbdd

server: improve default data releases for multiple species, #TASK-5564

abf94a0

server: improve messages in endpoint /meta/about, #TASK-5564

5addb64

cicd: added to test-analysis -DCELLBASE.WAR.NAME=cellbase #TASK-5564

c135e05

cicd: added to test-analysis -DCELLBASE.WAR.NAME=cellbase #TASK-5564

b6e7363

lib: use estimatedCount to speedup count queries, #TASK-5564

6f907ff

Merge branch 'TASK-5564' of https://github.com/opencb/cellbase into T…

1a60716

…ASK-5564

core: use the env. variable CELLBASE_SECRET_KEY, #TASK-8046, #TASK-5564

9b97afc

lib: add more indexes for the collection 'pubmed', #TASK-5564

0750f55

server: add some admin/endpoints to API key management, TASK-7912, #T…

f9129c4

…ASK-5564

lib: add DbSnpDownloader, and remove building and loading dbSNP data …

167dcf1

…from the variantion processing, #TASK-5564

lib: backwards compatibility, #TASK-5564

31286ce

lib: fix typo, #TASK-5564

0a2d312

lib: catch exceptions in the different consequence type calculators, …

210a4d7

…#TASK-5564

lib: fix PGS include, and remove some System.out, #TASK-5564

3873120

app: implement a Python script to compare performances, #TASK-5564

9fe6dec

app: add Python script to get metrics for a given variant annotation …

45b2c44

…query, #TASK-5564

lib: disable polygenic scores and mirna targets, to be enables in fut…

8f2da9c

…ure releases, #TASK-5564

j-coll requested a review from Copilot December 19, 2025 10:19

Copilot AI reviewed Dec 19, 2025

View reviewed changes

gpveronica requested a review from dapregi December 19, 2025 12:05

dapregi previously approved these changes Dec 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TASK-5564 - Update data sources for CellBase 6.2 #696

TASK-5564 - Update data sources for CellBase 6.2 #696

Uh oh!

jtarraga commented Jul 1, 2024

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 19, 2025

Uh oh!

Copilot AI Dec 19, 2025

Uh oh!

Copilot AI Dec 19, 2025

Uh oh!

Copilot AI Dec 19, 2025

Uh oh!

Copilot AI Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

8 participants

TASK-5564 - Update data sources for CellBase 6.2 #696

Are you sure you want to change the base?

TASK-5564 - Update data sources for CellBase 6.2 #696

Uh oh!

Conversation

jtarraga commented Jul 1, 2024

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

8 participants