Gene regulatory network inference (with prior knowledge)

### Task motivation

Gene Regulatory Network (GRN) inference is pivotal in systems biology, offering profound insights into the complex mechanisms that govern gene expression and cellular behavior. These insights are crucial for advancing our understanding of biological processes and have significant implications in medical research, particularly in developing targeted therapies and understanding disease mechanisms.

#### Computational Challenges
Despite its importance, GRN inference from single-cell RNA-Seq data is challenged by the high dimensionality of the data, inherent data noise, sparsity of the data, sparsity of the networks to be inferred, the lack of known negative edges in the GRN (positive unlabeled setting) and the ambiguity of possible causal explanations for the data. Available computational approaches often struggle with these issues, leading to inaccurate or overfitted models.

#### Research Gap
Current methods range from statistical correlations to advanced machine learning, each with limitations in terms of accuracy, data requirements, and interpretability. Multiple benchmarking studies exist, differing in the choices of evaluation, such as the way of negative sampling, metrics used and the choice of synthetic vs experimental data. What is missing is a more standardized way of benchmarking using biologically meaningful metrics.

### Task description

The task focuses on the inference of GRNs from scRNA-Seq data. It is divided into two subtasks based on the availability of prior knowledge:
1. GRN Inference **without** prior knowledge: Inferring GRN solely from scRNA-Seq data.
2. GRN Inference **with** prior knowledge: Inferring GRN from scRNA-Seq data using an additional prior knowledge graph (a subset of edges from the ground truth GRN).

#### Input Data
- For Subtask 1: Normalized and preprocessed scRNA-Seq data
- For Subtask 2: In addition to the scRNA-Seq data, a subset of given edges of the GRN as prior knowledge

#### Expected Output
The output for both subtasks is a predicted GRN, represented as a graph where nodes are genes and edges indicate regulatory interactions. The quality of the predicted networks can be evaluated in two main ways:
1. Binary Classification: Each potential interaction (edge) is classified as either present or absent ([like this](https://doi.org/10.1038%2Fs41592-019-0690-6))
2. Topological Evaluation: The overall structure and properties of the predicted network are assessed ([like this](https://doi.org/10.1093/bioinformatics/btae267))

### Proposed ground-truth in datasets

1. Synthetic, Curated and Experimental datasets from ([BEELINE](https://doi.org/10.1038%2Fs41592-019-0690-6))
2. Experimental datasets from ([this paper](https://doi.org/10.1093%2Fg3journal%2Fjkad004))

### Initial set of methods to implement

1. MLPs
3. Graph Neural Network based diffusion models (GCN / GAT)

### Proposed control methods

1. Pearson / Spearman correlation
2. Random predictor

### Proposed Metrics

#### Binary classification:
1. Link-equality metrics (AUROC / AUPRC)
2. Node-equality metrics (Mean Average Precision)
3. Precision@Top k

#### Topological evaluation:
1. Information Exchange (Average Shortest Path Length, Global and Local Efficiency)
2. Hub Topology (Assortativity, Clustering Coefficient, Centralization)
3. Hub Identification (PageRank, Betweenness, Radiality, Centrality)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gene regulatory network inference (with prior knowledge) #900

Task motivation

Computational Challenges

Research Gap

Task description

Input Data

Expected Output

Proposed ground-truth in datasets

Initial set of methods to implement

Proposed control methods

Proposed Metrics

Binary classification:

Topological evaluation:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Gene regulatory network inference (with prior knowledge) #900

Description

Task motivation

Computational Challenges

Research Gap

Task description

Input Data

Expected Output

Proposed ground-truth in datasets

Initial set of methods to implement

Proposed control methods

Proposed Metrics

Binary classification:

Topological evaluation:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions