-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
something like this?
@slobentanzer @scottgigante-immunai Regardless of how we decide to resolve this issue, I'm sure we can already many items we can define.
Originally posted by @rcannood in #247 (comment)
For instance:
Common dataset workflow
graph LR
classDef component fill:#decbe4,stroke:#333,color:#000
classDef anndata fill:#d9d9d9,stroke:#333,color:#000
normalization:::group
dataset_processors:::group
raw_dataset["Raw dataset"]:::anndata
common_dataset[Common<br/>dataset]:::anndata
dataset_loader[/Dataset<br/>loader/]:::component
subgraph normalization [Normalization methods]
log_cpm[/"Log CPM"/]:::component
l1_sqrt[/"L1 sqrt"/]:::component
log_scran_pooling[/"Log scran<br/>pooling"/]:::component
sqrt_cpm[/Sqrt CPM/]:::component
end
subgraph dataset_processors[Dataset processors]
pca[/PCA/]:::component
hvg[/HVG/]:::component
knn[/KNN/]:::component
end
dataset_loader --> raw_dataset --> log_cpm & l1_sqrt & log_scran_pooling & sqrt_cpm --> pca --> hvg --> knn --> common_dataset
Task-specific benchmarking workflow
graph LR
classDef component fill:#decbe4,stroke:#333,color:#000
classDef anndata fill:#d9d9d9,stroke:#333,color:#000
common_dataset[Common<br/>dataset]:::anndata
dataset_processor[/Dataset<br/>processor/]:::component
solution[Ground-truth]:::anndata
masked_data[Input data]:::anndata
method[/Method/]:::component
control_method[/Control<br/>method/]:::component
output[Prediction]:::anndata
metric[/Metric/]:::component
score[Score]:::anndata
common_dataset --> dataset_processor --> masked_data
dataset_processor --> solution
masked_data --> method --> output
masked_data & solution --> control_method --> output
solution & output --> metric --> score
Discussion
However, this workflow might not be applicable for all tasks.
- Multimodal datasets will have to be processed differently to regular unimodal datasets
- Some tasks don't really have a ground-truth and instead rely on internal scores. IMO these "benchmarks" should not be a part of OpenProblems, since it doesn't really count as a benchmark.
Metadata
Metadata
Assignees
Labels
No labels