-
Notifications
You must be signed in to change notification settings - Fork 87
Description
Is your feature request related to a problem?
We need to store information related to how the funkyheatmap column info (docs) is generated. Right now, there is the name of the component, the overall score (across all datasets and metrics), dataset-specific scores (per dataset, across metrics), metric-specific scores (per metric, across datasets).
For batch integration we want to create additional columns in the column info which show:
- whether a method supports feature, embedding and/or graph outputs
- overall scores per metric type (feature, embedding, graph)
For spatially variable genes, we want to provide different ways of aggregating the datasets because there are too many. E.g. perhaps by tissue or disease.
Describe the solution you'd like
Perhaps the info could be stored in the _viash.yaml, e.g.
info:
column_info:
- geom: text, id: method_name, name: Name
- geom: text, id: method_outputs_feature, name: Outputs feature matrix
- geom: text, id: method_outputs_embedding, name: Outputs embedding
- geom: text, id: method_outputs_graph, name: Outputs graph
- geom: bar, id: overall_feature, name: Overall feature scores # that is, an average of the scores that are relevant for the feature outputs, i.e. the feature + embedding + graph metrics)
- geom: bar, id: overall_embedding, name: Overall embedding scores # average of the embedding + graph metrics
- geom: bar, id: overall_graph, name: Overall graph scores # average of the graph metricsHowever, this also requires being able to specify how values like method_outputs_feature and overall_graph are computed.
In addition, for example, when computing averages per dataset tissue type across metrics, we don't know ahead of time which tissue types there are. That is, we could require the user type to know the tissue types beforehand, but then if more datasets with new tissue types get added, the user would be required to update the values in the _viash.yaml.
Therefore, the computations of this data could be formatted as something like a query of sorts?