Skip to content

BayesianOptimization for parameter selection #374

@SeZuNo

Description

@SeZuNo

Hi,
Thank you so much for this wonderful package!

I'm currently analyzing different subpopulations and MiloR performs well over Immune and Epithelial cells compared to the Seurat clustering (after Harmony integration). In Stromal cells (41K and very homogeneus) we have troubles with Seurat clustering, as none of the detected follow literature markers, for that reason tried using the groups of neighborhoods from Milo as Clusters and at first looked like the 15 detected followed some real markers, ideally they should be groupped into 6-8 (NhoodGroups) Clusters.

Does this makes sense to you? to use the kNN graph from MiloR to predict clusters?
and use the graph to plot a umap projection of the nhoodgroups? (I tested that the usual run of Milo projected over the previously calculated UMAP show nhoodgroups all scattered, whereas the UMAP calculated from the kNN shows consistent closeness of the nhoodgroups, these then would be the clusters)

To obtain the best parameters (k, d and prop) in order to get DA < FDR .2 the optimization ran over a continum space, meaning that the best parameter combinations for k and d were decimals, I didn't realize until later of this detail. Indeed the documentation says k and d are integers but when feeding the Optimized parameters there were no warnings, this means that the values are coerced at some point.

Is there any degree of confidence to conitnue with the decimal parameters or they have to be integers? there are at least 3 functions that use explicit k,d parameters: buildGraph(), makeNhoods(), calcNhoodDistance(). I don't know if under the hood there are more functions involved, I guess so and the question would be, do they all coerce the decimal in the same way? (ie floor() round())

I've ran the Optimization in a discrete space and there's no such k,d,prop combination for which there are DAs found.

The data consist of 2 conditions, 7 samples and is endometrial data.
K is searched between {20,40}.
Whereas d {30,50}
And prop {0.1 - 0.3}

Any pointers would be greatly appreciated.

Thank you so much!
sessionInfo()

R version 4.4.2 (2024-10-31)
Platform: x86_64-conda-linux-gnu
Running under: Ubuntu 22.04.5 LTS

Matrix products: default
BLAS/LAPACK: /home/ubuntu/anaconda3/envs/R44/lib/libopenblasp-r0.3.29.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=C.UTF-8          LC_NUMERIC=C             
 [3] LC_TIME=C.UTF-8           LC_COLLATE=C.UTF-8       
 [5] LC_MONETARY=C.UTF-8       LC_MESSAGES=C.UTF-8      
 [7] LC_PAPER=C.UTF-8          LC_NAME=C.UTF-8          
 [9] LC_ADDRESS=C.UTF-8        LC_TELEPHONE=C.UTF-8     
[11] LC_MEASUREMENT=C.UTF-8    LC_IDENTIFICATION=C.UTF-8

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] ParBayesianOptimization_1.2.6 scater_1.34.0                
 [3] scuttle_1.16.0                SingleCellExperiment_1.28.1  
 [5] miloR_2.3.1                   AUCell_1.28.0                
 [7] patchwork_1.3.0               CellChat_2.1.2               
 [9] igraph_2.0.3                  ComplexHeatmap_2.22.0        
[11] DEGreport_1.42.0              apeglm_1.28.0                
[13] pheatmap_1.0.12               pasilla_1.34.0               
[15] DEXSeq_1.52.0                 AnnotationDbi_1.68.0         
[17] BiocParallel_1.40.0           ROTS_1.34.0                  
[19] DESeq2_1.46.0                 SummarizedExperiment_1.36.0  
[21] Biobase_2.66.0                MatrixGenerics_1.18.1        
[23] matrixStats_1.5.0             GenomicRanges_1.58.0         
[25] GenomeInfoDb_1.42.3           IRanges_2.40.1               
[27] S4Vectors_0.44.0              BiocGenerics_0.52.0          
[29] edgeR_4.4.2                   limma_3.62.2                 
[31] openxlsx_4.2.8                xlsx_0.6.5                   
[33] data.table_1.17.0             Matrix_1.7-3                 
[35] viridis_0.6.5                 viridisLite_0.4.2            
[37] BUSpaRse_1.20.0               scales_1.3.0                 
[39] RColorBrewer_1.1-3            cowplot_1.1.3                
[41] ggpubr_0.6.0                  tidyseurat_0.8.2             
[43] ttservice_0.4.1               lubridate_1.9.4              
[45] forcats_1.0.0                 stringr_1.5.1                
[47] purrr_1.0.4                   readr_2.1.5                  
[49] tidyr_1.3.1                   tibble_3.2.1                 
[51] ggplot2_3.5.1                 tidyverse_2.0.0              
[53] dyno_0.1.2                    dynwrap_1.2.4                
[55] dynplot_1.1.2                 dynmethods_1.0.5.9000        
[57] dynguidelines_1.0.1           dynfeature_1.0.1             
[59] DoubletFinder_2.0.4           Seurat_5.2.1                 
[61] SeuratObject_5.0.2            sp_2.2-0                     
[63] dplyr_1.1.4                  

loaded via a namespace (and not attached):
  [1] graph_1.84.1                ica_1.0-3                  
  [3] plotly_4.10.4               Formula_1.2-5              
  [5] maps_3.4.2.1                zlibbioc_1.52.0            
  [7] tidyselect_1.2.1            bit_4.6.0                  
  [9] doParallel_1.0.17           clue_0.3-66                
 [11] lattice_0.22-6              rjson_0.2.23               
 [13] blob_1.2.4                  rngtools_1.5.2             
 [15] S4Arrays_1.6.0              parallel_4.4.2             
 [17] png_0.1-8                   cli_3.6.5                  
 [19] registry_0.5-1              ProtGenerics_1.38.0        
 [21] goftest_1.2-3               BiocIO_1.16.0              
 [23] lhs_1.2.0                   BiocNeighbors_2.0.0        
 [25] ggnetwork_0.5.13            uwot_0.2.3                 
 [27] curl_6.2.1                  mime_0.12                  
 [29] evaluate_1.0.3              stringi_1.8.7              
 [31] backports_1.5.0             desc_1.4.3                 
 [33] dyndimred_1.0.4             XML_3.99-0.17              
 [35] httpuv_1.6.15               magrittr_2.0.3             
 [37] rappdirs_0.3.3              splines_4.4.2              
 [39] ggraph_2.2.1                ggbeeswarm_0.7.2           
 [41] babelwhale_1.2.0            sctransform_0.4.1          
 [43] logging_0.10-108            DBI_1.2.3                  
 [45] jquerylib_0.1.4             genefilter_1.88.0          
 [47] withr_3.0.2                 emdbook_1.3.13             
 [49] systemfonts_1.2.1           lmtest_0.9-40              
 [51] GSEABase_1.68.0             bdsmatrix_1.3-7            
 [53] tidygraph_1.3.0             rtracklayer_1.66.0         
 [55] BiocManager_1.30.25         htmlwidgets_1.6.4          
 [57] biomaRt_2.62.1              IRkernel_1.3.2             
 [59] ggrepel_0.9.6               statnet.common_4.11.0      
 [61] SparseArray_1.6.2           ranger_0.17.0              
 [63] plyranges_1.26.0            annotate_1.84.0            
 [65] reticulate_1.41.0.1         zoo_1.8-13                 
 [67] GA_3.2.4                    XVector_0.46.0             
 [69] knitr_1.49                  network_1.19.0             
 [71] UCSC.utils_1.2.0            timechange_0.3.0           
 [73] foreach_1.5.2               fansi_1.0.6                
 [75] R.oo_1.27.0                 psych_2.4.12               
 [77] RSpectra_0.16-2             irlba_2.3.5.1              
 [79] fastDummies_1.7.5           lazyeval_0.2.2             
 [81] yaml_2.3.10                 survival_3.8-3             
 [83] scattermore_1.2             crayon_1.5.3               
 [85] RcppAnnoy_0.0.22            IRdisplay_1.1              
 [87] progressr_0.15.1            tweenr_2.0.3               
 [89] later_1.4.1                 ggridges_0.5.6             
 [91] codetools_0.2-20            base64enc_0.1-3            
 [93] GlobalOptions_0.1.2         KEGGREST_1.46.0            
 [95] bbmle_1.0.25.1              Rtsne_0.17                 
 [97] shape_1.4.6.1               Rsamtools_2.22.0           
 [99] filelock_1.0.3              dynparam_1.0.2             
[101] pkgconfig_2.0.3             xml2_1.3.7                 
[103] spatstat.univar_3.1-2       GenomicAlignments_1.42.0   
[105] spatstat.sparse_3.1-0       BSgenome_1.74.0            
[107] gridBase_0.4-7              xtable_1.8-4               
[109] hwriter_1.3.2.1             car_3.1-3                  
[111] plyr_1.8.9                  httr_1.4.7                 
[113] tools_4.4.2                 globals_0.16.3             
[115] beeswarm_0.4.0              broom_1.0.7                
[117] nlme_3.1-167                dbplyr_2.5.0               
[119] assertthat_0.2.1            digest_0.6.37              
[121] numDeriv_2016.8-1.1         farver_2.1.2               
[123] tzdb_0.4.0                  AnnotationFilter_1.30.0    
[125] reshape2_1.4.4              glue_1.8.0                 
[127] cachem_1.1.0                BiocFileCache_2.14.0       
[129] polyclip_1.10-7             dynutils_1.0.11            
[131] generics_0.1.4              proxyC_0.4.1               
[133] Biostrings_2.74.0           ggalluvial_0.12.5          
[135] mvtnorm_1.3-3               parallelly_1.42.0          
[137] ConsensusClusterPlus_1.70.0 mnormt_2.1.1               
[139] statmod_1.5.0               RcppHNSW_0.6.0             
[141] ScaledMatrix_1.14.0         carData_3.0-5              
[143] pbapply_1.7-2               httr2_1.1.1                
[145] fields_16.3.1               spam_2.11-1                
[147] gtools_3.9.5                graphlayouts_1.2.2         
[149] ggsignif_0.6.4              gridExtra_2.3              
[151] shiny_1.10.0                GenomeInfoDbData_1.2.13    
[153] R.utils_2.13.0              lmds_0.1.0                 
[155] RCurl_1.98-1.16             memoise_2.0.1              
[157] R.methodsS3_1.8.2           future_1.34.0              
[159] svglite_2.1.3               reshape_0.8.9              
[161] RANN_2.6.2                  spatstat.data_3.1-4        
[163] cluster_2.1.8               spatstat.utils_3.1-2       
[165] hms_1.1.3                   fitdistrplus_1.2-2         
[167] munsell_0.5.1               colorspace_2.1-1           
[169] FNN_1.1.4.1                 rlang_1.1.6                
[171] ggdendro_0.2.0              DelayedMatrixStats_1.28.0  
[173] sparseMatrixStats_1.18.0    dotCall64_1.2              
[175] ggforce_0.4.2               circlize_0.4.16            
[177] dbscan_1.2.3                xfun_0.51                  
[179] coda_0.19-4.1               sna_2.8                    
[181] remotes_2.5.0               iterators_1.0.14           
[183] abind_1.4-8                 rJava_1.0-11               
[185] carrier_0.1.1               geneplotter_1.84.0         
[187] repr_1.1.7                  bitops_1.0-9               
[189] ps_1.9.0                    promises_1.3.2             
[191] RSQLite_2.3.9               DelayedArray_0.32.0        
[193] pbdZMQ_0.3-13               compiler_4.4.2             
[195] prettyunits_1.2.0           beachmat_2.22.0            
[197] listenv_0.9.1               Rcpp_1.0.14                
[199] BiocSingular_1.22.0         tensor_1.5                 
[201] MASS_7.3-65                 progress_1.2.3             
[203] uuid_1.2-1                  spatstat.random_3.3-2      
[205] R6_2.6.1                    fastmap_1.2.0              
[207] rstatix_0.7.2               vipor_0.4.7                
[209] ensembldb_2.30.0            ROCR_1.0-11                
[211] rsvd_1.0.5                  gtable_0.3.6               
[213] KernSmooth_2.23-26          miniUI_0.1.1.1             
[215] deldir_2.0-4                htmltools_0.5.8.1          
[217] bit64_4.6.0-1               spatstat.explore_3.3-4     
[219] lifecycle_1.0.4             zip_2.3.2                  
[221] processx_3.8.6              restfulr_0.0.15            
[223] xlsxjars_0.6.1              sass_0.4.9                 
[225] vctrs_0.6.5                 zeallot_0.1.0              
[227] spatstat.geom_3.3-5         NMF_0.28                   
[229] pracma_2.4.4                future.apply_1.11.3        
[231] bslib_0.9.0                 pillar_1.10.2              
[233] GenomicFeatures_1.58.0      locfit_1.5-9.12            
[235] jsonlite_1.9.1              DiceKriging_1.6.0          
[237] GetoptLong_1.0.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions