Overview
The gDRimport
package is a part of the gDR suite. It
helps to prepare raw drug response data for downstream processing. It
mainly contains helper functions for importing/loading/validating dose
response data provided in different file formats.
Use Cases
Test Data
There are currently four test datasets that can be used to see what’s the expected input data for the gDRimport.
# primary test data
td1 <- get_test_data()
summary(td1)
## Length Class Mode
## 1 gdr_test_data S4
td1
## class: gdr_test_data
## slots: manifest_path result_path template_path ref_m_df ref_r1_r2 ref_r1 ref_t1_t2 ref_t1
# test data in Tecan format
td2 <- get_test_Tecan_data()
summary(td2)
## Length Class Mode
## m_file 1 -none- character
## r_files 1 -none- character
## t_files 1 -none- character
## ref_m_df 1 -none- character
## ref_r_df 1 -none- character
## ref_t_df 1 -none- character
# test data in D300 format
td3 <- get_test_D300_data()
summary(td3)
## Length Class Mode
## f_96w 6 -none- list
## f_384w 6 -none- list
# test data obtained from EnVision
td4 <- get_test_EnVision_data()
summary(td4)
## Length Class Mode
## m_file 1 -none- character
## r_files 28 -none- character
## t_files 2 -none- character
## ref_l_path 1 -none- character
Load data
The load_data
is the key function. It wraps
load_manifest
, load_templates
and
load_results
functions and supports different file
formats.
ml <- load_manifest(manifest_path(td1))
summary(ml)
## Length Class Mode
## data 4 data.table list
## headers 27 -none- list
t_df <- load_templates(template_path(td1))
summary(t_df)
## WellRow WellColumn Gnumber Concentration
## Length:768 Length:768 Length:768 Length:768
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
## Gnumber_2 Concentration_2 Template
## Length:768 Length:768 Length:768
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
r_df <- suppressMessages(load_results(result_path(td1)))
summary(r_df)
## Barcode WellRow WellColumn ReadoutValue
## Length:4587 Length:4587 Min. : 1.00 Min. : 12627
## Class :character Class :character 1st Qu.: 6.50 1st Qu.: 67905
## Mode :character Mode :character Median :12.00 Median : 140865
## Mean :12.49 Mean : 263996
## 3rd Qu.:18.00 3rd Qu.: 324707
## Max. :24.00 Max. :2423054
## BackgroundValue
## Min. :332.0
## 1st Qu.:351.0
## Median :374.0
## Mean :453.2
## 3rd Qu.:570.0
## Max. :704.0
l_tbl <-
suppressMessages(
load_data(manifest_path(td1), template_path(td1), result_path(td1)))
summary(l_tbl)
## Length Class Mode
## manifest 4 data.table list
## treatments 7 data.table list
## data 5 data.table list
PRISM
PRISM, the Multiplexed cancer cell line screening platform, facilitates rapid screening of a broad spectrum of drugs across more than 900 human cancer cell line models, employing a high-throughput, multiplexed approach. Publicly available PRISM data can be downloaded from the DepMap website (DepMap).
The gDRimport
package provides support for processing
PRISM data at two levels: LEVEL5 and LEVEL6.
LEVEL5 Data: This format encapsulates all information about drugs, cell lines, and viability within a single file. To process LEVEL5 PRISM data, you can use the
convert_LEVEL5_prism_to_gDR_input()
function. This function not only transforms and cleans the data but also executes the gDR pipeline for further analysis.LEVEL6 Data: In LEVEL6, PRISM data is distributed across three separate files:
prism_data: containing collapsed log fold change data for viability assays. cell_line_data: providing information about cell lines. treatment_data: containing treatment data.
Processing LEVEL6 PRISM data can be accomplished using the
convert_LEVEL6_prism_to_gDR_input()
function, which
requires paths to these three files as input arguments.
Processing LEVEL5 PRISM Data
To process LEVEL5 PRISM data, you can use the following function:
convert_LEVEL5_prism_to_gDR_input("path_to_file")
Replace “path_to_file” with the actual path to your LEVEL5 PRISM data file. This function will handle the transformation, cleaning, and execution of the gDR pipeline automatically.
Processing LEVEL6 PRISM Data
To process LEVEL6 PRISM data, you can use the following function:
convert_LEVEL6_prism_to_gDR_input("prism_data_path", "cell_line_data_path", "treatment_data_path")
Replace “prism_data_path”, “cell_line_data_path”, and “treatment_data_path” with the respective paths to your LEVEL6 PRISM data files.
SessionInfo
## R version 4.3.0 (2023-04-21)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] gDRimport_1.5.4 BiocStyle_2.30.0
##
## loaded via a namespace (and not attached):
## [1] SummarizedExperiment_1.32.0 xfun_0.49
## [3] bslib_0.8.0 Biobase_2.62.0
## [5] lattice_0.21-8 vctrs_0.6.5
## [7] tools_4.3.0 bitops_1.0-9
## [9] stats4_4.3.0 tibble_3.2.1
## [11] fansi_1.0.6 pkgconfig_2.0.3
## [13] Matrix_1.6-5 data.table_1.16.4
## [15] checkmate_2.3.2 desc_1.4.3
## [17] S4Vectors_0.40.2 gDRutils_1.5.3
## [19] rematch_2.0.0 readxl_1.4.3
## [21] assertthat_0.2.1 lifecycle_1.0.4
## [23] GenomeInfoDbData_1.2.11 compiler_4.3.0
## [25] textshaping_0.3.7 GenomeInfoDb_1.38.8
## [27] htmltools_0.5.8.1 sass_0.4.9
## [29] RCurl_1.98-1.16 yaml_2.3.10
## [31] pillar_1.9.0 pkgdown_2.0.7
## [33] crayon_1.5.3 jquerylib_0.1.4
## [35] DelayedArray_0.28.0 cachem_1.1.0
## [37] abind_1.4-8 digest_0.6.37
## [39] purrr_1.0.2 bookdown_0.37
## [41] fastmap_1.2.0 grid_4.3.0
## [43] cli_3.6.3 SparseArray_1.2.4
## [45] magrittr_2.0.3 S4Arrays_1.2.1
## [47] utf8_1.2.4 backports_1.5.0
## [49] rmarkdown_2.29 lambda.r_1.2.4
## [51] XVector_0.42.0 matrixStats_1.4.1
## [53] futile.logger_1.4.3 cellranger_1.1.0
## [55] ragg_1.2.7 memoise_2.0.1
## [57] evaluate_1.0.1 knitr_1.49
## [59] GenomicRanges_1.54.1 IRanges_2.36.0
## [61] rlang_1.1.4 futile.options_1.0.1
## [63] glue_1.8.0 BiocManager_1.30.22
## [65] formatR_1.14 BiocGenerics_0.48.1
## [67] jsonlite_1.8.9 R6_2.5.1
## [69] MatrixGenerics_1.14.0 systemfonts_1.0.5
## [71] fs_1.6.5 zlibbioc_1.48.2