(1084-A) Learning from the JUMP CP pilot data: Insights for platform development
Wednesday, May 24, 2023
13:30 - 14:30 CET
Location: Hall 3
Abstract: There is a growing interest in adopting image-based phenotypic profiling for target and drug discovery processes. Such high content approaches yield rich phenotypic data that can reveal critical holistic insights into mechanisms of candidate drug action and toxicity. Much of the growth has been driven by the use of Cell Painting, a standardized high content profiling method originally developed at the Broad Institute. The JUMP (Joint Undertaking in Morphological Profiling) Cell Painting (CP) consortium has been established to generate a large public reference Cell Painting dataset with the aim to create a new phenotypic approach to drug discovery. Here, we have focused on the preliminary JUMP CP dataset, which includes A549 and U2OS cell lines treated with chemical and genetic (CRISPR and ORF) perturbations to explore the CellProfiler output features capturing the variability in this data. We show how our web-based data analytics platform, StratoMineR, can be used to evaluate phenotypic data holistically.
One of the biggest barriers to analyzing the JUMP CP data is the vast amount of features; (5792 in the pilot set). Therefore, we used Spearman's correlation to exclude highly redundant features and narrowed down the feature set to 2000. We then performed Principal Component Analysis to reduce the complexity of the data. This allowed us to extract feature loading scores which can be subsequently used for making informed decisions on prioritization to create a smaller feature set. We used this approach for our collaboration with KML Vision for users to explore within their IKOSA AI platform for morphological cell profiling. This is critical not only to reduce computational power, but also to allow a rapid overview of general phenotypic alterations.
We also used the JUMP CP data to make several phenotypic comparisons between two cell lines, and tracked phenotypic drift over various time points and conditions. We examined the subset of data based on compound and CRISPR experiments, and we found more than 100 compounds and CRISPR guides that gave significant and diverse phenotypic Euclidean distance scores from the negative control based on 10 principal components. While our analyses revealed a few phenotypic similarities between compound and CRISPR treatments that share common gene targets, hierarchical and UMAP clustering showed compounds of unrelated gene targets give similar phenotypic outcomes.
Taken together, our analysis demonstrates the importance and feasibility of using a robust and iterative data analytic workflow that can yield important biological insights for further development of Cell Painting assays akin to the JUMP CP.