
Background:
Large scale NIAID funded study (Successful Clinical Response In Pneumonia Treatment) of hospitalized patients with severe pneumonia held across several years at Northwestern Memorial Hospital is collecting multiple types of patients’ data including: Single cell RNAseq, flow cytometry of bronchoalveolar lavage fluid, cytokines abundance information, electronic health records that include many clinical and biological factors such as gender of patient or days since intubation of patient in icu. This data might be useful when answering following questions:
- Is immune response to various pathogens pre-programmed or adaptive?
- Dependence of the response from secondary/primary infection?
- Can we predict ventilator acquired pneumonia onset within the next 7 days in intensive care unit?
- Can we predict ventilator acquired pneumonia outcome?
- How do long COVID patients stratify/cluster based on scRNA-seq?
Recent advances in biomedical deep learning introduced several useful tools for exploring and integrating multimodal biological data. These tools can be used to address questions above.
Methods:
We have collected a diverse dataset of 1741 samples generated from bronchoalveolar lavage fluid of patients with lung diseases and samples from healthy volunteers. For 263 samples additional single cell RNASeq data was provided. Several Deep Learning methods were selected for healthy vs SARS-Cov2 conditions comparison: including factor decomposition methods,latent perturbation methods, single cell large language models. These methods were used to discover gene expression patterns between different conditions to identify genetic drivers of researched diseases. A comparison benchmark was introduced to fine-tune discussed models and make sense of the results.
Results:
I have performed state-of-the-art differential gene expression analysis using pseudo bulk subsampling technique. Produced sets of genes that differed across conditions were used for models’ benchmarking. Finally, I have trained a gradient-boosting-based model to select most informative deep learning method for predicting clinical outcome of patient.
Conclusions:
We present a comprehensive study of severe-pneumonia patients using deep learning and traditional methods. Using clinical samples acquired in translational research settings, we have identified most informative methods for predicting ventilator pneumonia onset, acquiring pathogen associations with clinical outcomes and determining pathogen-associated immune response.