Projects
Coding Projects
Title: Multimodal Perturbation Modeling for MoA Prediction [report] [code]
Duration: Sep. 2025 – Dec. 2025
- Generated embeddings for transcriptomic and cellular imaging data using representation learning.
- Combined representations with late fusion and refined multimodal embeddings via contrastive loss.
- Applied KNN, MLP, and GMM classifiers to evaluate embeddings and predict perturbation MoA.
- Quantified model performance across embedding aggregation and modality combination variations.
Title: Deep Learning for Peripheral Blood Cell Classification [report] [poster] [code]
Duration: Apr. 2025 – May 2025
- Designed and trained ResNet based CNN architectures to predict blood cell classes from images.
- Final models achieved 98.8% accuracy and 0.999 AUC which surpassed multiple published baselines.
- Conducted hyperparameter tuning and evaluated generalizability to ensure stable performance.
- Used attention mechanisms and applied Grad-CAM for interpretability to validate biological basis.
Title: Pancreatic Cancer Differential Expression and Network Analysis [report] [code]
Duration: Nov. 2024 – Dec. 2024
- Analyzed differential gene expression in tumors vs. normal tissue using TCGA and GTEx data.
- Performed GSEA with GO and KEGG to identify dysregulated biological processes and pathways.
- Conducted network analysis with PANDA and KEGG to explore gene interactions and pathways.
- Identified distinguishing signatures and validated key genes specific to early stage pancreatic cancer.
Title: Heart Attack and BMI Associations with Health Factors [report] [code]
Duration: Nov. 2024 – Dec. 2024
- Analyzed 450,000 CDC 2022 BRFSS survey records using a variety of machine learning models.
- Conducted data wrangling and multiple imputation with chained equations to handle missing data.
- Identified lifestyle factors and comorbidities associated with heart attack, BMI, and both conditions.
- Applied R for statistical modeling, visualization, and ensuring reproducibility throughout the analysis.
Title: Computational Modeling of HIV Drug Efficacy [code]
Duration: Sep. 2023 – Nov. 2023
- Investigated compounds targeting CCR5 as part of HIV treatments using Python and ChEMBLdb.
- Calculated Lipinski Molecular Descriptors to indicate bioactivity and pIC50 to indicate efficacy.
- Used PaDEL descriptors to identify properties and fingerprints of CCR5 targeting drug molecules.
- Developed machine learning models to predict pIC50 and bioactivity to gauge structural efficacy.
Title: Machine Learning Pipeline for Recipe Interaction Prediction [report] [code]
Duration: Nov. 2022 – Dec. 2022
- Predicted user interaction and rating left by user given a user-recipe id pair, using 880,000 data points.
- Conducted EDA, feature engineering, and made models using heuristics, regression, and NLP.
- Resulted in accuracy of 0.977 and 0.711, from baselines 0.457 for interaction and rating respectively.
- Utilized Python and tools including pandas, numpy, scipy, sklearn, nltk, seaborn, and matplotlib.
Title: Phylogenetic Analysis of Malarial Strains [report] [code]
Duration: May 2022 – June 2022
- Developed custom phylogenetic analysis pipelines for 25 Plasmodium strains using Bash and Python.
- Leveraged web scraping, ClustalW, and RaxML to reconstruct phylogenetic trees from the original paper.
- Reverse-engineered paper software tools and parameters by documenting outcomes and discrepancies.
- Assessed original pipeline limitations and provided insights into reproducibility and result consistency.
Outreach Projects
Title: Lead Volunteer and Ambassador for Female Health and Wellbeing [organization]
Duration: July 2016 – June 2024
- Created data driven outreach and intervention programs to address female health in South Asia.
- Led the production and distribution of biodegradable pads in underserved South Asian communities.
- Helped raise over $2M for female health, education, and sustainability initiatives focused on India.
- Worked with Indian organizations to care for girls with special needs and survivors of sexual violence.