Data Scientist

Projects
Jul 2024 - Aug 2024
- Developed a Convolutional Neural Network (CNN) model achieving 95% accuracy in classifying tomato plant diseases.
- Engineered a web application for real-time image classification, processing over 1,500 user-uploaded images.
- Improved disease diagnosis speed by 50%, significantly aiding agricultural productivity.
Jul 2024 - Aug 2024
- Accomplished over 99% prediction accuracy in identifying the impact of review sentiment on rating scores
and user engagement by developing a machine learning regression model trained on categorized review text
and thumbs-up counts.
- Boosted model performance as measured by R² and MAE by transforming user-generated
content into sentiment scores (1–5) and performing extensive feature engineering and text preprocessing.
- Delivered key insights into user behavior by classifying review content into positive, neutral,
and negative, and visualizing trends using Python libraries like Matplotlib and Seaborn.
Jun 2024 - July 2024
- Accomplished generation of 10+ critical business insights as measured by their alignment with ad hoc executive requests,
by writing and optimizing complex SQL queries to support decision-making for a leading hardware company.
- Improved data accessibility and clarity by structuring queries for sales trends, inventory analysis, and customer segmentation,
simulating real-world challenges posed by a data analytics director.
- Demonstrated both technical proficiency and business communication skills through the creation
of clear documentation and rationale behind each SQL query.
Jun 2024 - Aug 2024
- Accomplished 92% classification accuracy in identifying 10+ celebrities from .png images by training a Convolutional
Neural Network (CNN) on a dataset of over 5,000 labeled images.
- Improved model robustness and reduced overfitting by implementing data augmentation techniques and optimizing
with early stopping and dropout layers.
- Delivered a fully functional, user-friendly Streamlit web application that allows users to upload images and
receive real-time predictions, enabling interactive deployment of deep learning models.
Jun 2024 - Aug 2024
- Accomplished 95%+ diagnostic accuracy in predicting diabetes as measured by evaluation on test data by
training and deploying a machine learning model using user inputs such as age, BMI, and glucose levels.
- Enabled self-assessment for over 100+ users by designing an intuitive Streamlit web interface for
real-time, accessible predictions.
- Improved model reliability and reduced uncertainty by selecting key features and fine-tuning classification
algorithms, leading to more confident health risk assessments.
July 2024 - Sep 2024
- Implemented an NLP pipeline using advanced machine-learning techniques to classify news articles with over 98% accuracy.
- Reduced false positives by 35%, ensuring reliable detection of misinformation.
- Designed a web interface allowing users to analyze news content interactively, boosting engagement by 30%.