I build Data Products, Automation, and AI systems that turn messy data into decisions people can trust. With 4+ years of experience, I focus on making solutions that are practical, clear, and built for real impact.
AI Researcher, Data Analyst, and Backend Software Engineer โ different roles, one focus: understanding systems deeply and making them work better.
Outside of work, I train consistently and stay active. I work as a Sports Assistant at the University of New Haven, which keeps me grounded and disciplined. I strongly believe that no opportunity is bigger than health, and that real success is built by balancing growth, energy, and long-term consistency.
Currently, I am supporting two nonprofit organizations by building automation, improving data workflows, and helping them scale. Solving real problems and contributing to meaningful social impact is a big part of why I do what I do.
EXPERIENCE
- Built an end-to-end donation pipeline connecting Zeffy (payments), Zapier (automation), and Systeme.io (data storage), reducing manual reconciliation effort by 40%
- Designed automated workflows that triggered real-time data sync across systems whenever a donation was recorded, improving data integrity and reporting accuracy
- Built a phishing and scam detection AI using Ollama, RAG, and LangChain โ combining prompt engineering and NLP to identify threats and explain them to users in plain language
- Leveraged fully open-source tools to run the entire system on-device, keeping user data private and reducing dependency on paid APIs by 100% โ making it accessible, reproducible, and ready for real-world deployment
- Cleaned and validated operational and financial data using SQL and Excel, then built a Power BI dashboard that improved reporting accuracy by 75%
- Consolidated scattered datasets into one clear report using Looker, making it easy for the team to track costs, program outcomes, and performance indicators
- Managed sports scheduling and tracked operational data using Excel, reducing reporting errors by 40% and cutting down unnecessary back-and-forth for the team
- Supported live campus events handling logistics and on-ground coordination, which taught me how to stay calm and organized under pressure
- Worked at Sodexo managing daily service operations, building strong habits around time management, teamwork, and getting things done consistently
- Worked on large-scale telecom data pipelines using SQL, MongoDB, and AWS โ validated datasets, fixed inconsistencies, and improved reporting accuracy by 50% across cross-functional teams
- Handled JSON parsing, regex-based data cleaning, and quality checks on live AT&T data, making sure everything flowing through the pipeline was reliable and ready for reporting
- Cleaned and structured financial datasets using Excel, implemented QA checks, and prepared recurring reports that improved downstream reporting accuracy by 25%
- Got hands-on experience with real data early on โ learned how to handle messy numbers, document everything clearly, and deliver consistent output even on a tight timeline
EDUCATION
CERTIFICATIONS
LANGUAGES
SQL - Advanced
C/C++ - Generalist
HTML/CSS - Generalist
Kannada - Native
Hindi - Fluent
Telugu - Fluent
German - Beginner
FRAMEWORKS & TOOLS
โ Cursor
โ Docker
โ FastAPI
โ Flask
โ Gemini
โ Gemma
โ Git
โ LangChain
โ Llama
โ Looker
โ MongoDB
โ NumPy
โ Ollama
โ Pandas
โ RAG
โ Scikit-learn
โ SQL
โ Tableau
โ VS Code
โ Zapier
โ HuggingFace
HOBBIES
โ Fitness & Training
โ Reading Books
โ Coding Side Projects
MY PROJECTS
Here are some of the projects I've built across personal and professional work. Each one solves a real problem and reflects how I think and work.
Built a RAG system for answering clinical questions on diabetes and hypertension using 9 raw medical documents. Cleaned and chunked documents into 200-word segments with 40-word overlap, generated embeddings using MiniLM-L6-v2, indexed with FAISS for fast retrieval, and evaluated GPT-2 family models to reduce hallucinations and improve factual grounding.
Identified regional food insecurity gaps across DC, MD, and VA using Capital Area Food Bank data. Cleaned and structured raw government datasets in Excel, built an interactive Power BI dashboard comparing food distributed vs. unmet demand across regions. Revealed that Maryland had the largest shortfall despite highest distribution โ enabling Downtown Evening Soup Kitchen to make data-driven resource planning decisions.
Predicts Tesla stock movement by combining historical stock data and real-time news sentiment. Fetches stock data via yfinance and news via NewsAPI, merges and processes both, trains a machine learning model on AWS SageMaker, and serves predictions through a Streamlit web app. Visual insights are displayed using AWS QuickSight.
Designed an AI-driven healthcare analytics MVP to assist clinicians with real-time risk monitoring and decision support. The system analyzes patient vitals and lab data to generate risk scores, visual dashboards, and automated clinical reports โ reducing manual review time. Built as a concept prototype focusing on practical clinical workflow integration.
Analyzed MTA department workforce budgets against actual staffing positions from 2017 to 2023. Cleaned and structured public NY state data in Excel, built pivot tables and KPI tiles tracking total budget, actuals, and variance. Delivered an interactive dashboard with year and status slicers โ enabling non-technical stakeholders to identify over- and under-performing departments and make data-driven workforce planning decisions.
Built a transformer-based deep learning model for pixel-level semantic segmentation of leaf health from real-world images. Fine-tuned SegFormer with transfer learning, optimized using Dice loss combined with Cross-Entropy to accurately classify healthy vs dry leaf regions. Evaluated ground truth masks against predicted outputs to measure segmentation accuracy.
Trained two separate TensorFlow CNN models โ one to classify floor patches as dusty or clean (85% accuracy), and another to classify stair, plain floor, obstacle, or unknown regions. Integrated both models into a real-time simulation using OpenCV with memory-based movement logic to avoid re-cleaning visited areas. Enabled autonomous multi-floor cleaning by detecting stairs and virtually climbing to continue cleaning across floors.
Redirecting to Medium...
GET IN TOUCH
Have a question or want to work together? Send me a message!
