Travel Insurance Purchase Prediction
Machine Learning Project
Built a predictive ML model to estimate the likelihood of customers purchasing travel insurance using demographic and travel-related features.
Performed extensive EDA, handled missing values, created visualizations, and engineered features to improve model signal quality.
Trained multiple models including Logistic Regression, Decision Tree, Random Forest, and XGBoost, using GridSearchCV for hyperparameter tuning.
Developed an ensemble classifier using VotingClassifier, achieving higher accuracy and more stable performance than individual models.
Identified key factors influencing insurance purchase through feature importance analysis, enabling actionable business insights.
Technologies: Python, Pandas, Scikit-learn, XGBoost, Matplotlib, Seaborn, Jupyter Notebook
Home Credit Default Risk Prediction
End-to-End ML Project
Built a complete ML pipeline to predict loan default probability using real-world credit bureau and financial datasets.
Performed advanced preprocessing including missing value treatment, categorical encoding, feature engineering, class imbalance handling, and outlier removal.
Conducted in-depth EDA involving distribution analysis, correlation heatmaps, feature visualization, and socioeconomic insights.
Trained and optimized XGBoost, LightGBM, and Gradient Boosting models, achieving a ROC-AUC of 0.77 using LightGBM.
Deployed the model on Google Cloud Platform (GCP) with an HTTP endpoint for real-time predictions.
Technologies: Python, Pandas, NumPy, Scikit-learn, XGBoost, LightGBM, Seaborn, Matplotlib, GCP, Jupyter Notebook
AI Interview Simulator Web App
LLM-Powered Application
Developed an interactive Streamlit web application that simulates job interviews using AI-generated questions and personalized feedback.
Integrated OpenAI/Gemini LLM APIs to generate tailored interview questions based on job title, job description, and uploaded resume.
Implemented advanced prompt engineering techniques including Zero-Shot, Few-Shot, Chain-of-Thought, Role-Based, and Self-Critique prompting.
Added automated candidate scoring with AI-driven strengths, weaknesses, model answers, and performance analysis.
Built customization options such as difficulty levels, creativity controls, question skipping, and raw LLM output debugging.
Implemented document parsing using PyPDF2 and python-docx for extracting resume content.
Designed the app with a modular structure for scalability and maintainability.
Technologies: Python, Streamlit, OpenAI/Gemini API, PyPDF2, python-docx, Regex, Virtual Environments
Accenture North America Data Analytics and Visualization Job Simulation on Forage
Data Analytics and Visualization Project
In May 2024, I completed the Accenture North America Data Analytics and Visualization Job Simulation on Forage, where I engaged in a comprehensive project for a hypothetical social media client.
During this simulation, I honed my skills in data cleaning, modeling, and analysis, ensuring that the data was accurate and ready for insightful examination.
I leveraged advanced analytical techniques to uncover trends and patterns that could drive strategic decisions. Furthermore, I developed and presented clear, compelling visualizations and reports that communicated these insights effectively, demonstrating my ability to translate complex data into actionable business strategies.
Research on Application of Artificial Intelligence in Medical Education
Data Management Project
In my final year, I undertook a significant research project on the application of Artificial Intelligence in Medical Education, particularly focusing on distance learning.
The project, spanning from December 2019 to May 2020, involved developing an Operational Data Management application using Python and Django.
This innovative platform facilitated seamless interaction between students, trainers, and administrators, enabling efficient information sharing and knowledge exchange.
By leveraging AI, the project aimed to enhance the adoption of medical education technologies among healthcare providers, thereby improving patient outcomes and operational efficiency on a larger scale.
This experience not only honed my technical skills but also deepened my understanding of AI's transformative potential in the healthcare sector.
Spam SMS Filtering Using Machine Learning
Data Quality Improvement Project
In my mini project, "Spam SMS Filtering using Machine Learning," conducted from February to May 2019, I focused on enhancing data quality through real-time classification of spam SMS messages.
The project aimed to immediately identify and filter out spam messages upon receipt on a mobile device, addressing the challenge of zero-hour attacks—newly created spam messages that traditional filters might miss.
By leveraging machine learning algorithms, we developed a robust model capable of accurately distinguishing between legitimate and spam messages, even when encountering previously unseen spam content.
This project underscored the importance of adaptive and proactive spam detection mechanisms in improving communication security and user experience.