Projects

End-to-end ML systems — from data preprocessing and model training to FastAPI deployment and Docker containerization.

Project 1

Real-Time Fraud Detection

Real-Time Fraud Detection

End-to-end fraud detection system built from raw transaction data through production deployment. Includes feature engineering, XGBoost model training with Optuna hyperparameter optimization, 5-fold cross-validation, threshold tuning, and FastAPI-based inference services deployed with Docker.

  • Data preprocessing & feature engineering on transaction signals
  • XGBoost training with cross-validation and ROC-AUC evaluation
  • FastAPI batch inference service (POST /predict_batch)
  • Docker containerization for reproducible deployment

0.997

ROC-AUC

0.861

F1 Score

FastAPI + Docker

Deployment

PythonXGBoostFastAPIPandasDockerscikit-learn

Live Demo

Upload a CSV of transaction data to the external FastAPI inference service. Results include fraud probabilities, risk levels, summary statistics, and downloadable predictions.

Launch Fraud Detection Demo

System Architecture

1

Data Preprocessing

Clean transaction records, handle missing values, process timestamps, and prepare categorical features for modeling.

2

Feature Engineering

Transaction velocity, customer spending behavior, amount ratios, time-based features, and geospatial distance calculations.

3

Model Training

XGBoost with Optuna hyperparameter optimization, class imbalance handling, and 5-fold cross-validation.

4

FastAPI Serving

Batch inference via POST /predict_batch, containerized with Docker

5

Model Evaluation

ROC-AUC, precision-recall curves, and business-aligned fraud metrics

Batch CSV inferenceFastAPI + Docker deploymentExternal model serving
Project 2

Exoplanet Host Star Classification

Exoplanet Host Star Classification

Binary classification system for identifying stars that are similar to known exoplanet-host stars using Gaia DR3 and NASA Exoplanet Archive data. Includes astrophysical feature engineering, XGBoost model training with cross-validation, model evaluation using ROC-AUC, F1 score, precision, and recall, and deployment through a FastAPI inference service for interactive predictions.

  • Gaia DR3 and NASA Exoplanet Archive data preprocessing
  • Astrophysical feature engineering from stellar parameters
  • XGBoost model training with cross-validation
  • Evaluation with ROC-AUC, F1 score, precision, recall, and confusion matrix
  • FastAPI inference API for interactive host-likeness predictions

0.991

Accuracy

0.912

F1 Score

13

Features Used

PythonXGBoostFastAPIscikit-learnPandasDocker

Live Prediction Form

Interactive demo for estimating how similar a star is to known exoplanet-host stars using stellar properties such as metallicity, mass, radius, temperature, luminosity, age, and surface gravity. The form sends inputs to FastAPI inference service and returns a host-likeness score.

View Exoplanet Demo