Open to Junior ML Engineer Roles

Semyon Sidorov

Aspiring Machine Learning Engineer

I build practical machine learning applications from raw data to deployment, including feature engineering, model training, evaluation, FastAPI inference services, and Dockerized APIs.

End-to-End ML

Data preprocessing, feature engineering, model training, and rigorous evaluation before deployment.

ML API Deployment

FastAPI inference services and Docker containerization for reproducible, portable model serving.

Measurable Impact

Evaluation with ROC-AUC, F1, precision-recall tradeoffs, and business-aligned metrics.

Projects

Two end-to-end ML systems with documented training pipelines, evaluation metrics, and deployment architecture.

Real-Time Fraud Detection

End-to-end fraud detection system built from raw transaction data through production deployment. Includes feature engineering, XGBoost model training with Optuna hyperparameter optimization, 5-fold cross-validation, threshold tuning, and FastAPI-based inference services deployed with Docker.

  • Data preprocessing & feature engineering on transaction signals
  • XGBoost training with cross-validation and ROC-AUC evaluation
  • FastAPI batch inference service (POST /predict_batch)
  • Docker containerization for reproducible deployment

0.997

ROC-AUC

0.861

F1 Score

FastAPI + Docker

Deployment

PythonXGBoostFastAPIPandasDockerscikit-learn

Exoplanet Host Star Classification

Binary classification system for identifying stars that are similar to known exoplanet-host stars using Gaia DR3 and NASA Exoplanet Archive data. Includes astrophysical feature engineering, XGBoost model training with cross-validation, model evaluation using ROC-AUC, F1 score, precision, and recall, and deployment through a FastAPI inference service for interactive predictions.

  • Gaia DR3 and NASA Exoplanet Archive data preprocessing
  • Astrophysical feature engineering from stellar parameters
  • XGBoost model training with cross-validation
  • Evaluation with ROC-AUC, F1 score, precision, recall, and confusion matrix
  • FastAPI inference API for interactive host-likeness predictions

0.991

Accuracy

0.912

F1 Score

13

Features Used

PythonXGBoostFastAPIscikit-learnPandasDocker

Tech Stack

Tools and technologies used across ML projects.

Pythonscikit-learnFastAPIPandasNumPySQLiteDockerXGBoostFeature EngineeringModel EvaluationData Pipelines

Open to ML Engineering Opportunities

I'm actively seeking Machine Learning Engineer opportunities. Reach out to discuss projects, collaborations, or open roles.