← В ленту
Регистрация: 11.01.2025

Anton Kuvaldin

Специализация: Data Scientist

Скиллы

Airflow
Apache Spark
AutoML
Bash
BERT
Big Data
Bootstrap
Catboost
Clustering
Data Pipelining
Data Scientists
Data Visualization
Deep Learning
Docker
Git
Groovy
Hadoop
Hive
Hyperopt
Jenkins
K-Means
LightGBM
Linux
Machine learning
Matplotlib
Neural Networks
NLP
NumPy
Onefactor
Optuna
Pandas
Plotly
PSI
Pyspark
Python
PyTorch
Random Forest
SciPy
Seaborn
SHAP
Sklearn
SQL
Statistics

Опыт работы

Senior Data Scientist
с 08.2024 - По настоящий момент |Sberbank, Compliance department
K-Means, Spark, Python
Insider Threat Detection / Clustering. ● Prepared a review of novel approaches on detecting insider trading activity on the stock exchange and implemented Dynamic Clustering based on KMeans which made it possible to introduce AI into business process and increase the efficiency of expert investigations. ● Conducted 10+ technical interviews, which allowed us to hire 3 new team members.
Data Scientist
02.2023 - 08.2024 |Sberbank, Compliance department
BERT, LightGBM, Apache Spark, NLP, Python, SQL, LogReg
1. Multiclass classification. ● As the lead of 2-Junior DS unit led the research and developed ensemble BERT+LightGBM for classifying transactions subject to mandatory control by the Federal Financial Monitoring Service, the solution increased recall by 30% at a given precision 99.9%. Task: Multiclass classification. Core stack: BERT, LightGBM, Apache Spark, NLP, Python. 2. Binary classification. ● Developed Gradient Boosting model for detecting transactions subject to mandatory control by the Federal Financial Monitoring Service and translated it into SQL code, as a result it became possible introduce AI into business process and decrease customer costs by 10%. Core stack: LightGBM, SQL, Spark, Python. 3. Fraud detection / Binary classification. ● Built an end-to-end pipeline for an ensemble of models for compliance control of opening business accounts in bank, which led to increase in recall by 20% at a given precision 65% and increase in client base coverage from 40% to 80%. Core stack: LightGBM, LogReg, Spark, Python.
Data Scientist
09.2021 - 02.2023 |Sberbank, Compliance department
LightGBM, Catboost, LogReg, Spark, SQL, Python
Binary classification. ● Researched a new data source and developed a new features data mart that was deployed into Prod and is used by 50% of DS models in Compliance department. ● Developed 5 ML models for different business processes that were successfully deployed in Prod.
Data Engineer
05.2021 - 09.2021 |Sberbank, Compliance department
Jenkins, Groovy, Qlik Sence, Openshift, Python, Rest Api
● Developed Jenkins pipelines in Groovy and Python for automatic building and publication distribution with ML model into Nexus and Openshift (Rest Api), which freed up ∼ 1 hour for each ML model deployment into Prod.

Образование

Computer Science (Магистр)
2020 - 2022
Moscow Institute of Physics and Technology (MIPT)
Physics (Бакалавр)
2014 - 2020
Moscow Institute of Physics and Technology (MIPT)

Языки

АнглийскийПродвинутый