← В ленту
Регистрация: 11.05.2022

Портфолио

EPAM Systems Inc.

- Software engineering. - Customer comminucation. - Hadoop infrastructure support. - Interns interviewing and internal projects mentoring. - Potential architecture improvements prototyping. 1. Data lake development. Helthcare reports reporting platform. Technologies: MongoDB, Apache Spark(Core), Scala, AWS S3, AWS EMR, TeamCity, Sumologic. 2. Building end-to-end data pipeline: ingestion, transformation, reporting on top of Apache Hive. Technologies: Apache Hive, MapReduce, HDFS, Python, Bash, HiveQL, Nexus, Jenkins. 3. Complex data streaming platform, handling telemetry data. Technologies: Apache Hive, Azure DataLake, Java, Apache Spark(Spark Dataset API, Spark Streaming), Redis, Azure HDInsight. 4. ETL pipeline development Technologies: Apache Impala, Apache Spark(Spark Dataset API), Apache Kafka, Parquet, Avro, StreamSets, Airflow, Cloudera. 5. Serverless image recognition pipeline development Technologies: AWS S3, AWS Lambda, AWS EC2, Docker, Python, ANN.

Grid Dynamics

1. Data platform for marketing data. - Design and implement data pipelines. - ETL Pipelines and Spark Jobs optimization. - Fixing bugs. 2. Cloud Data platform for manufacturer. - Architecting and documenting features. - Features development. - Integrating ML models into data pipelines.

DataArt

- Software engineering: - Bug fixes. - New sources ingestion. - ETL pipelines development. - Participating in architecture design and prototyping.

Скиллы

Airflow
Apache Cassandra
Apache Hadoop
Apache Hive
Apache Kafka
Apache Spark
Avro
AWS EC2
AWS EMR
AWS Lambda
AWS S3
Azure Datalake
Azure EventHubs
Azure HDInsight
Bash
Big Data
Cloudera
Git
Hortonworks
Java
Jenkins
JSON
Jupyter Notebooks
Linux
Parquet
SCALA
Spark Streaming
SQL

Опыт работы

Data Engineer
07.2021 - 05.2022 |Grid Dynamics
Python, Apache Spark, AWS EMR, AirFlow, SparkQL, AWS S3, Lambda, SQS, ECR, Snowflake
1. Data platform for marketing data. - Design and implement data pipelines. - ETL Pipelines and Spark Jobs optimization. - Fixing bugs. 2. Cloud Data platform for manufacturer. - Architecting and documenting features. - Features development. - Integrating ML models into data pipelines.
Data Engineer
06.2020 - 07.2021 |DataArt
.
- Software engineering: - Bug fixes. - New sources ingestion. - ETL pipelines development. - Participating in architecture design and prototyping.
Software engineer
10.2019 - 06.2020 |Perfect Art .Inc.
Python, Apache Hive, HiveQL, Apache Spark, Jupyter
Fuel efficiency analysis: - Dataflow analysis and optimization. - DataMarts creation. - Basic data quality autotests development (Deequ). - Existing Hive transformations optimization.
Software engineer
05.2019 - 10.2019 |DINS
Scala, Apache Spark, Apache Kafka, Postgres, Apache Impala, HDFS
Data platform - Software development(features, pipelines, modernization). - Supporting Hadoop cluster. - Service architecture refactoring.
Software engineer
07.2018 - 04.2019 |Nexign Systems Saint Petersburg
Kotlin(Java), Spring, Spring Boot, Postgres, Apache Cassandra, Apache Spark, Spark Streaming, Kafka Streams, Apache Hive
- Software engineering. - Product migration from traditional storages.
Software engineer
07.2015 - 07.2018 |EPAM Systems Inc. (Russia) Saint Petersburg
MongoDB, Apache Spark(Core), Scala, AWS S3, AWS EMR, TeamCity, Sumologic
- Software engineering. - Customer comminucation. - Hadoop infrastructure support. - Interns interviewing and internal projects mentoring. - Potential architecture improvements prototyping. 1. Data lake development. Helthcare reports reporting platform. Technologies: MongoDB, Apache Spark(Core), Scala, AWS S3, AWS EMR, TeamCity, Sumologic. 2. Building end-to-end data pipeline: ingestion, transformation, reporting on top of Apache Hive. Technologies: Apache Hive, MapReduce, HDFS, Python, Bash, HiveQL, Nexus, Jenkins. 3. Complex data streaming platform, handling telemetry data. Technologies: Apache Hive, Azure DataLake, Java, Apache Spark(Spark Dataset API, Spark Streaming), Redis, Azure HDInsight. 4. ETL pipeline development Technologies: Apache Impala, Apache Spark(Spark Dataset API), Apache Kafka, Parquet, Avro, StreamSets, Airflow, Cloudera. 5. Serverless image recognition pipeline development Technologies: AWS S3, AWS Lambda, AWS EC2, Docker, Python, ANN.

Образование

HDP Certified Administrator
По 2016
Certificates/Courses
Technical Cybernetics , Software engineering (Магистр)
По 2018
Peter the Great St. Petersburg Polytechnic University

Языки

АнглийскийВыше среднегоРусскийРодной