← В ленту
Регистрация: 08.08.2022

Alexey Zabolotskii

Специализация: Data Science

Портфолио

Cyberway

Tasks were to find fraudulent behaviour of various users in Blockchain Ecosystem. The problem is that there was no real representative groundtruth or train data to prove that particular user is a cheater or not, so the goal was to use unsupervised methods. Many users created complex interaction models and it was inporant to detect and to recognize such behaviour. Some of this actions included - bribes, arbitrage actions, fake votes, fake boosting. Later I proposed the model of cascade voting sustem algorithm to exclude fraudulent users. But it was decided that such algorithm can lead to drastic decrease in the speed of transactions because of EOS based blockchain limitations. I also proposed SOM model for network recovery and routing optimisation.

Leantech

Tasks connected with research area mainly but not limited to. Image recognition, medical image processing algorithms. Various tasks to detect and trace objects in changing environment with hard tasks for object segmentation or feature extraction cased by masking, hiding, noise, obstacles. Investigation of possible segentation ethods for medical purposes.

Libertas Navitas

I work in this company as well as Intstitute of Economics and Industrial Engineering recently. While having a contract with Oil/Gas Company, my tasks were connected with the creation of Oil/water exctraction prediction models, based on input variables. Next steps were to enhance model up to whole reservoir model with different ML/DL models affecting resource extraction. Main goal was to create reservoir parameters prediction model. Various sequential and parallel DL models for reservoit parameters prediction.

Скиллы

AI/ML idea generation
Code deploy
Code writing
R&D

Опыт работы

Data Science
По настоящий момент |Leantech
Tasks connected with research area mainly but not limited to. Image recognition, medical image processing algorithms. Various tasks to detect and trace objects in changing environment with hard tasks for object segmentation or feature extraction cased by masking, hiding, noise, obstacles. Investigation of possible segentation ethods for medical purposes. Wide range of tasks connected with investigation of CV inplementatin for various issues like prediction of electric lines intersection with trees, 3D point cloud implementation, relational segmentation, relational image recognition, canonical forms etc.
Data Science
11.2021 - 03.2022 |Arameem
Data Science Geospacial taxi Demand predictions. Unlike conventional models I proposed the Idea or parallelizing into hezagons thus increasing precision manifold. Combined Autoregressor - Boost model was applied. Model was deployed on Airflow with FastApi output
Data Science
11.2019 - 04.2021 |Libertas Navitas
Panda, Numpy, Scikit, Keras, Tensorflow, Pytorch, spark
Oil and Gas • Oil Extraction • Gas Extraction • Fuels and Lubricants (Retail) Data Science Company - Novatis (Data science (oil, gas ML models etc)) Novosibirsk, I work in this company as well as Intstitute of Economics and Industrial Engineering recently. While having a contract with Oil/Gas Company, my tasks were connected with the creation of Oil/water exctraction prediction models, based on input variables. Next steps were to enhance model up to whole reservoir model with different ML/DL models affecting resource extraction. Main goal was to create reservoir parameters prediction model. Various sequential and parallel DL models for reservoit parameters prediction. Exctrated Water/Oil prediction models. Models implemented - DN (Keras) models - MLP, LSTM, ConvLSTM2D, Attention based LSTM and ConvLSTM, parralel models (MLP-MLP, LSTM-LSTM, ConvLSTM2D-ConvLSTM2D), multicore-ConvLSTM2D , Graph models (Pytorch), spacial LSTM neuron model developmnent for pressure prediction(not finished). Parallel models were integrated as it was expected to combine short term impacts (pressire gradients) with long term (water level/percent). LSTM - repeater- LSTM core showed best results. Later parallel 1-dimesional ConvLSTM cores showed promising results. Tools and libraies used- Python (Panda, Numpy, Scikit, Keras, Tensorflow, Pytorch, spark , etc). Data preprocessing (keras layers, scikit). Data analysis. Various odels obtained up to 90 % precision on 1-10 step models. Though that precision was caused by short term data quality like heteroscedasticity or the absense of noise and outliers. Recently studied Computer Vision models based on Scene Understanding by means of real world physics mapping. Created my own custom layers VAR/Encoder-Decoder that achieved superb learning curves.
Data Science
07.2018 - 04.2019 |Cyberway
Fraud detection
Data Scientist Company - Cyberway Novosibirsk, www.linkedin.com/company/cyberway-core/ Data Scientist Tasks were to find fraudulent behaviour of various users in Blockchain Ecosystem. The problem is that there was no real representative groundtruth or train data to prove that particular user is a cheater or not, so the goal was to use unsupervised methods. Many users created complex interaction models and it was inporant to detect and to recognize such behaviour. Some of this actions included - bribes, arbitrage actions, fake votes, fake boosting. Later I proposed the model of cascade voting sustem algorithm to exclude fraudulent users. But it was decided that such algorithm can lead to drastic decrease in the speed of transactions because of EOS based blockchain limitations. I also proposed SOM model for network recovery and routing optimisation. Following tasks were implemented. 1) Data analysis (data clustering and analysis with neural and clustering tools - R, RStudio, Matlab, Visual Studio Code (Python3 - ML,AI, C++), Google Colab (Tensorflow (SOM, keras DN))) Data sets 1 000 000 to 10 000 000 observations of various kind (users,votes, transfers,fake votes, boosting,assets relocation and others) 2) Cryptography algorithms -Zero Knowledge based systems like SNARKS, STARKS). 3) Cryptography (crypto algorithms - c++, python, crypto ++, ZK (SNARK, SNARG, STARK, Bulletproof). ZK systems like PLONK, Sonic, Supersonic, Marlin, Fractal, Aurora, Groth 16, HALO, Ligero, BCVT 14. 4) Data Science (R - (Vector Machines (Hyperbolic, Haussian, others), tensorflow (SOM, DN), Optics Invented cascade algorithm with node activation function of infection like spreading to prevent boosting and fake votes. Invented self organized network reconfiguration to adjust nodes for better data transfer instead of heavy STP based network reconfigurations
Data science
03.2003 - 03.2022 |Institute of Economics and Idustrial Enginneering
Researcher/Data Science(ML)
Researcher/Data Science(ML) I work in the Institute of Economics and Industrial Engineering since 2004. My job is aimed at the investigation of innovations processes in such industries as biotechnology and microelectronics. Since 2004 I participated in various projects. Main tasks and methods are writing research articles, implementimg ML algorithms for panel data processing (Matlab, Python, R, SPSS etc.) . Research algorithms include SOM, GSOM, DL, KNN, K-Means, Gradient Boosting Methods, Regressions etc. They are: Big Data (scientific articles) processing with ML, Feasibility study of TERRD system biology database. The study was carried out to see the possibilities to enter foreign market. Result of study was negative, because of possible dangerous barriers in the EU,Russia, USA biotechnology industry production chain. Joint project with NEVZ (Novosibirsk Electrovacuum Plant). Project was implemented to create strategy of reconfigurable production chain for new materials company (NEVZ). Study of Biotechnology and microelectronics production chains and their integration with global and local innovation systems. Recent project is connected with the creation of integrator system based on the selection and compilation of innovations with innovative companies.

Образование

Economics (PhD)
с 2002
Institute of Regional Studies

Языки

НемецкийСреднийАнглийскийПродвинутый