AI Team Lead
12.2023 - 03.2025 |SpeaxAI
Python, OpenAI, FastAPI, Docker, Kubernetes, ElevenLabs
LipSync Video Dubbing Application.
● Train a custom whisper and ecient GPU-friendly deployment to handle batch-wised inference.
● Use LLM-based Multi-Prompt Translation based on End2End LLM as Judge.
● Emotion-aware Text to Audio models with ZeroShot Voice cloning system.
● Audio-Visual LipSync based on Diusion and Video Restoration Application.
Task description:
Leading AI strategy and development for a multilingual lip-sync video dubbing platform combining speech, language, and video AI.
Trained a custom Whisper model and optimized it for batch-wise inference on GPUs, supporting high-throughput dubbing workflows.
Developed a Multi-Prompt Translation pipeline evaluated by LLMs such as ChatGPT-4 and Claude, improving translation fluency and contextual accuracy across over a dozen languages.
Built an emotion-aware Text-to-Audio generation system with ZeroShot voice cloning to produce natural, expressive speech aligned with original speaker tone.
Designed an Audio-Visual LipSync module using diffusion-based alignment techniques, enhanced by video restoration tools to deliver high-fidelity output.
Oversaw prompt engineering workflows and multilingual evaluation strategies to ensure dubbing quality, sync accuracy, and emotional consistency in final outputs.
NLP Developer
08.2022 - 01.2024 |Giotto
Software Development, AR, MATLAB, Datasets, MLOps, Visualization, Scikit-Learn, Data Analysis, SQL, Red Hat Linux, Flask, AWS, Technical Vision, NLP, Data Science, PyTorch, Computer Vision, Docker Products, PySpark, Data Cleaning, SAS, Google API, FastAPI, Deep Learning, Generative AI, AI, Strategy, Automation, VR, LLM
Medical Text Mining Platform.
● Graph-based Language model for news anomaly detection.
● LLM-based social media alerting system.
● Table Information Parsing.
● Visual Question Answering.
Task description:
Developed graph-based language models to identify anomalies in news and regulatory data, enabling early detection of narrative shifts and compliance risks.
Built an LLM-driven social media alert system that identifies high-risk or non-compliant content across platforms.
Implemented Table Information Parsing and Visual Question Answering modules to extract insights from complex documents with mixed modalities.
Led the development of medical text classification models based on BERT architectures, later integrated into Giotto’s core offerings.
As part of an external collaboration with Google Mind, contributed to refining ALBERT, RoBERTa, and DistilBERT for low-resource language tasks, particularly Turkish.
Published and showcased BERT fine-tuning work via your Hugging Face profile, including domain-specific language models for healthcare and legal use cases.
Machine Learning Consultant
11.2020 - 07.2022 |Ucraft
Software Development, MATLAB, Datasets, MLOps, Visualization, Scikit-Learn, Data Analysis, SQL, Red Hat Linux, Flask, AWS, NLP, Data Science, PyTorch, Docker Products, PySpark, Data Cleaning, SAS, Google API, FastAPI, Deep Learning, Generative AI, AI, Strategy, VR, LLM, gRPC
AI Auto Website Builder.
● Website template ranking considering user-interested URLs.
● Image captioning based on OpenAI CLIP.
● Conversational AI for building a website.
Task description:
Designed a template ranking system using user behavior modeling and interest-based URL tracking.
Built a CLIP-based image captioning model for dynamic image-to-text generation across Ucraft's template editor.
Developed a Conversational AI assistant for guided website building using NLP and rule-based NLU engines.
Led research on medical text mining, creating structured outputs from clinical narratives and patient data
CTO
02.2020 - 06.2021 |NSAtech
Software Development, MATLAB, Datasets, MLOps, Visualization, Data Analysis, SQL, Flask, Suse, NLP, PyTorch, Docker Products, Data Cleaning, SAS, Google API, Microsoft Azure, FastAPI, Deep Learning, AI, GCP, LLM
AI-based Talent Identication System.
● Develop action similarity system via 2d Skeleton and 3D body mesh reconstruction via simple RGB
camera as an input to compare amateur and professional players.
● Train, Evaluate, and Serve deep learning models to get ready trained models to deploy on servers.
● Design and develop APIs to interact UI with backend services.
Deep Learning Developer (membership)
12.2019 - 03.2020 |Iran's National Elites Foundation
Software Development, MATLAB, Visualization, Scikit-Learn, Flask, NLP, PyTorch, Computer Vision, Data Cleaning, SAS, Deep Learning, AI, VR
Persian Sign Language Project:
● Converting Video Signs to the corresponding text (Gloss) labels.
● Converting Gloss to the Pure Farsi Texts (Natural Machine Translation Task).
● Using the SOTA Language Model such as BERT, XLM, GPT-2, and Attention approaches.
● Pre-training from scratch “GPT-neo” and “DistilRoBERTa” language models on more than 125GB of pre-processed self-crawled Persian Text via Google TPUv3-32 on 12 days (Thanks to Huggingface and Google Research for hardware resource sponsoring).
These models outperform previous results on downstream Persian NLP tasks such as NER, Sentiment Classication, Text Classication, and so on.
VP of AI Dept
01.2019 - 11.2020 |Alpha Reality
SabteAhval, SahmAshena Brokerage, Golrang
eKYC Platform.
● Deploy Face Recognition and Face AntiSpoong via Iranian Nation ID Card API (SabteAhval) for
Sejam Stock Market Electronic Authentication (SahmAshena Brokerage).
● Retail market analysis engine and recommender system based on customer segmentation and
behavior prediction (Golrang System).
● Train and deploy Customer RFM analysis for millions of customers and thousands of concurrent
requests.
Computer Vision Developer
04.2018 - 01.2019 |Alpha Reality
Deep Learning, EcientNet, MobileNetV2 CNN
● Indoor Navigation via Camera Positioning based on End2End Deep Learning approach using state-of-the-art EcientNet and MobileNetV2 CNN architectures.
● Face AntiSpoong web service based on Face 3D Pose Estimation for Electrical Authentication.
● AlphaCognition Project: VIP and BlackList Face Recognition System at MegaMarkets from Multiple CCTV Streams. (ShahrVand Chain Stores).
Machine Learning Developer
06.2017 - 04.2018 |Aradow
SQL, PyTorch, Written Communication, AI, VR, Sci-Kilt Learn, Flask, RESTful API, MongoDB, Keras
● Implementation of the diagnostic system for livestock diseases based on the given symptoms Server-side microservice as docker container using SVM Gaussian kernel and RESTful API (Sci-Kilt Learn, Flask).
● Implementation of the plant disease recognition system based on the given client-side leaf image Using the CNN MobileNetV2 Architecture (Keras, Flask).
● Implementation of the Recommendation system based on user preferences and their comment sentiments (Extracting information from MongoDB).
● Prediction of harvesting rates based on land climatic features and past years’ reports.
● Crop recommendation system based on nutrient report and soil features.