← В ленту
senior
Регистрация: 02.04.2025

Ali Ghofrani

Специализация: Deep Learning Researcher / Python Developer
— Experienced ML/DL researcher with a demonstrated history of working in AI industries. — Skilled in Python, OpenCV, SKLearn, TensorFlow, OpenAI, ChatGPT, Claude, LLaMA, LangChain, LlamaIndex, PyTorch, HuggingFace, NLTK, FastAPI, MongoDB, PostgreSQL, and Docker. — Strong research professional with an MSc focused on Signal Processing and TeleCom. — As an ML Developer, I'm interested in training from scratch or ne-tuning state-of-the-art deep architectures based on TensorFlow, PyTorch, or HF PEFT for different Vision, Speech, or NLP tasks. — But due to my 9years of experience with real-world challenges, As an MLOPs Developer, I'm in love with Quantize, Prune, and Distill trained models to lighter and faster production-ready microservices based on TRT, ONNX, OpenVino, and CoreML to bring huge models to server-side or client-side applications. Publications: — Knowledge Distillation in Plant Disease Recognition - Neural Computing and Applications Journal. — CaTILoc: Camera Image Transformer for Indoor Localization - The International Conference on Acoustics, Speech, & Signal Processing (ICASSP 2021). — APS: A Large-Scale Multi-Modal Indoor Camera Positioning System (Conference Best Paper Award) - The 4 th Mediterranean Conference on Pattern Recognition and Artificial Intelligence (MedPRAI 2020). — Plant Disease Recognition using Optimized Deep Convolutional Neural Networks (Oral Presentation) - The 4 th Mediterranean Conference on Pattern Recognition and Artificial Intelligence (MedPRAI 2020). — L-ICPSnet: LiDAR Indoor Camera Positioning System for RGB to Point Cloud Translation using End2End Generative Network (Oral Presentation) - The 8th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS). — Attention-Based Face AntiSpoong of RGB Camera using a Minimal End-2-End Neural Network (Oral Presentation) - 2020 International Conference on Machine Vision and Image Processing (MVIP). — ICPS-net: An End-to-End RGB-based Indoor Camera Positioning System using deep convolutional neural networks (Oral Presentation) - The 12th International Conference on Machine Vision, Amsterdam, The Netherlands. — Real-time Face-Detection and Emotion Recognition Using MTCNN and miniShueNet V2 (Oral Presentation – Best Paper Candidate) - 5th Conference on Knowledge-Based Engineering and Innovation [IEEE KBEI2019]. — Capsule-Based Persian/Arabic Robust Handwritten Digit Recognition Using EM Routing (Oral Presentation) - Sharif 4th International Conference on Pattern Recognition and Image Analysis [IEEE IPRIA2019].
— Experienced ML/DL researcher with a demonstrated history of working in AI industries. — Skilled in Python, OpenCV, SKLearn, TensorFlow, OpenAI, ChatGPT, Claude, LLaMA, LangChain, LlamaIndex, PyTorch, HuggingFace, NLTK, FastAPI, MongoDB, PostgreSQL, and Docker. — Strong research professional with an MSc focused on Signal Processing and TeleCom. — As an ML Developer, I'm interested in training from scratch or ne-tuning state-of-the-art deep architectures based on TensorFlow, PyTorch, or HF PEFT for different Vision, Speech, or NLP tasks. — But due to my 9years of experience with real-world challenges, As an MLOPs Developer, I'm in love with Quantize, Prune, and Distill trained models to lighter and faster production-ready microservices based on TRT, ONNX, OpenVino, and CoreML to bring huge models to server-side or client-side applications. Publications: — Knowledge Distillation in Plant Disease Recognition - Neural Computing and Applications Journal. — CaTILoc: Camera Image Transformer for Indoor Localization - The International Conference on Acoustics, Speech, & Signal Processing (ICASSP 2021). — APS: A Large-Scale Multi-Modal Indoor Camera Positioning System (Conference Best Paper Award) - The 4 th Mediterranean Conference on Pattern Recognition and Artificial Intelligence (MedPRAI 2020). — Plant Disease Recognition using Optimized Deep Convolutional Neural Networks (Oral Presentation) - The 4 th Mediterranean Conference on Pattern Recognition and Artificial Intelligence (MedPRAI 2020). — L-ICPSnet: LiDAR Indoor Camera Positioning System for RGB to Point Cloud Translation using End2End Generative Network (Oral Presentation) - The 8th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS). — Attention-Based Face AntiSpoong of RGB Camera using a Minimal End-2-End Neural Network (Oral Presentation) - 2020 International Conference on Machine Vision and Image Processing (MVIP). — ICPS-net: An End-to-End RGB-based Indoor Camera Positioning System using deep convolutional neural networks (Oral Presentation) - The 12th International Conference on Machine Vision, Amsterdam, The Netherlands. — Real-time Face-Detection and Emotion Recognition Using MTCNN and miniShueNet V2 (Oral Presentation – Best Paper Candidate) - 5th Conference on Knowledge-Based Engineering and Innovation [IEEE KBEI2019]. — Capsule-Based Persian/Arabic Robust Handwritten Digit Recognition Using EM Routing (Oral Presentation) - Sharif 4th International Conference on Pattern Recognition and Image Analysis [IEEE IPRIA2019].

Скиллы

Python
C
Matlab
Sci-Kit Learn
Keras
TensorFlow
PyTorch
MXnet
GluonCV
Cae
Numpy
Matplotlib
Pandas
Plotly
Dash
FAISS
ScaNN
Chroma
Pinecone
Milvus
Qdrant
OpenCV
I3D
SlowFast
LLM
GPT3.5turbo
LangChian
LVM
LLaVA
OpenAI CLIP
PoseNet
OpenPose
AlphaPose
Linux
Bash
Debian
Ubuntu
RasPi
RESTful API
Flask
FastAPI
SQL
NoSQL
MongoDB
Redis
MySQL
ARM
RasPi
Intel TBB
GStreamer
Agile
Scrum
Git
GitLab
GitHub
Trello
Jira

Опыт работы

AI Team Lead
12.2023 - 03.2025 |SpeaxAI
Python, OpenAI, FastAPI, Docker, Kubernetes, ElevenLabs
LipSync Video Dubbing Application. ● Train a custom whisper and ecient GPU-friendly deployment to handle batch-wised inference. ● Use LLM-based Multi-Prompt Translation based on End2End LLM as Judge. ● Emotion-aware Text to Audio models with ZeroShot Voice cloning system. ● Audio-Visual LipSync based on Diusion and Video Restoration Application. Task description: Leading AI strategy and development for a multilingual lip-sync video dubbing platform combining speech, language, and video AI. Trained a custom Whisper model and optimized it for batch-wise inference on GPUs, supporting high-throughput dubbing workflows. Developed a Multi-Prompt Translation pipeline evaluated by LLMs such as ChatGPT-4 and Claude, improving translation fluency and contextual accuracy across over a dozen languages. Built an emotion-aware Text-to-Audio generation system with ZeroShot voice cloning to produce natural, expressive speech aligned with original speaker tone. Designed an Audio-Visual LipSync module using diffusion-based alignment techniques, enhanced by video restoration tools to deliver high-fidelity output. Oversaw prompt engineering workflows and multilingual evaluation strategies to ensure dubbing quality, sync accuracy, and emotional consistency in final outputs.
NLP Developer
08.2022 - 01.2024 |Giotto
Software Development, AR, MATLAB, Datasets, MLOps, Visualization, Scikit-Learn, Data Analysis, SQL, Red Hat Linux, Flask, AWS, Technical Vision, NLP, Data Science, PyTorch, Computer Vision, Docker Products, PySpark, Data Cleaning, SAS, Google API, FastAPI, Deep Learning, Generative AI, AI, Strategy, Automation, VR, LLM
Medical Text Mining Platform. ● Graph-based Language model for news anomaly detection. ● LLM-based social media alerting system. ● Table Information Parsing. ● Visual Question Answering. Task description: Developed graph-based language models to identify anomalies in news and regulatory data, enabling early detection of narrative shifts and compliance risks. Built an LLM-driven social media alert system that identifies high-risk or non-compliant content across platforms. Implemented Table Information Parsing and Visual Question Answering modules to extract insights from complex documents with mixed modalities. Led the development of medical text classification models based on BERT architectures, later integrated into Giotto’s core offerings. As part of an external collaboration with Google Mind, contributed to refining ALBERT, RoBERTa, and DistilBERT for low-resource language tasks, particularly Turkish. Published and showcased BERT fine-tuning work via your Hugging Face profile, including domain-specific language models for healthcare and legal use cases.
Machine Learning Consultant
11.2020 - 07.2022 |Ucraft
Software Development, MATLAB, Datasets, MLOps, Visualization, Scikit-Learn, Data Analysis, SQL, Red Hat Linux, Flask, AWS, NLP, Data Science, PyTorch, Docker Products, PySpark, Data Cleaning, SAS, Google API, FastAPI, Deep Learning, Generative AI, AI, Strategy, VR, LLM, gRPC
AI Auto Website Builder. ● Website template ranking considering user-interested URLs. ● Image captioning based on OpenAI CLIP. ● Conversational AI for building a website. Task description: Designed a template ranking system using user behavior modeling and interest-based URL tracking. Built a CLIP-based image captioning model for dynamic image-to-text generation across Ucraft's template editor. Developed a Conversational AI assistant for guided website building using NLP and rule-based NLU engines. Led research on medical text mining, creating structured outputs from clinical narratives and patient data
CTO
02.2020 - 06.2021 |NSAtech
Software Development, MATLAB, Datasets, MLOps, Visualization, Data Analysis, SQL, Flask, Suse, NLP, PyTorch, Docker Products, Data Cleaning, SAS, Google API, Microsoft Azure, FastAPI, Deep Learning, AI, GCP, LLM
AI-based Talent Identication System. ● Develop action similarity system via 2d Skeleton and 3D body mesh reconstruction via simple RGB camera as an input to compare amateur and professional players. ● Train, Evaluate, and Serve deep learning models to get ready trained models to deploy on servers. ● Design and develop APIs to interact UI with backend services.
Deep Learning Developer (membership)
12.2019 - 03.2020 |Iran's National Elites Foundation
Software Development, MATLAB, Visualization, Scikit-Learn, Flask, NLP, PyTorch, Computer Vision, Data Cleaning, SAS, Deep Learning, AI, VR
Persian Sign Language Project: ● Converting Video Signs to the corresponding text (Gloss) labels. ● Converting Gloss to the Pure Farsi Texts (Natural Machine Translation Task). ● Using the SOTA Language Model such as BERT, XLM, GPT-2, and Attention approaches. ● Pre-training from scratch “GPT-neo” and “DistilRoBERTa” language models on more than 125GB of pre-processed self-crawled Persian Text via Google TPUv3-32 on 12 days (Thanks to Huggingface and Google Research for hardware resource sponsoring). These models outperform previous results on downstream Persian NLP tasks such as NER, Sentiment Classication, Text Classication, and so on.
VP of AI Dept
01.2019 - 11.2020 |Alpha Reality
SabteAhval, SahmAshena Brokerage, Golrang
eKYC Platform. ● Deploy Face Recognition and Face AntiSpoong via Iranian Nation ID Card API (SabteAhval) for Sejam Stock Market Electronic Authentication (SahmAshena Brokerage). ● Retail market analysis engine and recommender system based on customer segmentation and behavior prediction (Golrang System). ● Train and deploy Customer RFM analysis for millions of customers and thousands of concurrent requests.
Computer Vision Developer
04.2018 - 01.2019 |Alpha Reality
Deep Learning, EcientNet, MobileNetV2 CNN
● Indoor Navigation via Camera Positioning based on End2End Deep Learning approach using state-of-the-art EcientNet and MobileNetV2 CNN architectures. ● Face AntiSpoong web service based on Face 3D Pose Estimation for Electrical Authentication. ● AlphaCognition Project: VIP and BlackList Face Recognition System at MegaMarkets from Multiple CCTV Streams. (ShahrVand Chain Stores).
Machine Learning Developer
06.2017 - 04.2018 |Aradow
SQL, PyTorch, Written Communication, AI, VR, Sci-Kilt Learn, Flask, RESTful API, MongoDB, Keras
● Implementation of the diagnostic system for livestock diseases based on the given symptoms Server-side microservice as docker container using SVM Gaussian kernel and RESTful API (Sci-Kilt Learn, Flask). ● Implementation of the plant disease recognition system based on the given client-side leaf image Using the CNN MobileNetV2 Architecture (Keras, Flask). ● Implementation of the Recommendation system based on user preferences and their comment sentiments (Extracting information from MongoDB). ● Prediction of harvesting rates based on land climatic features and past years’ reports. ● Crop recommendation system based on nutrient report and soil features.

Образование

Engineer (Бакалавр)
По 2013
Isfahan University of Technology
Telecommunication Systems (Доктор наук)
с 2021 - По настоящий момент
Shahid Beheshti University
Telecommunication Systems (Магистр)
2017 - 2020
IRIBU
Electrical, Electronics and Communications Engineering (Бакалавр)
2014 - 2017
IRIBU

Языки

АнглийскийПродвинутый