Sharif Mulani
Портфолио
Aligned Automation
2. DELL - Address Validation & Geocoding (Python, NLP / DL). This project basically deals with rectifying bad or incorrect GEO address from LATAM countries like MEXICO, ARGENTINA, CHILLI, etc. with the help of Google Map, OpenStreetMap API and NLP technique of sentence embedding using SBERT (Sentence BERT). ● Solution architecture/design and Implementation in Azure Blob Storage, Azure ML & Azure DevOps CI-CD framework. ● NLP (SimCSE: Simple Contrastive Learning of Sentence Embeddings) model were trained on Tesla v100 GPU based system for domain specific corpus (i.e., SPANISH public domain addresses) for Textual Semantic Search downstream task and used to stored pre-validated Mexico address embeddings and search them in Elasticsearch-8.0 based dense vector database using cosine similarity. ● Solutions was made RESTful through Flask API framework and hooked to NodeJS / Reacts UI for better user experience. ● Project Management, Handling team of data scientist/machine learning engineer, data engineer, test engineer.
SunGard Global Solutions
1. NEC Japan - GTC Framework Development (Python, PyTorch DL, NLP). This project basically deals with development of Generic Text Classifier (GTC) framework that offers (a) Training (i.e., just use architecture) and Additional training (i.e., use architecture and weights) of various pre-trained embedding models on custom dataset (b) Various validation method (i.e., hold-out, k-fold), (c) Build classification model on top of embedding model from (a) with custom network layer defined through configuration. ● Ask was to build GTC framework that can support various customization (i.e., customized data, custom embedding and classification model) using pyTorch framework. ● Lead the framework design and development (includes defining the framework architecture, AWS EC2 (GPU) infra selection, GTC framework User Manual Creation). ● Various Transformer based Embedding Model (viz. BERT, BART, LaBSE, GPT2) & pre-defined Classification Model (viz. AutoModelSequenceClassification) were trained and tested on custom dataset. ● Received SPOT award for development of framework in short span of time. 2. Generative AI based POC and Proposal (Python, Generative AI/LLM, LangChain, Azure OpenAI). ● Contributed to Project Proposal based on POC for Q&A Bot to address banking customer query using Azure OpenAI API service. ● Inhouse POC - Built Q&A Bot using Lang Chain framework to address HR policy related queries.
Aligned Automation
1. SAP SD – Information Extraction (Python, PyTorch, Generative AI / LLM). This project basically deals with information extraction using Generative AI approach (i.e., one-shot, few-shot, Tree of Thought, Static and Dynamic Prompt engineering, Alpaca style Prompt Engineering, Auto Prompt Tuning etc.) from SAP Business Requirement (supplied in form of Q&A dataset for SAP SD Module). ● Ask was to extract keywords (single or multiple) from SAP Business Requirement supplied inform of Question & Answer. ● Existing SAP implement AI tool used to map Q&A to specific SAP SD BDC screen’s Process Element and its field name. However, Field value is supposed to be Extracted from Question & Answer through Prompt Engineering Technique. ● Various LLM (viz. Falcon, Vicuna, Llama based i.e., openllama, Nous-Hermas-Llama) were used for Prompt Engineering Technique (viz. one-shot, few-shot, Tree of Thoughts engineering, auto-prompt tuning). ● Contributed in research and leading the team of NLP Prompt engineer.