● Designed and implemented scalable data pipelines using Spark, Kafka, and Flink for processing large volumes of streaming data.
● Implement One time Data Migration of Multistate level data from SQL server to Snowflake by using Python and SnowSQL.
● Created multiple Glue ETL jobs in Glue Studio and then processed the data by using different transformations and then loaded it into S3, Redshift and RDS.
● Performed end- to-end Architecture & implementation assessment of various AWS services like Amazon EMR, Redshift and S3.
● Management almost completely managed in GitHub or with Ansible.
● Managed Hadoop deployment in AWS cloud using s3 storage and Elastic Map Reduce.
● Configured an AWS Virtual Private Cloud (VPC), NACL, and Database Subnet Group for isolation of resources within the Confidential RDS and Aurora DB clusters.
● Demonstrated proficiency in deploying applications to various cloud platforms such as TKGS, Azure, GCP, and AWS, ensuring adaptability to different cloud environments and deployment strategies.
● Day to-day responsibility includes developing ETL Pipelines in and out of data warehouse, develop major regulatory and financial reports using advanced SQL queries in snowflake.
● Proficient in data analysis tools such as Excel, Python, R, and SQL for querying and manipulation.
● Built microservices using Spring Boot, deployed to Docker containers, and hosted in Kubernetes clusters.
● Created automated pipelines in AWS Code Pipeline to deploy Docker containers in ECS using S3.
● Designed and set up an Enterprise Data Lake to support various use cases including analytics, processing, storing, and reporting of rapidly changing data.
● Utilized Microsoft Power BI to design and develop interactive dashboards and reports, providing stakeholders with actionable insights into key performance indicators (KPIs) and business metrics.
● Leveraged Power BI and Power Pivot for data analysis prototypes, utilizing Power View and Power Map for effective report visualization.
● Implemented CI/CD pipelines with AWS for efficient deployment and optimized workspace and cluster configurations for performance.
● Designed and developed data management system using MySQL and Managed large datasets using Panda data frames and MySQL.
● Designed and developed security frameworks for fine-grained access control in AWS S3 using Lambda and DynamoDB.
● Utilized Well-planned streaming for real-time data processing and integrated Databricks with various ETL/Orchestration tools.
● Implemented machine learning algorithms in Python for predictive analytics and utilized AWS EMR for data transformation and movement.
● Prepared and delivered presentations to stakeholders summarizing A/B test methodologies, results, and actionable recommendations, demonstrating a strong understanding of the SDLC and project implementation methodology.
● Strong understanding of the Software Development Life Cycle and project implementation methodology.