Minimum 4 years experienced & motivated ETL Developer skilled in the design, development & deployment of Python & PySpark based XML Frameworks for ingestion of data from various source systems to Snowflake in AWS & Google Cloud Platform (GCP) environments.
Hands-on coding using Python, PySpark in cloud-based environment involving big-data frameworks in AWS resources like EMR, Lambda, S3 bucket, RDS, EC2, ECS, EKS etc.
Experience in ETL frameworks for data ingestion, data pipeline to Snowflake Data Warehouse at high-frequency & volume scenarios (giga / tera-bytes of data ingestions per day / month).
Optimize pyspark jobs for performance, efficiency, trouble shoot issues, ensure data quality and availability
Implement processes for automating data delivery, monitoring data quality, production deployment.
The candidate should be an expert in development with hands-on experience in leveraging Git / GitLab based repository management with understanding of Government Commercial Cloud (GCC) requirements.
Proficient in consulting communication skills – good in articulating business problem, approach for solution, respond to changes in scenarios for business problems.
Degree in Computer Science, Computer Engineering or any STEM equivalent
Familiar with working in Government Commercial Cloud (GCC) environment.
Independent and self-motivated contributor and passionate about software development