Roles & Responsibilities
Data engineer (Azure) – Synapse and Pyspark, Python, Datawarehouse and Azure Data Explorer, Azure Devops
Job Scope
- Design, review and development of Pyspark scripts. Testing, troubleshooting of data pipelines, orchestration
- Implement and maintain datalake
- Establishing connections to source data systems such as on-prem databases, IOT devices, APIs
- Managing the collected data in appropriate storage / data-base solutions e.g. file systems, SQL servers, Big Data platforms such as Hadoop, HANA, etc. as required by the specific project requirements.
- Design, development of relevant data pipelines using pyspark, copy data activities for batch ingestion
- Performing data integration e.g. using database table joins, or other mechanisms at an appropriate level as required by the analysis requirements of the project
- Deployment of pipeline artifacts from one environment to the other using Azure Devops
- Preferred to have Devops expertise
Skills & Experience
Bachelor’s Degree in Computer Science or Engineering with 2 years of experience in Azure Data engineering, Python, Pyspark or Big Data developmentSound Knowledge of Azure Synapse analytics for pipelines, orchestration, set up1-2 experience in Visualization design and development with Power BI. Knowledge on row-level security, access controlSound experience in SQL, Datawarehouse, data marts, data ingestion with Pyspark and PythonExpertise in developing and maintaining ETL processing pipelines in cloud-based platforms such as AWS, Azure, etc. (Azure Synapse or data factory preferred)Team player with good interpersonal, communication, and problem-solving skills.Good to have
Data catalog with Purview enabling effective metadata management, lineage tracking, and data discoveryCandidates should demonstrate the ability to leverage Purview to ensure data governance, compliance, and efficient data exploration within Azure environmentsOthers
Able to work independently on assignment according to agreed schedule without much supervisionOwn assignment and take initiative to resolve issues hinder completion of assignment Proactively reach out for help / guidance whenever required.Tell employers what skills you have
PySpark
Factory
Azure
Big Data
Pipelines
Hadoop
ETL
Data Integration
Data Governance
Data Engineering
SQL
Python
Visualization
Metadata
Power BI
Databases