This job offer is not available in your country.

Technical Support Specialist - SEDHA CONSULTING PTE. LTD.

SEDHA CONSULTING PTE. LTD.Islandwide, SG

8 days ago

Job description

Roles & Responsibilities

About the Role

We are looking for a skilled and driven Technical Software / Support Engineer (Operations) to join our team. In this role, you will drive our operations and incident management initiatives, ensuring our systems remain robust, scalable, and resilient at scale. You will work closely with cross-functional teams to identify operational gaps and implement solutions that enable seamless deployment, observability, and maintenance of our system

Key Responsibilities

Incident Management & Response (60%)

Lead / contribute to incident response efforts during critical system outages and performance degradations
Develop and maintain incident response procedures, runbooks, and escalation protocols
Conduct thorough post-incident reviews and drive implementation of preventive measures
Coordinate cross-functional teams during high-severity incidents
Build and maintain incident management tooling and automation
Manage stakeholders expectations

System Operations & Reliability (20%)

Design, implement, and maintain monitoring, alerting, and observability across our system

Develop automation tools to reduce manual operational overhead

Ensure system SLAs and SLOs are met consistently

Software Development (10%)

Build internal tools, APIs, and platforms to improve operational efficiency

Create dashboards and reporting systems for operational metrics

Collaboration & Process Improvement (10%)

Partner with development teams to improve system reliability and operability

Establish and refine operational processes and best practices

Mentor team members on incident response and operational procedures

Participate in on-call rotation and provide operational leadership during incidents

Drive continuous improvement initiatives based on operational data and feedback

Required Qualifications

Technical Skills

5+ years of software engineering experience with a focus on operations

Proficiency in at least one programming language (Python, Java / Kotlin, TypeScript or similar)

Experience in modern web application technologies / tools such as PostgresDB, Kotlin, AWS

Knowledge of CI / CD pipelines and deployment automation

Experience with AWS and container technologies (Docker, Kubernetes)

Understanding of monitoring and observability tools (Prometheus, Grafana, ELK stack, or similar)

Experience with APM tools (New Relic, Datadog, AppDynamics)

Experience with infrastructure-as-code tools (Terraform, Ansible, CloudFormation)

Background in DevOps or Site Reliability Engineering practices

Experience with log aggregation and analysis tools

Understanding of security operations and compliance requirements

Contribute to system architecture decisions with operations considerations in mind

Operational Experience

Proven experience in incident management and response procedures

Experience with on-call responsibilities and escalation processes

Understanding of system reliability concepts (SLAs, SLOs)

Knowledge of networking, security, and database administration concepts

Experience with configuration management and deployment strategies

Soft Skills

Excellent problem-solving and analytical thinking abilities

Strong communication skills for technical and non-technical audiences

Ability to work effectively under pressure during incident situations

Collaborative mindset with cross-functional teams

Detail-oriented approach to documentation and process improvement

Tell employers what skills you have

Terraform

Security Operations

Kubernetes

DevOps

AWS

Kotlin

TypeScript

Automation Tools

Appdynamics

Database Administration

Reliability Engineering

Python

Docker

Ansible

Java

Grafana

System Architecture

Incident Management

Create a job alert for this search

Technical Specialist • Islandwide, SG