Roles & Responsibilities
Key Responsibilities :
- Provide L2 / L3 production support for Java / web-based applications, primarily in banking environments.
- Manage 24 / 7 application support, including high-severity and high-impact incidents, with a focus on fast recovery and minimal disruption.
- Monitor, troubleshoot, and resolve application and infrastructure issues promptly.
- Perform root cause analysis and post-resolution follow-ups for incidents.
- Participate in disaster recovery drills and major incident investigations.
- Generate and maintain reports based on ticket metrics and application performance.
- Coordinate and execute infrastructure maintenance activities including patching, upgrades, and deployments.
- Manage and maintain Docker containers and batch jobs (e.g., TWS, Cronjobs).
- Conduct regular health checks and performance monitoring using tools and scripts.
- Maintain documentation for technical procedures, incident handling, and change management.
- Support CI / CD pipelines and DevOps practices for application releases and change requests.
- Ensure SIT and UAT approval processes are completed, including UAT Business Unit sign-off.
- Collaborate with developers, project teams, vendors, and infrastructure teams to ensure seamless operations.
- Drive effective communication between business and technology regarding production service reliability and performance.
- Champion production resilience and availability, focusing on superior client experience.
- Drive the implementation of Site Reliability Engineering (SRE) and Chaos Engineering for strategic systems.
- Improve system reliability and availability by gathering data and designing for performance.
- Drive continuous improvements in processes and systems using SRE methodologies.
- Provide expert advice and training on technology solutions and advanced reliability techniques.
Requirements
Required Skills & Experience :
Minimum 10 years of experience in production support, system administration, or application support roles.Strong knowledge of Linux / AIX / UNIX, Windows Servers, and cloud platforms (AWS, Azure).Proficient in SQL, MySQL, MariaDB, and database batch scripting.Experience with WebSphere, WebLogic, Apache Tomcat, and VMware.Familiarity with monitoring tools, log analysis, and performance tuning.Hands-on experience with DevOps tools and CI / CD pipelines.Strong communication and problem-solving skills.Ability to work in high-pressure environments and manage multiple priorities.Education & Certifications :
Bachelor’s Degree in Science (Computer Science) or related field.Tell employers what skills you have
Puppet
Azure
Change Management
Pipelines
MySQL
Root Cause Analysis
Scripting
MariaDB
SQL
Application Support
Reliability Engineering
Apache Tomcat
Windows
System Administration
Banking
Performance Tuning
Disaster Recovery
Databases
Linux
Amazon Cloud