Roles & Responsibilities
We are seeking an experienced Technical Subject Matter Expert (SME) with a strong background in onboarding infrastructure devices—including network devices, storage systems, and server hardware—into Logstash-based log monitoring solutions. The ideal candidate will possess deep expertise in ELK Stack administration, data pipeline design, and the integration of diverse infrastructure logs for comprehensive monitoring and analytics across complex environments.
Key Responsibilities
Infrastructure Device Onboarding Lead the onboarding of network devices (routers, switches, firewalls), storage arrays, and server systems into Logstash pipelines for centralized log collection and monitoring.
Develop and maintain device-specific Logstash configurations for parsing syslog, SNMP traps, and proprietary log formats from various infrastructure vendors.
Integrate log sources from multi-vendor environments (e.g., Cisco, Juniper, Dell EMC, NetApp, HPE, etc.) ensuring compatibility and data normalization.
Logstash Pipeline Management Design, implement, and optimize Logstash pipelines to handle high-volume log ingestion from infrastructure devices.
Apply advanced parsing, enrichment, and transformation techniques to extract actionable metrics and event data from raw device logs.
Implement filtering and tagging strategies to categorize logs by device type, location, and criticality.
Monitoring & Visualization Create and maintain Kibana dashboards tailored for infrastructure monitoring, providing real-time visibility into device health, performance, and security events.
Develop alerting rules for critical infrastructure events (e.g., interface down, storage threshold breaches, hardware failures).
Automation & Integration Automate device onboarding and log pipeline deployment using scripting (Shell, Python) and configuration management tools.
Integrate log monitoring with ITSM platforms (e.g., ServiceNow) for automated incident creation and remediation workflows.
Maintain version control of pipeline configurations and scripts using Git.
Operational Excellence Perform routine health checks and troubleshooting of ELK Stack components and infrastructure log sources.
Collaborate with network, storage, and server teams to ensure comprehensive log coverage and rapid incident response.
Develop and update runbooks and SOPs for infrastructure log onboarding, monitoring, and incident management.
Required Experience
Technical Skills
Tell employers what skills you have
Switches
Syslog
Version Control
Remediation
Pipelines
VMware
Scripting
Logstash
Configuration Management
Windows
Juniper
Kibana
Routers
ServiceNow
Incident Management
Engineer Islandwide Sg • Islandwide, SG