Bangalore
7 hours ago
Splunk SME - SPL, AppDynamics/Dynatrace/Nagios/Zabbix, Linux, Bash/Python

Splunk Administration & Engineering

Serve as the SME for Splunk architecture, deployment, and configuration across the enterprise. Maintain and optimize Splunk infrastructure, including indexers, forwarders, search heads, and clusters. Develop and manage custom dashboards, reports, saved searches, and visualizations. Implement and tune log ingestion pipelines using Splunk Universal Forwarders, HTTP Event Collector (HEC), and other inputs. Ensure high availability, scalability, and performance of the Splunk environment. Create advanced dashboards and visualizations for various teams to monitor application health and performance. Expertise in SPL (Search Processing Language) and Splunk architecture, including configuration files and parsing. Monitor and troubleshoot applications using tools such as AppDynamics, Splunk, Grafana, Argos, and OpenTelemetry (OTEL). Design dashboards to detect network issues, monitor microservice health, and configure ing. Apply excellent debugging and triaging skills in large-scale, distributed systems. Develop and maintain documentation, runbooks, and usage guidelines for multi-cloud and microservices platforms. Optimize search queries and improve performance using summary indexing and other techniques. Manage and monitor the performance of the Splunk infrastructure. Develop long-term strategy and roadmap for integrating AI/ML tooling across the Splunk portfolio. Diagnose and resolve network-related issues affecting CI/CD pipelines, including DNS, firewall, proxy, and SSL/TLS problems using tools like tcpdump, curl, and netstat.

Enterprise Monitoring & Observability

Design and implement comprehensive monitoring solutions integrating Splunk with tools such as: AppDynamics, Dynatrace, Prometheus, Grafana, SolarWinds, and others. Collaborate with application, infrastructure, and security teams to define: Monitoring KPIs, SLAs, SLOs, and thresholds. Build end-to-end visibility into system performance, application health, and user experience. Integrate Splunk with ITSM tools (e.g., ServiceNow) for event and incident management automation.

Operations, Troubleshooting & Optimization

Perform data onboarding, field extraction, and parsing for structured and unstructured data. Support incident response and root cause analysis using Splunk for troubleshooting and diagnostics. Audit and optimize search performance, data retention policies, and index lifecycle configurations. Create runbooks, SOPs, and documentation for Splunk usage and monitoring best practices.

Required Qualifications

5+ years in IT infrastructure, DevOps, or monitoring roles. 3+ years hands-on experience with Splunk Enterprise as an admin, architect, or engineer. Experience designing, deploying, and managing large-scale, multi-site Splunk environments. Strong command of SPL (Search Processing Language) and dashboard/report development. Proficiency in Linux environments, scripting languages (e.g., Bash, Python), and REST APIs. Experience with enterprise monitoring tools and integration with Splunk (AppDynamics, Dynatrace, Nagios, Zabbix, etc.). Understanding of logging, metrics, and distributed tracing in modern IT environments. Strong grasp of network protocols, system logs, and application telemetry.
Por favor confirme su dirección de correo electrónico: Send Email