Software Engineer - Splunk/AWS
What you’ll be doing:
Candidate will join the Enterprise Monitoring team as a peer to 3 other engineers (plus Product Owner and Manager) working to support, maintain and bring iterative improvement to our Splunk logging platform (3-4Tb/day) which runs entirely in AWS. This resource will help our internal partners across DevOps, and Enterprise Apps adopt Enterprise Monitoring standards and best practice. You’ll assist these teams in onboarding and standardizing their monitoring and alerting. You’ll help contribute to an Enterprise Monitoring standard for Logging and Time Series Metrics adoption. You’ll aid the organization in obtaining its observability goals and further our adoption of SRE as teams across the org identify, instrument, and operationalize SLO’s. You may also contribute to the management and evolution of our Prometheus metrics platform which encompasses Prometheus, Trickster, Thanos, and Grafana.
Daily activities might include:
- Being a collaborative outcome focused open minded team mate
- Infrastructure maintenance and application upgrades
- Troubleshooting data inputs
- Assisting internal partners with configuring alerting in alignment with pre-defined alert management pipeline
- Ad-hoc support tickets
- Install and configuration of Splunk applications on an as-needed basis
- Iterative improvement of logging and parsing to improve usability of our data
- Create data summaries, dashboards, alerts
- Troubleshoot and remediate monitoring platform performance issues as needed
- Creating/Updating documentation
Notes from the hiring team:
I’m looking to add an engineer to my Enterprise Monitoring team. The ideal candidate would have over a year of experience managing Splunk infrastructure/deployments. This would entail building/running the instances the application stack runs on, in addition to managing the installation, maintenance and upkeep of Splunk apps and data inputs. They would have a strong grasp of the Splunk query language and experience troubleshooting data quality issues.
Cloud experience is highly recommended, particularly AWS. This extends to managing apps and infrastructure as code (we use Terraform). Experience with IAM a plus. Experience with industry DevOps tools required; CI/CD familiarity needed. Candidate should be comfortable operating as an admin from the OS ‘up’ the stack; experience with Python, Bash, Powershell recommended.
Experience with Prometheus, Grafana, PagerDuty, SolarWinds or Catchpoint nice to have.
Note: If Splunk experience is limited to viewing dashboards, creating ad-hoc queries, that is likely not deep enough experience.