Systems Engineer
Job Requirements:
• Build, Deploy and Manage the Enterprise Lucene DB systems (Splunk &
Elastic) to ensure that the legacy physical, Virtual systems and container
infrastructure for business-critical services are being rigorously and
effectively served for high quality logging services with high availability.
• Support periodic Observability and infrastructure monitoring tool releases
and tool upgrades, Environment creation, Performance tuning of large scale
Prometheus systems
• Serve as dev, ops, SRE for the internal observability systems in Client's
various data centers across the globe including in Cloud environment
• Lead the evaluation, selection, design, deployment, and advancement of the
portfolio of tools used to provide infrastructure and service monitoring.
Ensure tools utilized can provide the critical visibility on modern
architectures leveraging technologies such as cloud, containers etc.
• Build and grow the scope and capabilities of the Enterprise Monitoring team
with a top-down, service-driven focus. Ensure methodologies keep pace with the
shifts & transformations taking place within IT.
• Ensure monitoring team increases use of automation and adopts a DevOps/SRE
mentality
Qualification :
• 10+ years of enterprise system logging and monitoring tools experience, with
a desired 5+ years in a relevant critical infrastructure of Elasticsearch, ECE,
Open Distort Elastic and Enterprise Splunk
• Experience with designing and engineering solutions to monitor critical
systems and container infrastructure across a wide array of technologies and
platforms
• In-depth experience managing monitoring tools such as Prometheus, Grafana and
other commercial APMs, Nagios, SCOM, Zabbix, sysdig, BMC patrol.
• Strong knowledge on opensource logging and monitoring tools.
• Experience with containers logging and monitoring solutions.
• Experience with Windows and Linux operating system management and
administration
• Familiarity with LAN/WAN technologies and clear understanding of basic
network concepts / services
• Strong understanding of multi-tier application architectures and application
runtime environments
• Experience with monitoring infrastructure in cloud platforms such as AWS and
Azure is desired
• Knowledge of Python and other scripting languages and infrastructure
automation technologies such as Ansible is desired
• CKA (Certified Kubernetes Administrator) or CKAD is a plus
Req #
115774
Job Id
3717-1
Category
IT / Software Development
Job Type
Contract
Job Status
Other
Experience Level
Experienced (Non-Manager)
Location
TX, Austin