Devops Engineer:
Primary Skill Grafana,Promtheus,Kubernetes
Primary Location
Alpharetta, GA, United States (must relocate in 2 -3 Months or Prefer local)
Primary Skill Grafana,Promtheus,Kubernetes
Primary Location
Alpharetta, GA, United States (must relocate in 2 -3 Months or Prefer local)
- Hands on experience on Grafana, Prometheus with various exporters and Blackbox setup.
- Experience working with some of the following tech stack Kubernetes, Docker & Python, AWS, Java, Springboot.
- Definition and deployment of systems for metrics, logging, and monitoring
- Hands on experience on CICD pipeline, python, groovy, and JIRA Automation.
- Ensuring availability, performance, security, and scalability of production systems.
- System integration using a wide variety of protocols, not limited to REST, SOAP, MQ, TCP/IP, JSON.
- Ensure scalability & high availability with efficient server virtualization, networking and storage.
- Manage large scale deployments (thousands of servers, auto scaling etc) using Kubernetes container orchestration.
- Create architecture runways that enable rapid recovery, repair & cleanup of faulty migrations, with the objective of building fault tolerant systems.
- Create automations to resolve common faults in a system to reduce MTTI/MTTR.
- Identify & enable self-service capabilities for infrastructure & application management tasks
- Very high problem-solving ability, able to handle work under pressure, strong written and verbal communication skills.
- Solid understanding of systems, data structures, modern scripting and passion for enterprise level languages and open source tools.
- Bachelor's Degree in Computer Science/Computer Engineering / Relevant technologies
- Good communication skills and troubleshooting skills
- 4+ years of development experience on Kubernetes, Kafka Streaming, Docker, AWS Cloud
- Design/Develop pub/sub messaging using Kafka / Kubernetes clusters
- Install Kafka / Kubernetes clusters
- Write design specs
- Monitoring and Automation
- Building test scripts and support testing
- Support environment and service issues
- Performance turning and verification
- Participate towards on-call when necessary