Site Reliability Engineer Mount Technical Group Dallas, TX Full Time

Kate

Administrator
Команда форума
Job Description: Site Reliability Engineer



Overview

The Site Reliability Engineer position is responsible for keeping all user-facing services and other production systems running efficiently and smoothly through the effective use of automation and development practices. Your hands-on knowledge in system design, application development, testing, and operational stability will help the team deliver highly reliable products and solutions.



Position Responsibilities and Duties

  • Design, code, test, and deliver software to automate manual operational work
  • Ensure high-availability and disaster-recovery abilities across solutions
  • Enhance and improve upon existing monitoring and alerting capabilities to avoid incidents
  • Monitor systems capacity and performance
  • Design, build and maintain infrastructure that enables auto-scaling for peak performance
  • Design self-healing and resiliency patterns via usage of Chaos Engineering practices
  • Become a technical leader and contributor to projects, including coding, code reviews, and architectural discussions
  • Assist in debugging production issues across services and levels of the stack
  • Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents


Qualifications

  • Strong verbal and oral communication skills and a positive, can-do attitude required
  • DevOps / Infrastructure Engineer with a development background (must possess coding skills)
  • Proficiency in modern programming languages – Node.js, Python, Java, PHP
  • Development of automation/monitoring scripts and an understanding of interfaces
  • Working knowledge of infrastructure components (e.g. routers, load balancers, containers, storage, network, etc)
  • Expertise in AWS Cloud, CloudFormation, SAM templates, and IaC
  • Experience in automated Quality Assurance techniques and practices
  • Knowledge of performance, monitoring, telemetry tools
  • Experience in managing DevOps practices and toolsets – CI/CD, Ansible, AWS CodePipeline
  • A Bachelor's degree in IT or equivalent experience in a software engineering discipline


Job Type

  • Full-time

Recommended Skills​

Systems Design

Php (Scripting Language)

Infrastructure

Testing

Storage (Computing)

Application Development
 
Сверху