DevOps-Automation (Open to Remote) First American Folsom, CA

Kate

Administrator
Команда форума
Company Summary Join a team that puts its People First! First American's National Production Services division provides global title and escrow production support across all channels within First American Title including the Mortgage Services, Commercial, Direct, and Agency divisions. Since 1889, First American (NYSE: FAF) has held an unwavering belief in its people. They are passionate about what they do, and we are equally passionate about fostering an environment where all feel welcome, supported, and empowered to be innovative and reach their full potential. Our inclusive, people-first culture has earned our company numerous accolades, including being named to the Fortune 100 Best Companies to Work For® list for six consecutive years. We have also earned awards as a best place to work for women, diversity and LGBTQ+ employees, and have been included on more than 50 regional best places to work lists. First American will always strive to be a great place to work, for all. For more information, please visit www.careers.firstam.com.

Job Summary

The DevOps (Site Reliability Engineer) applies strong engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges. The role is responsible for the availability, reliability, integrity, and efficient operation of critical platform services and applications, ensuring they meet the requirements of internal and external users. This is achieved by monitoring, maintaining, automating and developing solutions that focus on uninterrupted delivery of applications throughout the software lifecycle.

Essential Functions
  • Measure and monitor availability and overall system and environment health.
  • Build and implement monitoring and recovery tools to provide optimum delivery and resilience.
  • Implement orchestration and tooling solutions to ensure that repetitive administration tasks are performed at a high level of efficiency and free of defect.
  • Provide primary operational support and engineering for multiple large distributed software applications
  • Builds permanent solutions to mitigate common system failures and bugs.
  • Partners with development and engineering teams to automate and optimize service availability, scalability, performance, monitoring and alerting.
  • May perform complex technical tasks with guidance from senior technical personnel.
  • Provides standard problem identification and resolution support services.
  • Uses infrastructure management tools to perform performance monitoring and capacity planning functions.
  • Contributes to the design, development and implementation of departmental workflow processes and procedures.
  • Required to provide on-call support during off-duty hours on weekdays, weekends and holidays on a scheduled/rotating basis.
  • Required to perform duties outside of normal work hours based on business needs.
Knowledge and Skills/Technology Used

Cloud Platform:
  • Good working knowledge of cloud services and architecture. (AWS, Azure)
  • Distributed Systems. (Architectures, micro-services, high availability)
  • Good working knowledge of distributed message bus.
  • Good working knowledge of container computing. (Docker, Kubernetes, Service Mesh)
  • Build and configure Azure, AWS services. (LAMBDA, Azure Functions)
  • Proxies and Load Balancing. (Nginx, HAProxu, Envoy)
Monitoring and Tools:
  • Knowledge of ServiceNow integrations.
  • Expertise with log event aggregation, metric collection and application monitoring and event handling. (Elastic, SCOM, AppD, Uptrends, AppInsights, Cloudwatch)
  • Basic working knowledge of Windows and UNIX/Linux technologies.
  • Basic working knowledge network triaging, packet loss and routing.
  • Understands Service Level Objectives (SLO), Service Level Indicators (SLI), Error Budgeting and Burn Rates.
Development:
  • Applies “everything as code” methodologies across configuration, infrastructure and orchestration.
  • Experience with programming languages. (.Net, C#, C++, Python)
  • Experience with continuous integration tools. (Chef, Ansible, Jenkins, Stash/Git)
  • Experience with configuration management tools. (Puppet, Hiera, Terraform, Terragrunt, Asnsible)
  • Knowledge of scripting languages or other tools to enable workflow automation.
Related:
  • Partners with development or engineering teams to automate and optimize service availability.
  • Introduce and test new technologies, tools and systems that enable fast and safe code deployment.
  • Administrative/Interpersonal Skills (All):
  • Good organization skills to balance and prioritize work assignments.
  • Good verbal and written communication skills.
  • Ability to work as a member of a multi-cultural, multi-location, team.
Typical Education
  • Generally requires BS Degree or equivalent work experience
Typical Range of Experience
  • Employees entering this job typically have 5+ years hands-on technical experience in system support and application development
License or Certification
  • Desired Certifications:
  • Cloud foundation (Azure, AWS, GCP, OCI)
  • Infrastructure Management Tools
  • Server and storage technologies
  • Networking
First American invests in its employees' development and well-being, empowers them to provide superior customer service and encourages them to serve the communities where they live and work. First American is committed to diversity and inclusion. We are an equal opportunity employer.

Based on eligibility, First American offers a comprehensive benefits package including medical, dental, vision, 401k, PTO/paid sick leave and other great benefits like an employee stock purchase plan.
 
Сверху