Director, SRE and DevOps Nvidia Santa Clara, CA

Kate

Administrator
Команда форума
NVIDIA is seeking a Director, SRE & DevOps to lead a team of SRE & Database Engineering Team to operationalize, visualize & automate world-class products to solve engineering, collaboration & Cloud challenges. You will lead a team of systems & software engineering to build and run large-scale, fault-tolerant systems and services in the Engineering, Collaboration & Database Infrastructure space. Cultural fit is a must, as you will need to be self-motivated, a critical thinker, data-driven, results-oriented, with a focus on delivering outstanding user experience.

What you'll be doing:

Lead by example, mentor the team of Managers & IC and establish credibility through quality technical execution, and pitch in with hands-on help and code as needed to keep things running smoothly

Mentor members of the team enabling them to deliver high-quality Systems & End User experience.

Own the strategy and development of the incident response management and service capacity management through core engineering execution

Help implement automated tests, automated deployments, monitoring, and operational tools along with true observability

You will apply engineering leadership and deep knowledge of infrastructure and software development at scale to lead the operation, adoption, and evolution of these services

Strong understanding of software development, debugging, optimization, and/or troubleshooting - hands-on experience with common programming languages preferred

Own the AI/ML & CI/CD Operations to facilitate faster time to market for the products

Develop tight relationships with multiple software development partners, ensuring that product needs are met from an operations perspective

What we need to see:

8+ years of demonstrated ability in site reliability and technical operations

Experience building large and geographically disperse infrastructure supporting business-critical cloud & on-premises services

BS or MS in Computer Science, a related field, or equivalent experience

7+ years of people management and team leadership experience including headcount planning and developing strong and motivated teams

Experience running AI/ML operations through CI/CD pipeline

Experience with 24/7 site monitoring and ability to own uptime & performance SLA’s

Operational experience at scale - designing and operating highly available, scalable, and fault-tolerant systems using best-of-breed technologies like containers, APIs, Data Platform, etc.

Excellent written and verbal communication, able to collaborate and rally support

Comfortable leading discussions with upper management and have experience tailoring the level of technical details to suit the audience

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative, results-oriented and enjoy having fun, then what are you waiting for? Apply today!

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression , sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
 
Сверху