NVIDIA is searching for a DevOps and an Infrastructure Software/Systems Engineer for the bringing up, development and prototyping a new class of server products and appliances for our Metropolis platforms. Data is the lifeblood of the modern city. Today, it’s being captured by over 500 million cameras worldwide, and that number is growing exponentially. This is creating a tsunami of information that’s impossible for humans to analyze. AI is the key to turning this information into insight. It’s redefining how we collect, inspect, and analyze data to impact everything from public safety, traffic, and parking management to law enforcement and city services. NVIDIA Metropolis is leading this AI revolution, providing the tools, technologies, and expertise to meet every challenge with smarter, faster applications.
This exciting role will require someone who can architect the build and the deployment process of our compute Servers that can advance the application of artificial intelligence and machine learning to Streaming video and data analytics to market. Practical experience in the use and administration of server virtualization technology will be highly helpful. Your understanding and knowledge of complex applications built on both on-prem and cloud infrastructure, across operating systems and device classes and Cloud Services is a prerequisite. Your ability to automate all aspects of a modern code delivery and deployment pipeline using: Source code management tools, build tools, Test automation tools, Containerization, configuration management tools, monitoring tools and orchestration will be critical to your success.
What you'll be doing:
As a key member of our Metropolis team, you will build, deploy and maintain GPU based Servers for its use in Metropolis platforms and machine learning applications for its test, development and production environment
Leading design and be responsible for infrastructure components on Network topologies, Streaming Servers and Security
Collaborating with different software, IT, Security and hardware teams across geographies for solving critical problems and performance issues
Establish configuration environment for these servers by creating processes and tools that can be widely deployed in the industry for software development, debugging, testing, benchmarking and documentation
Automate provisioning and management of bare-metals, internal cloud, Microsoft Azure, Amazon AWS
Implement automated monitoring and operating procedures for a range of domains across on-premise/cloud environments
Build and maintain infrastructures related to the delivery of software artifacts produced by Metropolis application development teams.
Create detailed documentation that will allow customers and partners and system integrators to replicate the deployment architecture prototyped
What we need to see:
BS or MS in Computer Science, Computer Engineering or Electrical Engineering or related field (or equivalent experience)
6+ years of proven ability in Configuration Management, Server administration (Linux) in an Engineering Hardware Lab environment.
Good programming skills in Python, Shell Scripting, ansible, terraform, Helm Template
Good understanding of configuring and managing Elasticsearch, Logstash, Kibana, Kafka ecosystem.
Software build, package and delivery skills with Jenkins, Pipeline Scripting, Dockerfile, Artifactory integration, Container Registry, Helm Package repositories.
Good understanding of Kubernetes ecosystem and helm based application deployment patterns.
Infrastructure provisioning automation with AWS, GCP, Azure.
Building configuration management, monitoring and automation tools
Familiarity in management of large scale of edge servers deployed in indoor and outdoor environments.
Strong interpersonal skills
Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression , sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
This exciting role will require someone who can architect the build and the deployment process of our compute Servers that can advance the application of artificial intelligence and machine learning to Streaming video and data analytics to market. Practical experience in the use and administration of server virtualization technology will be highly helpful. Your understanding and knowledge of complex applications built on both on-prem and cloud infrastructure, across operating systems and device classes and Cloud Services is a prerequisite. Your ability to automate all aspects of a modern code delivery and deployment pipeline using: Source code management tools, build tools, Test automation tools, Containerization, configuration management tools, monitoring tools and orchestration will be critical to your success.
What you'll be doing:
As a key member of our Metropolis team, you will build, deploy and maintain GPU based Servers for its use in Metropolis platforms and machine learning applications for its test, development and production environment
Leading design and be responsible for infrastructure components on Network topologies, Streaming Servers and Security
Collaborating with different software, IT, Security and hardware teams across geographies for solving critical problems and performance issues
Establish configuration environment for these servers by creating processes and tools that can be widely deployed in the industry for software development, debugging, testing, benchmarking and documentation
Automate provisioning and management of bare-metals, internal cloud, Microsoft Azure, Amazon AWS
Implement automated monitoring and operating procedures for a range of domains across on-premise/cloud environments
Build and maintain infrastructures related to the delivery of software artifacts produced by Metropolis application development teams.
Create detailed documentation that will allow customers and partners and system integrators to replicate the deployment architecture prototyped
What we need to see:
BS or MS in Computer Science, Computer Engineering or Electrical Engineering or related field (or equivalent experience)
6+ years of proven ability in Configuration Management, Server administration (Linux) in an Engineering Hardware Lab environment.
Good programming skills in Python, Shell Scripting, ansible, terraform, Helm Template
Good understanding of configuring and managing Elasticsearch, Logstash, Kibana, Kafka ecosystem.
Software build, package and delivery skills with Jenkins, Pipeline Scripting, Dockerfile, Artifactory integration, Container Registry, Helm Package repositories.
Good understanding of Kubernetes ecosystem and helm based application deployment patterns.
Infrastructure provisioning automation with AWS, GCP, Azure.
Building configuration management, monitoring and automation tools
Familiarity in management of large scale of edge servers deployed in indoor and outdoor environments.
Strong interpersonal skills
Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression , sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.