Unfortunately, this job posting is expired.
Don't worry, we can still help! Below, please find related information to help you with your job search.

Related keywords

Some similar recruitments

Recruited by Modern Technology Solutions, Inc. (MTSI) 8 months ago Address Edwards, CA, United States

Computing Facilities Group Data Center Support Engineer

Recruited by Lawrence Berkeley National Laboratory 9 months ago Address , San Francisco Bay Area $82,452 - $103,068 a year

Research Engineer Jobs

Recruited by Fujitsu 9 months ago Address Sunnyvale, CA, United States

Computing Systems Engineer Jobs

Recruited by Encantado Technical Solutions 9 months ago Address Livermore, CA, United States

Program Coordinator I Jobs

Recruited by San Diego State University 10 months ago Address San Diego, CA, United States

Project Coordinator I Jobs

Recruited by WinMax 10 months ago Address Cupertino, CA, United States

Research Engineer Jobs

Recruited by OpenAI 10 months ago Address San Francisco, CA, United States

Services Coordinator I Jobs

Recruited by MidPen Housing Corporation 11 months ago Address Livermore, CA, United States

2023 Junior Computing Systems Engineer

Recruited by The Aerospace Corporation 11 months ago Address , El Segundo, Ca $117,000 a year

Postdoc / Research Engineer Jobs

Recruited by California Institute for Telecommunications and Information Technology (CALIT2), UCSD 1 year ago Address San Diego, CA, United States

Research Engineer Jobs

Recruited by Magic 1 year ago Address San Francisco Bay Area, United States

High Performance Computing Cluster Administration

Report

Company	NVIDIA
Address	, Santa Clara
Employment type	FULL_TIME
Salary	$144,000 - $270,250 a year
Expires	2023-09-10
Posted at	9 months ago

Job Description

NVIDIA's Deep Learning Optimized Frameworks Group is looking for a deeply technical HPC cluster administrator to lead a diverse cluster of GPU-accelerated systems and provide architectural mentorship to product teams in the deep learning and scientific computing domains. As a member of the DLFW Infrastructure team, you will provide leadership in the design and implementation of groundbreaking GPU compute cluster that runs demanding deep learning, high performance computing, and computationally intensive workloads. We are looking for an expert to identify architectural changes and/or completely innovative approaches for our GPU Compute Cluster. In this role, you will help us with the strategic challenges we encounter, including compute, networking, and storage design for large-scale, high-performance workloads and effective resource utilization in a heterogeneous compute environment.

What you'll be doing:

Coordinate Storage Solutions and plan for growth.
Automate configuration management, software updates, and maintenance and monitoring of system availability using modern DevOps tools (Ansible, Gitlab, etc.)
Actively connect with management regarding any problems with the equipment and propose resolution.
Plan, build and install/upgrade new systems that support NVIDIA DL Software
Administer Linux systems, ranging from powerful DGX servers to embedded systems, bringup hardware to publicly available systems.

What we need to see:

Experience with containers (Docker, Singularity, LXC)
Deep understanding of operating systems, computer networks, and high-performance applications
You have a BA, BS, or MS in CS, EE, CE or equivalent experience
Proven track record to script in bash, Perl or python
Ability to work well with developers & test engineers
Familiar with resource scheduling managers (Slurm (preferred), LSF, etc!
Hard-working dedication to provide quality in support for your users
5+ years of previous experience deploying and administrating HPC clusters

Ways to stand out from the crowd:

Experience with mobile and embedded systems
Experience coding/scripting in Perl/Python/bash
Familiarity with GPU usage in Compute Cluster and Cuda
Basic knowledge of Deep Learning.
Familiarity and prior work experience with technologies such as: Ansible, GIT, Slurm, Zabbix, Prometheus, Grafana and Docker

The base salary range is $144,000 - $270,250. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and

benefits

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Latest vacancies

Systems Analyst - Excel, Xml, Sql, Scripting
By CyberCoders At Salt Lake City, UT, United States 8 months ago
(Senior) Finance & Shared Services Manager
By Catholics For Choice At Washington, DC, United States 8 months ago
Paralegal - Probate Administration
By CyberCoders At Miami, FL, United States 8 months ago
Account Executive - Automotive Software
By ECW Search At United States 8 months ago
Construction Project Coordinator Jobs
By CyberCoders At River Falls, WI, United States 8 months ago