- Home
- Remote Jobs
- [Remote] Senior Software Engineer, Distributed Systems - NIM Factory
[Remote] Senior Software Engineer, Distributed Systems - NIM Factory
Job summary
Work model
NVIDIA: Remote Senior Software Engineer, Distributed Systems - NIM Factory
NVIDIA is seeking a senior engineer to design and build factory infrastructure and automation for NVIDIA Inference Microservices (NIMs). This remote role, open to candidates in the USA, involves developing a factory pipeline for AI models, collaborating with teams, and mentoring colleagues to improve productivity and performance.
Responsibilities
- Develop a factory pipeline that will take an AI model in and produce a deployable service that is validated across Cloud, On-prem and Kubernetes environments
- With the team, define and deliver rapid iterations on the group's technical strategies and roadmaps to deliver and improve the NIM factory
- Design interfaces, data modeling and schema design, and expand observability over the factory pipeline and its compute infrastructure
- Work with technical leaders designing and developing scalable and reliable factory components
- Collaborate with multiple AI model teams to understand their requirements to build an efficient infrastructure that improves every team's productivity
- Define metrics and drive improvements based on user feedback
- Mentor and collaborate throughout the team and with other teams to grow your colleagues and yourself
- Demonstrate a history of learning and growing your skills and those around you
Skills
- Advanced programming skills to build distributed and compute systems, backend services, microservices, and cloud technologies
- Experience working with multi-functional teams, principals, and architects across organizational boundaries
- Mentorship, growing teams and team members, and flexibility to adjust direction and expectations based on customer needs
- Deep technical expertise in distributed containerized applications using technologies such as Docker, K8s, Cloud Endpoints, Helm, and Prometheus
- Passion for building rich, microservice applications and build/test automation pipelines
- Excellent interpersonal skills and the ability to lead multi-functional efforts
- Proven experience debugging and analyzing the performance of distributed microservices or cloud systems
- BS or MS in Computer Science, Computer Engineering, or related field (or equivalent experience)
- 8+ years of demonstrated experience developing performant microservice, cloud software, and/or tooling
- Experience delivering event-driven applications using services like Temporal, Kafka, Redis, or others, with a demonstrable ability to discuss their pros and cons
- Experience building and deploying containers for Microservices, Cloud, and On-prem deployments, and their associated CI/CD pipelines
- Prior experience in large-scale full-stack development
Benefits
- Equity
- Benefits
Company Overview
NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI. Founded in 1993 and headquartered in Santa Clara, California, USA, NVIDIA has a workforce of 10001+ employees. Visit https://www.nvidia.com.
Company H1B Sponsorship
NVIDIA has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role.