Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications.

As we continue to grow, we're looking for a skilled GPU Systems Engineer (CUDA) to join our dynamic team and contribute to our mission of transforming business processes through technology.

This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth potential.

Position Details

Location: 100% Remote (Continental United States)
Position Type: In-house Bright Vision Technologies SOW engagement (no third-party client or vendor)
Experience: 6+ years
Salary: 100k - 150k
Employment Type: Full-time, direct W2 (no C2C, no 1099, no third-party)
Sponsorship: No new H1B sponsorship available. H1B transfers welcomed for qualified candidates.

Employment Terms & Visa Policy

This is a 100% remote, full-time, direct W2 position with Bright Vision Technologies. This role is part of our in-house Statement of Work (SOW) engagement. We do not engage in C2C, 1099, or third-party arrangements.

Candidates must be willing to work directly as a full-time W2 employee. While no new H1B sponsorship is available, we support H1B transfers for qualified candidates. A technical coding assessment is mandatory.

Job Summary

We are seeking a GPU Systems Engineer with deep expertise in CUDA programming, GPU architecture, and high-performance computing to design and optimize compute-intensive workloads. This role focuses on extracting maximum performance from GPU platforms for AI training, inference, scientific computing, and high-throughput data processing.

Key Responsibilities

Design and implement high-performance CUDA kernels for compute-intensive workloads.
Profile and optimize GPU code using Nsight Systems, Nsight Compute, and CUDA profilers.
Tune memory access patterns, occupancy, register usage, and shared memory utilization.
Develop highly optimized libraries for linear algebra, attention, and other ML primitives.
Optimize multi-GPU and multi-node training using NCCL, RDMA, and high-performance networking.
Implement custom operators and fused kernels in PyTorch, JAX, or Triton.
Collaborate with ML engineers to identify performance bottlenecks.
Develop benchmarks and regression tests to safeguard performance.
Evaluate new GPU architectures and advise on adoption strategy.
Implement mixed-precision and quantized compute paths.

Required Qualifications

Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field.
6+ years of experience in GPU programming and performance engineering.
Deep expertise in CUDA C/C++ and GPU programming models.
Strong understanding of modern GPU architectures, memory hierarchies, and execution models.
Hands-on experience profiling and optimizing GPU workloads in production.
Familiarity with NCCL, MPI, and high-performance interconnect technologies.
Experience integrating custom kernels into ML frameworks.
Strong C++ skills and familiarity with modern systems programming practices.
Solid grounding in linear algebra and numerical methods.

Preferred Qualifications

Experience with Triton, CUTLASS, or other GPU kernel authoring frameworks.
Familiarity with TensorRT, FasterTransformer, or vLLM internals.
Exposure to compiler infrastructure such as LLVM or MLIR.
Open-source contributions to GPU or ML performance libraries.
Experience with large-scale distributed training infrastructure.

How To Apply

For immediate consideration, please send your resume to [email protected] or contact us at (908) 676-4399. Learn more at www.bvteck.com.

Bright Vision Technologies is an equal opportunity employer. We do not discriminate on the basis of any protected attribute.

GPU Systems Engineer (CUDA)

Job summary

Work model