Senior System Software Engineer - AI Data Platform - Inference Factory

ABOUT CLIENT

Our client is a leading technology company specializing in graphics processing units (GPUs) and artificial intelligence (AI).

JOB DESCRIPTION

Create infrastructure and tools to automate complex software processes effectively.
Improve performance: Deploy advanced test harnesses, benchmarking frameworks, and analytical tools to thoroughly evaluate and enhance the performance and efficiency of software and hardware platforms.
Utilize expertise in operating systems, kernel internals, device drivers, memory management, storage, networking, and high-speed interconnects to construct and troubleshoot high-performance systems.
Collaborate with engineering teams to comprehend requirements and deliver efficient solutions.
Establish performance objectives, assess feedback, analyze data, and continually enhance system reliability.
Shape technical strategies: Contribute to developing technical strategies and roadmaps for platform automation initiatives to ensure they are in line with company goals and industry best practices.

JOB REQUIREMENT

Required: Bachelor's or equivalent experience in Computer Science, Computer Engineering, or a related technical field, or Master's degree or equivalent experience in a similar field.
Minimum 5 years of industry experience in software development, focusing on infrastructure, distributed systems, automation, and/or performance engineering.
Proficiency in System-Level Programming: Proven ability to develop robust tools and automation using programming languages such as C++, Python, or Go.
Thorough Understanding of System Software: Experience with operating system internals, device drivers, memory management, and debugging performance issues in complex compute applications.
Distributed Systems Expertise: Experience in designing, building, and operating large-scale distributed systems, with knowledge of networking protocols, cluster management, and high-performance interconnects.
Automation and CI/CD Proficiency: Experience building and maintaining automated testing, benchmarking, and continuous integration/continuous deployment pipelines.
Strong Problem-Solving and Analytical Skills: Outstanding analytical, problem-solving, and debugging skills, with a track record of resolving complex technical challenges.
Collaboration and Communication Skills: Excellent interpersonal and communication skills, with the ability to articulate complex technical concepts to diverse audiences and collaborate effectively across teams.
Preferred qualifications
Experience optimizing performance for AI/Machine Learning workloads, especially inference applications, on diverse hardware platforms.
Prior experience building or contributing to large-scale compute infrastructure solutions in cloud environments or on-premises data centers.
Familiarity with containerization and orchestration technologies, such as Docker and Kubernetes.
Knowledge of performance profiling tools and methodologies for hardware and software systems.
Track record of driving significant efficiency gains or architectural improvements in large-scale systems.

WHAT'S ON OFFER

CONTACT

PEGASI – IT Recruitment Consultancy | Email: recruit@pegasi.com.vn | Tel: +84 28 3622 8666
We are PEGASI – IT Recruitment Consultancy in Vietnam. If you are looking for new opportunity for your career path, kindly visit our website www.pegasi.com.vn for your reference. Thank you!

Job Summary

Company Type:

Product

Technical Skills:

Devops, C/C++, Python, Golang

Location:

Ho Chi Minh - Viet Nam

Working Policy:

Hybrid

Salary:

Negotiation

Job ID:

J02058

Status:

Active

Related Job:

Senior Compositor/Editor - VVX

Ho Chi Minh - Viet Nam


Product

  • Artist

Combine various 3D elements, graphics, and visual assets to create top-quality videos. Edit and direct video clips to match the project's theme and narrative tone. Carry out post-compositing, color correction, and grading on real-time rendered content from Unreal / Unity. Include music, subtitles, and motion graphics to deliver refined and captivating end products. Ensure visual coherence by overseeing the overall tone, color scheme, rhythm, and atmosphere of each video.

Negotiation

View details

Locomotion Research Engineer

Others - Singapore


Product

Create and train RL locomotion policies for various movement types Establish and maintain simulation environments using custom actuator models to replicate hardware characteristics Implement domain randomization strategy to address simulation-to-reality discrepancies Validate and fine-tune locomotion controllers in simulation and physical platforms Utilize Data Engine telemetry data to refine simulation parameters Collaborate with different teams on issues related to locomotion performance Contribute to open-source releases of locomotion models, training code, and simulation assets

Negotiation

View details

Staff Engineer, Firmware

Others - Singapore


Product

  • Firmware
  • C/C++

Create the firmware stack for a new project from the ground up, including selecting the operating system, setting up the toolchain, and developing the board support package. Develop firmware for real-time processing units across the project's hardware platform. Establish and maintain embedded Linux environments for onboard computing, including BSP setup, device tree configuration, kernel driver development, and userspace/kernel interfaces. Create and maintain low-level drivers for various components such as actuators, sensors, IMUs, and communication buses (CAN, EtherCAT, SPI, I2C, UART). Design real-time control loops with specific latency and determinism requirements. Work closely with different engineering teams to co-design hardware and firmware interfaces. Develop tooling for firmware flashing, diagnostics, and in-field debugging. Set and uphold firmware architecture standards, code quality practices, and review processes within the team. Contribute to the hardware setup for new releases of the project, starting from schematic review through validation.

Negotiation

View details