Senior Platform Engineer (Core Platforms)

ABOUT CLIENT

Our client is a global investment firm

JOB DESCRIPTION

Providing Architecture As-A-Service solutions utilizing Open Source Software.
Collaborating with the team on supported products.
Creating tools to enhance deployment and monitoring of services in a distributed environment.
Collaborating with teams to develop innovative software solutions for DevOps and Agile transformation.
Involvement in periodic on-call responsibilities.

JOB REQUIREMENT

Advanced proficiency in Linux based systems.
Strong skills in Python for building APIs, developing automation scripts, and creating operational tools.
Experience with modern authentication and authorization protocols such as OIDC, SAML, and Kerberos for secure access to platforms.
Knowledge of container orchestration with Kubernetes (K8s) is preferred. Experience with other cluster management systems like Slurm is a plus, especially in large-scale computational environments.
Proven experience with Infrastructure as Code (IaC) to automate the provisioning and management of infrastructure, using tools like Terraform, Ansible, and Helm.
Hands-on knowledge with containerization systems such as Docker or Podman.
Familiarity with CI/CD, GitLab (preferred), GitHub or Git.
Agile development experience.
Strong background in observability, including monitoring and logging using tools like Prometheus, Grafana, and the OpenTelemetry (OTEL) stack.
Experience using AI/LLM-powered tools and its concepts to support development workflow is a plus.
A collaborative team attitude.
Strong written and verbal communication skills in English.

WHAT'S ON OFFER

Competitive and appealing compensation package with a clear career progression
Emphasis on continuous learning and development through training courses, library access, speaker sessions, and knowledge sharing events
Opportunity to collaborate with intelligent and talented colleagues
Support for diversity and inclusion through employee resource groups
Premium health insurance and Employee Assistance Program
Generous time-off policy and sabbatical leave based on tenure
Employee benefits through Trade Union for staff and family
Monthly team-building activities and employee clubs for various interests
Annual company trip and occasional global conferences to connect with global teams
Daily tea break, snacks, and meals provided in the office

CONTACT

PEGASI – IT Recruitment Consultancy | Email: recruit@pegasi.com.vn | Tel: +84 28 3622 8666
We are PEGASI – IT Recruitment Consultancy in Vietnam. If you are looking for new opportunity for your career path, kindly visit our website www.pegasi.com.vn for your reference. Thank you!

Job Summary

Company Type:

Product

Technical Skills:

Devops, Backend

Location:

Ho Chi Minh, Ha Noi - Viet Nam

Working Policy:

Onsite

Salary:

Negotiation

Job ID:

J00574

Status:

Active

Related Job:

Senior Business Analyst

Ho Chi Minh - Viet Nam


Outsource

  • Business Analyst

Negotiation

View details

AI-Native Software Engineering Lead

Ho Chi Minh - Viet Nam


Outsource

  • Backend
  • AI

Responsible for developing and evolving the AI-native SDLC operating model, including agent workflow designs, verification gates, context management standards, and evaluation frameworks Build and lead multi-agent systems using orchestration layers such as Claude Code, GitHub Copilot Workspace, Cursor, LangGraph, CrewAI, or equivalent, from prototype to production Collaborate with the Director of Engineering to contribute to and maintain the company's AI toolchain selection criteria and evaluate tools with engineering rigor, providing internal guidance on when AI is beneficial and when it is not Establish engineering standards, agent evaluation loops, and AI output quality gates across the delivery organization Previous experience in a lead, principal, or staff engineer role with demonstrated cross-team influence Experience in outsourcing, consulting, or multi-client delivery environments Track record of building or leading an internal community of practice, guild, or AI adoption program Develop and continuously evolve the company's AI-native SDLC playbook, including standards, workflow templates, case studies, and guardrails that delivery teams can adopt immediately Design and lead internal upskilling programs that transition engineers from AI-assisted to AI-native working patterns Keep track of the AI capability frontier, model improvements, new agent frameworks, and emerging risks, translating signals into timely updates to KMS's practices Work closely alongside Delivery Teams as an AI transformation advisor and execution partner, identifying the highest-value automation opportunities across the SDLC and coordinating with the team to implement them Design and deploy agent-orchestrated workflows tailored to each client's stack, team maturity, and delivery context, with measurable ROI Build business cases for AI-native adoption with clients and account managers, framing the value in terms of velocity, quality, and cost Represent the company's AI-native engineering capabilities in client conversations, QBRs, and RFP responses as a credible technical authority

Negotiation

View details

Platform Lead

Others - Singapore


Product

  • Backend
  • Devops
  • Data Engineering

Develop and expand distributed systems to handle large volumes of sensory, telemetry, and control data across cloud and edge environments, facilitating real-time connections for fleets of robots. Create the API Platform with a focus on high reliability, exceptional developer experience, and robust multimodal AI capabilities accessible through user-friendly APIs and SDKs. Establish extensive training and inference platforms for foundation models used in robot autonomy, teleoperation, and developer integrations. Devise data ingestion and streaming pipelines for real-time connectivity of robot fleets to the cloud, covering various data inputs such as video, LiDAR, joint states, and audio. Oversee and advance a modern cloud native infrastructure stack employing Kubernetes, Docker, and infrastructure as code tools. Ensure platform reliability through telemetry, monitoring, alerting, autoscaling, failover, and disaster recovery measures. Make infrastructure decisions pertaining to distributed storage, consensus protocols, GPU orchestration, network reliability, and API security. Foster collaboration across ML, robotics, and product teams to facilitate hardware in the loop simulation, policy rollout, continuous learning, and CI/CD workflows. Implement secure APIs featuring fine-grained access control, usage metering, rate limiting, and billing integration to accommodate a growing user base.

Negotiation

View details