Senior Engineer (Backend)

ABOUT CLIENT

Our client is a leading research company specializing in technology innovation

JOB DESCRIPTION

Plan and construct the API Platform with a strong emphasis on reliability, speed, and an exceptional developer experience.
Create user-friendly APIs and SDKs that simplify developers’ access to and integration of multimodal AI capabilities.
Deploy and refine AI/ML models in scalable production environments in close collaboration with research and applied ML teams.
Oversee a contemporary, cloud-native infrastructure stack, utilizing Kubernetes, Docker, and infrastructure-as-code (IaC) utilities.
Ensure platform dependability by developing telemetry, monitoring, alerting, autoscaling, failover, and disaster recovery systems.
Contribute to developer and operations workflows, encompassing CI/CD pipelines, release management, and on-call rotations.
Work collaboratively across teams to implement secure APIs with precise access control, usage metering, and billing integration.
Continually enhance platform performance, cost-efficiency, and observability to accommodate global scale serving millions of users.

JOB REQUIREMENT

At least 3 years of experience in developing and operating large-scale, production systems with high uptime expectations
Proficiency in containerization and orchestration tools such as Kubernetes, Docker, Helm, and service mesh technologies like Istio or Linkerd
Strong command over cloud platforms like AWS, GCP, or Azure, including expertise in IAM, networking, and serverless tooling
Proficient in Python and JavaScript/TypeScript, with practical knowledge of backend frameworks like FastAPI, Express
Deep understanding of API architecture and design patterns, including WebRTC/Websockets, REST, gRPC, OpenAPI/Swagger, authentication methods like OAuth2, API keys, and versioning
Experience with various databases, including PostgreSQL, Redis, and modern vector databases like Pinecone, Weaviate, FAISS
Familiarity with CI/CD pipelines, GitOps practices, and relevant tools such as GitHub Actions, Argo CD, or Jenkins
Comfortable with monitoring and observability tools like Prometheus, Grafana, Datadog, or OpenTelemetry
Bonus Experience
Familiarity with MLOps tools and AI workflows, including model versioning, inference pipelines, and model registries
Understanding of access control systems such as ACLs, RBAC, multi-tenant security, and isolation
Knowledge of billing systems, rate limiting, chargeback models, quota enforcement, and usage metering
Previous experience working on developer platforms or API products with a strong emphasis on UX and documentation

WHAT'S ON OFFER

Work remotely in an environment that promotes open-source collaboration
Enjoy 14 days of leave and unlimited sick days
Access to GPUs, AI credits, opportunities for fast career progression, and other perks.

CONTACT

PEGASI – IT Recruitment Consultancy | Email: recruit@pegasi.com.vn | Tel: +84 28 3622 8666
We are PEGASI – IT Recruitment Consultancy in Vietnam. If you are looking for new opportunity for your career path, kindly visit our website www.pegasi.com.vn for your reference. Thank you!

Job Summary

Company Type:

Product

Technical Skills:

Python, NodeJS

Location:

Others - Singapore

Salary:

Negotiation

Job ID:

J01840

Status:

Active

Related Job:

Senior DevOps (Data Platform)

Ho Chi Minh - Viet Nam


Digital Bank, Product

  • Devops
  • Spark

Managing workloads on EC2 clusters using DataBricks/EMR for efficient data processing Collaborating with stakeholders to implement a Data Mesh architecture for multiple closely related enterprise entities Utilizing Infrastructure as Code (IaC) tools for defining and managing data platform user access Implementing role-based access control (RBAC) mechanisms to enforce least privilege principles Collaborating with cross-functional teams to design, implement, and optimize data pipelines and workflows Utilizing distributed engines such as Spark for efficient data processing and analysis Establishing operational best practices for data warehousing tools Managing storage technologies to meet business requirements Troubleshooting and resolving platform-related issues Staying updated on emerging technologies and industry trends Documenting processes, configurations, and changes for comprehensive system documentation.

Negotiation

View details

Python Developer (Distributed Systems)

Ho Chi Minh - Viet Nam


Outsourcing

  • Python
  • Flask

Engage in architecture, design, and code reviews. Contribute to strategic project development, testing, and deployment. Tackling scalability and reliability challenges will lead to meaningful discussions on Distributed Systems. Collaborate within a high-impact, cross-functional team. Utilize technologies including Kafka, PostgreSQL, Spark, BigQuery, GitLab with integrated CI/CD, etc.

Negotiation

View details

Senior Machine Learning Engineer

Ho Chi Minh, Ha Noi - Viet Nam


Information Technology & Services

  • Machine Learning

We are seeking a pragmatic Senior Machine Learning Engineer to accelerate our MLOps roadmap. Your primary mission will be to own the design and implementation of our V1 LLM Evaluation Platform, a critical system that will serve as the quality gate for all our AI features. You will be a key builder on a new initiative, working alongside dedicated Data Engineering and DevOps experts to deliver a tangible, high-impact platform. This role is for a hands-on engineer who thrives on building robust systems that provide leverage. You will be fully empowered to own the implementation and success of this project Build the V1 Evaluation Platform: Proactively own the end-to-end process of designing and building the core backend systems for our new LLM Evaluation Platform, leveraging Arize Phoenix as the foundational framework for traces, evaluations, and experiments. Implement Production Observability: Architect and implement the observability backbone for our AI services, integrating Phoenix with OpenTelemetry to create a centralized system for logging, tracing, and evaluating LLM behavior in production. Standardize LLM Deployment Pipeline: Design and implement the CI/CD framework for versioning, testing, and deploying prompt-based logic and LLM configurations, ensuring reproducible and auditable deployments across all AI features. Deliver Pragmatic Solutions: Consistently make pragmatic technical decisions that prioritize business value and speed of delivery, in line with our early-stage startup environment. Cross-functional Collaboration: Work closely with our Data Science team to understand their workflow and ensure the platform you build meets their core needs for experiment tracking and validation. Establish Core Patterns: Help establish and document the initial technical patterns for MLOps and model evaluation that will serve as the foundation for future development.

Negotiation

View details