Site Reliability Engineer

JOB DESCRIPTION

We are seeking an engineer to ensure the reliability and performance of Our Client's Data Platform. Successful candidates will work with researchers, operations, and other technology teams to establish the smooth functioning of our production data pipeline sourced from an enormous and continuously updating catalog of vendor and market data. This engineer will also develop solutions to improve the efficiency and scalability of our ever-growing business-critical management system
Operate, monitor, and provision the system to make sure it works smoothly
Provide feedback for system improvement
Provide solutions for live monitoring of the production data pipeline
Design and implement continuous integration and test automation
Deliver release management solutions
Collaborate with engineering, analyst, and research teams to ensure the reliability and operability of new data pipeline components
Analyze and diagnose platform performance and reliability problems
Understand, manage, and utilize the right technologies for building our platforms such as Kubernetes, Kafka, and Spark

JOB REQUIREMENT

Bachelor’s degree in Computer Science or equivalent experience
Excellent analytical skills and a passion for solving problems
Experience in Linux administration; fluent in Linux standard command line programs
Fluency in Python and its ecosystem (numpy, pandas, etc.) is strongly recommended
Experience in metrics and logs aggregation and analysis with a focus on performance optimization
Understanding of Git and CI/CD concept
A great support attitude (our job is to make life easier for other teams!)
Strong written and verbal communication skills; Fluency in the English language
Knowledgeable in:
Computer science fundamentals (algorithms and data structures)
Relational databases
Modern service architectures
Experience in the following technologies is relevant: Kafka, Docker, Helm, Kubernetes, GC, AWS, Spark and Pyspark, Hadoop, Redis, MySQL, gRPC, Apache Arrow, Apache Airflow

WHAT'S ON OFFER

Competitive and attractive compensation package with a clear career road-map – where you feel challenged every day
We offer a strong culture of learning and development: training courses, library, speakers, share and learn events
Learn from who sits next to you! Working in our client's environment, you are surrounded by smart and talented people
Employee resources groups with strong diversity and inclusion culture
Premium Health Insurance and Employee Assistance Program
Generous time-off policy, unlimited sick days, re-creation sabbatical leave (based on tenure), Trade Union benefits for staff and family
Team building activities every month: Local engagement events, monthly team lunches – Employee clubs: football, ping-pong, badminton, yoga, running, PS5, movies, etc.
Annual company trips and occasional global conferences – the opportunity to travel and connect with our global teams
Happy hour with tea breaks, snacks, and meals every day in the office!

CONTACT

PEGASI – IT Recruitment Consultancy | Email: recruit@pegasi.com.vn | Tel: +84 28 3622 8666
We are PEGASI – IT Recruitment Consultancy in Vietnam. If you are looking for new opportunity for your career path, kindly visit our website www.pegasi.com.vn for your reference. Thank you!

Job Summary

Company Type:

Offshore

Technical Skills:

Devops, Kubernetes, Kafka, Python

Location:

Ho Chi Minh, Ha Noi - Viet Nam

Working Policy:

Salary:

Negotiation

Job ID:

J01251

Status:

Close

Related Job:

Senior Deep Learning Algorithms Engineer

Ho Chi Minh, Ha Noi - Viet Nam


Product

  • Machine Learning
  • Algorithm

Analyze and optimize deep learning training and inference workloads on advanced hardware and software platforms. Work with researchers and engineers to enhance workload performance. Develop high-quality software for deep learning platforms. Create automated tools for workload analysis and optimization.

Negotiation

View details

Software Engineer

Ho Chi Minh - Viet Nam


Product

Create and develop the API Platform with a focus on reliability, performance, and providing a top-tier developer experience Deploy and enhance AI/ML models in scalable, production environments in collaboration with research and applied ML teams Manage and advance a contemporary, cloud-native infrastructure stack utilizing Kubernetes, Docker, and infrastructure-as-code (IaC) tools Ensure platform dependability by designing and implementing telemetry, monitoring, alerting, autoscaling, failover, and disaster recovery mechanisms Contribute to developer and operations workflows, encompassing CI/CD pipelines, release management, and on-call rotations Work collaboratively across teams to implement secure APIs with fine-grained access control, usage metering, and billing integration Continuously enhance platform performance, cost-efficiency, and observability to accommodate scaling and serve users globally.

Negotiation

View details

Physical Design Engineer (STA Focus)

Ho Chi Minh, Ha Noi - Viet Nam


Outsource

  • Chip Physical Design

Conducting Static Timing Analysis (STA), validation, and debugging under various PVT conditions using Tempus. Implementing DMMMC flow for STA and logical/physical aware ECO flows with a focus on timing and leakage optimization. Overseeing STA setup, convergence, reviews, and final approval for multi-mode (func/scan/atspeed) and multi-voltage domain designs. Evaluating unconstrained endpoints and examining timing reports. Working closely with design, synthesis, and PnR teams to ensure efficient timing closure.

Negotiation

View details