Site Reliability Engineer

JOB DESCRIPTION

Maintain systems and troubleshoot system issues.

Identifying bottleneck in various Java applications and implement performance improvements.

Identify and analyze user requirements.

Prioritize, assign, and execute tasks throughout the software development life cycle.

Develop, configure, and deploy tools for cloud-based systems and services.

Containerize new and legacy applications.

Maintain awareness of new and emerging technologies.

Support development and operations teams.

Enhance, modify or debug developer code as needed.

JOB REQUIREMENT

Must-have

Understanding of an object-orientated language, preferably the latest version of Java (with experience in Hibernate, Multi-thread, Spring Boot)

Experience in configuration, in Jenkins for CI/CD pipeline creation, automation scripts and Kubernetes implementation with Google.

Proficiency in supporting a 24×7 critical operation.

Experience in a cloud computing platform and associated automation patterns it provides, preferably GCP.

Proficient in production systems design including High Availability, Disaster Recovery, Performance, Efficiency, and Security user, application performance, system, log, time-series, and dashboarding.

Familiarity with Open-Source concepts and tools like Prometheus, Grafana, ELK etc.

Proficient in a modern infrastructure automation toolkit such as Terraform/Helm

Proficient in a Linux or Unix based environment.

Experience in destructive testing methodologies and tools such as chaos monkey

Experience in defensive coding practices and patterns for high availability

Nice-to-have

Experience in a cloud computing platform and associated automation patterns it provides, preferably GCP

Proficient in a modern scripting language like GO or Python

Knowledge of APM fundamentals or experience in tools like New Relic or AppDynamics.

WHAT'S ON OFFER

Open to deal base salary with additional project allowances.

Full salary during probation & Full coverage of social insurance.

Performance & salary review: twice a year

Monthly childcare support.

Premium Healthcare insurance and Health check-up services for employee and family ones.

15 Annual Leaves plus 10 days for Bereavement leave and 1.5 months for Paternity leave.

Premium package at top Gym service provider.

Diverse internal activities: Football, Billiards, Badminton, E-sport clubs & other regular company events.

Frequent opportunities to travel to US headquarter from 3-6 months.

Free parking for motorbike and car

CONTACT

PEGASI – IT Recruitment Consultancy | Email: recruit@pegasi.com.vn | Tel: +84 28 3622 8666

We are PEGASI – IT Recruitment Consultancy in Vietnam. If you are looking for new opportunity for your career path, kindly visit our website www.pegasi.com.vn for your reference. Thank you!

Job Summary

Company Type:

Outsource

Technical Skills:

Devops, Java

Location:

Ho Chi Minh, Da Nang - Viet Nam

Working Policy:

Salary:

Negotiation

Job ID:

J01196

Status:

Related Job:

AI & DATA Engineer/Databricks

Ho Chi Minh - Viet Nam

Outsource

Data Engineering
AI

Lead the design and implementation of cloud-native data pipelines for large-scale analytics and AI applications using Databricks Develop and maintain high-quality backend APIs and AI Agents supporting internal tools and customer-facing products Execute and manage data migration projects with a focus on performance, reliability, and maintainability Access Databricks environments directly or via CLI to develop, orchestrate, test, and deploy jobs and pipelines Promote and implement CI/CD best practices, Git workflows, and engineering standards across the data team Collaborate closely with AI engineers, consultants, and external stakeholders to translate requirements into scalable solutions

Negotiation

View details

Solution Designer - Technology Fraud & Scams

Ho Chi Minh - Viet Nam

Product

Java
ReactJS
AWS
Azure
Microservices

Evaluate business requirements and COM systems to form enterprise-level solutions that prioritize technical excellence, security, scalability, and business value. Develop comprehensive design documents, architectural diagrams, and technical documentation for implementation teams. Utilize systems thinking and design thinking methodologies to analyze implications across the COM ecosystem. Conduct proof-of-concept builds as needed to validate architectural decisions and minimize implementation risks. Communicate intricate technical concepts and design choices to various stakeholders such as Heads of Technology, Engineering Managers, Product Owners, Tech Leads, Testing Teams, and Technology Executives. Act as a reliable intermediary between business and technical domains. Promote platform-thinking principles by creating reusable, adaptable solutions that enhance consistency and effectiveness within the organization. Offer strategic advice to Technology Leadership on architectural strategy, technical debt, and platform development.

Negotiation

View details

Featured Job

Associate Manager – Software Engineer

Ho Chi Minh - Viet Nam

Product

Java
ReactJS

Lead the decisions around scalable full-stack and cloud-native systems architecture. Advocate for best practices in system design, reliability, and observability. Take charge of delivering essential platform capabilities, including crew training and assessment (OCL), compliance systems (Track & Trace), restaurant monitoring and reporting (MRD), virtual restaurant assessments, and intelligent operational action systems for RGMs. Collaborate with product and stakeholders, translating business requirements into scalable solutions. Build and design scalable applications using React/React Native, Spring Boot, and NestJS. Develop robust APIs and microservices that support restaurant operational systems. Ensure high code quality through testing, code reviews, and performance enhancement. Manage and create cloud infrastructure using AWS (EKS, Lambda). Set up CI/CD pipelines using GitLab CI. Guarantee strong monitoring and system reliability through the use of Datadog. Collaborate closely with engineering and product teams across global locations to deliver platform capabilities. Address complex technical challenges and provide scalable solutions to enhance platform reliability and operational efficiency.

Negotiation

View details