SRE Lead/Manager (DevOps, AWS)

JOB DESCRIPTION

As a Support Site Reliability Engineer (SRE) leader, you will lead our efforts in establishing a support SRE team that works closely with The Company's Product SRE to increase productivity. The ideal candidate will utilize leadership and technical skills to streamline operational tasks affecting Product SRE team efficiency through collaboration with SRE teams located in Japan and Vietnam
Design and execute the Support SRE team's strategic roadmap.
Collaborate with The Company's Product SRE teams to identify opportunities for improving operational efficiency and reducing toil.
Mentor and coach team members to foster their growth and development in technical and collaboration areas.
Drive a culture of continuous improvement and knowledge sharing within the team.
Design and implement automation solutions to standardize operational tasks, reducing manual effort and improving efficiency.
Develop and maintain tools, scripts and processes to automate routine operational tasks.
Build, maintain, and improve our infrastructure, including monitoring, diagnosing, and resolving incidents promptly.
Participate in incident response, on-call rotations, and post-mortem analysis.

JOB REQUIREMENT

At least 5 years experience as a DevOps Engineer (Experience on on-premises environments being a plus) or similar.
3+ years of hands-on experience with AWS or other cloud platforms. Experience with managed AWS services is a plus.
Solid understanding of CI/CD pipelines and best practices.
Working understanding of containerization technologies (Docker and Kubernetes).
Experience with monitoring and logging solutions.
Proficiency with IaC (e.g., Terraform).
Deep understanding and hands-on experience with MySQL or similar relational databases.
Proven track record in training and educating team members, promoting a culture of continuous learning.
Strong ownership and responsibility, with a proactive and solutions-oriented mindset.
Experience in developing and operating web applications built in Go or Ruby is a plus.
Project management experience.
English language proficiency at a professional working level.
People management or team leadership experience is a plus.

WHAT'S ON OFFER

Caring Mental & Physical Recreation:
Hybrid working: 2 days at the office and 3 days WFH
Working hour: Flexible start 8AM-9AM from Mon-Fri
Full salary in probation
Insurance: Applied from Probation period:
Social Insurance, Health Insurance, Unemployment Insurance (on 100% salary)
Private health insurance & accident insurance. From Managing level: extra for family members
Bonus: 13th month salary
17 - 24 paid days off and more
Paternity leave: Extra 5 days
Annual company trip; Quarterly team building
Billiards & Running club
Annual health check
Well-equipped facility: Macbook pro, additional monitor, ..
Caring Career & Development:
Clear Career path
Foreign language & International technology-related certifications sponsoring
External & internal training courses
Soft-skill workshops
Tech seminars
Monthly and biannual Recognition Awards
Performance & salary review: twice/year (Jun & Dec)

CONTACT

PEGASI – IT Recruitment Consultancy | Email: recruit@pegasi.com.vn | Tel: +84 28 3622 8666
We are PEGASI – IT Recruitment Consultancy in Vietnam. If you are looking for new opportunity for your career path, kindly visit our website www.pegasi.com.vn for your reference. Thank you!

Job Summary

Company Type:

Product, Fintech

Technical Skills:

Devops, AWS

Location:

Ha Noi - Viet Nam

Salary:

Negotiation

Job ID:

J01508

Status:

Close

Related Job:

Python Developer (DevOps - focused)

Ho Chi Minh - Viet Nam


Outsourcing

  • Python
  • Devops

We are looking for a technically strong Python Developer to join our dynamic operations team. In this role, you will be the first line of support for researchers and internal users by managing and resolving issues via our JIRA service desk. You will play a key role in driving operational efficiency through automation and smart tooling, ensuring timely and effective support. To help manage the flow of issues and resolve them. For the more complex issues they can then redirect to the Devops team for their handling, but the support engineer must still keep ownership; To come up with solutions to improve the efficiency of resolving issues. This would include exploring the user of bots to automate some of the common tasks, as well as to write scripts to programmatically categorize and handle the tickets in the JIRA service desk.

Negotiation

View details

Distributed Systems Engineer

Ho Chi Minh - Viet Nam


Product

  • Data Engineering
  • Devops

Design & build large-scale distributed services for telemetry ingestion, event streaming, and command orchestration across edge and cloud environments Implement real-time data pipelines using Kafka, NATS, or gRPC streams, ensuring low-latency, high-throughput processing Maintain and optimize stateful services (Redis, InfluxDB, Postgres) for consistency, replication, and failover in multi-region deployments Collaborate with embedded, controls, and ML teams to define API contracts, message schemas (Protobuf), and service SLAs Develop infrastructure-as-code (Terraform, Helm) and CI/CD workflows to automate testing, security scans, and rolling upgrades Monitor & troubleshoot production systems with Prometheus, Grafana, Jaeger, and custom observability tooling to meet 99.99% uptime goals Champion best practices in reliability engineering, capacity planning, and incident response for distributed platforms

Negotiation

View details

QA Lead

Ho Chi Minh - Viet Nam


Wellness and Fitness Services

  • Manual Test
  • Automation Test

Proven experience leading and mentoring a QA team of 10+ engineers, fostering technical growth and collaboration. Actively contributed to scaling the team by interviewing, hiring, onboarding, and delegating responsibilities to optimize team performance. Designed and maintained detailed test plans across features, projects, and release cycles to ensure consistent quality standards. Established a strong team structure with clear ownership, accountability, and alignment to delivery goals. Conducted regular one-on-one meetings and performance reviews to support career development and maintain team motivation. Delivered transparent reporting on bugs, test results, and overall quality status to cross-functional teams. Led test evaluations and produced quality reports for each release, including clear GO/NO GO recommendations. Provided developers with targeted test checklists and test cases to enable effective self-testing prior to QC handoff. Demonstrated a hands-on, proactive mindset with a strong commitment to achieving goals and resolving challenges. Maintained a bias for action, driving momentum and ensuring timely delivery in fast-paced environments.

Negotiation

View details