SRE Lead/Manager (DevOps, AWS)

JOB DESCRIPTION

As a Support Site Reliability Engineer (SRE) leader, you will lead our efforts in establishing a support SRE team that works closely with The Company's Product SRE to increase productivity. The ideal candidate will utilize leadership and technical skills to streamline operational tasks affecting Product SRE team efficiency through collaboration with SRE teams located in Japan and Vietnam
Design and execute the Support SRE team's strategic roadmap.
Collaborate with The Company's Product SRE teams to identify opportunities for improving operational efficiency and reducing toil.
Mentor and coach team members to foster their growth and development in technical and collaboration areas.
Drive a culture of continuous improvement and knowledge sharing within the team.
Design and implement automation solutions to standardize operational tasks, reducing manual effort and improving efficiency.
Develop and maintain tools, scripts and processes to automate routine operational tasks.
Build, maintain, and improve our infrastructure, including monitoring, diagnosing, and resolving incidents promptly.
Participate in incident response, on-call rotations, and post-mortem analysis.

JOB REQUIREMENT

At least 5 years experience as a DevOps Engineer (Experience on on-premises environments being a plus) or similar.
3+ years of hands-on experience with AWS or other cloud platforms. Experience with managed AWS services is a plus.
Solid understanding of CI/CD pipelines and best practices.
Working understanding of containerization technologies (Docker and Kubernetes).
Experience with monitoring and logging solutions.
Proficiency with IaC (e.g., Terraform).
Deep understanding and hands-on experience with MySQL or similar relational databases.
Proven track record in training and educating team members, promoting a culture of continuous learning.
Strong ownership and responsibility, with a proactive and solutions-oriented mindset.
Experience in developing and operating web applications built in Go or Ruby is a plus.
Project management experience.
English language proficiency at a professional working level.
People management or team leadership experience is a plus.

WHAT'S ON OFFER

Caring Mental & Physical Recreation:
Hybrid working: 2 days at the office and 3 days WFH
Working hour: Flexible start 8AM-9AM from Mon-Fri
Full salary in probation
Insurance: Applied from Probation period:
Social Insurance, Health Insurance, Unemployment Insurance (on 100% salary)
Private health insurance & accident insurance. From Managing level: extra for family members
Bonus: 13th month salary
17 - 24 paid days off and more
Paternity leave: Extra 5 days
Annual company trip; Quarterly team building
Billiards & Running club
Annual health check
Well-equipped facility: Macbook pro, additional monitor, ..
Caring Career & Development:
Clear Career path
Foreign language & International technology-related certifications sponsoring
External & internal training courses
Soft-skill workshops
Tech seminars
Monthly and biannual Recognition Awards
Performance & salary review: twice/year (Jun & Dec)

CONTACT

PEGASI – IT Recruitment Consultancy | Email: recruit@pegasi.com.vn | Tel: +84 28 3622 8666
We are PEGASI – IT Recruitment Consultancy in Vietnam. If you are looking for new opportunity for your career path, kindly visit our website www.pegasi.com.vn for your reference. Thank you!

Job Summary

Company Type:

Product

Technical Skills:

Devops, AWS

Location:

Ha Noi - Viet Nam

Working Policy:

Job ID:

J01508

Status:

Close

Related Job:

Customer Success Manager, SaaS

Ho Chi Minh - Viet Nam


Product

  • Account Management

Help new customers in the food and beverage industry familiarize themselves with service-time, fraud detection, and staff scheduling modules Conduct interviews with various department personnel to gather feedback on the adoption process and identify any issues Act as a liaison between customers and the product development team, communicating specific industry needs and influencing the product roadmap based on real operational challenges Develop onboarding materials and review frameworks tailored for quick-service restaurants and food and beverage chains Conduct regular business reviews with Operations Managers and COOs, utilizing platform data to provide insights on throughput, compliance, and cost Proactively monitor customer usage and health metrics to foresee challenges and prevent customer churn Identify opportunities for customers to expand their usage or adopt additional functionalities.

Negotiation

View details

Database Expert

Ho Chi Minh - Viet Nam


Outsource, Product

  • DBA
  • Oracle
  • AWS

Oversee the health and maintenance of the database services to ensure optimal availability, reliability, integrity, security, performance, and backups and recovery plans. Implement disaster recovery protocols and measures to maintain database accessibility during system disruptions or failures. Provide database support to development teams. Participate in operational sessions with clients for real-time monitoring, issue resolution, and performance enhancement of production databases. Identify and resolve issues in both Production and Integration settings. Provide guidance and leadership to team members as needed. Engage in ongoing improvement efforts related to OCI Cloud, AWS, and MongoDB through transformation and architecture initiatives.

Negotiation

View details

QA Team Lead

Ho Chi Minh - Viet Nam


Product

  • Automation Test
  • Management

Take ownership of the quality of your product/team's release Determine testing scope based on risk, timelines, and business priorities Ensure only production-ready features are released Actively participate in planning, requirement reviews, and release discussions Identify quality risks early and develop mitigation plans Lead and mentor QA engineers for both manual and automation testing Cultivate a culture of accountability, ownership, and continuous improvement within the team Enhance team efficiency and execution quality Define and enhance regression strategies Drive automation for UI, API, and integration testing Improve CI/CD quality workflows and release confidence Optimize test execution time while maintaining meaningful coverage Collaborate closely with Product, DEV, and Support teams Analyze production defects and customer feedback to improve test coverage Conduct root-cause analysis and promote defect prevention practices Focus on customer-perceived quality, not only internal QA metrics Aid in establishing practical QA processes that teams will adhere to Monitor quality KPIs and testing effectiveness Enhance release readiness, test planning, and defect management Introduce AI-assisted QA workflows where they provide real value

Negotiation

View details