SRE Lead/Manager (DevOps, AWS)

JOB DESCRIPTION

As a Support Site Reliability Engineer (SRE) leader, you will lead our efforts in establishing a support SRE team that works closely with The Company's Product SRE to increase productivity. The ideal candidate will utilize leadership and technical skills to streamline operational tasks affecting Product SRE team efficiency through collaboration with SRE teams located in Japan and Vietnam
Design and execute the Support SRE team's strategic roadmap.
Collaborate with The Company's Product SRE teams to identify opportunities for improving operational efficiency and reducing toil.
Mentor and coach team members to foster their growth and development in technical and collaboration areas.
Drive a culture of continuous improvement and knowledge sharing within the team.
Design and implement automation solutions to standardize operational tasks, reducing manual effort and improving efficiency.
Develop and maintain tools, scripts and processes to automate routine operational tasks.
Build, maintain, and improve our infrastructure, including monitoring, diagnosing, and resolving incidents promptly.
Participate in incident response, on-call rotations, and post-mortem analysis.

JOB REQUIREMENT

At least 5 years experience as a DevOps Engineer (Experience on on-premises environments being a plus) or similar.
3+ years of hands-on experience with AWS or other cloud platforms. Experience with managed AWS services is a plus.
Solid understanding of CI/CD pipelines and best practices.
Working understanding of containerization technologies (Docker and Kubernetes).
Experience with monitoring and logging solutions.
Proficiency with IaC (e.g., Terraform).
Deep understanding and hands-on experience with MySQL or similar relational databases.
Proven track record in training and educating team members, promoting a culture of continuous learning.
Strong ownership and responsibility, with a proactive and solutions-oriented mindset.
Experience in developing and operating web applications built in Go or Ruby is a plus.
Project management experience.
English language proficiency at a professional working level.
People management or team leadership experience is a plus.

WHAT'S ON OFFER

Caring Mental & Physical Recreation:
Hybrid working: 2 days at the office and 3 days WFH
Working hour: Flexible start 8AM-9AM from Mon-Fri
Full salary in probation
Insurance: Applied from Probation period:
Social Insurance, Health Insurance, Unemployment Insurance (on 100% salary)
Private health insurance & accident insurance. From Managing level: extra for family members
Bonus: 13th month salary
17 - 24 paid days off and more
Paternity leave: Extra 5 days
Annual company trip; Quarterly team building
Billiards & Running club
Annual health check
Well-equipped facility: Macbook pro, additional monitor, ..
Caring Career & Development:
Clear Career path
Foreign language & International technology-related certifications sponsoring
External & internal training courses
Soft-skill workshops
Tech seminars
Monthly and biannual Recognition Awards
Performance & salary review: twice/year (Jun & Dec)

CONTACT

PEGASI – IT Recruitment Consultancy | Email: recruit@pegasi.com.vn | Tel: +84 28 3622 8666
We are PEGASI – IT Recruitment Consultancy in Vietnam. If you are looking for new opportunity for your career path, kindly visit our website www.pegasi.com.vn for your reference. Thank you!

Job Summary

Company Type:

Product, Fintech

Technical Skills:

Devops, AWS

Location:

Ha Noi - Viet Nam

Working Policy:

Salary:

Negotiation

Job ID:

J01508

Status:

Close

Related Job:

Engineering Manager

Ho Chi Minh - Viet Nam


Product

  • Java
  • Management

70% of the job role involves Engineering & Architecture, while 30% is focused on People & Project Management. Develop, enhance, and streamline backend services through the use of Java, Spring, and Kotlin Lead conversations on system architecture and steer the technical direction for the Security Domain (QR Code, Shorten URL, and Static Application Security Testing Tool) Oversee code reviews, ensure compliance with coding standards, and uphold system quality Coordinate with teams from the Korean headquarters to harmonize technology and the roadmap Enhance system scalability, performance, and reliability Provide guidance and mentorship to over 10 engineers spanning various domains Establish and assess team objectives and individual OKRs/KPIs Take the lead on sprint planning, tracking deliveries, and nurturing a culture of code reviews Cultivate an engineering culture centered on growth and ownership Strategize and execute quarterly domain goals (such as rolling out security products, releasing new features, etc.) Monitor progress using quantifiable KPIs (velocity, release quality, incident rate, etc.) Maintain effective communication across teams (Product, QA, Infra, and Korean stakeholders)

Negotiation

View details

Product Manager

Ho Chi Minh - Viet Nam


Product

  • Product Management

Develop and refine the product roadmap, striking a balance between speed, simplicity, and inclusivity. Drive an intuitive user experience across various platforms including desktop, browser, and mobile. Manage the entire user journey from onboarding to retention. Coordinate between different teams to ensure that product decisions align with technical capabilities and research progress. Establish high standards for performance, stability, and overall quality of the product. Actively engage with the community on platforms such as Discord, Reddit, and GitHub to gather insights and implement necessary actions. Collaborate with content and growth teams to enhance global adoption, retention, and engagement.

Negotiation

View details

Product Quality Engineer

Ho Chi Minh - Viet Nam


Product

  • Automation Test
  • Devops

Develop and implement thorough testing strategies across web, API platform, desktop, and mobile platforms Create automated test suites and Auto QA agents for continuous releases, model updates, and API integrations Manage build and CI/CD pipelines to ensure product functionality across major operating systems Verify compatibility between web clients and different server node versions, including upgrade paths and backwards compatibility testing Validate resource management and performance optimizations across various hardware configurations, including GPU acceleration Engage in the Discord community and GitHub Issues to translate feedback into practical test cases Oversee release cycles, prioritize bugs, and provide timely alerts Generate user-friendly documentation to help users resolve issues

Negotiation

View details