Site Reliability Engineer

JOB DESCRIPTION

Maintain systems and troubleshoot system issues.
Identifying bottleneck in various Java applications and implement performance improvements.
Identify and analyze user requirements.
Prioritize, assign, and execute tasks throughout the software development life cycle.
Develop, configure, and deploy tools for cloud-based systems and services.
Containerize new and legacy applications.
Maintain awareness of new and emerging technologies.
Support development and operations teams.
Enhance, modify or debug developer code as needed.

JOB REQUIREMENT

Must-have
Understanding of an object-orientated language, preferably the latest version of Java (with experience in Hibernate, Multi-thread, Spring Boot)
Experience in configuration, in Jenkins for CI/CD pipeline creation, automation scripts and Kubernetes implementation with Google.
Proficiency in supporting a 24×7 critical operation.
Experience in a cloud computing platform and associated automation patterns it provides, preferably GCP.
Proficient in production systems design including High Availability, Disaster Recovery, Performance, Efficiency, and Security user, application performance, system, log, time-series, and dashboarding.
Familiarity with Open-Source concepts and tools like Prometheus, Grafana, ELK etc. 
Proficient in a modern infrastructure automation toolkit such as Terraform/Helm
Proficient in a Linux or Unix based environment.
Experience in destructive testing methodologies and tools such as chaos monkey
Experience in defensive coding practices and patterns for high availability
Nice-to-have
Experience in a cloud computing platform and associated automation patterns it provides, preferably GCP
Proficient in a modern scripting language like GO or Python
Knowledge of APM fundamentals or experience in tools like New Relic or AppDynamics.

WHAT'S ON OFFER

Open to deal base salary with additional project allowances.
Full salary during probation & Full coverage of social insurance.
Performance & salary review: twice a year
Monthly childcare support.
Premium Healthcare insurance and Health check-up services for employee and family ones.
15 Annual Leaves plus 10 days for Bereavement leave and 1.5 months for Paternity leave.
Premium package at top Gym service provider.
Diverse internal activities: Football, Billiards, Badminton, E-sport clubs & other regular company events.
Frequent opportunities to travel to US headquarter from 3-6 months.
Free parking for motorbike and car

CONTACT

PEGASI – IT Recruitment Consultancy | Email: recruit@pegasi.com.vn | Tel: +84 28 3622 8666
We are PEGASI – IT Recruitment Consultancy in Vietnam. If you are looking for new opportunity for your career path, kindly visit our website www.pegasi.com.vn for your reference. Thank you!

Job Summary

Company Type:

Outsource

Technical Skills:

Devops, Java

Location:

Ho Chi Minh, Da Nang - Viet Nam

Working Policy:

Salary:

Negotiation

Job ID:

J01196

Status:

Close

Related Job:

Platform Lead

Others - Singapore


Product

  • Backend
  • Devops
  • Data Engineering

Develop and expand distributed systems to handle large volumes of sensory, telemetry, and control data across cloud and edge environments, facilitating real-time connections for fleets of robots. Create the API Platform with a focus on high reliability, exceptional developer experience, and robust multimodal AI capabilities accessible through user-friendly APIs and SDKs. Establish extensive training and inference platforms for foundation models used in robot autonomy, teleoperation, and developer integrations. Devise data ingestion and streaming pipelines for real-time connectivity of robot fleets to the cloud, covering various data inputs such as video, LiDAR, joint states, and audio. Oversee and advance a modern cloud native infrastructure stack employing Kubernetes, Docker, and infrastructure as code tools. Ensure platform reliability through telemetry, monitoring, alerting, autoscaling, failover, and disaster recovery measures. Make infrastructure decisions pertaining to distributed storage, consensus protocols, GPU orchestration, network reliability, and API security. Foster collaboration across ML, robotics, and product teams to facilitate hardware in the loop simulation, policy rollout, continuous learning, and CI/CD workflows. Implement secure APIs featuring fine-grained access control, usage metering, rate limiting, and billing integration to accommodate a growing user base.

Negotiation

View details

Embedded Software Engineer (Chinese Speaking)

Ho Chi Minh - Viet Nam


Outsource

  • Embedded

Create, maintain, and enhance complex embedded software components as per technical and business needs. Conduct software requirement engineering by validating and analyzing customer requirements. Integrate software components and merge them into a unified build. Develop and implement test cases to verify software functionality and ensure it meets quality standards. Adhere to established software development processes and coding standards to produce reliable code for embedded systems. Use debugging and analysis tools to troubleshoot software defects and performance issues. Provide guidance to junior engineers on technical tasks, coding practices, and problem-solving. Contribute to technical reviews and knowledge-sharing sessions within the team. Ensure compliance with industry standards, regulatory requirements, and quality frameworks relevant to assigned projects.

Negotiation

View details

Senior Backend Engineer (Python/AWS)

Ho Chi Minh - Viet Nam


Outsource

  • Python

Our company, with expert teams in Berlin and Ho Chi Minh City, provides innovative software solutions for startups and leading enterprise businesses in Germany. The team in Berlin and Ho Chi Minh City collaborates to develop high-quality solutions. We are seeking a Senior Backend Engineer (Python/AWS) to join our team in Ho Chi Minh City. This role is ideal for team players interested in building an international career as a product builder as well as a coder. Developing and maintaining scalable backend services using Python for a live product Designing and implementing robust RESTful APIs and backend systems following industry best practices Leading the design and development of cloud-native backend solutions on AWS (e.g., ECS, SQS, SNS) Driving the architecture and scalability of backend systems for reliability, performance, and maintainability Defining and enforcing coding standards, testing strategies, and best practices across the backend codebase Implementing and overseeing observability practices, including monitoring, logging, and alerting Collaborating closely with frontend engineers, QA, DevOps, and product stakeholders to deliver high-quality solutions Conducting code reviews, technical design reviews, and architectural discussions Leading troubleshooting and root-cause analysis of complex production issues Mentoring junior and mid-level engineers and supporting their technical growth Contributing to technical documentation, system design documentation, and knowledge sharing Staying up to date with emerging technologies and driving technical innovation within the team

Negotiation

View details