Certified Pega System Architect
ABOUT CLIENT
JOB DESCRIPTION
JOB REQUIREMENT
WHAT'S ON OFFER
CONTACT
Job Summary
Company Type:
Offshore
Technical Skills:
Location:
Ho Chi Minh - Viet Nam
Working Policy:
Hybrid
Salary:
Negotiation
Job ID:
J01552
Status:
Close
Related Job:
Platform Reliability Engineer
Ho Chi Minh - Viet Nam
Outsource
- Devops
Maintain production reliability of the Linux-based research and trading platform within a globally distributed engineering team. Respond quickly to production infrastructure issues. Comprehend internal client needs and effectively communicate them to regional and global leadership. Identify risks, develop contingency plans, and implement solutions to mitigate them. Enhance the observability platform to monitor the performance and health of critical computing environments. Take part in occasional on-call rotations and support on-call staff during their shifts. Contribute to organizational knowledge through documentation, education, and writing maintainable code.
Negotiation
View detailsStorage System Engineer (Linux)
Ho Chi Minh - Viet Nam
Outsource
Monitoring storage performance, capacity, and availability for optimal performance and reliability. Troubleshooting storage-related issues and providing timely resolutions to users. Developing and maintaining scripts and automation tools for storage administration tasks. Performing regular data backup and recovery procedures to ensure data availability.
Negotiation
View detailsPrincipal Engineer, System Software Platform Engineering
Ho Chi Minh, Ha Noi - Viet Nam
Product
- Devops
- Backend
- AI
Create and manage a platform for AI that provides services for multiple users, handles identity and policy management, configures quotas, and controls costs. Additionally, this platform should offer easy paths for teams to work on AI projects. Oversee the deployment of AI models at scale, including routing, autoscaling, and implementing safety measures to ensure reliability and observability. Manage GPU resources in a Kubernetes environment, including device plugins, feature discovery, and scheduling strategies, among other responsibilities. Take charge of the entire lifecycle of GPUs, ensuring that driver, firmware, and runtime updates are implemented safely and consistently. Implement virtualization strategies for GPU resources, such as vGPU and PCIe passthrough, while defining policies for resource placement, isolation, and preemptive actions. Establish secure traffic and networking protocols, including gateways, service mesh, and authentication/authorization measures. Enhance observability and operational efficiency through monitoring tools for GPUs, response protocols for incidents, and optimization of costs. Develop reusable templates, integrate SDKs and CLIs, and implement infrastructure-as-code standards for the platform. Influence the platform's direction by creating design documents, mentoring engineers, and aligning platform development with the needs of AI products.