DevOps Engineer | SDE - 1 | SDE - 2
Role: Senior Cloud Architect - GPU Infrastructure & AI Platforms
Function: Cloud Infrastructure & Platform Engineering
Location: Singapore
Type: Full-time
Compensation: Not specified
Industry: Information Technology & Services
About Company
A deeptech startup building the next generation of AI-native infrastructure. As a Zero-Trust and Confidential-by-Design hyperscaler, we are developing the Kluisz Secure Fabric™️ to redefine how the world handles massive-scale compute.
We are looking for a visionary Senior Cloud Architect to lead the design, deployment, and operation of our large-scale GPU cloud infrastructure in Singapore. You will own the end-to-end GPU platform architecture—from hardware and cluster design to Kubernetes scheduling and customer solutioning.
Position Overview
You will architect and deploy 2,000+ GPU clusters that power the next generation of AI infrastructure. You own the end-to-end GPU platform architecture, bridging specialized hardware teams and AI engineers to deliver production-ready cloud solutions. This role defines the technical direction of a hyperscaler fleet from the ground up, with high ownership and direct impact on global cloud infrastructure.
Role & Responsibilities
- Design and evolve large-scale GPU clusters (2,000+ GPUs) for AI training and inference workloads
- Define GPU server topology including PCIe/NVLink configurations and high-speed networking architecture
- Architect and oversee Kubernetes-based GPU platforms with scheduling, isolation, and multi-tenant strategies
- Support deployment and optimization of AI models while troubleshooting complex distributed training issues
- Evaluate and select GPU, server, and networking vendors with technical input for RFPs
- Lead customer engagements for solution architecture and provide technical leadership
- Mentor infrastructure and platform engineers while driving capacity planning initiatives
Must Have Criteria
- 10-15 years of experience in cloud infrastructure, platform engineering, or systems architecture
- Proven experience designing or operating GPU clusters with at least 2,000 GPUs
- Strong production experience with Kubernetes specifically for GPU workloads and resource scheduling
- Deep understanding of GPU architecture and compute optimization across different GPU vendors
- Experience with high-performance networking protocols (InfiniBand or RoCE) for GPU clusters
- Hands-on experience with distributed storage systems optimized for AI workloads
- Experience in customer-facing solution architecture or technical leadership roles
Nice to Have
- Experience with AI/ML frameworks and large-scale training platforms (PyTorch, TensorFlow)
- Exposure to multi-cloud or hybrid cloud environments
- Track record of improving GPU utilization and cost efficiency at scale
- Experience with container orchestration tools beyond Kubernetes (Docker Swarm, Nomad)
- Background in hyperscaler or cloud service provider environments
What We Offer
- Opportunity to build next-generation AI infrastructure from the ground up
- Leadership role in a cutting-edge Zero-Trust and Confidential-by-Design hyperscaler
- Work with state-of-the-art GPU technology and secure fabric architecture
- High ownership and impact in shaping global cloud infrastructure
- Collaborative environment focused on redefining massive-scale compute
FAQs
Are there any additional costs for payroll processing in multiple countries?
Throughout history, These artists have inspired countless others to explore the instrument and its diverse musical possibilities.
Mastering the Accordian rying ability.
The history of t, the Accordian has evolved, with various types emerging, including the piano accordian and the button accordian, each offering unique playing styles and sounds.
The Accordian is a versatile musical instrument that has been used in various genres, from folk to classical music. Its uniqu.

