Elastic Computing SRE/DevOps

Sunnyvale, CA, CA
Full Time
Experienced
Job Title: Elastic Computing Site Reliability Engineer
Position Type: Full-time
Location: Sunnyvale, CA
Salary Range: $180000 - $240000 (USD)

About the Job:
Elastic Compute Service (ECS) is a core product of the company. The Elastic Compute team is dedicated to
building world-leading cloud computing infrastructure. As a key component of self-developed Apsara operating system, Elastic Compute Service (ECS) provides full-stack computing resources
covering virtual machine instances, container services and Heterogeneous computing clusters.
Through technological innovation and product optimization, the Elastic Compute team
continuously drives advancements in cloud computing technologies, delivering high-quality computing
services to users worldwide.
Our goal is not only to support enterprises in achieving elastic scalability but also to deeply empower
infrastructure innovation in the New era. Our mission is to build an intelligent foundation of "Computing as
a Service," enabling developers to focus on businesses to concentrate on breakthroughs, without worrying
about the complex engineering implementations from chips to clusters.

About the team:
The Elastic Compute Service (ECS) SRE (Site Reliability Engineering) team is a critical force in ensuring system stability and reliability. The SRE team focuses on guaranteeing the high availability, high performance, and robust stability of ECS products through technical expertise and innovation.
The ECS SRE team is not only a core technical safeguard but also a driver of technological innovation and continuous optimization. By leveraging technical capabilities and collaborative teamwork, we ensure the stability and reliability of ECS products, safeguarding global customers' businesses. Additionally, we are committed to advancing cloud computing technologies through knowledge sharing and industry collaboration.
Joining the ECS SRE team offers the opportunity to engage in the development and optimization of world-leading cloud computing technologies, while growing alongside a passionate and creative team.

Job Description:
This is an SRE or DevOps position focused on the entire Elastic Computing product line. The responsibilities of this role include:
• Stability, Performance Optimization, Monitoring, and Operations: Oversee the stability, performance optimization, monitoring, and operational work for multiple core products (such as ECS, ACK, ACS, Heterogeneous computer cluster, OOS, Compute Nest, etc.), taking responsibility for the online stability of these products.
• Operation System and Online System Development: Engage in the development of operation systems and some online systems. Through tools, process optimization, and system improvements, ensure the stability and performance of Elastic Computing-related products.
• Customer and Team Collaboration: Work closely with other teams (such as R&D, after-sales support, etc.) to ensure efficient technical support and problem resolution.
• Candidates can choose to take responsibility for one or more core duties based on their expertise. Meanwhile, we are looking for experts who possess cross-team collaboration skills and system-level thinking abilities.

Responsibilities:
• Drive Customer Onboarding: Guide customers through the onboarding process for our behavior analytics platform — including data collection design, SDK and event validation, and dashboard setup.
• Be the Analytics Expert: Act as a strategic partner to customers, helping translate business goals into measurable data strategies across collection, analysis, and reporting.
• Enable Data-Driven Culture: Provide ongoing consultation to support customers' digital transformation and promote best practices for data-informed decision-making.
• Industry-Focused Guidance: Research and curate analytics best practices tailored to game genres, lifecycle stages (soft launch, LiveOps), and business models (IAP, IAA, hybrid).
• Deep-Dive Analytics: Apply descriptive and diagnostic analysis to uncover gameplay trends, monetization bottlenecks, and user segmentation opportunities.
• Customer Support & Training: Be responsive to product questions, troubleshooting requests, and on-site or remote training needs.
• Knowledge Sharing: Contribute to internal knowledge bases covering analytical techniques, success stories, and game-specific metric frameworks.

Qualifications:

Minimum qualification:
• Professional Knowledge and Experience
o Bachelor's degree or higher in Computer Science, Information Technology, or a related field.
o At least 3 years of experience in system operations or SRE, with familiarity in cloud computing services and core products (e.g., ECS, K8S, Heterogeneous Computer, etc.).
o Familiarity with the design and optimization of cloud resource provisioning and delivery systems; experience in serving overseas customers is preferred.
o In-depth understanding of the overall architecture and operational mechanisms of the elastic computing product line, with the ability to quickly identify and resolve complex issues.

Preferred qualification:
• Possession of cloud-related certifications (e.g., ACP, ACE, or other major cloud vendor certifications).
o Participation in the architectural design or performance optimization projects of large cloud platforms.
o Outstanding contributions in system stability assurance, automation tool development, or cloud-native domains are highly valued.

About Us:
Founded in 2009, IntelliPro is a global leader in talent acquisition and HR solutions. Our commitment to delivering unparalleled service to clients, fostering employee growth, and building enduring partnerships sets us apart. We continue leading global talent solutions with a dynamic presence in over 160 countries, including the USA, China, Canada, Singapore, Japan, Philippines, UK, India, Netherlands, and the EU.
IntelliPro, a global leader connecting individuals with rewarding employment opportunities, is dedicated to understanding your career aspirations. As an Equal Opportunity Employer, IntelliPro values diversity and does not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, disability, or any other legally protected group status. Moreover, our Inclusivity Commitment emphasizes embracing candidates of all abilities and ensures that our hiring and interview processes accommodate the needs of all applicants. Learn more about our commitment to diversity and inclusivity at https://intelliprogroup.com/.

Compensation: The pay offered to a successful candidate will be determined by various factors, including education, work experience, location, job responsibilities, certifications, and more. Additionally, IntelliPro provides a comprehensive benefits package, all subject to eligibility.
Share

Apply for this position

Required*
Apply with Indeed
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*