Role Overview
We are seeking an experienced Staff Performance Engineer to lead and scale performance engineering practices for our cloud-native SaaS platform. This role is responsible for driving performance, scalability, reliability, and cost efficiency at an organizational level, with a strong focus on serverless and distributed architectures.
You will define performance engineering strategy, build scalable and AI-driven performance platforms, and influence architectural decisions across teams. The role requires deep expertise in modern cloud environments and a strong focus on embedding performance into the entire software lifecycle, from development to production.
Key Responsibilities
- Define and drive organization-wide performance engineering strategy aligned with business KPIs, customer experience, and cost efficiency
- Architect and build scalable, self-service performance engineering platforms enabling teams to run performance tests and analysis independently
- Design and implement AI-driven performance engineering solutions including anomaly detection, predictive performance insights, adaptive load testing, and automated optimization recommendations
- Lead the design and execution of advanced performance testing strategies for serverless, distributed, and event-driven systems
- Establish and standardize performance benchmarks, SLAs, SLOs, and KPIs across services
- Drive integration of performance testing and validation into CI/CD pipelines to enable continuous performance engineering (shift-left approach)
- Analyze system-wide performance bottlenecks including latency, cold starts, concurrency limits, and resource utilization across distributed systems
- Collaborate with engineering, SRE, and architecture teams to influence system design for scalability, resilience, and performance optimization
- Own performance in production environments by leveraging observability tools, distributed tracing, and real-time monitoring systems
- Implement intelligent observability solutions using tools such as CloudWatch, Datadog, New Relic, and AI-based monitoring platforms
- Lead capacity planning and scalability initiatives for high-throughput and globally distributed systems
- Drive cost-performance optimization strategies in cloud-native environments (FinOps alignment)
- Mentor and guide engineers across teams, promoting a performance-first culture and best practices
- Stay updated with emerging trends in performance engineering, including AI/ML-driven optimization and cloud-native innovations
Desired Skill and Requirements
Must Have
- 8+ years of experience in performance engineering within large-scale SaaS or cloud-native environments
- Performance testing tools - JMeter, Gatling, Locust, or similar
- Serverless architectures - AWS Lambda, API Gateway, event-driven systems
- Performance monitoring and observability tools - CloudWatch, Datadog, New Relic, distributed tracing systems
- Building performance engineering frameworks or platforms at scale
- Performance optimization in distributed and serverless systems - latency, cold starts, concurrency, and scaling behavior
- Integration of performance engineering into CI/CD pipelines
- Programming/scripting - Python (preferred), Java, or similar
- AI/ML-based performance optimization techniques - anomaly detection, predictive analysis, adaptive load modeling
- Cloud platforms (AWS preferred) and performance optimization techniques
- Ability to identify and resolve complex performance bottlenecks
- Large-scale load testing and capacity planning
- Cost-performance optimization in cloud environments
Good To Have
- Kubernetes, containerized, and serverless architectures
- Chaos engineering and resilience testing
- Internal developer platforms and self-service tooling
- FinOps and cloud cost optimization strategies
- Globally distributed and multi-region architectures
- API performance optimization
- Modern distributed data stores - DynamoDB, Aurora Serverless, NoSQL systems
- AIOps platforms and intelligent observability systems
Soft Skills
- Strong problem-solving and analytical thinking
- Ability to influence architectural and technical decisions across teams
- Excellent communication and stakeholder management skills
- Ownership mindset with the ability to drive cross-functional initiatives
- Mentorship and leadership capabilities
- Ability to operate in a fast-paced, high-growth SaaS environment
Experience
- 8+ years of experience in performance engineering in large-scale SaaS or cloud-native environments
- 3+ years of experience in Senior, Lead, or Staff-level performance engineering roles
- 4+ years of experience performance testing large-scale SaaS or distributed systems
- 5+ years of hands-on experience with performance testing tools such as JMeter, Gatling, k6, or Locust
- Experience designing and executing large-scale performance tests in production-like environments
- Experience identifying and resolving performance bottlenecks across application, database, network, and infrastructure layers
- Experience tuning databases for performance at scale
- Experience defining and implementing performance benchmarks, KPIs, and capacity planning strategies
- Experience working with observability and monitoring platforms for performance analysis
- Experience optimizing event-driven and serverless architectures
- Experience influencing architecture and engineering decisions across teams and domains
- Experience operating in fast-paced, high-growth SaaS environments
Education
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field
- Equivalent practical experience in performance engineering or cloud-native systems