CommerceIQ
Company Overview
At CommerceIQ, we help consumer brands accelerate their retail ecommerce market share growth and profitability through machine learning algorithms. We are building the world’s most complete and sophisticated Retail Ecommerce Management Platform, which connects and intelligently automates the management of retail ecommerce channels like Amazon, Walmart, and Instacart, across the entire ecommerce operational chain of retail media management, sales operations, supply chain, and digital self analytics.
We are in hyper growth mode, having recently raised our Series D funding at unicorn valuation (>$1B) and ended our third year of triple-digit revenue growth. Continued acceleration of our growth is fueled by landing new customers, expanding our platform through new products, managing new retail ecommerce platforms, and delivering exceptional customer service to unlock high net retention rates.
DevOps Lead:
Top consumer brands like Nestle, Kimberly Clark, Natures Bounty, Johnson & Johnson, Mondelez, Kellogg to name a few rely on CommerceIQ suite of products to make efficient business decisions on a daily basis. It is critical to have services which run high quality data and algorithms that drive business decisions for our customers in a timely manner. As a DevOps you are responsible for building scalable, extensible, secure infrastructure on cloud for CommerceIQ’s applications and data science teams.
In this role, you will work with a couple of Infra/DevOps SDEs and partner with engineering managers to achieve CommerceIQ’s QoQ goals. As a DevOps lead, you will play a crucial role in upkeep security, cost, deployment infrastructure and maintain/develop services owned by Infra/DevOps team.
A successful candidate will be obsessed with technology and relentlessly raise the bar on the architecture, design and quality of code delivered while aggressively pursuing optimizations to meet cost and scale SLAs. The candidate should be capable of managing a fast-paced delivery schedule and influence and drive a high-level engineering strategy with the leadership, as well as take a hands-on approach to implementing that strategy.
Functional level Expectations
Required Skills
Below is a brief description of the charter for Infra team (subject o product/business deliverables)
Prometheus / Grafana
K8s metrics and deployment metrics
We use newrelic(NR) for log collection and operational metrics. But certain shortcomings and solved problems in Prometheus/Grafana might force us to move away from NR. Apart from the prometheus track, the lead will have to work with all. stakeholders [apps/DS] to migrate from OCD to K8s, with all balance and checks in place
EFK stack
Log collection and analysis is via newrelic and it’ll cost us as we move more applications to K8s. We need to optimise or explore options like EFK stack of logs or sumo logic etc
Security audit
We need to be proactive and do frequent security audits of our applications. We need a security center with all our documentation. Place to store the docs/questionnaires we filled for our clients etc.
BCP
We need to audit all our services for disaster recovery and develop a BCP across the company.
CI/CD
The lead is expected to enforce git branching and set code cov, style check guidelines across teams and projects. E.g Publishing the guidelines for Java project, NodeJS project, Python project. Establishing average code cov metrics, running static code analysis, findbug and other security code scanning tools
Infra services
The Infra team also has many services, especially crucial services like BSS (CSS) . These infra services should set the benchmark for DevOps best practices in the org.
Azkaban
Azkaban is our ETL orchestrator. It's run in-house and has custom code deployed. Maintenance of Azkaban to match growth is needed. Exploration and guidance to use other orchestrators like airflow, step function, astronomer is also expected
AWS admin and access / SF admin and access / AWS/SF cost , unit cost for client
Constant efforts to identify and reduce costs
AWS partnership [case studies, blogs, trainings, tech conf talks]
Cost savings across all our infrastructure.
SOC2 and other compliance
Automate our soc2 process with diligence. This is more important as we expand to other pillars of soc2 and ISO.