Engineering Manager, Cloud Platform
Verdigris Technologies
Software Engineering, Other Engineering
Palo Alto, CA, USA
USD 190k-240k / year + Equity
Posted on Apr 14, 2026
GPU racks pull 120-140 kW each, heading toward 600 kW-1 MW per rack by 2027. Design margins in data centers have compressed from 30% to 10-15%. Standard BMS systems poll at 1-second intervals but GPU workloads ramp in 8 milliseconds. The gap between what operators can see and what they need to see is widening fast.
Verdigris closes that gap. We build the electrical intelligence layer for AI infrastructure. Continuous 8 kHz measurement that detects hidden degradation and validates safe operating headroom. Our platform sits between monitoring infrastructure (Schneider, Eaton, Vertiv) and autonomous controls. We are the validation layer that makes both trustworthy. We are building toward a world where data center operators can safely unlock stranded capacity, prevent failures before they cascade, and ultimately enable autonomous power orchestration for AI workloads.
About 50 people. Series B. Real customers, real revenue, real hardware deployed in colocation facilities running AI workloads. The cloud platform already processes billions of 8 kHz waveform readings from deployed sensors and turns them into validated operating limits that operators use daily. Today that means reliability and early warning. Tomorrow it means capacity optimization and machine-facing orchestration APIs that GPU schedulers consume directly.
We are hiring an Engineering Manager to lead the cloud platform team, the system that makes all three product pillars (Observability, Intelligence, Orchestration) work. You would manage a team of 3-5 engineers, reporting to Jon (co-founder/CTO), with a mandate to grow the team and raise the bar.
Here is the situation. The platform works. Customers depend on it. The 8 kHz ingestion pipeline is real and running in production. But the architecture has grown faster than the team's ability to maintain it cleanly, and the org structure has not kept pace with what we need to build next. AI infrastructure spend is projected at $250-650B in capex, and demand for validated electrical intelligence is accelerating with it. We need someone who can take ownership of the platform, organize the team around clear ownership, and raise the quality of how we build and ship, while also building toward the orchestration layer that does not exist yet.
This is a player-coach role. You will manage people, set direction, and run the engineering operating cadence. You will also read code, debug production issues, and make architectural calls. If you have not been in a codebase recently, this is not the right fit.
First 6 months
- Audit the platform: reliability, scalability, observability, tech debt. Form your own view, not just ours.
- Organize team ownership across the three-pillar stack: Observability (ingestion, 8 kHz data pipeline), Intelligence (ML signal processing, validated operating limits), and the APIs and dashboards that deliver them.
- Stand up an engineering operating cadence: roadmap reviews, incident reviews, delivery planning, architecture reviews.
- Get your hands dirty on the hardest reliability and performance problems. Ship fixes, not just plans.
- Identify hiring gaps and start filling them. Raise the bar on who we bring in.
By 12 months, here is what success looks like
- Platform reliability and deployment velocity are measurably better. Fewer fires, faster fixes.
- The team ships consistently with clear ownership. They do not need you in every decision.
- There is an engineering roadmap people trust, one that connects today's reliability work to the capacity optimization and orchestration capabilities we are building toward.
- Cycle time from issue discovery to production fix has dropped.
- You have made at least two hires who made the team noticeably stronger.
- The platform is positioned to support machine-facing orchestration APIs, the layer where validated intelligence feeds directly into GPU schedulers and demand response systems.
What we are looking for
- You have real technical depth in cloud infrastructure, data systems, or ML platforms. You can review architecture, debug production, and make tradeoffs, not just delegate them.
- You have inherited or built a small team before and made it better. Not by replacing everyone, but by setting clear expectations, building ownership, and coaching people up, or making hard calls when coaching was not enough.
- You can operate without a clean roadmap. Cross-functional dependencies, incomplete requirements, competing priorities. You turn that into a plan with owners and timelines.
- You care about production quality. Observability, incident response, release discipline. You build the habits, not just the systems.
- You are genuinely interested in what happens when AI meets physical infrastructure. Our customers run mission-critical facilities where electrical reliability directly determines whether AI workloads stay online. The validation layer we are building does not exist anywhere else. This is new territory.
Compensation
- Base salary: 190,000 - 240,000 USD
- Equity: 0.15 - 0.30% (Series B, four-year vest)
- Palo Alto-based preferred. Pacific Time overlap required.
Why this role
- You would work directly with the founding team and own the platform that makes the product work.
- The company is small enough that your decisions show up in the product and the culture within months, not years.
- The 8 kHz ingestion pipeline is already running in production. You are not starting from zero. You are taking something real and making it significantly better.
- If you are at a bigger company and wondering whether you will ever get to build something from a position of real ownership, this is that role.