Whitepaper · June 2026
Cloud-Agnostic
Supercompute.
A cost and architecture model for escaping hyperscaler lock-in on GPU, ML training, and high-throughput analytics workloads.
AI workloads have outgrown the hyperscaler default.
For a decade, the right answer to "where should we run this workload?" was almost always the same: pick the hyperscaler you already use, choose the instance type closest to your need, accept the price. For most enterprise applications, that answer is still correct. For AI workloads — GPU-bound training, high-throughput inference, large-scale data preparation. The answer has changed. The economics have moved fast enough that the hyperscaler default is no longer cost-defensible for the workloads where compute is the dominant line item.
POWERON is Ionate's cloud-agnostic compute platform: a unified control plane that runs the same workload across hyperscalers, specialty GPU clouds, sovereign clouds, on-prem clusters, and bare-metal partners. This whitepaper describes the architecture, the cost model that justifies it, the workload categories where the model is most compelling, and the operating discipline required to make agnostic compute work at enterprise scale.
Cloud-agnostic compute is not a position against hyperscalers. It is a stance about pricing compute as a portable input. Pricing curves move; the vendor with the strongest position on AI workloads this year is unlikely to hold that position uncontested across the rest of the decade.
Where the bill comes from.
Hyperscaler pricing for AI workloads has three components that have not converged in the way commodity compute did. Understanding the components is the prerequisite for evaluating alternatives.
GPU instance pricing
Top-tier GPU instances on major hyperscalers are priced at levels that have not declined materially in two years. The supply is constrained, demand is unconstrained, and the hyperscalers are not racing to the bottom. List prices in the multi-dollar-per-hour range per GPU are now the norm for inference; training is a multiple of that.
Egress economics
Data egress charges remain the most overlooked component of cloud AI cost. Training datasets, model artifacts, and inference responses all move data. The egress is billed at rates that, for high-throughput workloads, can equal or exceed the underlying compute cost. Egress fees are also the lever the hyperscaler uses to make multi-cloud architectures economically unattractive — by design.
Network and adjacent service stickiness
An AI workload that runs on Cloud X is rarely just compute. It is compute plus storage plus networking plus authentication plus observability plus a half-dozen other adjacent services. Each adjacency adds another opinionated API, another vendor-specific dependency, and another reason it would be hard to leave. Hyperscalers price compute aggressively because the stickiness of the adjacencies funds it.
Specialty providers have changed the curve.
The market has produced a new generation of compute providers whose economics are materially different from the hyperscalers'. They are not better at everything. Most do not offer the breadth of services a hyperscaler does. But for the specific workloads where compute is the dominant cost, the price gap is material.
| Substrate | Strength | Limitation | Best For |
|---|---|---|---|
| Hyperscalers | Breadth, adjacencies | Price on raw compute | Stateful applications, regulated workloads |
| Specialty GPU clouds | GPU price, scale | Limited adjacencies | Training, large-scale inference |
| Sovereign clouds | Data residency | Geographic scope | Public-sector, regulated regions |
| Bare-metal partners | Predictable cost | Operational responsibility | Steady-state, capacity-planned |
| On-prem GPU clusters | Lowest unit cost | Capital intensity | Continuous, high-utilization |
The portfolio principle
The new economics are not "leave the hyperscaler." They are "build a substrate portfolio." Different workloads belong on different substrates. The portfolio approach captures the price advantage of each substrate where it applies without giving up the convenience advantages where they apply.
The portability prerequisite
A substrate portfolio is only economically meaningful if the workloads are actually portable across substrates. If moving a workload from substrate A to substrate B costs more than a quarter of savings, the portfolio is theatre. POWERON exists to make workload portability operationally cheap.
What if your control plane knew about every substrate?
POWERON has three architectural layers. Each is engineered for substrate independence.
Layer 01
Workload Abstraction
Workloads expressed as Kubernetes-native + accelerator manifests. The same definition runs on every substrate.
Layer 02
Substrate Adapters
Per-substrate adapters translate the abstract workload into the substrate's native deployment primitives.
Layer 03
Scheduler & Placement
A policy-driven scheduler that places each workload on the substrate that best fits its constraints — cost, latency, residency, capacity.
Workload abstraction
Every POWERON workload is declared as a Kubernetes-native manifest extended with accelerator and data-locality hints. The manifest is the source of truth; the substrate-specific deployment is derived. Customers expose intent; POWERON resolves to substrate. Where customers already operate the IONATE KUBERNETES platform, POWERON integrates as the substrate-placement layer beneath it — workload manifests are identical, and the platform contract carries through (see our companion paper Production Kubernetes for Modernized Workloads).
Substrate adapters
Adapters exist for AWS (EKS + p4/p5/p6), GCP (GKE + A3/A4), Azure (AKS + ND-series), CoreWeave, Lambda Labs, OCI bare-metal, several sovereign-cloud providers, and customer-owned Kubernetes clusters. Each adapter translates the workload's abstract requirements into the substrate's actual primitives — instance types, accelerator IDs, network and storage classes, identity bindings.
Scheduler and placement
The scheduler evaluates each incoming workload against the customer's policy: minimum substrate count, residency constraints, latency SLA, cost ceiling. It places the workload on the substrate that fits — and re-places when the fit changes (capacity disappears, price moves, residency tightens).
Think of it as the difference between a calculator and a spreadsheet. You can price one workload against one substrate at a time with a calculator. A portfolio of dozens of workloads against half a dozen substrates needs the spreadsheet. And the spreadsheet has to be live.
Where the savings actually are.
The savings of cloud-agnostic compute are not uniform. They are concentrated in workload categories where compute is the dominant line item and where substrate diversity exists. The table below describes the workload categories where POWERON customers consistently see material savings.
| Workload | Dominant Cost | Typical Substrate Mix | Savings vs Hyperscaler-Only |
|---|---|---|---|
| Foundation model training | GPU-hours | Specialty + on-prem peak | 50–70% |
| Fine-tuning & LoRA | GPU-hours | Specialty + hyperscaler burst | 40–60% |
| Steady-state inference | Compute + egress | Bare-metal + hyperscaler façade | 40–60% |
| High-throughput analytics | Compute + storage | Sovereign or bare-metal | 30–50% |
| Embedded enterprise apps | Operational adjacencies | Hyperscaler-native | Minimal |
What the savings exclude
The savings ranges are net of the operational overhead of managing a substrate portfolio, the cost of cross-substrate networking, and the cost of POWERON itself. They are not gross compute deltas. Reported ranges reflect the spread observed across actual customer engagements; specific outcomes depend heavily on workload mix, baseline contracts, and reservation patterns.
Where the savings do not apply
Workloads dominated by hyperscaler-specific managed services (proprietary databases, vendor-specific identity, deeply-integrated observability stacks) do not benefit from substrate diversity. The cost of breaking the adjacency is larger than the cost saved on compute. Those workloads stay where they are.
What is hard, and what is solved.
Multi-substrate operations are not free. The cost is real and must be planned for. POWERON's design has been shaped by what is actually hard at scale.
Identity
Every substrate has its own IAM model. POWERON federates them through a single control plane: workloads carry their identity, substrates accept it. Customers integrate their existing IDP (Okta, Entra ID, Ping) once.
Observability
Cross-substrate observability is the operational metric that determines whether a portfolio is sustainable. POWERON ships a substrate-agnostic telemetry layer (OpenTelemetry-based) that produces a unified view of workloads regardless of where they are running.
Data locality and movement
The most expensive mistake in cross-substrate operations is moving data unnecessarily. POWERON's scheduler tracks data residency and places workloads where the data is — not the reverse. When data movement is required, it is explicit, audited, and budgeted.
Capacity reservation
Specialty providers have different capacity-reservation models than hyperscalers. POWERON exposes a unified reservation interface so customers can plan capacity across the portfolio without learning N substrate-specific reservation languages.
When compliance is the substrate driver.
For some workloads, the cost savings of cloud-agnostic compute are secondary to the compliance benefits. Sovereign-cloud and hybrid-cloud deployments are most common in three customer categories.
Public sector
Federal, state, and provincial agencies are increasingly required to run AI workloads in jurisdiction-specific cloud regions. POWERON's substrate adapters cover the major sovereign-cloud providers globally, enabling agencies to apply the same workload portfolio across approved providers.
Financial services in restricted regions
Some financial-services regulators require all customer-data processing to remain in-jurisdiction. POWERON's residency-aware scheduler enforces this by construction: a workload tagged with a residency constraint cannot be placed on a substrate outside the allowed region.
Defense and critical-infrastructure
Where the workload must run on customer-owned hardware (on-prem clusters, air-gapped data centers), POWERON treats the on-prem cluster as another substrate. The same workload definition runs in the secure facility as runs in the commercial substrate.
From workload inventory to first portable deployment.
Most customer engagements start with the same question: of your existing AI workloads, which actually benefit from a substrate portfolio? The grid below is the working tool we hand to AI infrastructure leads in the first meeting. It is the input to the portfolio design that follows.
| If your workload is... | And its dominant cost is... | Then the candidate substrate is... |
|---|---|---|
| Foundation model training | GPU-hours | Specialty GPU cloud + on-prem peak; hyperscaler only for bursts |
| Fine-tuning / LoRA | GPU-hours | Specialty GPU cloud; hyperscaler reservation as fallback |
| Steady-state inference | Compute + egress | Bare-metal partner with a hyperscaler façade for variable traffic |
| High-throughput analytics | Compute + storage | Sovereign or bare-metal substrate; data stays put |
| Embedded enterprise app | Operational adjacencies | Stay on the hyperscaler; the adjacency cost dominates |
| Residency-constrained workload | Any cost driver | Sovereign cloud or on-prem; residency wins regardless |
Once a candidate substrate is identified for the top workloads by spend, the portfolio takes shape on its own. Ionate's role is to wire up the adapters, codify the scheduler policy, and run a pilot deployment on the first workload that proves the model.
Curious what a substrate portfolio would cost your workloads?
We will inventory your top AI workloads against the substrate market and return a fully-priced portfolio recommendation. The output is a document your finance team can model directly.