What is AI COGS?

The direct, recurring cost of serving an AI product to customers — and the line on the income statement that decides whether your AI product carries software margins. A working definition for finance teams.

By COGScontrol Team · June 12, 2026

AI COGS — AI cost of goods sold — is the set of direct, recurring costs incurred to serve an AI product to customers: model inference (API tokens or self-hosted GPU compute), the cloud infrastructure attributable to running the product, and the per-request services around them. It excludes research, experimentation, and most training, which belong in R&D.

Cost of goods sold is one of the oldest lines in accounting: the direct cost of producing what you sold, subtracted from revenue to give gross profit. Software spent two decades treating it as a rounding item, because serving one more customer cost close to nothing. Inference ended that. Every AI request consumes tokens with a price attached, and the line is back at the center of the income statement.

What counts as AI COGS?

Four categories qualify, and the test for each is the same: does the cost recur, directly, in the path of serving a customer? If fulfilling one more request makes the line grow, it belongs in COGS.

Model inference. API tokens from providers such as OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, and Google Vertex AI — or GPU compute if you host models yourself. For most AI products this is the largest single component.
AI-attributable cloud infrastructure. The share of AWS, Google Cloud, or Microsoft Azure spend that exists to run the AI product in production: orchestration compute, GPU capacity, caching layers, and the data pipelines feeding live features.
Vector databases and retrieval infrastructure. Embedding storage, index maintenance, and the retrieval calls that precede generation in a RAG architecture.
Third-party AI services in the request path. Per-request transcription, moderation, OCR, guardrails, reranking — anything invoked in production to fulfill a customer request.

What does not count as AI COGS?

Research and experimentation sit outside COGS: prototype tokens, evaluation runs in development, and one-off training efforts are R&D, because they do not scale with customer usage. Fine-tuning is the genuinely contested case. Some finance teams expense it as incurred; others capitalize the cost and amortize it into COGS over the model’s useful life. Both positions are defensible, and treatment varies across companies and auditors. What is not defensible is switching between them. Choose a policy, document it, and apply it consistently, or your gross margin trend stops meaning anything.

How is AI COGS different from traditional software COGS?

Traditional software COGS is small, stable, and mostly on one invoice. AI COGS is usage-coupled, priced per token, spread across several providers, and arrives on invoices that say nothing about which product incurred them.

	Traditional SaaS COGS	AI COGS
Cost driver	Seats, storage, hosting; broadly flat per customer	Usage; every request consumes priced tokens
Pricing unit	Instances, seats, gigabytes	Tokens in and out, per model, per provider
Volatility	Predictable quarter to quarter	Can move by multiples within a quarter
Invoices	One or two vendors	Several model providers plus several clouds
Attribution	Rarely necessary	Essential — invoices do not name products or customers

The volatility is not hypothetical. TechCrunch reported that Priceline saw a four-to-five-fold cost increase at its Cursor renewal, and the same report recounts a CTO whose engineer ran up $40,000 in token charges in one month. Those read as engineering anecdotes, but they settle on the finance team’s gross-margin line.

This is less a failure of tooling than a change in the problem. Cloud cost platforms and the FinOps discipline are genuinely good at the infrastructure half — rates, commitments, utilization, anomaly detection — and any company running serious cloud workloads should use them. What they were not designed for is the accounting half: deciding which product, customer, and P&L category each token belongs to, and tying the result to revenue. That division of labor is covered in FinOps vs. AI Value Management.

How do you calculate AI COGS?

Sum the direct serving costs across every provider, attribute them to a product or customer, and state them per accounting period. The arithmetic is the easy part:

AI COGS = model inference + AI-attributable cloud infrastructure + retrieval & vector infrastructure + per-request AI services

The work is in the inputs. The costs arrive on separate invoices — OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, and Google Vertex AI on the model side; AWS, Google Cloud, and Microsoft Azure on the infrastructure side — and none of them say which product or P&L line they belong to. A defensible AI COGS figure therefore requires normalizing every provider’s billing data into one ledger, classifying each cost by rule against dimensions such as product line, cost center, and environment, and reconciling the total back to invoice so it always ties out. That is the job COGScontrol’s unified ledger and attribution engine was built for, with rules reapplied retroactively and an audit trail behind every classification.

Stated cleanly, AI COGS becomes the numerator of every figure a board will ask about: cost per interaction, cost per customer, and contribution margin. You can pressure-test your own figures in the AI unit economics calculator.

One ledger, reconciled daily

Stop assembling AI COGS by hand.

COGScontrol normalizes model and cloud invoices — OpenAI, Anthropic, Bedrock, Azure, Vertex, plus your clouds — into one ledger, classified by rule and reconciled to invoice daily.

＋Get Started Free

Why does AI COGS determine your gross margin?

Because for an AI product, COGS is the only large cost line that scales with revenue — so it, more than anything else, sets the gross margin. When serving costs run at a substantial share of revenue, the margin profile stops resembling software: Andreessen Horowitz has observed AI companies with gross margins often in the 50–60 percent range, against a 60–80-plus percent benchmark for comparable SaaS businesses.

The spread between those outcomes is wide, and partly within your control. ServiceNow CFO Gina Mastantuono can tell investors that AI reasoning is “less than 10% of our cost to serve” — a sentence only available to a company that measures its serving costs precisely. That precision is the whole game: a defensible AI COGS is one normalized ledger, attributed by rule and reconciled to invoice daily, so the number that sets your gross margin always ties out. From there the next question is what the spend produced — the subject of measuring the ROI of AI initiatives. Cost tools tell you what you spent. AI Value Management tells you what it bought.