AI features can look inexpensive during a prototype and become uncomfortable once real users arrive. A chat assistant, document summarizer, support agent, or internal workflow may make several model calls for one visible user action. If the product also retrieves documents, processes images, retries failed calls, or keeps long chat history, the simple estimate can drift quickly.
The goal is not to predict the invoice perfectly. The goal is to understand which assumptions matter before launch. This article explains the tradeoff between API, subscription, and open-source AI, the math behind API estimates, what can break the estimate, and how Canadian businesses should think about CAD planning.
The tradeoff: API vs subscription vs open-source AI
There is no single cheapest AI path. API-based AI is flexible because you can build it directly into your product, automate workflows, control prompts, and meter usage. The tradeoff is that cost depends on real behaviour: request volume, token length, model choice, retries, and product design.
Subscription AI tools are usually easier for internal use. A business may pay a predictable monthly fee per employee and avoid building infrastructure. But subscription tools may not fit a customer-facing product, may have usage policies or limits, and may not provide the exact workflow or data controls a team needs.
Open-source or self-hosted models can reduce dependency on a per-token API price, but they are not automatically free. Hosting, GPUs, engineering time, observability, evaluation, security review, and maintenance can become the real cost. For some teams that control is worth it; for others it is complexity too early.
| Path | Usually stronger when | Watch for |
|---|---|---|
| API | You need product integration, metering, and model flexibility | Token growth, retries, agent loops, provider price changes |
| Subscription | The use case is mostly internal and seat-based | Usage limits, workflow fit, data policy, per-seat scaling |
| Open-source | You have technical capacity and need control | Hosting, engineering time, security, monitoring, support |
The math: how AI API costs are calculated
Most AI API estimates start with a simple formula: model calls multiplied by input tokens and output tokens, then multiplied by the provider's price per million tokens. That sounds clean, but the hard part is choosing realistic inputs.
A customer support bot might receive one message from the user, then add a system prompt, conversation history, retrieved policy documents, tool outputs, and a long final response. An agent workflow might make three to eight model calls behind the scenes. A document feature might process far more tokens than the user sees on screen.
A better estimate separates base monthly requests, agent loop multiplier, retrieved-document tokens, input tokens, output tokens, image or vision calls, cache savings, and non-AI business costs. That structure makes it easier to see which assumption deserves a stress test.
- Monthly requests estimate how often the feature is used.
- Input tokens include prompts, chat history, retrieved documents, and tool context.
- Output tokens include model responses, structured JSON, summaries, or generated text.
- Agent loop multiplier estimates hidden repeated calls in multi-step workflows.
- Image or vision costs may be priced separately from text tokens depending on provider rules.
What could break the estimate
The fragile part of an AI estimate is usually not the spreadsheet. It is the single assumption nobody challenged. Monthly usage might be higher than expected. Average output length might double. A product decision might add RAG context to every request. A support agent might retry or call tools more often than the prototype suggested.
Pricing can also change. Provider token prices, subscription tiers, caching discounts, rate limits, model availability, and usage policies can move over time. Treat official provider pricing pages and real invoices as the final source before making a commitment.
The estimate can also miss costs outside the API bill: implementation time, employee training, prompt engineering, monitoring, data privacy compliance, security review, human review, downtime risk, and support. Direct model cost is important, but it is not the whole operating cost.
Canadian business planning notes
Many AI providers publish prices in USD. A Canadian budget should convert those assumptions into CAD and leave room for exchange-rate movement. If revenue is in Canadian dollars but AI cost is in USD, the margin can move even when user behaviour does not.
Canadian teams should also think separately about GST/HST, income tax treatment, bookkeeping, and whether the AI system touches sensitive customer or employee data. This article and the calculator are planning tools, not accounting or legal advice.
For a small Canadian business, the practical question is often: what has to be true for this feature to stay affordable? That may include usage limits, cheaper fallback models, shorter context windows, caching, pricing tiers, or a slower rollout.
Try the AI Cost Calculator
The AI Cost Calculator lets you model requests, tokens, agent loops, retrieved-document tokens, image or vision cost, CAD/USD conversion, infrastructure, payment fees, support costs, and break-even pricing in one place.
Use it before launch to compare whether the workflow still makes sense under conservative assumptions. Then check the broader tools page if the estimate changes your budget, pricing, savings, or account-planning decisions.
- Start with /tools/ai-cost-calculator to estimate direct and supporting costs.
- Use /tools to compare other financial planning calculators after the AI estimate changes your budget.
- Use /tools/account-decision-tool if the project affects personal cash flow, TFSA/RRSP/FHSA contributions, or emergency-fund planning.
Next path after estimating AI costs
Once the estimate is built, avoid treating the first answer as the budget. Run a low, middle, and high usage scenario. Change one input at a time: monthly requests, response length, agent loop multiplier, RAG tokens, paid conversion rate, and subscription price.
If the estimate is tight, the next move is usually not optimism. It is a design decision: reduce token length, choose a cheaper model for routine tasks, cache stable context, set usage caps, simplify the workflow, or delay expensive automation until there is enough revenue to support it.
If the AI project affects your own savings or business cash flow, connect the estimate back to broader financial planning. Registered accounts like TFSA, RRSP, and FHSA are separate personal decisions, but major business spending can change how much room you have for them.