> ## Documentation Index
> Fetch the complete documentation index at: https://help.notis.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Usage & model rates

> How your monthly usage budget is tracked, what each model costs, and what happens when you reach the cap.

### How do limits work?

Each plan includes a monthly agent usage budget denominated in dollars. Each Notis message deducts from that budget using Notis model rates, which include a 20% service markup over the provider's published API price for OpenAI models. You can view your billing cycle, included usage, On-Demand usage, and request breakdowns on the [Usage page](https://app.notis.ai/usage) of your portal. In a Manager thread, the usage indicator also shows the thread's completed cost and how much context the latest completed run used.

If you use Notis with a team, everyone shares one monthly usage pool. For example, a 10-person team on a plan with \$150 of included monthly usage per seat gets a shared \$1,500 pool for the month. Any teammate can use it, and all team activity counts toward the same total. The Usage page shows the team's shared usage for the current billing cycle, and billing settings are managed by the team owner.

Notis manages model routing automatically based on the task. Prices below are shown per 1M tokens and reflect the current Notis rates after markup.

### Auto mode

By default, Notis runs on **Auto** (recommended). Instead of a fixed model, Notis's smart router reads each turn and picks the intelligence level — model and reasoning effort together — that fits the task: light, inexpensive work for a quick reply, and a stronger model for a hard, multi-step task. Because the choice is made per turn, a single conversation can move up and down levels as the work changes, so you only pay for the capability each step actually needs.

### Intelligence levels

You can **pin** an intelligence level — per conversation in the [Desktop and Web app](/channels/manager#intelligence-and-priority), as a per-channel default in [Channels](/channels/channels-compared#default-intelligence--priority), or per automation in the [automation editor](/get-started/automate#intelligence) — instead of letting Auto choose. Each level sets the model and its reasoning effort together:

| Level  | Model        | Reasoning effort | Price vs Medium |
| ------ | ------------ | ---------------- | --------------: |
| High   | gpt-5.5      | Medium           |              ×2 |
| Medium | gpt-5.4      | None             |              ×1 |
| Low    | gpt-5.4-mini | Low              |              ÷4 |

The multiplier is an estimate for a typical message: it combines the model's per-token rates (tables below) with the extra reasoning tokens its effort level produces. Actual billing always uses the real token counts and rates.

### Processing tiers (priority)

Notis chooses the processing tier automatically, but on a plan with priority access you can pin one per thread or per channel alongside the intelligence level (it shows as **Standard / Fast / Economy** in the picker):

| Tier (picker label) | Price vs Standard | When it's used automatically                                       |
| ------------------- | ----------------: | ------------------------------------------------------------------ |
| Fast (Priority)     |                ×2 | Voice interactions when the model supports it                      |
| Standard (Normal)   |                ×1 | Default for regular messages and user-triggered work               |
| Economy (Flex)      |                ÷2 | Background or automation-triggered work when the model supports it |

If a model does not support Flex or Priority, Notis uses its Normal rate instead. The Flex and Priority token tables below show the resulting per-token rates.

<Note>
  **Automations always run on Economy (Flex)** — about half the base price, at a lower (slower) priority. That's fixed and separate from the per-automation [Intelligence](/get-started/automate#intelligence) setting: pinning an automation to High or Low changes only its model and effort, never its priority, so it keeps the Flex discount either way.
</Note>

<Tabs>
  <Tab title="Normal">
    | Model                 |  Input | Cached input |  Output |
    | --------------------- | -----: | -----------: | ------: |
    | gpt-5.5               | \$6.00 |       \$0.60 | \$36.00 |
    | gpt-5.4               | \$3.00 |       \$0.30 | \$18.00 |
    | gpt-5.4-mini          | \$0.90 |       \$0.09 |  \$5.40 |
    | gpt-5.4-nano          | \$0.24 |      \$0.024 |  \$1.50 |
    | o4-mini-deep-research | \$2.40 |       \$0.60 |  \$9.60 |
  </Tab>

  <Tab title="Flex">
    | Model                 |         Input |  Cached input |        Output |
    | --------------------- | ------------: | ------------: | ------------: |
    | gpt-5.5               |        \$3.00 |        \$0.30 |       \$18.00 |
    | gpt-5.4               |        \$1.50 |        \$0.15 |        \$9.00 |
    | gpt-5.4-mini          |        \$0.45 |       \$0.045 |        \$2.70 |
    | gpt-5.4-nano          |        \$0.12 |       \$0.012 |        \$0.75 |
    | o4-mini-deep-research | Not available | Not available | Not available |
  </Tab>

  <Tab title="Priority">
    | Model                 |         Input |  Cached input |        Output |
    | --------------------- | ------------: | ------------: | ------------: |
    | gpt-5.5               |       \$12.00 |        \$1.20 |       \$72.00 |
    | gpt-5.4               |        \$6.00 |        \$0.60 |       \$36.00 |
    | gpt-5.4-mini          |        \$1.80 |        \$0.18 |       \$10.80 |
    | o4-mini-deep-research | Not available | Not available | Not available |
  </Tab>
</Tabs>

Specialized tools use their own model rates:

| Tool model                                     | Pricing basis                                      |                                        Notis rate |
| ---------------------------------------------- | -------------------------------------------------- | ------------------------------------------------: |
| Nano Banana Pro (`gemini-3-pro-image-preview`) | Text input / image input / reasoning / text output | \$2.40 / \$2.40 / \$14.40 / \$14.40 per 1M tokens |
| Nano Banana Pro (`gemini-3-pro-image-preview`) | Image output                                       |                            \$144.00 per 1M tokens |
| gpt-image-2                                    | Text input / cached text input / text output       |           \$6.00 / \$1.50 / \$12.00 per 1M tokens |
| gpt-image-2                                    | Image input / cached image input / image output    |           \$9.60 / \$2.40 / \$36.00 per 1M tokens |
| whisper-1                                      | Transcription                                      |                               \$0.0072 per minute |
| gpt-4o-mini-tts                                | Text input / audio output                          |                    \$0.72 / \$14.40 per 1M tokens |
| sora-2-pro                                     | 720p video output                                  |                                 \$0.36 per second |

### What happens when I reach my limit?

Once your monthly usage is exhausted, Notis notifies you and prompts you to either upgrade to a higher tier, activate **On-Demand Usage**, or wait for your billing cycle to reset. On teams, that means the shared team pool is exhausted.

On-Demand is optional. If you leave it disabled, Notis pauses paid work once your plan cap is exhausted. If you enable it, Notis can keep working past your included credits and bill the extra usage separately. You can choose a fixed monthly On-Demand limit or use Unlimited mode.

Read the full guide here: [On-Demand Usage](/account-and-settings/billing/on-demand-usage).
