theJugglingCompany.com

Blog · 22 May 2026 · 7 min read BrainTech

Count Your Props Before You Start

You cannot throw what is not in your hands. Before building, check what you actually have - API rate limits, SaaS tier caps, hardware ceilings. The pattern you design must fit the props you own.

Two cupped hands cradling three glowing RGB balls - one red, one green, one blue - against a dark background

Before a juggler starts, they do something that looks invisible: they count.

Not because they do not know how many props they have. Because the count is where the planning lives. Three balls means a three-ball pattern - the cascade, the shower, the columns. Four balls means four-ball patterns. Five means five. You do not design the pattern and then count the props. You count the props and then design the pattern.

This sequence matters more than it sounds. A pattern designed for the wrong number of props is not ambitious. It is broken from the first throw.

3
types of limits
Rate limits, quota limits, and hard platform limits
1st
real work
Counting your props is not a step before the work - it is the first work
0
fixes available
Hard limits cannot be changed by cleverness, money, or architecture

What the hands hold is the constraint

The inventory is whatever is in hand right now. Not the props you are hoping to find. Not the props on order from somewhere. The actual count, today.

The pattern that emerges has to fit that count. Shannon’s juggling theorem (1980) made this exact: F + D = N(B + E), where flight time, dwell time, hands, balls per hand, and empty time are not independent variables. Change the ball count and every other term in the pattern shifts. There is no pattern that works at the wrong count.

Building with software - particularly cloud infrastructure, SaaS platforms, or external APIs - is the same discipline. And it is violated constantly. People design systems for the resources they expect, or the tier they plan to upgrade to, or the API limits they read about in a blog post, rather than the actual constraints of the plan they are currently on.

The three categories of limits

Every platform you build on has three kinds of limits worth counting before you start.

Rate limits are the props you can juggle per unit of time. An LLM API might allow 60 requests per minute on a free tier. A geocoding API might cap at 1,000 calls per day. If your pattern requires 200 calls per minute and you have 60, you do not have a scaling problem you can solve by being clever. You have a counting problem. The number is wrong.

Quota limits are the total you can consume within a billing period. Storage quotas, compute hours, token budgets, seats. These deplete continuously and do not reset until the period ends. Running into a quota in production is not a failure mode worth engineering around. It is a math error that happened before any code was written.

Hard limits are things the platform will simply not do regardless of tier or price. Maximum context window for a model. Maximum concurrent database connections. Maximum file upload size. No amount of money or architectural cleverness changes these. They are the physics of the platform you chose.

Rate LimitsQuota LimitsHard Limitsprops per unit timetotal per billing periodplatform physics60 req/min1,000 calls/day10 concurrentstorage quotatoken budgetcompute hourscontext windowmax connectionsupload file sizeFix: throttle, queue,or upgrade tierFix: budget ahead,track depletion rateFix: none. Choosea different platform.Count all three before writing code. Rate and quota limits have engineering responses. Hard limits do not.
Three categories of platform limits - each requires a different kind of counting before you design the pattern

Open source and hardware share the same structure

This is not only a SaaS concern. Self-hosted systems have their own inventory that requires the same counting.

A machine has RAM, CPU cores, disk I/O throughput, and network bandwidth. These are fixed at a given point in time. A pattern that requires 64GB of working memory on a 16GB machine is not a promising approach to cut down later - it is a miscounted inventory. An inference workload that assumes 4 parallel processes on a 2-core server will run at half the expected speed at best and not at all at realistic load.

Open source projects have similar structures in their documented hardware requirements. Running a vector database, a message broker, and a model inference server on a single low-memory virtual machine is juggling four props when you have hands for two. The props do not care that you optimistically believed otherwise.

Juggling Cloud and infrastructure planning
Count the props before designing the pattern Read rate limits and quotas before writing code that calls an API
Three balls means a three-ball pattern - not a four-ball plan Design for the tier you are on, not the tier you expect to upgrade to
Pattern designed for wrong count breaks on first throw System designed for wrong limits fails at first production load
Four props, hands for two: immediate drop Vector DB + broker + inference server on 8GB RAM: immediate OOM
Hard limit: physics of the human body Hard limit: platform maximum context window or connection count
The prop count before a juggling routine maps directly to resource counting before system design

The discipline that gets skipped

The pattern for checking limits is straightforward: before writing code that calls an API, read the rate limits in the documentation. Before designing a database schema, verify the storage tier you are on. Before deploying to a machine, run the actual workload on that machine and measure what it costs.

None of these steps is exciting. All of them feel like delay. Schedule pressure makes them easy to skip with the justification that you can handle limits later when you hit them.

The count is not a bureaucratic step before the real work. It is the first real work.

But a juggler who starts throwing before knowing what is in their hands is not a juggler who will run out of time mid-pattern. They are a juggler who was going to drop something on the first throw. The count is not a bureaucratic step before the real work. It is the first real work.

The three balls are not a limitation

A good three-ball routine, run cleanly, is worth more than a four-ball pattern that drops on the first throw. The props you have are not a placeholder on the way to something bigger. They are what you are working with right now, and the pattern you build for them should be good enough to run perfectly.

Know your props. Then build the pattern.


Related: The Wrong Number - on what happens when the count is off and the pattern has already started.