Notes on time complexity
Imagine we’re given a task to sort a list of names, actually a billion names, for a search engine. There are dozens of ways to sort. Each one of them solves the problem, but some finish in seconds, others in hours. Algorithm analysis helps us to compare them.
We mainly care about running time. We also consider memory, but time is the star. Why? Even if you have enough memory, an algorithm that takes too long can make a program useless.
But it runs fast on my machine!
When we write any algorithm, we want it to be fast, reliable, and most importantly, ready for the real world. But how do we know if it’s fast enough compared to another algorithm doing the same task?
You could run both algorithms on your laptop. But it’s not universal. A very fast algorithm on a high-end computer or server might limp on a budget PC. And what about production, where the hardware could be anything from a cloud cluster to an old desktop?
Execution time depends on hardware, programming language, and even your coding style. Counting the number of statements or instructions doesn’t help either, because a “single step” in Python is not the same as in C++ or Java. To pick the best algorithm, we need a universal yardstick. It should be tied to the algorithm itself, not the machine it runs on or the programming language we are using.
Let’s think about input size: . It’s the number of elements we’re working with: the length of an array, the nodes in a graph, the bits in a number. Bigger means more work — more numbers to sort, more data to process. If you express running time as a function of , say , you get a machine-independent measure. Now you can compare algorithms head-to-head, no stopwatch needed.
Good! But what is in ?
is a function of input . It counts operations — the basic steps like adding numbers, comparing values, or reading memory. Suppose you’re summing a list of numbers. You loop through each, adding one at a time. That’s roughly operations, so .
Now imagine sorting that list with Bubble sort. You compare pairs of numbers, possibly in nested loops, checking each item against others. That could lead to comparisons, plus some other setup steps, giving .
Those terms in — like , , or 100 — reflect the algorithm’s moves. The comes from nested loops, the heavy lifting. The might be a single loop for swaps or checks. The 100 is fixed work, like setting up variables.
We count these operations on an idealized machine, where each step takes the same time. This way, we are ignoring the real-world quirks like slow CPUs or cache delays. It’s a clean way to focus on the algorithm’s logic.
Inputs are not always same
Not all inputs behave the same with any algorithm. Some inputs take a long time, while the algorithm finishes instantly with some other inputs.
For example, A sorted list might let a sorting algorithm finish early, while a reverse-sorted list needs every step: more time to finish. That’s why analyze every type of input:
- Best Case: The luckiest input, where the algorithm runs fastest. For linear search, finding the item first takes 1 step—blazing fast.
- Worst Case: The toughest input, where the algorithm limps. In linear search, the item is last or missing, requiring steps.
- Average Case: The typical scenario, assuming random inputs. For linear search, you check about half the list, so roughly steps. From these, the Worst case is our safety net — it guarantees performance even under attack or the worst possible scenario. The average case predicts real-world use but needs assumptions about input patterns. The best case is nice but rare.
Each case gives its own . For example, in the worst case, an algorithm’s might be , while for the best case, it might be . You see, we need a better way to describe these cleanly.
Making more simple
Most of the time, for any algorithm, we want to know what happens when our input size is huge, like really huge! And due to this assumption of huge input size, not all parts of matter equally in running time measurement. For huge , the biggest term takes over.
In , when , the is a trillion, while is just 4 million, and 100 is a particle. So the quadratic term dominates the others.
This thinking is about the “rate of growth”. It tells us how running time scales with input. This is what computer scientists refer to as the algorithm’s time complexity. Initially, we consider all terms. We include smaller contributions to understand the exact number of steps. Later, for large inputs, we focus on the dominant term. This tells us how the algorithm behaves as it grows.
It’s like budgeting: if you’re buying a car and a coffee, the car’s cost overshadows the rest. We focus on because it tells the algorithm’s fate for larger inputs. So,
Asymptotic notation
Previously, we had with a lot of unnecessary terms instead of just the dominant term. We’re going to change it here using asymptotic notation and the thinking process of rate of growth to make algorithm analysis simpler. So, what is “Asymptotic” here?
A function is asymptotic to another function if:
This means:
- For very large , and grow at almost the same rate.
- is like a “simpler approximation” of at infinity. Let’s say we’ve . Here for a very large , barely matters. So we can say (n grows without bounds) .
We’re going to do the same here: we’ll focus on the behavior of when approaches infinity. We’ll ignore minor terms of and zoom in on the dominant-term trend.
So our task is to find some simpler for any to analyze the running time of an algorithm asymptotically, instead of playing with all terms.
Now see, there is some flexibility and options here. We could pick a that grows slightly faster than , or one that slightly slower. For example, if , a natural choice is , which closely tracks the growth of . But we could also pick ; it will eventually be larger than , giving an upper estimate, or , which underestimates it slightly. These concepts of upper and lower estimation bounds could be formalized through Big-, Big-, and Big-.
Big-O: analyzing the upper bound
Big-O sets a ceiling on growth. It describes an upper bound, meaning the function will never grow faster than this, for any input of size . If , then grows no faster than , up to a constant. Formally, there are constants and so that
Notice something important: there are infinitely many valid choices for . For example:
- If , then technically , but also , , or even .
- All of these are valid upper bounds because they grow at least as fast as . So Big-O is really a set of functions that provide an upper estimate. In practice, we always choose the “smallest” or tightest one, which gives the clearest description of growth.
For example, if , then the tightest choice is . Because dominates and we ignore the smaller terms. Big-O is about safety: “It won’t get worse than this.” That’s why linear search is , binary search is .
Big-: analyzing the lower bound
Big- sets a floor. If , then grows at least as fast as . Formally, there are constants and so that
Similar to Big-O, notice something important: there are infinitely many valid choices for . For example:
- If , then technically , but also , , or even .
- All of these are valid lower bounds because they grow no faster than . So Big- is really a set of functions that provide a lower estimate. In practice, we usually choose the “largest” or tightest one, which gives the clearest description of growth. Hence, we seek the largest that stays below .
For example, if , then . Because for large , gives a lower bound on the growth of .
Big- says, “It won’t be faster than this.” It’s less commonly discussed than Big-O, but it helps confirm an algorithm’s minimum cost.
Big-: analyzing both bounds
So far:
- Big-O gave us a ceiling → “it won’t grow faster than this.”
- Big- gave us a floor → “it won’t grow slower than this.” But what if we want the exact growth rate, not just loose boundaries? For example, if , then technically and . Those are true, but not very informative. What we’d really like is a description that defines the growth rate from both sides at once.
That’s where Big- (Theta) comes in. It is not too high, not too low, but just right. Formally, if there exist positive constants , and such that for all :
This means is squeezed between two multiples of when gets large. So it doesn’t just grow at most like or at least like — it grows exactly like (up to constant factors). Example:
- Upper bound:
- Lower bound:
- Together:
Everything together
Now let’s see how best, worst, and average cases relate to Big-, Big-, and Big- for Linear Search.
Worst Case ():
- Big-: → the search will take at most linear time.
- Big-: → the search takes at least linear time in the worst scenario.
- Big-: → exactly linear, up to constants.
Best Case ():
- Big-: → the search will take at most constant time(tightest bound). Technically also , … but these are looser.
- Big-: → at least constant time.
- Big-: → exactly constant.
Average Case ():
- Big-: → at most linear.
- Big-: → at least linear.
- Big-: → exactly linear on average.
Does mean “worst case time complexity”?
Not necessarily! Saying an algorithm’s time complexity is just means its running time grows no faster than linear, but it doesn’t automatically mean worst-case. We need to specify the case. Let’s take linear Search as an example:
- Worst Case: The item is last or missing. You check all elements, so .
- Tight Big-O: .
- Best Case: The item is first. You check only one element, so .
- Tight Big-O: .
- Technically, Big-O is flexible: or even , but those are looser, less informative bounds.
So next time you see “”, ask: best, worst, or average case? Big-O, Big-, or Big- alone don’t tell the whole story.
So…
- Big- provides an upper bound — “it won’t grow faster than this.”
- Big- provides a lower bound — “it won’t grow slower than this.”
- Big- gives a tight bound — the function grows exactly at this rate (up to constant factors).
- In practice, we most often discuss worst-case performance using Big- notation, because knowing the maximum time an algorithm could take is critical for guaranteeing performance.
- Remember: case analysis (best/worst/average) and asymptotic notation() are independent concepts. You can describe any case with any notation, though by convention we typically focus on worst-case Big-.
Have a great day!