Quadratic-Time

In the timing experiments of the previous section, you may have noticed that the running time of the block increases approximately by a factor of four when the size of the list is doubled; also, the running time increases approximately by a factor of 100 when the size of the list is increased by a factor of ten. In general, as we scale the size of the list by a certain amount, we also scale the running time of the block by the square of that amount. If we were to plot the running time of the block against the size of the list, we would get a parabola or a quadratic function, because as we move along a parabola, any increase in the x-coordinate corresponds to a squared increase in the y-coordinate. Algorithms similar to that used by this block are thus called quadratic-time algorithms.

Why does the algorithm that checks for the distinctness of elements in a list run in quadratic-time? Let's think about how many comparisons we have to do in the worst-case, for a list of length N. We compare the first number against (N - 1) numbers, the second number against (N - 2) numbers, the third number against (N - 3) numbers, and so on, until we reach the last number (if we're unlucky), which we have to compare against no other number. In total, we have to do (N - 1) + (N - 2) + (N - 3) + ... + 2 + 1 total comparisons, which, according to Gauss, is [N * (N - 1)]/2 total comparisons, or (N² - N)/2 total comparisons. For large values of N, the N² term begins to eclipse the value of N:

Quadratic Time
Size of input	`N`	`N²`
N = 10	10	100
N = 100	100	10,000
N = 1,000	1,000	1,000,000
N = 10,000	10,000	100,000,000

The difference between N² and N is more pronounced for even larger values of N. Since we only care about the runtime of algorithms for really large inputs, we say that the contribution of the N term is essentially negligible, and that the runtime of the algorithm varies as N².

You may find the approximation of (N² - N)/2 as N² a bit shocking (mathematicians definitely do); however, in computer science, we make many such approximations because we don't want to drown ourselves in too much detail, and also because this runtime analysis is very rough and approximate. The overarching theme of abstraction is evident here: we want to abstract ourselves away from the details of the computer that the algorithm will eventually be run on, and instead gain an idea as to how the runtime of an algorithm will behave for larger and larger inputs.

We would like to answer the question: does the computer get slower for larger inputs on one kind of algorithm than on another? If so, can we make the algorithm faster or better? Answering these questions can sometimes lead to surprising results, as you saw when comparing two different algorithms to sum the numbers in a list.