Why Zero Raised to the Zero Power IS One

The question of what value 0^0 {(0^0)} should evaluate to has been discussed since the time of Euler (1700s). There are three candidate choices: 1,0, or “indeterminate” (i.e., throw an error).

In this article, I argue that the only reasonable choice (for discrete mathematics) is 0^0=1 (0^0 = 1), and I’ll give a tangible, feel-the-grit-in-your-palms reason why it can’t be any other way.

There are important implications for software developers, in particular, the developers of various popular mathematical computing platforms have adopted differing conventions: R, Octave, Maxima, Ruby, Google, Excel, and various software calculators.


Why Zero Raised to the Zero Power Ought to Be One ({0^0 =1})

The question of what value 0^0 {(0^0)} should evaluate to has been discussed since the time of Euler (1700s). There are three candidate choices: 1,0, or “indeterminate” (i.e., throw an error).1

Context of the Debate: Discrete and Continuous Mathematics

The reason for having three choices becomes clear when considering {x^y} as a function of two continuous variables. If we take {x,y \neq 0}, and then consider {y \rightarrow 0}, we have {x^y \rightarrow 1}. But if we consider {x \rightarrow 0}, then we have {x^y \rightarrow 0}. Clearly, {x^y}, as a function of two variables, is discontinuous at {(0,0)}. This is the reason for L’Hopital’s rule in Calculus that requires treating limits of ratios of quantities that each tend to 0 by considering the relative rates of their approach (i.e. the limit of the ratio of the derivatives).

But in discrete mathematics, the situation is somewhat different. There is no “approaching” — you are either at 0^0 or you are away from it: at 1^0 = 1, or at 0^1 = 0.2

Now, Euler in the 1700s and, in more recent times, Knuth (of The Art of Computer Programming, and TeX fame) each argued strongly that {0^0 = 1}, Knuth based upon consistency with the binomial theorem and its frequent appearance in discrete mathematical computations:

“Some textbooks leave the quantity 0^0 undefined, because the functions 0^x and x^0 have different limiting values when x decreases to 0. But this is a mistake. We must define x^0=1 for all x , if the binomial theorem is to be valid when x=0, y=0 , and/or x=-y. The theorem is too important to be arbitrarily restricted! By contrast, the function 0^x is quite unimportant.”
– from Concrete Mathematics, p.162, R. Graham, D. Knuth, O. Patashnik, Addison-Wesley, 1988 (Citation)

Different Conventions Among Mathematical Computing Platforms

Though “majority opinion” is never an acceptable reason in mathematics, it is interesting to consider how implementations of {0^0} differ even among various modern common computing platforms:

  • Google: 0^0 is 1.
  • Ruby: 0^0 is 1.
  • R: 0^0 is 1.
  • Octave: 0^0 is 1.
  • Microsoft’s Calculator: 0^0 is 1.

  • Maxima: 0^0 is indeterminate — throws an error.
  • Microsoft Excel (2000): 0^0 is indeterminate — throw an error

  • Hexalon Max (software calculator): 0^0 is 0.
  • TI-36 Hand calculator: 0^0 is 0.

Principles for a Decision in Mathematics: Extension and Consistency

Often in mathematics, where there is more than one choice, or where a definitional decision is required, the decision is made by arguing the desirability of an extension into what is otherwise undefined, and then the decision is made to maintain consistency with the evidence that is already accumulated and accepted. It is precisely in this way that ordinary multiplication of positive numbers is extended, first to multiplication by a single negative number, then to multiplication by two negative numbers, i.e. {(-1)(-1) = 1}.

To see that we must have (-1)(-1) = 1, take the view of elementary mathematics as an empirical science and not as an axiomatic science, since it is as an empirical science that mathematics is taught to young children, and where this arithmetic rule is first encountered, typically along with some sort of reference to authority:

“Minus times minus is plus.
The reason for this we need not discuss!”
— W.H. Auden

From the point of view of an empirical science, the multiplication of two positive numbers has a well-defined, tangible meaning as repeated addition, and this meaning remains valid even with the extension of one of the products to a negative number. But both products negative is ill-defined from this point of view.

It is at this point that the mathematician sees an opportunity. The consequences of declaring something to be undefined (throwing an error) is an enormous loss of efficiency. Any product now has to be checked for the case that both factors are negative, and this case has to be treated separately. If a definition could be found that remains consistent with all other empirically obtained rules, and if that definition means that multiplication can proceed indifferent to the sign of the factors, well, then that is a big win. The consistency in this particular case is the distributivity of multiplication over addition, a law which, for positive numbers, can be accepted on entirely empirical grounds. (The full argument is given in the footnote: 3)

Forced to a Decision: A Tangible Computation That Requires an Answer

I claim that the case of 0^0 is similar. Not only would it be a significant loss of efficiency to treat this case separately, but indeed, in the finite summation of integer powers, we have a problem with a real, tangible result (a finite sum), whose value (an empirically determinable fact) depends unavoidably on the chosen value of 0^0. The crucial step in this argument occurs in the derivation of (*1b) from (*1a) in Finite Summation of Integer Powers, Part 2.

Extracting the relevant part of that derivation, we have:

\sum_{k=1}^{N-1} (N-K)(K+1)^{p-1}\ \ \ \ \ (*1a)

After expanding the binomial power using the binomial formula and further manipulation, we arrive at:


= \sum_{j=0}^{p-1} \binom{p-1}{j} \sum_{k=0}^{N-1} \left[NK^j -K^{j+1}\right]

(Pull the {K=0} term out of both summations. Note: 0^0 = 1\ \ \ \ \ \mbox{(***)})

= N + \sum_{j=0}^{p-1} \binom{p-1}{j} \sum_{k=1}^{N-1} \left[NK^j -K^{j+1}\right]

(which, after additional manipulation, yields)
= N + N(N-1) - S_p(N-1) + \sum_{j=1}^{p-1} S_j(N-1)\left[N\binom{p-1}{j} - \binom{p-1}{j-1}\right]

= N^2 - S_p(N-1) + \sum_{j=1}^{p-1} S_j(N-1)\left[N\binom{p-1}{j} - \binom{p-1}{j-1}\right]\ \ \ \ \ \mbox{(*1b)}

The key step happens in (***) above: we peel off the {K=0} term of the inner summation to get: {N0^j - 0^{j+1}}. Peeling this out of the outer summation requires considering the expression for all {j}. Now, 0 raised to any positive power is 0, so we can dispel the case of {j>0}.

But what is {0^0}? A decision must be made: it is either {0} or {1}. Indeterminacy is not an option, since the situation is real and is required to continue the simplification.

The Argument for 0^0 = 1

What are the consequences of choosing the other definition, i.e. 0^0 = 0? In this case, the final formula for S_p(N) is off by a linear constant -N, while the choice {0^0 = 1} leads to the exact formula and a computed value that matches a brute force summation. For S_5(10), the difference is between 220,825 (the correct, verifiable answer), and 220,815 (verifiably NOT correct). In the face of counting pebbles, trees, integer powers, the correct choice seems clear.

Compelling? For discrete mathematics, I think so. Certainly, the possibility of 0^0=0 being open for continued consideration seems to shut, and the choice of abstention by throwing an error is weakened as well. The evidence, I claim, provides strong support for adopting the convention by definition at least in discrete mathematics4:

Definition (Empirical) {0^0 = 1} for {k^j}, where {k,j} are discrete variables.


(If you’re a software developer of a mathematical package, I’d be interested in how you arrived at your decision. You can send me an email using the Comments link below.)

If you enjoyed this article, feel free to click here to subscribe to my RSS Feed.


Footnotess

  1. Further discussion of {0^0} is at: The Math Forum
  2. You might also be at 0^{-1}, but that, I believe we would all agree, is division by zero, and so appropriately remains undefined (unless working in the extended reals, in which case it is +\infty.)
  3. Considering that any quantity times zero is zero, and that one times any quantity is the quantity, we have no hesitation in granting -1 \times 0 = 0. But then observe that we way write 0 = (-1 + 1), which means, combining the two expressions, we have -1 (1 + (-1)) = 0. If we accept the law of distribution of multiplication over addition for positive whole numbers, purely on empirical grounds, and if we wish negative numbers to behave in the same manner as our empirically accepted positive whole numbers, then we want the distributive law to hold as well. And therefore we have 0 = (-1)(1) + (-1)(-1) = -1 + (-1)(-1). Which means that (-1)(-1) must be the oppositive (additive inverse) of (-1), and hence (-1)(-1) = +1.
  4. The implications for continuous mathematics are a consideration for another discussion. The statement that a discontinuity exists at the origin (x,y) = (0,0) isn’t quite enough.