We can finally get to the point. The extreme value theorem is a fundamental theorem about continuous functions that most engineers don’t learn. Most engineering math courses use what’s valuable and hence skip out on a lot of details covered by other courses. That’s okay, because the situation is being rectified here. The extreme value theorem is rather simple. Bernard Bolzano and Karl Weierstrass made a much more complete version that we will be using in higher dimensions. We’re restricting ourselves to the one dimensional case still, and so we’ll just call it the extreme value theorem for now.
The extreme value theorem says that if a function is continuous on a closed and bounded interval in , then attains both its minimum and maximum, at least once for each. We know what functions are; we also know what it means to be closed in (in a very non-rigorous sense) and what intervals are. What’s missing is a better definition for continuous. Two other important things are going to be introduced here in order to motivate you to read past Step 3 and move on to Step 4 (which will include the idea of convexity).
Thing 1: The extreme value theorem is actually a subset of two sub-theorems, which nobody really talks about: lower semi-continuous functions on closed bounded subsets of reach their minimums (at least once), and upper semi-continuous functions on closed bounded subsets of reach their maximums (at least once). Optimization is normally formulated to only concern itself with minimizing things because maximization problems can be turned into minimization problems by flipping them. (Maximize is in a certain sense the same problem as minimize .) A function is continuous if and only if it is both lower-semi-continuous and upper semi-continuous.
Thing 2: Compactness. Two famous mathematicians named Heine and Borel are famous for a theorem called the Heine-Borel theorem. Sets in are called compact if and only if they are closed and bounded. This is a post I decided to turn into Optimization, Step 4.
Ok, I’ve wet your whistle. Let’s get to the math.
Recall that a function being continuous is a property of topological spaces. Since metric spaces are a subset of topological spaces, it is sometimes easier to define continuity using metrics. We normally do the same for the definition of openness, neighbourhoods, etc. It’s okay to do this because I don’t optimize things in strict topological spaces that don’t have metrics. I want at least the structure that a metric gives us before I even think about optimizing, so it’s okay to formula neighbourhoods, openness, and continuity in the context of a metric space and it still be ok.
There are many definitions of continuity, and they are all the same. They may not look the same, but they fundamentally are. First we need the definition of a ball as well as a neighbourhood (in the context of a metric space).
3.1 Definition: a ball
Let be a metric space (recall is the set, is the metric, jointly referred to as a metric space). The open ball of radius centered at a point is defined by:
Closed balls (with radius centered at ) are:
Balls in one dimension are just intervals on the line, centered at with radius out to each side. An open ball at with radius is: . (Why? , so and leads to and , the lower bound.) Closed balls include the endpoints, of course. So then what’s a neighbourhood? A neighbourhood is an open set that (we’ll call it such that there is an open ball centered at with radius such that is contained in V. Here’s the formal definition.
3.2 Definition: a neighbourhood
Given a metric space , a set is called a neighbourhood of a point if there exists an open ball with centre and radius such that:
That is, it has to be a proper subset, i.e. fully contained within . If a ball of any positive radius fits inside of then $latex $V$ is a neighbourhood. Now, I repeat, these definitions are equivalent.
3.3a Definition: continuity at a point
A function is continuous at a point if
for all there is a $\delta > 0$ such that if .
3.3b Definition: continuity at a point
A function is continuous at a point if the limit of as approaches $x_0$ through the domain of exists and is equal to , i.e. .
3.4 Definition: continuity
A function is continuous if it is continuous at every .
The Wikipedia page does a great job with continuity. Definition 3.3a has some metrics in it, and it can be reformulated into a nice sentence about neighbourhoods: “More intuitively, we can say that if we want to get all the values to stay in some small neighbourhood around , we simply need to choose a small enough neighbourhood for the values around , and we can do that no matter how small the neighbourhood is; is then continuous at .
If is a function between metric spaces which is continuous and is a sequence which converges to a point where each , then is a sequence which converges to and where each .
There’s a good proof of this here. In English, this means that continuous functions map convergent sequences to convergent sequences. This is a very important property of continuous functions and actually characterizes them (i.e. it can be used to define continuity, like a definition 3.3c, equal and equivalent to the others).
We already know what a bounded set is: bounded sets have both an upper and lower bound.
3.6 Definition: boundedness of a function
A function is called bounded from above if there is a real number such that for all . The number is called an upper bound of .
A function is called bounded from below if there is a real number such that for all . The number is called a lower bound of .
A set is bounded if it has both lower and upper bounds.
Sometimes we don’t care if the lower bound and upper bound differ, just that the function is indeed bounded.
2. \leq 5 \Rightarrow |f(x)| \leq 5$
Why? If , surely and thus .
Note one final subtlety with functions. Sets don’t necessarily have maximums or minimums. Open sets like (0,1) don’t have a maximal or minimal element, but they do have supremums and infimums. If and is a continuous function, it stands to reason from the definition of continuity that for some and . But is it true that if (i.e. is bounded) then $latex m = f(c)$? Yes. It is true. The first part of this idea called the boundedness theorem. The boundedness theorem is used to prove the extreme value theorem, which is what we set out to do in this post.
The Extreme Value Theorem
The boundedness theorem says:
3.7 Theorem: the boundedness theorem
If is continuous function in the closed and bounded interval , then for some .
The interval is bounded because are real numbers, and is not a real number. What the boundedness theorem does not say, is that the number is the infimum of , or that is the supremum of . That is what the extreme value theorem is for. The proof of the boundedness theorem is so important, it gets its own post here (once I make it, I will link it).
3.8 Theorem: the extreme value theorem
If is a continuous function on the closed and bounded interval , then must attain its maximum and minimum, each at least once.
Recall that if a maximum exists, it is the supremum. A supremum can exist even if a maximum doesn’t. Recall that for a given subset , in order for a maximum to exist it has to be an element of itself, whereas the supremum can lie in . Let and . Then does not exist, but . If and then and . This logic about supremums/maximums is true for sets and for functions since functions map to sets.
From Wikipedia: “The extreme value theorem enriches the boundedness theorem by saying that not only is the function bounded, but it also attains its least upper bound as its maximum and its greatest lower bound as its minimum.” So the boundedness theorem means exists but we don’t know if the maximum exists, and the extreme value theorem says . It says the same for the min/infimum.
Functions map inputs (elements) from a set called the domain, to a set called the codomain. We ask that and be subsets of a complete ordered field, i.e. and . We write a function as . If is a closed (meaning it contains all its boundary points) and bounded (meaning it’s not infinity in either direction, required since is closed) interval (sets of the form A = [a,b]) and if is a continuous function (meaning it can’t change too quickly), then reaches its maximum and minimum (on , where since is also a subset of a complete ordered field, it can have supremums and infimums).
Why did I introduce metrics? To deal with the idea of completeness. The function is continuous in but it’s not continuous in . This is where the idea of being the nice set we can do calculus on comes from. You can define similar functions that break rules if the underlying fields are not complete.
It turns out that simple topological spaces have something called pseudometrics that you can always define on them, and they differ from metrics in that the distance between two points can be zero even if it is not the same point.
E.g. For a metric . That’s the only time the metric can be zero, is if it’s the same element you’re asking for the distance between. With a pseudometric, it can be true that for a given and such that $x,y\in \mathbb A, x \neq y$.
Thing 1: semi-continuity
3.9 Definition: lower semi-continuity
A function is lower semi-continuous if it looks ‘sharp’ when looked at from the bottom. The modification to the extreme value theorem is that lower semi-continuous functions on closed bounded subsets of attain their minimum.
3.10 Definition: upper semi-continuity
A function is upper semi-continuous if it looks ‘sharp’ when looked at from the top. The modification to the extreme value theorem is that upper semi-continuous functions on closed bounded subsets of attain their maximum.