The Wikipedia article for compact spaces is large, but extremely informative. I want to distill the important points of what it means to be compact in the context of optimization.
The motivation for something to be compact is a generalization of being closed and bounded. Indeed, any subset that is both closed and bounded is compact. Take this as a definition for now, and we’ll expand on it and re-adjust our thinking. Recall that is both closed and open, but it is definitely not bounded, as it has no upper or lower bound. Without loss of generality, we will only consider closed intervals, i.e. . Being closed means that contains its endpoints, but in general it means that a sequence with terms in has a limit that also stays within . We would like it if compactness would mean just that – that sequences with terms in have their limit point in . Is the ‘definition’ of compactness we used at the beginning of this paragraph consistent with this logic? Yes and no, and we’ll delve into that now.
Compactness was studied from two different fronts at the same time: Bolzano and Weierstrass in terms of sequences, Borel and Lebesgue from the point of view of something called open covers. If you have the time, this is what you want to read about the history of compactness. When math students are asked about what compactness is, they normally spit out the open cover version, because it is the more complete one. As with continuity, there are now multiple equivalent definitions.
3.11a Definition: compact
A topological space is called compact if each of its open covers has a finite subcover.
Here’s an equivalent definition.
3.11b Definition: compact
A topological space is compact if every net on has a convergent subnet.
This sounds like the terminology for sequences, but it isn’t true for sequences. I only want to tease you with what nets are, because they’re cool. Sequences are treated as functions from the natural numbers to a set , i.e. . With sequences we treat as the label of the function, recalling that and hence . The natural numbers are countably infinite, but it might make sense to allow the domain of the sequence be uncountable, but it has to have a similar ordering as the natural numbers. The domain needs to be what is called a directed set, and functions from directed sets onto other sets are called nets. E.g. for a directed set , is a net. See the Wikipedia page on nets for more. Finally, it is worth checking out this page to get an understanding for the open-cover definition, amongst other things.
We want the Definition 3.11b to work with sequences instead of nets, if that is possible. Unfortunately it’s not possible: the open cover definition and the sequences definition are mutually incompatible, and so we call it something slightly different.
3.12 Definition: sequentially compact
A topological space is called sequentially compact if every sequence in has a convergent subsequence whose limit is in .
Does being compact imply sequential compactness? Nope, unfortunately not.
Does being sequentially compact imply compactness? Nope, unfortunately not.
Definition 3.12 is what we would like to focus our attention on and we’d like it to be the full version of compact, but it’s not. Two things are important to note, however:
Thing 1: In a metric space, compactness and sequential compactness are the same thing. More precisely, a metric space is compact iff it is sequentially compact. Recall that all metric spaces are topological spaces, i.e. metric spaces topological spaces (however not all topological spaces are metric spaces). That little bit of extra structure that metric spaces affords lets the two completely different definitions mean the same thing for metric spaces.
Thing 2: Borel proved that a subset of is compact iff it is closed and bounded. It’s now called the Heine-Borel theorem. A mathematician named Cousins proved it for subsets of in the same year, and then Borel provided an argument for dimensions three years later. Wikipedia uses the -dimensional case. Note that it says for a subset of Euclidian space ; Euclidian space has the structure of a Euclidian metric and is thus a metric space. We are still only doing things in one dimension, but when referring to the Heine-Borel theorem I will probably say something like “in finite dimensional Euclidian spaces, closed and bounded compact”.
Our train of logic goes something like this: Say I have a closed and bounded interval . We will call the interval , i.e. . Because is bounded, the infimum exists. Because it is also closed, the minimum exists ( however undefined. ). In the definition of infimum I can create a sequence converging to the infimum, i.e. a sequence exists such that . Even better, we know because of closedness that the sequence goes to the minimum of , i.e. . This minimizing sequence exists because the space has an infimum (minimizing sequences are defined to be any sequence that tends to the infimum of a set). Equip the space with the Euclidian metric. By the Heine-Borel theorem, is a closed and bounded subset of (i.e. a Euclidian space) hence is compact. Because is a metric space and it is compact it is also sequentially compact, meaning that all minimizing sequences have convergent subsequences. Continuous functions map convergent sequences to convergent sequences by definition: . Now we have a sort-of proof that minimizers exist in the space of a function so long as is a metric space.
Continuous functions have a lot of definitions that are all equivalent; they’re very interesting objects when viewed topologically. The “convergent sequences” definition (as opposed to the ) is the one that is useful here. The chain of logic above told us that the minimizer exists.
Look at this paragraph in the Wikipedia article of the Bolzano-Weierstrass theorem. It brings the idea of the Heine-Borel theorem and the Bolzano-Weierstrass theorems together. Terence Tao has a great exposition on compactness here.
We like dealing with compact spaces because the image of a compact function under a continuous function is compact. If the image is compact and a metric space, then it has a minimizing sequence with a convergent subsequence.
Or, you can think of it as:
If the space is compact and a metric space, it is also sequentially compact and has a minimizing sequence with a convergent subsequence. A continuous function maps convergent sequences to convergent sequences: i.e. also implies .