Optimization, Step 4: Compactness

Compactness

The Wikipedia article for compact spaces is large, but extremely informative. I want to distill the important points of what it means to be compact in the context of optimization.

The motivation for something to be compact is a generalization of being closed and bounded. Indeed, any subset A \subseteq \mathbb R that is both closed and bounded is compact. Take this as a definition for now, and we’ll expand on it and re-adjust our thinking. Recall that \mathbb R is both closed and open, but it is definitely not bounded, as it has no upper or lower bound. Without loss of generality, we will only consider closed intervals, i.e. [a,b]. Being closed means that A contains its endpoints, but in general it means that a sequence with terms in A has a limit that also stays within A. We would like it if compactness would mean just that – that sequences with terms in A have their limit point in A. Is the ‘definition’ of compactness we used at the beginning of this paragraph consistent with this logic? Yes and no, and we’ll delve into that now.
Continue reading “Optimization, Step 4: Compactness”

Optimization, Step 3: Extreme Value Theorem

We can finally get to the point. The extreme value theorem is a fundamental theorem about continuous functions that most engineers don’t learn. Most engineering math courses use what’s valuable and hence skip out on a lot of details covered by other courses. That’s okay, because the situation is being rectified here. The extreme value theorem is rather simple. Bernard Bolzano and Karl Weierstrass made a much more complete version that we will be using in higher dimensions. We’re restricting ourselves to the one dimensional case still, and so we’ll just call it the extreme value theorem for now.
Continue reading “Optimization, Step 3: Extreme Value Theorem”

Optimization, Step 2

We’ll get really close to finding out why functions have minimums or maximums in this section. In order to do that, we’re going to need a few more tools. We’ll start with functions from \mathbb R \rightarrow \mathbb R, which are one dimensional in nature. Why are they one dimensional? The superscript on the first \mathbb R is implicitly a 1, i.e. \mathbb R^1 \rightarrow \mathbb R^1  in that we’re looking at functions of one independent variable. Soon we’ll generalize the results to n dimensions, and eventually to infinitely many dimensions, but we have to start with one dimension.
Continue reading “Optimization, Step 2”

Optimization, Step 1

Some preliminaries.

Engineers often take for granted that functions have minimums and maximums. In general, the underlying goals of machine learning and optimization are to find the minima and maxima of functions. As things get more complicated and more complex ideas as thrown into the mix of real-world problems, it’s not always clear that a goal of finding these maxima or minima is even attainable. Luckily there is a solid framework from which mathematicians stand in order to know that algorithms are going to achieve their goals, i.e. solutions will be found.
Continue reading “Optimization, Step 1”