Now that it looks like I have a real job (more on that coming soon), it’s time to make my new bosses regret the decision by showing them how dumb I really am. 🙂
I feel like I should have known this since my freshman year in high school, but I have finally realized why people talk so much about linear algebra, and why we care about vector spaces so much in real life. I didn’t come up with this by myself – most of it came from three excellent sources: Shewchuk’s conjugate gradients paper, Strang and Borre’s “Linear Algebra, Geodesy and GPS” linear textbook, and Luenberger’s “Optimization by Vector Space Methods”. In fact, you’d be much better served if you stopped reading this post, and instead reading those books. But you’re probably reading this to laugh at me, anyway, so let’s go back to how dumb I am.
First, there are two properties of vector spaces. 1. If is in a vector space, then is also in that vector space, for every number . That is, vectors can be stretched. 2. If and are in a vector space, then so is . That is, you can add two vectors together.
Now, in linear algebra, we study vectors. But more importantly, we study things that change vectors into new vectors. One of the basic ideas is to look at the things (let’s call them transformations) that change vector linearly, that is, such that and . Also, these transformations themselves are in vector spaces, you can add them together and scale them.
When I started thinking about vector spaces other than (and their linear transformations), my only intuition would be that “linear operators ‘don’t blow up'”. That is, if you change a vector a little bit, and you look at a linear transformation of this changed vector, then the transformation of the changed vector is not too far from the transformation of the original vector.
That is true, but it’s not the point of linear algebra and vector spaces. The really important intuition is: Break your vectors up – you know exactly what will happen to the parts!. That is, the trick of linear algebra is in applying property 2 “backwards”: think of your vector as a combination of other vectors , etc, such that . Obviously, “backwards” is a bad adjective: no one ever said that the rule applied only in one direction. However, this is how I thought about that property for the longest time. As I said, you’d all see how dumb I am.
But the cool part of this realization is it gives a great strategy for understanding linear “situations”: if you see a scary vector, break it up into cuddly vectors, and whatever understanding you get from these small vectors combines exactly back into the original vector. In particular, if you’re trying to understand what happens to a scary transformation on a scary vector , find a nice transformation that you understand, say , and define the “purely scary part” of as such that , and do the same for your vector: .
Then, linear algebra tells you that . Hopefully, you will be able to understand three of the four terms in that right side. One trick is to define the decompositions such that you know that some of these terms will become zero, or at least know that they will be small.
Once you notice this pattern, lots of mysterious (to me, at least) things in linear algebra start to make sense. The reason for the obsession of linear algebraists with eigenvectors is obvious: since a matrix simply scales its eigenvectors, if you break any vector as a sum of the eigenvectors, you can look at what a matrix does to a vector by breaking up any vector as a linear combination of the matrix eigenvectors. Then, the matrix will simply scale these pieces, and all you have to do is add the pieces to get the entire result. Eigenvectors used to seem magic, but they’re straightforward once you realize that the whole point is to split scary things into warm and fuzzy ones. I suspect most other mysterious things in linear algebra work similarly.