we're using fundamentally the same algorithm, just in a different order.
We have got to somehow reduce the
number of multiplications. We are going to reduce it to 7. The claim is that if we have two two-by-two matrices we can take their product using seven multiplications. If that were true, we would reduce the 8 to a 7 and presumably make things run faster. We will see how fast in a moment. You can compute it in your head. If you are bored and like computing logs that are non-integral logs then go ahead. All right. Here we are. This algorithm is unfortunately rather long, but it is only seven multiplications. Each of these P's is a product of two things which only involves addition or subtraction, the same thing. Those are seven multiplications. And I can compute those in 7T(n/2). Oh, indeed it is. Six was wrong. Six and seven are the same, very good. You know, you think that copying something would not be such a challenging task. But when you become an absent-minded professor like me then you will know how easy it is. OK. We have them all correct, hopefully. We continue. That wasn't enough. Of course we had seven things. Clearly we have to reduce this down to four things, the elements of C. Here they are, the elements of C, r, s, t, u. It turns out r=P_5+P_4-P_2+P_6. Of course. Didn't you all see that? [LAUGHTER] I mean this one is really easy, s=P_1+P2. t=P_3+P_4. I mean that is clearly how they were chosen. And then u is another tricky one, u=P_5+P_1-P_3-P_7. OK. Now, which one of these would you like me to check? Don't be so nice. How about s? I can show you s is right. Any preferences? u. Oh, no, sign errors. OK. Here we go. The claim that this really works is you have to check all four of them. And I did in my notes. u=P_5. P_5=(ae + ah + de + dh). That is P_5. Check me. If I screw up, I am really hosed. (af - ah) = P_1. P_3 has a minus sign in front, so that is (ce + de). And then we have minus P_7, which is a big one, (ae + af - ce - cf). OK. Now I need like the assistant that crosses off things in parallel like the movie, right? ah, de, af, ce, ae, thank you, and hopefully these survive, dh minus minus cf. And, if we are lucky, that is exactly what is written here, except in the opposite order. Magic, right? Where the hell did Strassen get this? You have to be careful. It is OK that the plus is in the wrong order because plus is commutative, but the multiplications better not be in the wrong order because multiplication over matrixes is not commutative. I check cf, OK, dh, they are in the right order. I won't check the other three. That is matrix multiplication in hopefully subcubic time. Let's write down the recurrence. T(n) is now 7. Maybe I should write down the algorithm for kicks. Why not? Assuming I have time. Lots of time. Last lecture I was ten minutes early. I ended ten minutes early. I apologize for that. I know it really upsets you. And I didn't realize exactly when the class was supposed to end. So, today, I get to go ten minutes late. OK. Good. I'm glad you all agree. [LAUGHTER] I am kidding. Don't worry. OK. Algorithm. This is Strassen. First we divide, then we conquer and then we combine. As usual, I don't have it written anywhere here. Fine. Divide A and B. This is sort of trivial. Then we compute the terms -- -- for the products. This means we get ready to compute all the P's. We compute a+b, c+d, g-e, a+d, e+h and so on. All of the terms that appear in here, we compute those. That takes n2 time because it is just a bunch of additions and subtractions. No big deal. A constant number of them. Then we conquer by recursively computing all the P_i's. That is each our product of seven of them. We have P_1, P_2 up to P_7. And, finally, we combine, which is to compute r, s, t and u. And those are just additions and subtractions again, so they take n^2 times. So, here we finally get an algorithm that is nontrivial both in dividing and in combining. Recursion is always recursion, but now we have interesting steps one and three. The recurrence is T(n) is seven recursive subproblems, each are size n/2 plus order n^2, to do all this addition work. Now we need to solve this recurrence. We compute n^log_b(a), which here is nlog_2(7). And we know log base 2 of 8 is little bit less than 3 but still bigger than 2 because log base 2 of 4 is 2. So, it is going to be polynomially larger than n^2 but polynomially smaller than n^3. We are again in Case 1. And this is the cheating way to write n log base 2 of 7, nlg7. lg means log base 2. You should know that. It is all over the textbook and in our problem sets and what not, nlg7. And, in particular, if I have my calculator here. This is a good old- fashion calculator. No, that is wrong. Sorry. It is strictly less than 2.81. It is strictly less than 2.81. That is cool. I mean it is polynomially better than n^3. Still not as good as addition, which is n^2. It is generally believed, although we don't know whether you can multiply as fast as you can divide for matrices. We think you cannot get n^2, but who knows? It could still happen. There are no lower bounds. This is not the best algorithm for matrix multiplication. It is sort of the simplest that beats n^3. The best so far is like n^2.376. Getting closer to 2. You might think these numbers are a bit weird. Maybe the constants out here dominate the improvement you are getting in the exponent. It turns out improving the exponent is a big deal. I mean, as n gets larger exponents really come out to bite you. So, n^3 is pretty impractical for any very large values of n. And we known that Strassen will beat normal matrix multiplication if n is sufficiently large. The claim is that roughly at about 32 or so already you get an improvement, for other reasons, not just because the exponent gets better, but there you go. So, this is pretty good. This is completely impractical, so don't use whatever this algorithm is. I don't have the reference handy, but it is just trying to get a theoretical improvement. There may be others that are in between and more reasonable but that is not it. Wow, lots of time. Any questions? We're not done yet, but any questions before we move on for matrix multiplication? OK. I have one more problem. Divide-and-conquer is a pretty general idea. I mean, you can use it to dominate countries. You can use it to multiply matrices. I mean, who would have thought? Here is a very different kind of problem you can solve with divide-and-conquer. It is not exactly an algorithmic problem, although it is computer science. That is clear. This is very large-scale integration. The chips, they are very large scale integrated. Probably even more these days, but that is the catch phrase. Here is a problem, and it arises in VLSI layout. We won't get into too many details why, but you have some circuit. And here I am going to assume that the circuit is a binary tree. This is just part of a circuit. Assume for now here that it is just a complete binary tree. A complete binary tree looks like this. In all of my teachings, I have drawn this figure for sure the most. It is my favorite figure, the height four complete binary tree. OK, there it is. I have some tree like that as some height. I want to imbed it into some chip layout on a grid. Let's say it has n leaves. I want to imbed it into a grid with minimum area. This is a very cute problem and it really shows you another way in which divide-and-conquer is a useful and powerful tool. So, I have this tree. I like to draw it in this way. I want to somehow draw it on the grid. What that means is the vertices have to be imbedded onto dots on the grid, and I am talking about the square grid. It has to go to vertices of the grid. And these edges have to be routed as sort of orthogonal paths between one dot and another, so that should be an edge and they shouldn't cross and all these good things because wires