Listen to this story
Recently, DeepMind researchers did what they do best—making AIs champions at games. The team tackled the fundamental matrix multiplication problem by turning it into a single-player 3D board game called ‘TensorGame’.
But they aren’t the only players succeeding in the field.
Two Austrian researchers, Jakob Moosbauer and Manuel Kauers at Johannes Kepler University Linz, bested that new record by one step. Speaking with Analytics India Magazine, the researchers shared their stories about how their curiosity led them to compete against one of the biggest AI companies today.
Why are matrices important?
Matrices are not very complicated objects. Kauers said they are just quadratic tables that contain numbers that somehow encode transformations. For example, a rotation can be written by a matrix.
“In mathematics, we work in higher dimensional spaces. Lots of problems can be translated into geometric problems; they may not be in three-dimensional or two-dimensional space. The computation time for multiplying two matrices grows when the matrix becomes bigger. If you have a three 1000 times matrix, the usual algorithm for doing this just takes a certain number of operations, which, even for a fast computer, you would have to wait a bit,” he explained.
Enter the matrix
Since his undergraduate program in 2015, Jakob Moosbauer has been under Manuel Kauers’ guidance. Even though Moosbauer’s main topic for PhD has been matrix multiplication for only two years, he has been fascinated by the subject since school.
Similarly, as far as he can think back, Kauers’ field of interest has been computer algebra. He explained that in modern mathematics, instead of just numbers, we compute with more complicated objects like variable words or polynomials. Computing with such objects is much harder for a computer though. “Because the way we deal with this as humans is we somehow understand on a higher level, what these things mean, and how we can maybe solve a mathematical problem. Now, computer algebra is about developing computer programmes that can do mathematics on this more advanced level. So, this has been interesting to me, as far as I can think back.”
Kauers, who has been in the field for over two decades now, was introduced to computer algebra in high school, “In high school, I had a teacher who said, computers are completely useless because they can only compute with numbers. And that’s not mathematics. Somehow, I already had this suspicion that this is not right. A computer can also do more advanced things; I had just the feeling that there is no inherent limitation, while your computer shouldn’t be able to do that.”
After high school, Kauers studied computer science in Germany and landed in Austria, the centre for computer algebra and mathematics.
Moosbauer chose matrix multiplication as his PhD topic because Manuel Kauers was also working on that topic. “For my PhD, he came up with a list of possible topics that he said he was working on and thought there’s some potential to do something there as well. So, it was more like that. I relied on Manuel’s judgement that there would be some interesting questions and problems we could solve there. So, he’s been the one who has accompanied me most of where I am now,” Moosbauer said.
The (week later) breakthrough
The 4×4 matrices are special. Since one can use the algorithm for the 2×2 case that has been discovered. So, for 4×4, one can subdivide it and use it twice. So, Kauers didn’t spend much time improving this because that’s already very good. Instead, he looked at 5×5.
Along with Moosbauer, he wrote a computer programme that searches for new algorithms. Collectively, they had been working on the search algorithm for about two to three months to reduce the multiplication to 97 from 98 for 5×5.
“We had it fully implemented and were really happy about that. And, then on Wednesday came the news that we were beaten and our new result wasn’t going to work out anymore,” Moosbauer said.
When the DeepMind paper dropped, the JKU researchers took their algorithms and fed them to their own programme. It took just a few seconds to find the result.
Man VS Machine
From DeepMind’s paper, what stood out for Moodbauer was the basic idea of the Tensor Game. “For every move in that game, the AI has to choose from 20 to 100 and then design a game where literally the AI could put anything there and find those solutions by some pattern recognition.”
“I was actually a bit angry because they’re a big company with a lot of computational resources behind them. When they are on the topic where I want to write my PhD, that’s a tough competition. But, that changed very quickly. It’s nice that now there’s so much attention on the field. And we could also use their results and publish our results now,” said Moosbauer.
So, there’s a little bit of a competition to find something that the other party doesn’t find. For 5×5, we found something with 97. And then there was the Nature article, and they had 96. Moreover, they also did an incredible thing for 4×4; I said before that 49 is not improvable, and they managed to bring it down to 47. Usually, we improve one step at a time, and here they go two steps, which is amazing and completely unexpected.
Kauers said, “I cannot explain how DeepMind did it, but it’s not the first time that a machine learning approach has solved a problem that otherwise could not be solved. The interesting thing about the machine learning approach is that it somehow starts from scratch. So the computer has to find on its own how to optimise it.”
Moosbauers and Kauers’ approach encodes mathematical understanding into the search procedure. “So now, that’s positive and negative; the positive thing is that the search procedure can be more efficient because we know what it’s looking for. On the other hand, the machine learners would say that there is a bias that somehow our understanding is going in a specific direction, and maybe that’s not the right direction,” said Kauers.
Explaining the 50-year gap, Kauers said, “The way matrix multiplication is defined, it looks like there’s no way to escape the complexity. And, it was long believed that you could not improve this until the 60s. This was a German mathematician [Volker Strassen] who tried to prove that you cannot do better than the obvious way of matrix multiplication. And accidentally proved that you cannot do better, he found odd. There’s a way to do it better. And so this was a start for development. So, he found a way for two by two matrices”.
Moosbauer said, “Everybody thought the known [Strassen’s] solution might be the best out there. So, people didn’t even believe that much into improving four times four”.
Moosbauer believes that most of the results in the DeepMind paper, as well as their own results, are more like theoretical interests than purely mathematical. “The newfound algorithms will not be practical right now. And they [DeepMind] also don’t expect those algorithms to be implemented like on a wide set of software, maybe in some computer algebra systems,” he said.
DeepMind trains the algorithm to work fastest on specific hardware. So, maybe if this really brings improvement, those things can have a measurable improvement, he added.
The natural next step
Kauers believes machine learning is a great tool.
“But why do we start on a white piece of paper? Let it start with the knowledge accumulated by the mathematical community and let it improve, like what we did with their solution. So, the machine learning approach would possibly be even more successful if it’s combined with if we feed more mathematical knowledge into this,” he said.
The next step for Moosbauer is to write the actual paper to explain how their technique works. They will also apply it to other matrix sizes to see whether there will be some improvement.
“From then on, I think there might be some potential because our search algorithm actually works by a random search. This kind of random search can use some kind of heuristics or even machine learning to improve the algorithm to be able to search even deeper and find probably better algorithms,” he said.