With the rise of massive networks there is a push to find ways to parallelism training in new ways to make research on them more accessible. The sheer size of these networks which now reach into the billions of parameters cut off many researchers from making advances in the field. There are various attempts to […]
A very interesting paper that goes through the process of making reversible block within a neural network. The idea here is that the mathematical output of blocks in a network can be simply reversed with a few simple operations to get the input. This means that instead of storing enormous amounts of activation function output […]
First I got to say wow this is an impressive work not just as a discovery but is a brilliant paper that explains the idea well with good examples. It is also a paper that doesn’t lean too heavily into mathematical formulas for its description which I feel makes the paper more accessible for a […]
A quick little paper. I’ve become used to reading journal articles or the occasional thesis in the field of machine learning. I’d refer to this as more of a monograph. A quick little look at a small modification of vision transformers which certainly suited me on what was a very busy day for unrelated reasons. […]
This is a fun paper for me. This paper involves symbolic regression as a core methodology. My machine learning journey started with symbolic regression and I didn’t even realise it at the time I was doing it. As in I didn’t realise the journey I was about to undertake. I was simply experimenting with creating […]
This is a master’s thesis which is quite a divergence from the journal papers I’ve become used to reading but it is a great quality one which is well worth a read for anyone with a passing interest in audio generation of music. The explanation is detailed, the diagrams are useful and the work is […]
The following is about the paper that most appealed to me today: Multimodal Chain-of-Thought Reasoning in Language Models https://arxiv.org/pdf/2302.00923v1.pdf This is a very interesting paper claiming super human performance on a specific task with a relatively small language model. Of course this must always be couched with the warning that the language model is small […]
Everyday I at least try to read through a new paper in the field of machine learning. Today I read through a paper that can be found at the following: Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions – ScienceDirect This is an interesting idea that I would say […]