Swarm Parallelism – Paper of the Day

With the rise of massive networks there is a push to find ways to parallelism training in new ways to make research on them more accessible. The sheer size of these networks which now reach into the billions of parameters cut off many researchers from making advances in the field. There are various attempts to […]

Paper of the Day: The Reversible Residual Network

A very interesting paper that goes through the process of making reversible block within a neural network. The idea here is that the mathematical output of blocks in a network can be simply reversed with a few simple operations to get the input. This means that instead of storing enormous amounts of activation function output […]


First I got to say wow this is an impressive work not just as a discovery but is a brilliant paper that explains the idea well with good examples. It is also a paper that doesn’t lean too heavily into mathematical formulas for its description which I feel makes the paper more accessible for a […]

Paper of the Day: Dual PatchNorm

A quick little paper. I’ve become used to reading journal articles or the occasional thesis in the field of machine learning. I’d refer to this as more of a monograph. A quick little look at a small modification of vision transformers which certainly suited me on what was a very busy day for unrelated reasons. […]

Paper of the Day: Discovering Symbolic Models from Deep Learning with Inductive Biases

This is a fun paper for me. This paper involves symbolic regression as a core methodology. My machine learning journey started with symbolic regression and I didn’t even realise it at the time I was doing it. As in I didn’t realise the journey I was about to undertake. I was simply experimenting with creating […]

Paper of the Day: A R C H I S O U N D : A U D I O G E N E R AT I O N W I T H D I F F U S I O N

This is a master’s thesis which is quite a divergence from the journal papers I’ve become used to reading but it is a great quality one which is well worth a read for anyone with a passing interest in audio generation of music. The explanation is detailed, the diagrams are useful and the work is […]

MultiModal CoT – Paper of the Day

The following is about the paper that most appealed to me today: Multimodal Chain-of-Thought Reasoning in Language Models https://arxiv.org/pdf/2302.00923v1.pdf This is a very interesting paper claiming super human performance on a specific task with a relatively small language model. Of course this must always be couched with the warning that the language model is small […]

Deep Kronecker Neural Network – Paper of the Day

Everyday I at least try to read through a new paper in the field of machine learning. Today I read through a paper that can be found at the following: Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions – ScienceDirect This is an interesting idea that I would say […]

Automated Machine Learning From Scratch #1

Automated machine learning is a tool for creating a suite that automates the task of applying machine learning to problems. I have decided to create my own automated learning system and to begin this journey I have started working on feature engineering. In this Kaggle notebook I went through creating generic functions to handle the […]

Adaptive Boosting From Scratch

In order to better understand the process involved in adaptive boosting I made a simple boosting model myself recently. Which can be found here. Now I’d like to go through what is boosting and what it can do for us. Adaptive Boosting uses many decision trees with a depth of one to split the data: […]