Recently there’s paper “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale” on open-review. It uses pretrained transformers at scale for vision tasks. Transformers are highly successful for language tasks, but haven’t seen that much success for vision. In vision, transformers are either applied in conjunction with Convolutional Neural Networks(CNN) or to replace some components of CNN. Recently transformers has shown good results on object detection (End-to-End Object Detection with Transformers). This paper applies transformers to vision task without using CNN and shows that state-of-art results can be obtained without CNN. The Cost of attention is quadratic…

Photo by Jeremy Bishop on Unsplash

I decided to implement some of the traditional machine learning algorithms using NumPy to understand them. The repository can be found here. These will be series of posts regarding these algorithms starting with decision tree , mostly for my own understanding.

Decision tree is one of the simplest and popular method for classification and regression task. Decision tree is supervised learning method. The idea is to create a model(tree) that predicts the target based on simple decision rules that are learned from training dataset. Decision tree is non-parametric method.

Behavioural cloning is literally cloning the behaviour of the driver. The idea is to train Convolution Neural Network(CNN) to mimic the driver based on training data from driver’s driving. NVIDIA released a paper, where they trained CNN to map raw pixels from a single front-facing camera directly to steering commands. Surprisingly, the results were very powerful, as the car learned to drive in traffic on local roads with or without lane markings and on highways with minimum amount of training data. Here, we’ll use the simulator provided by udacity. The simulation car is equipped with 3 cameras in the front…

Identifying lanes of the road is very common task that human driver performs. This is important to keep the vehicle in the constraints of the lane. This is also very critical task for an autonomous vehicle to perform. And very simple Lane Detection pipeline is possible with simple Computer Vision techniques. This article will describe simple pipeline that can be used for simple lane detection using Python and OpenCV. This exercise is done as part of “Udacity Self driving Car nanodegree”.

Note that this pipeline comes with its own limitations (described below) and can be improved. …


Learning rate might be the most important hyper parameter in deep learning, as learning rate decides how much gradient to be back propagated. This in turn decides by how much we move towards minima. The small learning rate makes model converge slowly, while the large learning rate makes model diverge. So, the learning rate needs to be just correct.

Gradient descent with small(top) and large (bottom) learning rates. Source: Andrew Ng’s Machine Learning course

Still finding and setting the correct learning rate is more of trial-and-error. The naive way is to try different learning rates and choose the one which gives smallest loss value without sacrificing the learning speed.(Validation loss also matters for underfitting/overfitting).


Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store