What Is Deep Learning and How it Works?

The concept leading Artificial intelligence (AI) is to make machines or computers as efficient and intelligent as humans (maybe more than humans, but that’s a long way to go!) The best way to do that is to make them learn by themselves i.e. Machine Learning (ML), which is the thing researches are knuckled down to. Deep Learning is one of the many algorithms to make machines exceed at ML. It cannot be said with surety that it is the best but it is the most trusted one, again because it imitates the human nervous system!

Deep Learning is thus of broad and current interest for, well, everyone! Some people even arguably say that it is DL, not AI that is here to stay and slay. From a salesperson to a businessperson and an analyst to a scientist, everyone knows it is going to modify how they earn and how they live. But when they want to know more about the next big thing, they get confused and you are probably one of them. But don’t worry, this read will give you a head start at AI and DL.

How Deep Learning works?

As I mentioned earlier, DL is an algorithm to make machines learn different tasks and do them when the need arises without someone telling them to i.e. automating the different tasks in a field. The approach used for that is quite obvious. A machine is given a ton of historical data and machine “trains” on it and grasps the concept of doing it.

DL gives a twist to this to make an AI system learn effectively. Instead of showing the AI how a job is done, DL lets AI do its job and corrects it on every step, using an effective method of learning. A Deep Learning system attaches different weights to inputs to find the final output. At first, the algorithm gives the wrong answer because it is untrained. Then it changes the weights and tries again, comparing its own output with the actual output in the data. This process is repeated until the AI is trained enough to get the right output or has learned to do a specific job.

To make it simple, let’s take an example of an AI system designed to detect how much time a machine in an industry will take to be worn out and need replacement i.e. the age of a machine. There can be many factors (inputs) related to it, we take the following four to keep it easy:
  • Type of machine
  • Time per day for which the machine is used
  • Environmental conditions around the machine
  • Impact Damage
Of course, there are many other factors and the system will not be purely software but this example is good enough to understand the process.

Now, the historical data of different machines in that industry is fed into the AI and random weights are assigned to different inputs (factors). The factor that supposedly is more important is given more weight, which can be impact damage in this particular case. Rest of the operation is automatically done by DL.

DL processes the data and it predicts the wrong answer at first (the machine knows that by comparing it with historical ones). Then it changes the weights of inputs by using a systematic process. It processes all the cases in the data and continues improving with every pass. It is considered “trained” and deployable in the industry when the difference between its answer and the actual answer is less enough.

Above is the working of AI in the simplest way possible, I will now define some most important terms of DL below, linking them to above example so that you nicely understand the whole process when researching more on it.


A neuron is a cell of the human brain that processes and transmits information. A perceptron can be considered as an algorithmic version of a neuron which takes several inputs and weighs them up to produce a single output. Each input is weighted according to its importance in the output decision.

Artificial Neural Networks:

Often referred to as ANN, this is where the magic of Deep Learning happens. Remember I said that DL imitates the human nervous system? The neurons of the human brain make complex connections with each other, thus making the whole brain and nervous system. ANN is the computational model of the human brain. Perceptrons make several layers and make connections with each other.

The layers of the ANN can be classified as:
  • Input layer
  • Hidden (processing) Layers
  • Output Layer
The input layer receives input data (the four factors in our case), the hidden layers perform mathematical computations on our inputs and the output layer returns the output data (the age of a machine in our example). The data is passed through all these layers at every pass and DL solves a part of the problem at every layer.

Cost function:

Cost function is simply the difference between actual input and the algorithm’s output. This is the most important function of DL as it specifies the level of correctness of the algorithm and also tells if the AI system is ready for practical use or not.

Gradient descent:

Gradient Descent is the systematic process that changes the weightages of inputs. The actual function of this process is to make the Cost Function as low as possible. It finds the local minimum of the cost function thus increasing the efficiency of the AI system.

No comments

Powered by Blogger.