What is deep learning & how does it works?

What is deep learning

We've covered a number of different basic concepts in machine learning, you know what supervised and unsupervised learning techniques are, you understand basic machine learning use cases such as regression, classification, and clustering and you know how these models work at a very high-level and you know what metrics you can use to evaluate these models.

We've also discussed traditional machine learning models, and we've discussed the basic structure of some of these models, specifically linear regression, logistic regression, and decision trees. It's time for us to turn our attention to the most powerful models that we have today, deep learning models, models built using neural networks.

What's Deep learning model:

But first, let's understand the difference between traditional models vs. deep learning. Traditional models fit an algorithmic structure on data we saw which either fits a line on data or a curve or a tree, and it uses this algorithmic structure to extract insights from data. In contrast to traditional machine learning models, deep learning models do not have any specific algorithm.

They use neural networks that are made up of active learning units called neurons. These neural networks have neurons arranged in layers, and these layers that make up a neural network can extract arbitrarily complex relationships that exist in the training data, and then use these relationships to make predictions. Now, all of these terms neurons, neural networks, and layers might seem strange to you, but don't worry, we'll understand what all of these mean. Neural networks don't use algorithms, instead they are constructed using different architectures.

Neural network in deep learning model:

The architecture of a neural network comprises of how many layers the neural network has, the number of neurons in each layer, how the layers are connected up together, everything. There are other differences between traditional models vs. deep learning models as well, let me go through a few. Traditional machine learning models cannot be trained directly on raw data. The data that you feed into train such models has to be pre-processed heavily. This means that the developer of the model, whether it's a data scientist or a machine learning engineer, has to perform manual feature extraction.

It's the developer who specifies what features are relevant for the feature and which features are important, and how those features should be processed. The model doesn't know how to do this. Deep learning models can often work directly on raw data, and they perform automatic feature extraction, which means you can feed in data with a lot of features. It's quite possible that many of those features are irrelevant, the model knows how to ignore irrelevant features and focus only on the relevant features.

So the model is capable of identifying relevant features that it needs to train. Traditional machine learning models tend to work well on smaller datasets, which also implies that the training of these models may not be very computationally resource heavy. Also, the training may not need to be done in a distributed manner. You don't need multiple machines to run training. Deep learning models tend to be trained using a huge corpus of training data, millions of records. This means that training tends to be computationally intensive and may require the use of graphics processing units or GPUs.

Also, you may need to train these models in a distributed manner on a cluster of machines. And here is a point we've discussed before, traditional machine learning models are built using well known and well understood algorithms, and different models have their own algorithms, whereas with deep learning models are built using neurons as active training units. So, it's these neurons that are learning from data, neurons are arranged in layers and the layers are interconnected in different ways.

When you work with traditional machine learning models, there are certain parameters of the model that can be configured and controlled by the model developers. These parameters, the design parameters are referred to as hyperparameters. These hyperparameters are referred to as the model's design constraints, and they can be tuned by the developer. With deep learning models, the model's design and the architecture of the neural network is what the developer specifies. The developer specifies how the different layers are connected together, is every neuron connected to every other neuron?

Are the connections dense or sparse? These are the choices that the developer makes. Traditional machine learning models are white box models. Model parameters can be examined once the model is trained and the parameters are well understood. Traditional models lend themselves very well to model explain-ability and interpretability. Deep learning models on the other hand, have parameters that are very hard to interpret and may not be well understood, they tend to be black box models.

Artificial Neural Networks(ANN) & its working:

Deep learning models may give you results that are correct, but you don't know why they are correct. They don't lend themselves well to model interpretability or explain-ability. And finally, traditional models work well on small datasets, maybe a few hundred thousand records. Deep learning models tend to work well with very large datasets, millions of records. The deep learning models that we work with today are neural networks. These neural networks are basically computing systems majorly inspired by the biological neural networks that constitutes human/animal brains.

Now, these neural networks are often referred to as ANNs or artificial neural networks. Neural networks were built to mimic how the brain works, that's why the active learning units in neural networks are referred to as neurons. Neural network models are deep learning models, and deep learning models are a part of a broader family of machine learning (ML) models based on artificial neural networks.

A neural network can have hundreds of layers extracting information from the data. Now, this idea of neurons and layers might seem very abstract right now, but this is what we'll seek to understand in this video and the next. Let's talk about deep learning with neural networks. Here is a very simplified visualization of how neural networks work.

Neural networks are made up of active learning units called neurons, and these neurons are arranged in layers, such as the layers that you see here on screen. Every layer of the neural network extracts some information from the data that you feed into the neural network, and each layer will extract some slightly different information. Let's say you were feeding in images.

The first layer might look at pixel-level information, the second layer might find edges in the images. the third layer might find corners, the fourth layer might identify objects, and the fifth layer might put all of this information together to identify the image of a puppy. Now these neurons are connected to one another, so the output of one layer of neurons serves as an input to the next layer of neurons.

Earlier layers in the neural network learn very fine-grained details about the input. Later layers aggregate this information together and learn more complex representations. All of this will start making a little more sense if you just bear with me. I'm just showing you the layers here and showing you how the data flows through a neural network. Now, if you were to take a magnifying glass and look into each of these layers, you'll see that these layers are made up of active learning units that are referred to as neurons, though they are just called units or learning units today. Now in a neural network, these neurons are interconnected with one another.

That is, a neuron will process data and will pass its output along to other neurons in the neural network. Now, here I've shown that every neuron in one layer is connected to every neuron in the next layer. But how exactly these interconnections are set up, that is a part of neural network design and architecture. These interconnections between neurons work like synapses in the biological brain. Every connection in a neural network can transmit a signal from one neuron to another. So, as many connections that you have, that many signals are transmitted.

A neuron will receive signals from neurons that connect to that particular neuron. The neuron will then process that signal in some way, perform some kind of transformation, and then further signal other neurons to which it is connected. During the training of a neural network and when it's used for prediction, all of these interconnections are actively transmitting signals. So every neuron receives signals and transmits signals. The input to a neuron is a signal from some neuron in the previous layer to this current neuron.

So, neurons from previous layers signal this neuron and this neuron will then process the data, and the processed signal will then be transmitted to neurons in the next layer. So every neuron receives signals and transmits signals. The input to a neuron is a signal from some neuron in the previous layer to this current neuron, so neurons from previous layers signal this neuron. This neuron will then process the data, and the processed signal will then be transmitted to neurons in the next layer. A neuron can thus receive multiple signals at the input. A neuron typically receives many signals, processes them, and transforms them in some way, and transmits its output signal to multiple neurons.

Conclusion:

Deep learning do not use any specific algorithm like they're used in machine learning but Deep learning uses neural networks that are basically made up of active learning units called neurons. When you consider all of the neurons together in a layer, each layer performs a different transformation on the input data. The input data that you feed into a neural network makes up the signals that traverse from the first layer to the last layer, and these signals are transformed along the way. The output of the last layer of the neural network gives you the final prediction of the model.

W3google