# Activation Functions :Linear ,Sigmoid, Tannh, Hard Tanh, Softmax, Rectified Linear

Here are some key points about common activation functions used in neural networks:

1. Linear Activation Function:
• The linear activation function computes a weighted sum of the inputs and biases without applying any nonlinearity.
• The output of the function is simply the sum of the inputs and biases multiplied by the weights.
• The linear activation function is mainly used in regression problems where the output needs to be continuous and unbounded.
• However, it is not commonly used in neural networks because it cannot introduce nonlinearity into the network, which is often necessary to learn complex patterns and relationships in the data.
1. Sigmoid Activation Function:
• The sigmoid activation function applies a sigmoid function to the weighted sum of the inputs to produce an output between 0 and 1.
• The sigmoid function is a nonlinear function that has an S-shaped curve, which makes it useful in binary classification problems where the output is a probability value.
• However, the sigmoid function has some drawbacks, such as the vanishing gradient problem, which can make it difficult to train deep neural networks.
1. Tanh Activation Function:
• The tanh activation function applies the hyperbolic tangent function to the weighted sum of the inputs to produce an output between -1 and 1.
• The tanh function is similar to the sigmoid function, but it is symmetric around 0 and has a steeper gradient, which makes it more useful in classification problems where the output needs to be bounded.
• However, like the sigmoid function, the tanh function can suffer from the vanishing gradient problem, which can make it difficult to train deep neural networks.
1. Hard Tanh Activation Function:
• The hard tanh activation function is a modified version of the tanh function that applies a threshold to the output to produce an output between -1 and 1.
• The hard tanh function is faster to compute than the tanh function and is commonly used in embedded systems and real-time applications.
• However, like the tanh function, the hard tanh function can suffer from the vanishing gradient problem.
1. Softmax Activation Function:
• The softmax activation function is used in the output layer of multi-class classification problems.
• The softmax function normalizes the output so that the sum of all outputs is 1, representing the probability distribution over all possible classes.
• The softmax function is useful for multi-class classification problems because it provides a clear output that indicates the probability of the input belonging to each class.
1. Rectified Linear Activation Function (ReLU):
• The rectified linear activation function (ReLU) applies a linear function to the weighted sum of the inputs if it is positive, and outputs 0 otherwise.
• The ReLU function is commonly used in hidden layers of deep neural networks and is known to improve training performance.
• The ReLU function is fast to compute and does not suffer from the vanishing gradient problem, which makes it useful in deep neural networks.