modReLU

ModReLU is a type of activation function, used in machine learning and artificial neural networks, that modifies the Rectified Linear Unit (ReLU) activation function. Activation functions determine the output of a neural network based on the input it receives.

What is an Activation Function?

An activation function is an essential part of a neural network that introduces non-linearity, allowing the network to model complex patterns and make accurate predictions. In essence, it applies a mathematical function to the output of each neuron in the network, turning the input into output.

Without an activation function, a neural network would be a static linear function with no ability to learn. By using an activation function, a neural network becomes more flexible, allowing it to adapt to different types of data and learn to make accurate predictions.

What is ReLU?

ReLU, or Rectified Linear Unit, is one of the most widely used activation functions in neural networks. It is a simple function that returns the input value if it is positive, and zero if it is negative. Mathematically, ReLU is defined as:

$$ \text{ReLU}(z) = \begin{cases} z, & \text{if } z\geq 0\\ 0, & \text{otherwise} \end{cases} $$

ReLU is preferred over other activation functions like the sigmoid function because it is very fast to compute and is less likely to suffer from the problem of vanishing gradients.

What is modReLU?

ModReLU is a modification of ReLU that aims to address some of the limitations of the original function. Specifically, modReLU introduces a bias parameter that can be learned during training. This bias parameter allows modReLU to adjust the location of the inflection point of the function, making it more flexible and adaptable.

The modReLU activation function is defined as:

$$ \sigma\_{modReLU}\left(z\right) = \begin{cases} (|z| + b)\frac{z}{|z|}, & \text{if } |z| + b \geq 0\\ 0, & \text{if } |z| + b \leq 0 \end{cases} $$

In this equation, $b$ is the bias parameter of the nonlinearity. For a $n_h$ dimensional hidden space, one nonlinearity bias parameter is learned per dimension.

By adding the bias parameter, modReLU can adjust the position of the inflection point, meaning it can be shifted to the right or left depending on the value of $b$. This allows the neural network to learn more complex features in the data and make more accurate predictions.

Advantages of modReLU

The modReLU activation function has several advantages over other activation functions, including:

Flexibility: The bias parameter allows modReLU to adjust the inflection point of the function, making it more flexible and adaptable to different types of data.
Improved Accuracy: By adjusting the inflection point of the function, modReLU can learn more complex patterns in the data, resulting in improved accuracy.
Reduced Overfitting: Overfitting occurs when a neural network becomes too specialized to the training data and performs poorly on unseen data. ModReLU's flexibility allows it to avoid overfitting by adjusting to the data more effectively.

ModReLU is a modification of the ReLU activation function that has several advantages over other activation functions. By introducing a bias parameter, modReLU can adjust the inflection point of the function, making it more flexible and adaptable to different types of data. This results in improved accuracy, reduced overfitting, and better performance in neural networks.