Activation functions play a crucial role in training neural networks. Currently, the most widely used and default activation function in most neural network frameworks is Rectified Linear Unit (ReLU), defined as ReLU(x) = max(0,x), which retains only positive inputs and returns zero for negative values. Although ReLU has simple computation and sparsity characteristics, for input datasets with uneven data distribution, the presence of more negative values often results in a large number of ReLU