Neural Networks: life as an alchemy
[ Last Updated: 2024-07-21 ]
After two weeks of deep-diving into code and feeling the soul-crushing weight of hyperparameter tuning, I’ve finally gathered the courage to write about Neural Networks. The core of this algorithm is incredibly "brute-force," yet the implementation details require immense patience.
What is a Neural Network?
Neural Networks (NN) aren't new; they gained academic attention back in the 1980s, inspired by the biological interactions between neurons in the human brain. While engineers love this mechanical analogy, a modern NN's inner workings are quite distinct from an actual human brain.
After a period of stagnation due to limited data and computing power, the field exploded recently under the name "Deep Learning." Today, NNs solve complex tasks in image recognition, natural language processing, and more. Let's break down what actually happens inside the "box."
The Components of a Neural Network

1. Hidden Layers
The most significant difference between a simple regression and a Neural Network is the inclusion of Hidden Layers. These layers allow the model to capture interactions between input features to create new, higher-level information.

Imagine predicting if someone will buy a product based on Price () and Utility (). A hidden layer might combine these to create new "indices":
- Value-for-Money Index: High utility + Low price.
- Necessity Index: High utility (where price matters less, like a washing machine).
- Bargain Hunter Index: Extremely low price, even if utility is low.
By adding more layers, these indices can interact with each other, creating even more complex patterns.
2. Activation Functions: The "On/Off" Switch
If every neuron only performed linear calculations (multiplication and addition), the entire network would just collapse into one big linear function. To capture non-linear patterns, we use Activation Functions.

- ReLU (Rectified Linear Unit): . It sets all negative inputs to 0 and keeps positive inputs as they are. It is computationally efficient and speeds up convergence compared to Sigmoid.
- Sigmoid: Maps values to a (0, 1) range, often used in the output layer for binary classification.
In our "Buy/No-Buy" example, if an item is useful but expensive, the "Bargain Hunter" neuron might output a negative value. ReLU would flip that neuron "off" (setting it to 0) so it doesn't negatively affect the final decision in a linear way.
3. The Output Layer and Loss Functions
The Output Layer produces the final prediction. The choice of activation function here depends on the task:
- Binary Classification: Sigmoid.
- Multi-class Classification: Softmax.
- Regression: Linear function.
Softmax is particularly cool for multi-class problems (like identifying digits 0-9). It turns raw scores into probabilities that sum up to 1:
The Zi is the score to show how it belongs to the category i.
We then use Cross-Entropy Loss to measure how far our prediction is from the truth.
4. Backpropagation: The "Recall" Process
How does the network learn? Through Backpropagation. We calculate the error at the output and use the Chain Rule from calculus to "propagate" that error backward through the layers, updating every weight () and bias () along the way.
It’s a massive, iterative game of "tuning the knobs" until the total error is minimized.
Conclusion: The "Alchemy" of Coefficients
In a deep network, individual neurons lose their obvious "meaning" (like our "Bargain Hunter" example). Below is a visualization of 60 neurons' weights from my handwritten digit recognition (MNIST) project. Some look like strokes of numbers; others look like random static.

It’s magical that these "static" patterns, when combined, can recognize digits with incredible accuracy. It feels a bit like alchemy—if you provide enough data and turn the knobs long enough, you eventually find gold.
I’m currently looking into Convolutional Neural Networks (CNNs) for better performance on MNIST, though that might take another week or two to write up!

Honks:
Honestly, successfully "demystifying" Neural Networks for myself has been a highlight of the month. They aren't "unfathomable"—just complex and incredibly powerful.
— Untitled Penguin 2024/07/21 18:30