Neural Network
Also known as: ANN, Artificial Neural Network
A computing system inspired by biological neural networks, consisting of interconnected nodes (neurons) organized in layers that process information using learnable weights and activation functions.
“A computing system inspired by biological neural networks, consisting of interconnected nodes (neurons) organized in layers that process information using learnable weights and activation functions.
“
Overview
Neural networks are the computational framework that underpins modern AI. Inspired by the structure of biological brains, they consist of layers of interconnected nodes (neurons) that process and transform data. Through training, neural networks adjust the strength of connections between neurons (weights) to learn patterns in data.
Structure
Input Layer
Receives the raw data and passes it to the network. Each neuron in the input layer represents a feature of the input data.
Hidden Layers
One or more intermediate layers where the actual computation and pattern recognition occurs. Each layer transforms its input through weighted sums and non-linear activation functions.
Output Layer
Produces the final result — a classification, prediction, or generated output.
How Learning Works
Neural networks learn through a process called backpropagation:
- Forward Pass: Input data flows through the network to produce an output
- Loss Calculation: The output is compared to the desired result using a loss function
- Backward Pass: Gradients of the loss are computed with respect to each weight
- Weight Update: Weights are adjusted in the direction that reduces the loss
Activation Functions
- ReLU: The most common, outputs zero for negative inputs and the input value for positive inputs
- Sigmoid: Squashes output to the range (0, 1), useful for binary classification
- Softmax: Produces a probability distribution across multiple classes
- GELU: Used in Transformer models, a smoother variant of ReLU
Sources & Further Reading
Related Terms
Artificial Intelligence (AI)
The simulation of human intelligence processes by computer systems, including learning, reasoning, self-correction, and the ability to perform tasks that typically require human cognition.
Deep Learning
A subset of machine learning based on artificial neural networks with multiple layers (deep architectures) that can learn hierarchical representations of data for complex pattern recognition.
Machine Learning
A subset of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed, using algorithms that identify patterns in data.
Transformer
A neural network architecture based on self-attention mechanisms that processes input sequences in parallel, forming the foundation of virtually all modern large language models.