The thing about Neural Networks is that you kill off ineffective permutations as the training is refined.
So it seems from your response that rather than having a directed goal, the function acts more or less like a random walk?
Sorry if my previous response wasn’t that clear.
The network learns through “gradient descent.”
Gradient descent is a fundamental optimization algorithm used in training neural networks. It works by iteratively adjusting the network’s parameters to minimize the difference between its predicted outputs and the actual targets in the training data.
The process starts with initializing the parameters randomly. Then, for each training example, the network calculates the error between its prediction and the target output. The gradients of the parameters with respect to this error are computed, indicating the direction and magnitude of the steepest descent in the error space.
The parameters are updated by subtracting a fraction of the gradients, known as the learning rate, from their current values. This step is repeated for all training examples in multiple iterations or epochs, gradually reducing the error.
By repeatedly adjusting the parameters based on the gradients, the neural network “descends” the error surface, converging towards a set of parameter values that yield better predictions. This iterative process enables the network to learn and improve its performance over time.
— ChatGPT
Think of the gradients like slope. You are always trying to determine which way is “downhill” and move towards the lowest possible point. A good visualization of gradient descent is this video: https://youtu.be/hfMk-kjRv4c?t=907 (from 15:07).