How the foundational component of a new architecture for AI was forged, validated, and ultimately revealed the limits of standard benchmarks.This work represents a comprehensive research pipeline built from scratch, encompassing formal mathematical foundations, a PyTorch implementation, visualization tools, and systematic experimental validation across a series of experiments. The entire system is open-source, enabling reproducible research and collaborative development.Key PointsVector Synapses: Multi-dimensional synaptic communication replacing single weights with information, plasticity, and modulatory signals.Adaptive Topology: ~92.7% accuracy using only 20% of connections — proving 80% of dense networks are redundant.Complete Implementation: Formal math, PyTorch integration, visualization tools, and CLI for in-depth research.Local Learning: Replaced backpropagation with Hebbian rules, achieving 92% accuracy through purely local updates.Representation Insight: Multi-layer “failures” revealed successful representation learning, challenging evaluation methods.\Why This MattersCurrent AI systems like large language models build their intelligence on a product of human cognition — language itself. They excel at learning patterns in text and generating coherent responses. This got me thinking: what if we could build AI that learns about the world directly, rather than just from human-written text and data?This sparked my journey into exploring whether brain-inspired networks might learn differently than the AI systems we’re used to. This article documents my first step toward that vision — developing and validating a novel, bio-inspired neuron architecture — and how its success was almost missed due to conventional evaluation methods.Establishing a Strong BaselineFor my initial independent research, I focused on MNIST to validate core innovations with limited resources before scaling. This neuron model builds on existing SNN advancements, making them a suitable baseline for direct comparison and demonstrating the novel neuron’s unique contributions.To prove the validity of the experimental setup and set a conventional target, a standard Spiking Neural Network (SNN) was built using snntorch and nn.Linear layers. This achieved a baseline accuracy of 98.11% on MNIST, setting a high bar for the narrow task of classification.\ With the baseline established, it was time to integrate the neuron model with the traditional PyTorch framework tools step by step. This meticulous approach allowed for isolating the impact of each novel component and observing its interaction with the existing neural network architecture.Introducing Vector SynapseTraditional neural networks use a single number (synaptic weight) to connect neurons, strengthening or weakening signals. The vector synapse uses a rich data packet for this connection, delivering multiple instructions simultaneously:Immediate message: A fast, informational signal similar to the traditional weight, dictating immediate neuron reaction.Learning message: A parallel instruction guiding the synapse’s future self-modification.Modulatory message: A slow, background signal influencing the neuron’s overall state, like excitability or learning rules.This bundling allows for complex, dynamic neuron conversations, separating urgent computation from slower adaptation and learning.The First Unexpected ResultThe first novel component, the VectorSynapse, was introduced and implemented as PyTorch nn.Module. Replacing the first nn.Linear layer with the custom VectorSynapse module and attempting to train it with a generic surrogate gradient method resulted in a significant accuracy drop to 9.80%. The problem was not the model, but a fundamental mismatch of methods: the generic learning algorithm was incompatible with the custom module’s logic.Regardless of the number of epochs, resulting accuracy stayed the same and the loss value wasn’t changing at all. Simply replacing a traditional implementation with this custom one wasn’t enough.It seemed that the reason could’ve been due to the oversaturated synaptic strength caused by the bad initial value choice, and calibrating them restored the accuracy to the baseline. This has indicated that when carefully adjusted, this new implementation can replace the classical one, although without any practical gains and with slower learning speed. However, this has now created a ground for customizing the synaptic behavior further.Breakthrough — Replacing BackpropagationWhile calibrating the synapse proved it could replace a standard layer, it offered no real advantage and was slower to train. The true potential of the vector synapse was only unlocked when it was allowed to operate according to its own principles.To test this, the module was integrated with its intended, custom Hebbian-like learning rule. The experiment used a hybrid architecture: a single layer of these custom VectorSynapse neurons was responsible for initial feature extraction, followed by a standard, backpropagation-trained layer for the final classification.The hybrid setup hit ~92.06% accuracy — not bad, though still trailing my conventional baseline of 98%. But here’s where it gets interesting. It proved that the model, operating entirely on its own local rules without any global error signal, could successfully replace backpropagation for the critical task of feature extraction.Interestingly, by modifying the number of custom neurons and number of epochs for readout training, a more optimal configuration can be achieved, with a 10x decrease in the amount of custom neurons causing 10% readout accuracy reduction and 4x shorter feature extraction time, for example. This demonstrates that a balanced configuration plays a big role in the overall model performance.The Challenge of Integrated ComplexityWith the core learning mechanism validated, the next step was to integrate the full suite of advanced features from the formal model, including dynamic neuromodulation and component-wise retrograde signaling. The integration was a technical success; the network remained highly stable and effective, achieving a strong accuracy of ~92.17%. While this was slightly below the peak performance of the simpler hybrid model, it established a new, more biologically complete baseline and highlighted a mature research lesson: increasing a model’s biological fidelity often transforms a straightforward optimization task into a more complex, multi-parameter tuning challenge.The Path to an Efficient TopologyThis increased fidelity also came with a significant computational cost, raising a critical question about how the model could scale efficiently. The initial hypothesis was that a sparse, “small-world” topology could provide a solution. Experiments showed that a Watts-Strogatz network with just 0.5% of the connections — a 200-fold reduction — could achieve a surprisingly strong 85.3% accuracy. While this revealed the model’s resilience, the performance trade-off was still too high.This led to the final advancement of this phase: an adaptive topology. The network was initialized with sparse (15%) connectivity and given a simple rule to progressively grow new connections as the learning progresses. The result was remarkable. The network’s structure grew organically and achieved ~92.7% accuracy at just 20% density — nearly matching the fully-connected model with 80% fewer connections. This demonstrated that most connections in a dense network are redundant and that an intelligent, adaptive structure is the key to balancing complexity with efficiency.A Different Perspective: Building IntuitionParallel to the quantitative benchmarking in PyTorch, a separate tool was developed to provide a more qualitative understanding of the model’s behavior. A complete, pure Python implementation of the neuron was built, complete with a command-line interface and a web-based visualization tool.This system was never intended for benchmarking. Its purpose was to build intuition. By presenting MNIST digits to this network and watching the real-time visualization of membrane potentials and propagating spikes, it was possible to see how the network responded to patterns outside the rigid constraints of a supervised task.This interactive exploration confirmed that the model’s dynamics were behaving as intended by the formal theory and provided the crucial visual feedback needed to question the results of the purely quantitative experiments that would follow.Reinterpreting “Failure” — Success in RepresentationThe most pivotal moment of the research came when attempting to build a pure multi-layer network, where every layer consisted of the custom VectorSynapse neurons. The initial result looked like a complete failure: classification accuracy dropped to around 16% without a classification layer, reaching 9.8% in the worst cases.The conventional interpretation would be simple: the model’s local learning rules are incapable of hierarchical credit assignment. But given the model’s core philosophy and the insights from the visualization tool, a different hypothesis emerged. What if this “failure” was actually telling me something interesting? Maybe those deeper layers were learning something about the digits that just doesn’t matter for the simple 0–9 labeling task I was testing.The high performance of the single-layer hybrid model had already proven that one custom layer was more than sufficient to extract the simple features needed for MNIST classification. The subsequent layers in the pure network, therefore, lacked a strong, goal-oriented signal for this specific task. Freed from this constraint, they were not failing to learn; they were successfully fulfilling their primary purpose: building a rich, hierarchical representation of the data. They were likely learning to represent more abstract patterns within the digits — such as variations in handwriting style, stroke thickness, or slant — features that are completely irrelevant to the simple 0–9 labels used for evaluation.The low accuracy score, therefore, was not a model failure but an evaluation failure. It was the result of a profound mismatch between a simplistic benchmark and the system’s sophisticated objective. The model was building a representation of MNIST digits, but I was only asking it to perform a trivial labeling task.Reality CheckLet me be honest about where this stands. The performance isn’t beating conventional methods yet — 92% vs 98% is a meaningful gap. The system is computationally heavier and needs a lot of parameter tweaking, which isn’t exactly the self-organizing system I’m working towards. And I’ve only tested this on MNIST, which is pretty much the ‘hello world’ of AI datasets.Future WorkSo where does this go from here? The obvious next step is testing on something harder than MNIST — CIFAR-10 would be a good start to see if the representational learning idea actually holds up when I can run proper transfer learning experiments. The adaptive topology stuff looked promising for efficiency, so that’s another avenue worth exploring.But honestly, the bigger challenge might be figuring out how to properly evaluate what these systems are actually learning. If they’re really building internal models of the world rather than just optimizing for specific tasks, I need better ways to measure that. Traditional benchmarks might be missing the point entirely.The long-term vision is still about building systems that learn continuously from their environment, developing rich internal models like biological systems do. But I need to test this on much harder problems, compare it properly with other brain-inspired approaches, and figure out better ways to measure what the system is actually learning. This is very much the beginning of the story, not the end.Conclusion: Lessons Beyond the BenchmarkThis story of validating a novel neuron revealed that its performance could not be measured by conventional metrics alone. The most profound insight was not achieving a high score, but understanding why the model “failed” when it was actually succeeding at its intended purpose under the correctly crafted experiments. This whole experience taught me that sometimes our models might be succeeding in ways we’re not measuring. The challenge now is figuring out how to actually test that hypothesis.This article validated the foundational building block and laid a foundation for the future. The next article will explore the why behind this neuron’s representational power, detailing its formal mathematics. In the articles that follow, together we will zoom back out to the full architecture, exploring how to build and — more importantly — how to evaluate a system designed not just to solve tasks, but to understand its world.\\Source code with neuron model, CLI tool and web visualization tool, as well as the Python script for MNIST visualization is available at https://github.com/arterialist/neuron-modelThis article is the part of ongoing research presented at https://al.arteriali.st\\I invite you to follow me on Hackernoon, Medium, LinkedIn and X (Twitter) to get notified about further research on this topic. Thank you!