Artificial and biological neural networks

Published in

DataDrivenInvestor

24 min readJan 8, 2023

This can be compared to the search for the Holy Grail, the search for the algorithm by which biological neural networks work. Of course, many will say that there is no grail, it’s all legends, and everything is already implemented in artificial neural networks, just wait for the development of this technology, computing resources and… and all the real artificial intelligence will be created. And, to understand the complex and confusing organ for this purpose is not necessary. But I hope there is a good portion of adventurers who will be interested in some considerations where to look for the Holy Grail. In this article we will analyze and compare the work of artificial neural networks with hypotheses about how biological neural networks work, and of course, accompany this with practical experiments, we will take apart a new artificial neural network, which is closer to its biological analogue in its principle of operation.

So how does an artificial neural network work?

We will start with the analysis of artificial neural networks. No, we will not describe the principles of networks here, numerous other sources will tell us about them better, here we will try to understand some fundamental essence of their work in terms of pattern recognition. One simple example demonstrates this very clearly:

Imagine a network consisting of two parts of an encoder and a decoder, the input of which will be given examples from a set of handwritten digits MNIST and at the output, respectively, we duplicate the input. We will train it by the classical method, the back propagation of an error. The narrowest part of the network, the smallest layer located between the encoder and decoder will have only 2 neurons. After training, it will turn out that these 2 neurons will carry information in themselves, so to speak, encode images of a set of shapes of all numbers. If you place the activity of one neuron along the x axis, and the activity of another neuron along the y axis, then you can mark on the diagram which points correspond to each class, and use the decoder to reproduce the outline of the number corresponding to the points.

A great interactive example of this can be tried here

What we got is called the Small Representation (SR), “small” because we have only 2 real numbers reflecting a vector of 784 real numbers, and “representation” is because it reflects the external world inside the network, that is, knowing only these two values, we can estimate what the network “sees” at the moment, what figure and even the approximate form of its outline. We can say that the Small Representation (SR) is a kind of model of the world, only the network had to make a very simple model, small, just two numbers, although the external world is more complex and represents sets of 784 real numbers.

Let’s take a closer look at what this Small Representation (SR) is, we see several areas that will correspond to certain classes and neighboring points will correspond to very similarly shaped numbers. It turns out that SR can help greatly simplify the process of learning recognition. We can take a trained decoder, discard the encoder, add a small perceptron and quickly enough teach the new network to recognize handwritten digits (this is called transfer learning), that is, first our network, or its first configuration was trained only on images without connecting them to the actual number, such a self-training (without control) to get the SR, and then the next configuration using the obtained SR, is trained on matching images and digit labels, with control, and does it much faster than the network which.

In fact, here the class of figure is defined by some ranges of activity of the two neurons. These ranges or areas, as we see in the diagram, can be called some kind of embeddings, representations of classes within the network, which were formed in the process of self-learning.

Next we will talk about what is Small Representation (SR) in biological networks, in the brain, and how these embedding entities are represented in it, but first let us highlight some qualities of these entities for artificial neural networks. First, in SR the whole layer is involved, all neurons of the layer matter. Let’s imagine that we removed one of two neurons from the network, it would be problematic to detect exactly what the network “sees” by one neuron, it is not always possible to answer even the question, whether the network now sees, for example, the number “5”. Yes, there is pruning, but it is only possible when there are too many neurons, when we can remove neurons without consequences for SR. Secondly, the more neurons in a layer involved in SR, the harder it is to isolate fics, as each neuron creates its own dimension in the space of class representations, 2 neurons — 2 dimensions, 10 neurons — 10 dimensions… and all 10 will be interdependent. Thirdly, the obvious thing for artificial networks, all neurons of a layer or even the whole network, at each tact of training are involved, each neuron will get its correction of weights.

In contrast to classical artificial neural networks, biological networks reflect the neuron-detector approach. In a biological network, a neuron acts as a detector of a certain feature or group of features. One can say that individual neurons are responsible for features or embeddings here. This is confirmed by numerous experiments; there is even a joking name for this phenomenon — “granny neurons,” which suggests that any image in our brain will correspond to the activity of a certain cell; it is even possible to have such a cell for our granny’s image. Of course, when we say “neuron” in this case, we mean a whole group of active neurons. First, before the excitation reaches the notorious grandmother neuron, it passes through a number of other neurons; second, usually the activity of a whole group of neurons distributed throughout the cortex, a neuron ensemble, or a pattern of activity, is behind some kind of activity.

The formal neuron of artificial networks reflects quite well the principle of operation of a biological neuron, the weights of the formal neuron reflect the sizes of mediator portions released into the synaptic cleft during synaptic activation, and the cumulative effect of the neurotransmitter leading to the action potential is reflected in the threshold function of the formal neuron. That’s great, but

mapping of the primary visual cortex (“Eye, Brain, and Vision” by Hubel David, Torsten Wiesel) records neuronal activity to certain stimuli, for example, different orientations of lines, segments, and ends of segments. And the point here is not that cortical neurons work according to other principles, but rather in the very organization of these neurons.

In general, we can say that formal neurons of artificial networks handle information more efficiently than biological neurons, for example, as we have seen, it may take only two neurons to encode a set of handwritten digits, while encoding according to the neural-detector principle will require at least 10 neurons to at least encode 10 classes of digits, not to mention different forms of these digits. Then a reasonable question arises: why try to understand this principle at all, why search for this Grail?

Firstly, by working on the creation of a model of neural networks in which the neuron-detector approach will be applied, we can learn more about the principles on which biological neural networks work. Secondly, although the evolutionary process is not perfect and does not always lead to optimal solutions, evolution has been successful in saving energy and brain efficiency. With the neuron-detector approach there is no need to activate all neurons when training and running the model, yes, for a mathematical model this does not give much advantage in efficiency, but when implementing such a model in “iron” it can give significantly. Third, imagine how convenient it is to use a neural network consisting of neuron detectors, we can evaluate what the network is “thinking” at the moment by its neuron activity, we can understand how it makes decisions, we can manage these decisions by simply activating necessary neurons. Fourth, one-shot learning. If you have a neural detector responding to the word “table” and a neural detector responding to an image of a table, you would have no problem making a connection between them. Fifthly, it would be a more stable model in the matter of retraining, retraining (in a good sense), as well as physical destruction, if you damage the neuron-detector responsible for the number “5”, the remaining neurons allow good recognition of numbers, trivially because there is no close interaction of neuron activity level during detection. A few more arguments in favor of the prospects of such networks: this is the flexibility of their control in dynamics, which means the possibility to create high-level control algorithms for these networks, similar to the way emotional systems control memory and reactions in the nervous systems of animals and humans. It is also possible to evaluate novelty of incoming information at any abstract level at which the network operates, if our system has a detector that reacts to an external stimulus, and we can evaluate the degree of this reaction, and thus evaluate the proximity of the external stimulus to the internal memory of the system, which will be the evaluation of novelty. If novelty criteria become important for neural network control, this should bring us closer to understanding how to create real artificial intelligence, because novelty is a very important factor in human and animal behavior.

How do biological neural networks encode information?

The Small Representation (SR) for the brain is SDR (Sparse Distributed Representation). “Sparse” means that only a small number of neurons are active at one time, and “distributed” means that this activity is evenly distributed throughout the tissue. But the word “small” can also be applied to this term, since any representation in the brain will always be of a smaller dimension than the outside world. In fact, the sparseness is only the result of the neuro-detector approach, we will see in practice how it happens.

Many may associate the notion of SDR with the works of Jeff Hawkins, as he wrote extensively about SDR and applied it in his model of Hierarchical Temporal Memory or HTM, but SDR is a general experimentally observed phenomenon inherent in the nervous systems of all mammals (and insects) from mice to humans. The author does not share Jeff Hawkins views on HTM, because it is impossible for an HTM neuron to exist (as Jeff describes it), which does not fit into the conventional neuronal paradigm — again, in the author’s opinion.

Let’s try to create an artificial neural network based on the neuron-detector approach. We will be guided and inspired by the knowledge of the biological brain and the principles of its work.

Network diagram and organization

There is no tool that works equally well with normal and sparse data at the same time, so the mechanism of neuron detectors associated with external data and data arising inside the network will be different. The first layer of our model will receive ordinary input, in this case a 784 dimensional vector representing a digit from MNIST set, and we will get sparse (SDR) at the output, and all subsequent layers will receive SDR and output SDR. Essentially, the main task of the first layer is to convert the data into a sparse representation, there is no need to train this layer. After many variants to the principles of construction of this layer, I came to the simplest and best option, the use of several kernels of convolution manually selected. Sixteen kernels of size 5x5.

The choice of kernels is determined by the prevalence of patterns in the data we will be processing. For example, the figures from MNIST are characterized by empty, unfilled spaces, that is, in almost every example you can find a single 5x5 area where all pixels will be 0, such a kernel is reasonable to include in our set. Also the figures are made by lines of different thickness and orientation, which defines some more variations of kernels — lines of different orientation. Here it is important that the set turns out to be as diverse as possible.

As we know, the primary visual cortex is characterized by line-detector neurons of different orientation, each column actually tuned to a different stimulus. Hubel David and Torsten Wiesel in their work on mapping of the primary visual cortex introduced the term — hypercolumn or module, which is a set of all the columns that have a common receptive field, for example, a small area on the screen where luminous lines of different orientation are projected. That is, as a part of the module (the term module is more acceptable to me than hypercolumn, as there is no consonance with the term column, so hereinafter I will apply this name) there are all representations of orientations of lines for both eyes and it makes about 1000 columns.

*Hypercolumn or Module, illustrations from The Eye, the Brain, and Sight by Hubel David*

In our variant, there will be only 16 neuron detectors in a module (yes, our neuron detector is equivalent to one column in the cortex), each corresponding to its own kernel. Each module will be responsible for its own fragment of the 5x5 picture, and, like in the cortex, the receptive fields of the columns usually overlap, so we will divide the total picture into 5x5 fragments in increments of 1.

The modular principle of organization is fundamental to our network, both for the first layer and for subsequent layers. The main feature of modular construction is that there is always one winner in a module. That is, there is a competition among neuron-detectors of the module, which will determine which neuron in the module will have priority in activation and learning. Just as living columns have lateral inhibition, due to which columns with weak activity are completely suppressed, and due to the effect of irradiation (aspirations of excitement will spread) the winning columns are strengthened in their activity.

In my previous works, I described experiments with a pixel shader in which lateral inhibition and irradiation, which shows that any initial activity is converted into small point-like distributed foci. This is the reason for the distributed and sparse activity.

The cortex has no explicit module boundaries, but when creating a mathematical model of neural networks we can make it more precise and structured, to some extent this is to our advantage. Therefore, lateral inhibition can be replaced by the rule: only one neuron-detector can be active in a module.

Initial activity of each neuron in the first layer is estimated by Manhattan distance, but eventually, in each module we will consider only one active, the one that will have maximal closeness to the input vector in the module. The module operation may remind us of the principles of the Kohonen network, our future model is as if composed of many small Kohonen networks.

Examples of work of the first layer. The input vector is 28x28, the core is 5x5 step 1, there are 16 neuron detectors in each module, so the output is 24x24x16 which is converted into planar form 96x96. This output of the first layer is an SDR for any input vector.

Planarity or the representation of data in two-dimensional form in our case is very important, because the subsequent layers will receive data in this form, it is even more visual and similar to the way the data in the brain cortex is represented.

As it was mentioned before, the main task of the first layer is to represent external stimuli in the form of SDR. Since the number of neurons in the module is 16, the ratio of ones to zeros in the resulting SDR will always be 1 to 15, for any input vector. Our first layer is not trained, it only transforms data.

Since the layer performs the usual data transformation, it is possible to obtain an inverse transformation, which shows that the data transformation occurs with some loss of quality, but this is a necessary sacrifice, because the obtained data will be absolutely normalized, the total sum of the obtained vector will always be the same, regardless of what is fed to the network input. In biology, similar transformations take place already at the level of the retina, thanks to the on-, off-cells, no matter what is projected onto the retina the total nerve activity coming from the eye will be constant.

Neurons of subsequent layers will receive sparse vectors as input. If neurons of the first layer, we can say, extracted features from the input data (sticks, dots, empties), then neurons of subsequent layers will generalize, i.e. generalize the extracted features.

Unsupervised bottom-up learning

Self-learning is the main clue as to where to look for our Holy Grail. It can be said that usually learning in animals and humans is mostly based on data that are not labeled, often this data is incomplete and noisy. And labels appear in the process of learning only sometimes, in the form of comparison of modalities, that is, learning by labeling, supervised learning is only auxiliary, because the basis is self-learning.

If we talk about the problem of pattern recognition, in essence we need an algorithm that would allow us to form a diverse set of neuron detectors, which would respond to explicitly expressed representatives of their classes. Earlier in my research I used the Kohonen network algorithm and its variations, but it does not work well with sparse representation, which did not allow me to create a multilayer version of the network. And the neuron of this network determines its activity by Manhattan distance, but I was inclined to a simpler version of the neuron, as a simple adder.

And solving the classification problem on simple adders turned out to be very easy. I was interested in the article, which told about successful clustering of MNIST using spike networks, the solution was very big, cumbersome and complicated for me.

So I tried to implement the mechanics of this clustering without anything unnecessary, the result fits literally in a few lines of code (link to Google colab).

*Network weights after 10,000 MNIST examples*

The example above weighs 256 neurons in the layer (16x16). Only one, the most active neuron in the whole layer can be trained per cycle of the network.

The training follows a very simple formula:

map[:, winner] += rate * (inp - 0.5)

to the winner’s weights we add the product of the coefficient of the training rate (rate=0.2) and the difference of the input minus 0.5. Why exactly 0.5? — It is a trivial average of the input range, it is a real number from 0 to 1. Let’s call this number — the target intensity. The target intensity allows you to draw a line between weak and strong input. This formula essentially weakens weak inputs and strengthens strong ones. And the weaker the input, the stronger we decrease the weight associated with it and vice versa. Of course, due to this, not all data is suitable for this algorithm, where the number in the vector does not mean intensity only, but carries more complex information, most likely it is better to use Kohonen nets, but for MNIST it works well enough. Also this algorithm has a distinct advantage, there is practically no “super winner” or leader problem. The essence of it is that one neuron that wins most often takes all the learning. Thanks to the training of neighbors in Kohonen networks, this problem does not manifest itself explicitly. The thing is that if the starting sum of weights is big enough when using the above algorithm, then the more a neuron is trained, the lower is its activation level per input vector, compared to less trained neurons, thus less trained neurons will have more chances to win. This is of key importance at the beginning of learning, when a neuron is just acquiring its specialization. So it was logical to make the value of all weights at network initialization equal to one.

At birth the brain has the maximum number of neurons and the first days and weeks the maximum number of synaptic connections are unfolded. Then, as the brain grows and learns, there is a general degradation and loss of synaptic connections and loss of nerve cells. This process is called pruning and is particularly large for the human brain by the first three years after birth. Pruning allows neurons to specialize more precisely, and this can be compared to what happens in our clustering algorithm; initially all weights are equal to 1, then in the learning process the extra is reduced to zero.

But this algorithm has a drawback, it strives for the smallest sum of weights, with more and more learning the activation level will decrease, the sum of weights will also decrease, and to win the neurons will specialize on examples with less and less sum.

*Network weights after 10 epochs on all 60,000 examples*

To eliminate this problem, the solution is to normalize the input vector, and to bring the sum of the input vector to a constant. And the best solution for this is SDR.

But in fact, the simple combination of this algorithm and SDR does not solve the problem of reducing the total sum of weights, although the diversity of classes will be preserved. It is necessary to bring even more stability to the algorithm.

The general idea of a biological neuron is that it strengthens in regularly repeated connections, while rarely used connections are degraded and lost, some stable pathways of nerve signals propagation are formed. To implement this, we need to keep some kind of statistics for each synapse, how often and how effectively it is used. Therefore, in addition to the weight map, we will have a “temperature trace” or trace map, the value of this map will be in the range from 0 to 1. Every time both neurons are active, the synapse connecting them will add +0.3 to trace, but during other cycles of the network trace will gradually decrease by -0.001. Thus synapses that will be regularly used effectively will have a high trace value close to 1. In contrast, synapses that are hardly used will have a trace value of zero or close to zero. This indicator will have a direct effect on the value of the synapse weight, because frequently used synapses should be strengthened. The final formula for the resulting algorithm is as follows:

map[winner,:] = (1 - rate)*map[winner,:] + rate*(inp - 0.5) + trace[winner,:]*rate

In addition to the well-known formula, here the neuron weights increase by the product of trace and rate, thus the frequently used synapse strengthens, in contrast to this there is regular degradation of synapses, their weights decrease modulo (1 — rate)*map[winner,:]. Since constant growth in biological cells is impossible, there is always some limited resource, so there is heterosynaptic competition. Also the weights are limited in the range from -1 to 1, where -1 is the maximum strong inhibitory synapse and 1 is the maximum strong inducing synapse.

The described algorithm is quite simple and allows to perform self-learning of the network even organized in many hierarchical layers.

Next, let’s look at a particular configuration of a multilayer network as an example.

For the considered variant of the network the modules of the next layer will also contain 16 neuron-detectors each, which will have a common receptive field. The receptive field of a neuron will have the size 20x20, which is 25 (5x5) modules of the previous layer. Receptive fields of modules will overlap in increments of 4. The next layer will contain 400 (20x20) modules of 16 neurons, totaling 6,400 (80x80) neurons. Let me remind you that the first layer is 96x96 (9,216) neurons.

Using a similar scheme, let’s add two more layers, the first of which will have a module size of 16 (4x4) neurons. The third layer — the size of the receptive field will be 20x20, step 4, total 64x64 neurons. The fourth layer — receptive field will cover all previous layers 64x64, but we will make the module size 36, actually, the activity of this layer will be characterized by only one neuron out of 36, to get a clear pronounced detector of one digit. And the last fifth, the output layer will contain only 10 neurons by the number of recognized classes. This layer is needed to match the result of the previous layers and the digit labels. This layer will have only one module, so only one neuron from this layer can be active. Each neuron of the output layer will have the same receptive field of size 6x6 (=36).

The result is the scheme: 28x28 (receptors) → 96x96 (the first layer, untrained, simply converts the input signal to SDR) → 80x80 (second layer) → 64x64 (third layer) → 6x6 (fourth layer) → 10 (output layer). The multilayer of this network was chosen to demonstrate all aspects of its operation, in particular “deep learning”, for the MNIST task a simpler configuration can also be used.

The output layer has only 10 neurons by the number of recognized classes; all neurons of the output layer are included into one module. This means that all neurons of the output layer have a common receptive field equal to the size of the previous layer (6x6 in this example), and only one neuron of the output layer can be active.

The penultimate layer consists of one module, it will form a certain representation during self-training, each of 36 neurons will respond to a selected type of input signal, while training of the last layer is based on matching the obtained representation with the label of the class represented by the example network. If the label coincides, then we increase the synapse weight by the rate (0.1), and for all other synapses that are connected with other classes, the weight is reduced by the same value only divided by the total number of classes, i.e. by a number ten times smaller.

As a result, self-training on 5,000 examples presented to the network once, the quality is about 77%. That is, we clearly see that self-learning works and there is an allocation of digit classes, even without reference to the label. But we can improve this result by spreading the influence of the class label deeper into the layers.

Supervised learning or backpreference

The backpropagation algorithm for connective neural networks has become an incredibly powerful engine of machine learning, and it has enabled artificial neural networks to produce amazing results. But this algorithm does not match what we know about the biological nervous system. And it’s not a question of spreading signals back through the network, it’s a question of transmitting a lot of information, transmitting an error value, and there are no known mechanisms, mechanics or organelles in biology that would even evaluate this error itself. There are many articles that propose supposedly biologically close variants of backpropagation, but all of this, if assessed from the biology side, is implausible.

On the one hand, there are feedback connections between different cortical areas involved in hierarchical information processing, as analogs of layers in artificial neural networks, and they sometimes number several times more than direct connections. There is a constant circulation of excitation between cortex areas — reverberation. But there are no complicated calculations. Calculations are made on the organelle-membrane, and the algorithm of these calculations is defined by the Hodgkin-Huxley model, that’s all, in this system of knowledge there is no place for backprop and error calculations.

And yet a neuron at a low hierarchical level with a limited receptive field, no matter how good its self-learning algorithm is, cannot learn effectively without a high-level conductor, without cues coming from a common task or goal.

The modular organization of the network and SDR allows us to apply the backpreference algorithm. Neurons in the module compete with each other, since there can be only one winner in the module, which neuron wins is determined by ascending connections and their weights, but it is possible to give some neurons a preference in this competition thanks to descending connections. It is possible to take any neuron of the last layer and evaluate the activity of which neurons of the previous layer will most likely lead this neuron to the victory, and then it is possible to go down to the lower levels.

Knowing the class label, we can select neurons of the last layer that are most relevant to this label, then moving down from layer to layer, designate those neurons that will eventually give a preference to the right neurons of the last layer, then give those neurons an advantage in winning by adding some value to the final sum of their activations. So we only slightly interfere with the process of self-learning, directing it in a more appropriate direction. We can adjust how much supervised learning will influence unsupervised learning. It makes sense if supervised learning is stronger on the last layers, and self-training is stronger on the first layers, which are closer to the receptor input.

In our example there are three trainable layers, the last, third of them supervised learning operates with a coefficient of 1, which says 100% will win the neuron that gets the preferences. For the second layer the coefficient is 0.5, and on the first layer only 0.1, here self-training has a greater advantage.

It is a very flexible combination of supervised and unsupervised learning, you can turn off control completely at any time and let self-learning take control, and when the necessary labels appear add in control learning. It’s very similar to how the biological brain works and learns.

At the beginning of learning, when the output layer has not yet determined the correlation of the last layer neurons and class labels, supervised learning does not occur at all, only weak correlations are formed through self-training, which are then picked up by supervised self-training and strengthened.

The result using supervised learning on 5,000 examples presented to the network once, the quality is about 93%. Of course the algorithm is inferior in quality metrics to the connection of neural networks with backprop, but it is a very decent result, given that the algorithm does not have any methods for convergence and error calculation. Increasing the dimensionality of the network, and hence its potential capacity with the same training parameters can slightly improve the quality result.

Interpretability of artificial neural network based on the neuro-detector approach

The question of the interpretability of the network is very important, it is not enough to take a model to train it on a large amount of data and trust it completely, especially when it comes to the life and health of people. It is necessary to understand well why this model makes certain decisions, it must be possible to predict its actions on any external stimulus. In this, neural networks based on the neuro-detector approach will be unrivaled.

We have already touched upon the topic of interpretability speaking about small representations, in the example where the latent space has only two dimensions as in the case of two neurons between encoder and decoder, in this example we can easily interpret the learning results, but only one of the narrowest layers. But, for example, in face recognition tasks, the latent space is quite large and has, for example, 18 parameters. It will be quite difficult to represent in a clear and readable form the 18-dimensional dimension and to compare it with face images, not to mention the other layers of the network.

For networks based on neuro-detector approach it is easy to interpret the results, it is possible to perform inverse transformations and get values of receptive field activity that will most likely lead to activity of a particular neuron, and this can be done for any neuron of any layer. In fact, the inverse influence propagation algorithm performs inverse transformations, we will just apply it to interpret the training results.

The ability to interpret the results of network training is needed not only to increase the confidence of the network, but also to be able to exclude unacceptable actions in its work by correcting the network, directly changing its weights, blocking the work of specific neurons, rather than revise the contents of the dataset or apply additional algorithms.

*Reverse analysis by output layer (3rd learner)*

*Reverse analysis by hidden layer (2nd learner)*

*Reverse analysis by hidden layer (1st learner)*

Local minimum of connection networks with backprop

Artificial neural networks and in particular the error back propagation algorithm is one of Man’s greatest discoveries, which will still contribute much to progress. This perfect mathematical tool can transform data into a function, but in the matter of creating general artificial intelligence or human-level intelligence, the development of networks as a mathematical tool will not lead to the desired result. It makes no sense to strive to surpass backprop in quality metrics. The study of neurobiology, neuropsychology and physiology of the nervous system provides insight into the principles on which cognitive functions are based, but these principles are in no way consistent with the mechanisms of classical artificial networks. Therefore, I have long been searching for the “Holy Grail” of a neural network algorithm corresponding to the paradigm of biological brain network operation, and the next stage of research will focus on application of knowledge of network control based on the neuro-detector approach using emotional mechanisms and implementation of learning mechanisms with reinforcement based on this network.

If we imagine the task of creating general-purpose artificial intelligence as a space of solutions, then with connective networks we are at a kind of local minimum, it is impossible not to note the greatest achievements obtained by models based on this algorithm. But there is an alternative, which is only nascent, which does not yet have bright and significant results, and in the future, who knows, with the development will give an opportunity to solve the problem of creating artificial intelligence.

Source on Google colab

Subscribe to DDIntel Here.

Visit our website here: https://www.datadriveninvestor.com

Join our network here: https://datadriveninvestor.com/collaborate

Artificial and biological neural networks

Written by Andrey Belkin