Tuesday, November 16, 2010

Recursion

Recursion Extends Code Usefulness Rather Seriously In Our Neuron
Smoke that!

After running quite a few iterations for the net, it became apparent that very little useful linking went on. From time to time a long chain would emerge only to be dismantled a few hundred iterations later. For the most part, neurons simply were never active, never linked to the source or the drain.

The problem seemed to be that random linking neurons is simply not effective enough. Even with a high number of neurons, once the critical path between source and drain is broken, it's rare to see it re-established.

Moreover, because of the pathway-rewards and active decay, once the source-drain path is broken, no pathways are rewarded, because none provide output.

So it seems a modification is in order: Neurons must be added to an existing pathway

This means that a basic net consists of at least one input, directly connected to at least one output (neurons with callbacks registered). A net can then have any number of free, unlinked, neurons at startup. These will randomly link with the stipulation that they must link in such a way that they compose a path between the source-drain pair.

So, a neuron may intercede between two connected neurons, effectively extending the path. Or it may bridge entire sections of an existing path, forming a fork. In more complicated instances such a bridged connection may join two existing paths.

Consider:

src->a->b->c->drain

Connect x and get

src->a->x->b->c->drain
or
src->a->b->c->drain and src->a->x->drain

where x now bridges the b->c section of the original path

All that is good and well, but we will need some help in linking neurons up in a sensible way. To do this, we add two helpers to the Neuron class: indexToDrain and pathToDrain, detailed below.

The basic gist is that a neuron needs to know which of it's neigbours ultimately leads to a drain. In these functions, a further stipulation is that it must be the shortest route to a drain. This should keep the net's as tight as they can be. Long neuron chains/paths are still possible, with interlinking.

We use recursion to resolve this problem, so that each neuron just needs to know it it is a drain, and if not, if it's direct neighbours are.

There is also an added guard against doubling back on the path, so a neuron will not consider it's caller (by definition a neighbour) when considering routes to a drain.

neuron.h:

...
public
int indexToDrain();
int distanceToDrain(Neuron* from);
...

neuron.cpp:


int Neuron::indexToDrain() {
int index = ERR_NOT_FOUND;
if (callback != 0) {
return NEURON_RANK;
} else {
int distance = 0;
for (int i=0; i< NEURON_RANK; i++) {
if (_links[i] != 0) {
int link_distance = 1 + _links[i]->distanceToDrain(this);
if ((link_distance > 0) &&((link_distance < distance) || (distance == 0))) {
//if distance is still 0 here, we are not a drain and we have not found a drain either(yet)
distance = link_distance;
index = i;
}
}
}
if (distance == 0) {
return ERR_NOT_FOUND;
}
}
return index;
}


int Neuron::distanceToDrain(Neuron* from) {
int distance = 0;
if (callback != 0) {
return THIS_NODE;
} else {
for (int i=0; i< NEURON_RANK; i++) {
if (_links[i] != 0 && _links[i] != from) {
int link_distance = 1 + _links[i]->distanceToDrain(this);
if (link_distance < distance || distance == 0) {
//if distance is still 0 here, we are not a drain and we have not found a drain either(yet)
distance = link_distance;
}
}
}
}
if (distance == 0) {// we never found a drain
return ERR_NOT_FOUND;
}
return (distance);
}




With these in hand, and tested of course, I can now make the net to do proper linking to maintain src->drn pathways...

Thursday, November 4, 2010

#include "code"

Last post I covered the project overview, the basic Makefile and also introduced some classes which seemed to fit the bill for what I want: bR41nzz...

In this post I will delve a little deeper into the class implementations themselves. I'm not going to explain every line of code, I presume you can read enough of it to follow.

So, lets start with the smallest building block, and digress from there...

Note: I'll be describing the system in headers files first, building them up as I go along. Then, when it's time to test the concept, we will write the test and code to satisfy that test.

Another note: I'll be writing the most basic c++ I can, always going for the simplest route rather than the more correct one. For instance, rather than using a private member with a getter and setter, I'll use a public member, until such time as I find I need to implement accessors for some reason.


Neuron
./source/neuron.h


  1. #ifndef __NEURON_H__

  2. #define __NEURON_H__

  3. class Neuron {

  4. //...

  5. };

  6. #endif /* __NEURON_H__ */




First we will need a header guard, it's ALWAYS a good idea to put them in, they prevent the header file from being included more than once in any compile.

Next, I'll add some obvious parts to the class:
  • constructor/destructor
  • store a value for the neuron (an int for now, as I mentioned before)
  • some way to act/fire the neuron
  • a way to pass input to and receive output from the neuron
  • a way to link to other neurons
./source/neuron.h



  1. class Neuron {

  2. public:

  3. Neuron();

  4. ~Neuron();


  5. int connect(Neuron *dest); //connect to another neuron

  6. Neuron* operator[](const int index) const; //get a connection by index

  7. bool input(const int value);

  8. TFunctor* callback;

  9. private:

  10. int (*_gate)(int a, int b); //the mathematical function to use when this neuron fires

  11. int _value;

  12. Neuron* _links[NEURON_RANK]; //all the neurons this one connects to


  13. };




So, now I can create a neuron, and destroy it. I can connect it to another neuron and access linked neurons by index. I have a gate/action , and I have a value that the neuron stores. I also have a callback to get output from the neuron.

The callback is a functor, I'll write up more on that in the next post

_gate is a function pointer. I've opted to implement them separately as non-class in-line functions, so that they are easily interchangeable (to my thinking, a neuron should know it's value, not what to do with it.).

The gate functions are implemented like this:

./source/gate.h


  1. #ifndef __GATE_H__

  2. #define __GATE_H__

  3. inline int add(int a, int b){ return(a+b);}

  4. inline int subtract(int a, int b){ return(a-b);}

  5. inline int multiply(int a, int b){ return(a*b);}

  6. inline int divide(int a, int b){ if (b>0) {return(a/b);} else { return 1;}}

  7. inline int pass_a(int a, int b){ return a;}

  8. inline int pass_b(int a, int b){ return b;}

  9. #endif /* __GATE_H__ */




Where pass_a and pass_b are special gates which just propagate one of their inputs.
(I think some neurons don't have to do anything, they just relay info. Some will relay what is told to them, others will relay their own values regardless of what is told to them)

Before I delve into how neurons work together, I think it's time to test a single neuron on it's own.
Test it like a bi-curious nun
./test/test_neuron.cpp


  1. //... implementation omitted for brevity

  2. int main() {

  3. printf("test_constructor: %d\n", test_constructor());

  4. printf("test_connect: %d\n", test_connect());

  5. printf("test_operator[]: %d\n", test_operator());

  6. printf("test_input: %d\n", test_input());

  7. printf("test_callback: %d\n", test_callback());

  8. printf("end of tests\n");

  9. }


Wednesday, November 3, 2010

The bird men are coming!

...also known as: Some c++ for the hell of it.

From time to time I fire up a c++ compiler. I started my career out as a c++ developer and have written some fairly cool stuff over the years, I like to think.

Every few years I return to one of two problems, haunting me since my varsity days. The first: I've always wanted to write my own text MUD, from scratch. The second: Neural networks and code learning.

In the next few entries in this blog, I'll take on problem two, again*.

* I looked at mudconnect today to see whats out there.. man, muds are sooo retro. It's like playing a modfile and pressing the turbo button on your PC

Brainz...
Bear in mind I have no extensive formal education in neural networking or what passes under the deceptively cool sounding banner of "AI" in computer science.

To my mind, most of what they teach in AI, even at university level, is just rubbish. There is a lot of formulaic neural nets and that weird-as-shit algebra they apply to it. But in the end, all they are building is a system of weights and counter-weights which calculate a known result. Whats the use of that ?

I must give some credit to efforts along the lines of swarming and hive intelligence. There is definitely something to what these guys are up to. I realise some of them build neural nets into the swarming parts, but in the end thats just to get an "expert system", the AI itself never learns. It just gets better at doing a known thing.

Now, enough of my opinionated rant (hey, this is a blog) and on to some code...

The wind up
I work on linux, so all the examples and references here are in that context. If you are using anything else, I'm sorry. If you are using Visual c++, just fuck off, and take your crap compiler with you.

First, there is the makefile. I won't go through the details of a makefile here, there are many many many pages on the topic already, go read them. I opted for a simple makefile, foregoing my old nemesis autotools. I love autotools, it's a fantastic toolset, but I just don't have the time or patience to read up on it now (it's been years, I used to know the auto-book backwards.), and this is not supposed to be such a big project, so I will manage the makefile manually.

The project folders layout will look like this:
./
./bin
./source
./test
./objects

If you build it they will come

Makefile:
PROGRAM = zombie

INCLUDES = \
-I./source

OBJ_DIR = ./objects
BIN_DIR = ./bin
SRC_DIR = ./source
TST_DIR = ./test

CXX_SOURCES = main.cpp
CXX_OBJECTS = $(CXX_SOURCES:%.cpp=$(OBJ_DIR)/%.o)

TEST_FOO_SOURCES = foo.cpp test_foo.cpp

CXX_FLAGS = -c
CXX = g++

all: $(PROGRAM) tests

tests: test_foo

$(PROGRAM): $(CXX_OBJECTS)
$(CXX) $(CXX_OBJECTS) -o $(BIN_DIR)/$@

$(OBJ_DIR)/%.o: $(SRC_DIR)/%.cpp
$(CXX) $(INCLUDES) $(CXX_FLAGS) $< -o $@

clean:
$(RM) $(OBJ_DIR)/* $(BIN_DIR)/*

$(OBJ_DIR)/%.o: $(TST_DIR)/%.cpp
$(CXX) $(INCLUDES) $(CXX_FLAGS) $< -o $@

test_foo: $(TEST_FOO_SOURCES:%.cpp=$(OBJ_DIR)/%.o)
$(CXX) $(TEST_FOO_SOURCES:%.cpp=$(OBJ_DIR)/%.o) -o $(BIN_DIR)/$@
We will swap out 'foo' with something a bit more useful later on.

Building blocks
I'll try and run through this in the same way that I approached the programming.

First off, we need to sense of what we want to build. The idea is to build a learning brain. Thats a very ambitious statement, but not if you brack it down into it's component parts.... Lets start with building a brain. The idea is that if we build a half decent brain, the learning will take care of itself.

So, building blocks: Brains have neurons, our brain will need them too. Moreover, brains have functional groups of neurons, networks of them that all work together to achieve the same goal. So we will need 'nets'. That is 1 brain, many nets per brain, many neurons per net.

To keep things simple on the code front, I've decided to only deal with integers (int) for now. You will later see that swapping this out with any other arbitrarily complex type is trivial. I just did not want to get bogged down with anything but the core for now.

So, some classes we will likely need:
neuron
net
brain

I will also need some tests to verify progress and test parts of the classes. I will be writing these myself, that are fairly trivial. I will do a unit test program per class:
test_neuron
test_net
test_brain
Something I should mention here is that every neuron needs to do something. Neurons store information, and act on that information. You can imagine it with logic gates, or in this case mathematical functions. I've decided to use 4 basic integer functions:

add
subtract
multiply
devide

Next episode... the code