Pages:
Author

Topic: Artificial Neural Network & Genetic Algorithm Library For Deja Vu (Read 4648 times)

donator
Activity: 452
Merit: 252
also bitfreak or anyone else interested, I just started a blog on neural nets (and job hunting, but mostly neural nets Tongue) if anyones interested in having a conversation I'm all ears!

http://learningann.wordpress.com/
donator
Activity: 452
Merit: 252

I am not using any libraries other than the libraries I have created myself. The fitness of each net is calculated as a percentage of profit or loss with respect to the starting balance. If the net starts with a balance of $1000 and it makes a profit of $10 during the virtual trading test then it gets a score of 1.01 but if it makes a loss of $10 then it gets a score of 0.99. If it were to double its starting balance then it would obviously get a score of 2. A set number of nets with the highest scores are placed into the elite group for breeding (although the parent pairings and offspring mutations are random). To make the process faster I also included a mechanism which allows nets to "die" if they aren't performing well enough. For example if the net seems to be making a consistent loss or if it isn't placing enough trades then the testing process will be cut short and that net will incur a large score penalty to make sure it isn't included in the elite group. Doing that helps speed up the training process by a large degree and it also mimics natural evolution where the subjects who are bad at surviving die quickly.

Interesting design, does it function well? It seems a tad bit simplistic in its parameters to function effectively but I could see it working given enough backdata and hidden neuron count.

I'm currently converting all of my members and functions into openCL kernels, as I've recently mastered (or mastered enough) openCL over the weekend and I plan on optimising the shit out of my algorithims. I have a plan on having a 3 layered network with 1 output each, however the first network will have something on the order of ~350 inputs and ~250 hidden neurons, so it could definitely get messy, however I'm confident my openCL design will still make short work of the computation time.

However this shit-storm with mtgox probably won't be able for any bot (that I can think of, at least) to be able to predict the mtgox price collapse, however I think I could get really close if I have multiple exchange data with trading bots running on all and communicating with each-other on an even higher tier neural net.
legendary
Activity: 1536
Merit: 1000
electronic [r]evolution
Also (I know your using built in libraries but you might know) how are you calculating normalized fitness?
I am not using any libraries other than the libraries I have created myself. The fitness of each net is calculated as a percentage of profit or loss with respect to the starting balance. If the net starts with a balance of $1000 and it makes a profit of $10 during the virtual trading test then it gets a score of 1.01 but if it makes a loss of $10 then it gets a score of 0.99. If it were to double its starting balance then it would obviously get a score of 2. A set number of nets with the highest scores are placed into the elite group for breeding (although the parent pairings and offspring mutations are random). To make the process faster I also included a mechanism which allows nets to "die" if they aren't performing well enough. For example if the net seems to be making a consistent loss or if it isn't placing enough trades then the testing process will be cut short and that net will incur a large score penalty to make sure it isn't included in the elite group. Doing that helps speed up the training process by a large degree and it also mimics natural evolution where the subjects who are bad at surviving die quickly.
donator
Activity: 452
Merit: 252
for any incredibly complex system like the bitcoin market its possible to forcast market conditions with only 1 network, however holy crap is it unadvised
My plan wasn't to train just one net and then use it to trade for me. My plan was to train multiple nets and then use them to create a committee of machines and average the outputs from all of them. Associative neural networks seem to be an extension of that basic concept but I'm not really sure of the exact details. Autoassociative neural networks seem to a whole other thing and I haven't read much about how they work either. Apparently the Hopfield network is one example of an autoassociative neural network but I'd have to read more about it to understand how it works. However I did read that one of the main uses of the Hopfield network is to pick out patterns from noisy data so I can see why you'd think that was a good choice.

Quote
Also your bot being able to pick out trends, is it autoassociative and unsupervised?
I described exactly how my nets are structured when I described how the DNA is formatted. The design is not autoassociative as far as I know but the nets are obviously trained using unsupervised learning because they use a genetic algorithm for training and nothing else. All I do is give them some training data and then let them evolve by themselves based on how well they perform at virtual trading (which is linked to their ability to predict future trends in the data). If I had of ranked them based purely on their ability to predict what data will come next in their training data then I would still have to build my own trading algorithms based on the predictions they make. I wanted them to evolve their own trading strategies by themselves so I ranked them by how well they performed at virtual trading.

back again, it's been a crazy week. My foundation net is almost functioning perfectly but I think I need to tweak it for a few more hours to get the RMS per iteration to improve significantly.

I'm currently using a really interesting hybrid approach that I came up with, I order every individual 0 - popsize -1, and then based on that order put it into an expotential decaying function to determine if it's a parent or not. I then have an if statement in the reproduction function that takes the order of the individual as the parameter for another expotential decaying probability boolean, and then if it succeeds the genes of the parent are directly passed to the child (child = parent), I've worked it out so that theres only a probability of ~10% for the best of the generation but thats something that I'll be tweaking.

the rest of the reproductive system is all stoichastic, everything is probability based like parent selection, gene pairing, mutation, etc. I've done a few monte carlo simulations (IE I run a very large population iteration through and cout every if statement success). For a genetic algorithim I think this method is solid, but as a backup I also have a "shake" function, which does a single iteration of back propagation to every child before its tested for fitness. One thing that I'm worried about with my back propagation is that since my training regimen is "all in" (IE I run each individual through the entire training data and then average the RMS of the output errors), I run the risk of having a faulty backpropagation, since I'm averaging the values of my inputs with all of the possible values that input can have throughout the training data.

I think I might do a few tests without the back prop to see if that's the issue, and have a read in some of my textbooks to determine what the best way to incorporate back prop into a genetic algo is, I may have to change my training regimen to be stage based.

Also (I know your using built in libraries but you might know) how are you calculating normalized fitness? I'm currently taking the negative expotential of the average RMS over the training data times a lamda constant (IE fitness = e^(-lamda*avg_rms) for an individual and then dividing each individual fitness by the average fitness of the iteration, so I always have fitness values greater than 0, and their always "different" enough for the computer to recognize the differences.


I haven't gotten to the point yet of testing my network on real data, I'm still testing with an XOR table, but since I'm writing this from scratch I feel that I'll be able to reach the goal of a real bitcoin forcast/profit maker soon, once my foundation is finished everything should be (knock on wood) as easy as using a library.
legendary
Activity: 1536
Merit: 1000
electronic [r]evolution
for any incredibly complex system like the bitcoin market its possible to forcast market conditions with only 1 network, however holy crap is it unadvised
My plan wasn't to train just one net and then use it to trade for me. My plan was to train multiple nets and then use them to create a committee of machines and average the outputs from all of them. Associative neural networks seem to be an extension of that basic concept but I'm not really sure of the exact details. Autoassociative neural networks seem to a whole other thing and I haven't read much about how they work either. Apparently the Hopfield network is one example of an autoassociative neural network but I'd have to read more about it to understand how it works. However I did read that one of the main uses of the Hopfield network is to pick out patterns from noisy data so I can see why you'd think that was a good choice.

Quote
Also your bot being able to pick out trends, is it autoassociative and unsupervised?
I described exactly how my nets are structured when I described how the DNA is formatted. The design is not autoassociative as far as I know but the nets are obviously trained using unsupervised learning because they use a genetic algorithm for training and nothing else. All I do is give them some training data and then let them evolve by themselves based on how well they perform at virtual trading (which is linked to their ability to predict future trends in the data). If I had of ranked them based purely on their ability to predict what data will come next in their training data then I would still have to build my own trading algorithms based on the predictions they make. I wanted them to evolve their own trading strategies by themselves so I ranked them by how well they performed at virtual trading.
donator
Activity: 452
Merit: 252
I haven't yet sat down and decided on exactly what kind of structure I want to train
I took a rather lazy approach to this problem. I simply generated random nets with a fixed number of inputs and outputs. I fed market data into the inputs and then conducted virtual trades based on their outputs. The fitness of any given net was determined by how much profit they made in the virtual trading. My hope was that I would be able to evolve a net with quite good trading skills. That is why I used a different training data set for each generation, so that they could learn to pick up on a range of different trading patterns and not just one pattern from a single set of training data.
I'd recommend you pick up some light reading on theory, for any incredibly complex system like the bitcoin market its possible to forcast market conditions with only 1 network, however holy crap is it unadvised to use only 1 just because of the absolute mountain of neurons + training data you'll have to sift through to get any meaningful forcasts, which means that each iteration of your network is going to take an obscenely large amount of time for little gain.
Also your bot being able to pick out trends, is it autoassociative and unsupervised? Those are the best net designs for trend spotting.
legendary
Activity: 1536
Merit: 1000
electronic [r]evolution
what's your current method of error reduction? Are you using back propegation? Genetics?
I'm only using a fairly simple genetic algorithm to "evolve" the nets.

I'm also strongly considering transforming my currently 100% genetic algorithm into a hybrid with a single backpropegation step after the creation of each child.
Probably a good idea. Training the nets using only a genetic algorithm seems be a very time consuming task. A hybrid method might speed it up significantly and avoid the limitations of backpropegation at the same time.

I haven't yet sat down and decided on exactly what kind of structure I want to train
I took a rather lazy approach to this problem. I simply generated random nets with a fixed number of inputs and outputs. I fed market data into the inputs and then conducted virtual trades based on their outputs. The fitness of any given net was determined by how much profit they made in the virtual trading. My hope was that I would be able to evolve a net with quite good trading skills. That is why I used a different training data set for each generation, so that they could learn to pick up on a range of different trading patterns and not just one pattern from a single set of training data.
donator
Activity: 452
Merit: 252
.....

Interesting! I do like your format of the connection system, I'm currently using a significantly more unwieldly series of if statements to generate the number of weights and to designate the design and layout of the network (I'm making assumptions that aren't really what should be done for a foundation library), what's your current method of error reduction? Are you using back propegation? Genetics?

I'm currently using genetics with an extensive RNG factor for essentially everything including gene splicing. I haven't yet begun work on transforming my weight values into grey code (http://en.wikipedia.org/wiki/Gray_code if you want a refresher) to minimise the generation cycle length just yet. I'm also strongly considering transforming my currently 100% genetic algorithm into a hybrid with a single backpropegation step after the creation of each child.

I haven't yet sat down and decided on exactly what kind of structure I want to train, but I'm strongly considering the large heirarchical multi network specialist strategy when it comes to trading. I plan on having 1 autoassoicative network to decide what the trend is, then for each kind of trend that the network decides is unique, a new specialist recurrent network designed to work in that specific regime. The only problem I can see with that are the boundary states between each "trend", but I think thats an issue that can be solved with a strong autoassociative network training regime for the top tier network.

With regards to open source, I'm personally not against open source in the slightest; I haven't used any open source code  (besides the boost:: library) myself but I can definitely imagine it being helpful and I would love to help others, but right now I have a 76 year old unemployed father who I'm taking care of and I'd like to make enough money so I don't have to choose between myself and him. Hopefully when this project is complete I won't have to worry ahah, sorry for spilling out all my personal life in a tech thread  Tongue
member
Activity: 114
Merit: 10
A quick way to test out neural network schemes and number of neurons in each lvl on specific datasets  would be to use the Matlab Neural Network toolbox.  Once you figure out what gives you the best result, you can port that schema back to C/C++ or PHP if you so desire.
legendary
Activity: 1536
Merit: 1000
electronic [r]evolution
bitfreak: I'm nearly done my foundations, I've written an RNN from scratch (nearly got it to work, just going to test for bugs tomorrow). Strongly contemplating making this an open source project but I'm on the fence being that I'm literally sleeping on my dads couch until he throws me out. Looks really good though, can't wait to see how it looks finished.
I would be interested in seeing your source code but don't make it open source if you don't want to. What I designed is also a type of recurrent neural network, although I think it's a bit different from most other types of RNN's. I'll give you an example of what one of my nets would look like. First of all I want to show you what the DNA code for a net looks like, here is an example DNA string for a very simple net:

I:0/I:0|0:1-1/0:1;2-3/0:1;2-1/0:1-2/0:1;2-3/C:4;5/B:1|1:3;4-2/1:1;4-1/1:1;5-1/1:1;2-3/1:3;4-3/C:5;5/B:1|2:1;2;3;4;5-2/B:1#4,23,13

I designed it so that random DNA strings can easily be generated and stored in a simple text file and so that a net object can easily be constructed from the DNA. First I will quickly explain the format of the DNA string and then I'll post a picture which shows the actual structure of the net associated with the DNA string above. The weights are stored in a separate file from the DNA, but I wont explain the format of the weight text because it's fairly simple to understand and not really important to this explanation.

Everything after the # character is "meta data" for the net. So we have the numbers 4, 23 and 13 separated by commas after the # character. What that means is that the net has 4 layers, 23 connections (not including bias connections and context connections) and 13 neurons (not including bias neurons and context neurons). Each layer in the DNA string is separated by the | character, so if we remove the meta data and then split the string into substrings using | as the separator we get the following strings of text:

I:0/I:0
0:1-1/0:1;2-3/0:1;2-1/0:1-2/0:1;2-3/C:4;5/B:1
1:3;4-2/1:1;4-1/1:1;5-1/1:1;2-3/1:3;4-3/C:5;5/B:1
2:1;2;3;4;5-2/B:1

So now it becomes clear that we have 4 layers in this particular example. Then as you may have already guessed, each neuron is separated by the / character. The two neurons in the first layer are obviously the input neurons, I:0 means that it is an input neuron with a default value of 0 (kind of pointless to have a default value but I wanted consistency in the format). The second and third layers both have 5 neurons each as well as one context neuron and one bias neuron. B:1 ones obviously means a bias neuron with an output value of 1.

The context neurons are a little bit more complicated, if we take the context neuron in the second layer it reads "C:4;5" which means it's a context neuron with a link to the 4th neuron in the the layer in which it exists and it will remember and feed back the output of that neuron for 5 iterations before it accepts a new output from the 4th neuron. In other words it will hold the same value for 5 iterations and then it will accept a new value from the 4th neuron in the 2nd layer and it will remember that value for 5 iterations and feed it back to the 4th neuron. We could have more than 1 context neuron on each layer but I've avoided that for this example.

Now if we look at the 5th neuron in the 2nd layer it has a value of "0:1;2-3". The 0 means that it has connections to neurons in layer 0 (the input layer) and the 1;2 part means that it connects to the 1st and 2nd neurons in layer 0 (so it has a connection to both the input neurons). So it's listing where connections come from, not where they go. The 3 after the - character refers to the activation function. Every neuron in this net can be assigned one of three different types of activation functions, meaning not all neurons need to have the same activation function. If a value of 0 is placed after the - character it means the neuron doesn't use an activation function at all.

It is even possible in my system to have a single neuron link to neurons which are separated by multiple layers. For example if one of the neurons in the 3rd layer (layer 2 if we start from 0) had a value of "0:1;2,1:1;3-3" then it would connect to neurons in the first two layers (layer 0 and layer 1) because it can be separated into "0:1;2" and "1:1;3" using the comma as the separator (keeping in mind I removed the -3 from the end because it's the activation function and not part of the connections). But I haven't included multi-layer connections in this example because it would make it a bit too messy.

I also left out the bias neurons and their connections from the following diagram because it would just make it unnecessarily messy and overly complex. But you can easily imagine the bias neurons feeding a value of 1 into all of the neurons (apart from the input neurons). The context neurons and their corresponding connections are shown in the lighter gray color. The numbers inside the context neurons represent their memory length in iterations. The numbers inside the rest of the neurons represent the type of activation function. The numbers inside the input neurons are 0 because they have no activation function.

Hopefully you understood my above explanation and you will be able to see how the DNA string example I gave above can be transformed into the following neural network:

donator
Activity: 452
Merit: 252
bitfreak: I'm nearly done my foundations, I've written an RNN from scratch (nearly got it to work, just going to test for bugs tomorrow). Strongly contemplating making this an open source project but I'm on the fence being that I'm literally sleeping on my dads couch until he throws me out. Looks really good though, can't wait to see how it looks finished.
donator
Activity: 452
Merit: 252
hey guys, I've not got the time to make a blog post since I've sort of put a deadline for myself to master ANNs in the next 2 weeks, but I found a text that's absolutely wonderful if you have knowledge of C/C++.

"Practical Neural Network Recipes in C++" by Tim Masters, extremely well written and useful text.
legendary
Activity: 2562
Merit: 1071
Ahh ok, that looks pretty good too but I would prefer a course dedicated to neural networks so I think I'll keep watching the lectures from the course I found and then maybe check out the machine learning course when I'm finished (is it also free?).

Sure, have fun.  Smiley And yes, the machine learning course is also free; in fact I think all the courses in coursera are, which is pretty cool.
legendary
Activity: 1536
Merit: 1000
electronic [r]evolution
Ah, sorry, actually I meant this one: https://www.coursera.org/course/ml with Andrew Ng from Stanford University. You can see the videos the course is based on in the preview, and it starts with neural networks at week 4.
Ahh ok, that looks pretty good too but I would prefer a course dedicated to neural networks so I think I'll keep watching the lectures from the course I found and then maybe check out the machine learning course when I'm finished (is it also free?).
legendary
Activity: 2562
Merit: 1071
Ah, sorry, actually I meant this one: https://www.coursera.org/course/ml with Andrew Ng from Stanford University. You can see the videos the course is based on in the preview, and it starts with neural networks at week 4.
legendary
Activity: 1536
Merit: 1000
electronic [r]evolution
I assume you are talking about this class?

https://www.coursera.org/course/neuralnets

It actually seems to be exceptionally good based on what I've seen so far. And it's free! Thanks a lot for the link.
legendary
Activity: 2562
Merit: 1071
I've watched almost all of his videos, I'm extremely grateful that you could show me this mans channel, what a brilliant teacher.
No problem, he is certainly a great teacher. He is able to explain things in a very clear and simple fashion and he doesn't overcomplicate anything beyond what is necessary. I've tried watching other lectures on ANN's and all of them are very hard to follow compared to this lecture series. He also has a lot of other great lecture series related to ANN's if you look through the videos in his channel.

I personally liked following the Machine Learning course in https://www.coursera.org/. The teacher is great and covers most of the basics; unfortunately, it never gets to anything much more elaborate.
legendary
Activity: 1536
Merit: 1000
electronic [r]evolution
I'll probably draw a nice flowchart tonight after I finish mastering contextual neural nets
Please do, as I'm having trouble fully understanding the training process you intend to use.
donator
Activity: 452
Merit: 252
From what I've been reading, we're following a similar approach just that you aren't nesting your nets. The approach I'm taking to determining the best time periods, neural configurations, and what inputs to use are what I'll be determining with a "high net", this guy uses a genetic algorithim to change the inputs used and what neural configuration will work best for the current problem.

This "neural network maker" will first make a "trend spotting" noise filter, I'll most likely be working around the 10 min interval with 6 contextual neurons (short term memory of 60 min), however I'll be doing some small scale tests to determine the perfect point for that, and I'll most likely be reinforcement training this guy.

The second project for the neural network maker is to create specialist trading nets that are designed to work within the specific trends that the trend spotter picks up, this will probably be pretty time consuming as I'll have nested GAs running simutaniously, but it should only really have to be done once if enough different trends are created by the trend spotter. Also this part I plan to have each GA running parallel with an openCL hook for some GPGPU computing with my 7950, which I figure will speed up the training time considerably.

I'll probably draw a nice flowchart tonight after I finish mastering contextual neural nets and start work on genetic and simulated annealing algorithims, this shit is super cool.
legendary
Activity: 1536
Merit: 1000
electronic [r]evolution
What language are you writing in?
Like I said, the library is written in PascalScript because it's meant to be used with Deja Vu (you can download the library from the Deja Vu page). I used a very object orientated and modular approach in my design of the net structure and I didn't use matrices anywhere in the code as would be typically done in a ANN library. The most important files are the core.pas and evol.pas files, they contain the bulk of the code related to the ANN and genetic algorithm stuff. I took a relatively simple approach to the breeding process, the layers in the parent nets are essentially just spliced together to produce a child net (the parent nets must have the same number of layers and the same number of neurons on each layer otherwise they wont breed properly, I haven't really figured out a way to get around that, it's like asking how our DNA evolved and what would happen if a human was born with an extra genome; how could that person breed with other humans?).

The entire process is basically this: first a specified number of random nets are generated to form the 1st generation of nets. Then the fitness/performance of each net in the group is tested and a desired number of the best performers are placed into the elite group. The rest of the nets are discarded and the elite nets are used for breeding the 2nd generation of nets. Then the fitness of all the nets in the 2nd generation are tested and the elite nets from the previous generation are also tested (this is necessary because the training data is broken up into multiple sets so that each generation isn't faced with exactly the same problem, which prevents the nets from honing in on just one specific pattern in the training data). Then the best performers from the 2nd generation are placed into the elite group (elite nets from previous generations can stay in the elite group if they continue to perform well) and they are used to breed the 3rd generation. And this process repeats over and over again for the specified number of generations, and the final group of elite nets are hopefully going to be trained fairly well.

I say that I've never gotten anything useful from it, but to tell the truth I haven't tried running the training process for more than 24 hours. One of the major hurdles to training the nets properly is working out what time interval is the most profitable. High frequency trading doesn't seem to be as profitable as longer term trading nets. I can't really remember now what interval of time was giving me the best result, I think it was 30 mins or maybe an hour. The other major problem is that even if you get a well trained net you need to make sure that the simulated trading which was used to test the fitness of the net is going to carry over into real life trading, and that appears to be a bit more difficult then you might imagine at first glance. Anyway I've rambled on quite a bit and I've gotta go get some sleep so I can't continue this discussion right now but let me know if you do create a blog to document your progress because I'd be interested to see how you go.
Pages:
Jump to: