Tuesday, April 17, 2012

Postpruning - First Whack


Zoo dataset A-B, before and after post-pruning.
It's funny I'm working on pruning for decision trees while earlier I was out in the yard pruning real trees.

It would be nice to eliminate the construction of these redundant nodes in the first place, but that seems to mean comparing the counts of the dataset and sub-datasets to see if the test is redundant which in Linq could be expensive.

It's a good thing I implemented edges because in previous versions of this function I was trying to remove and add nodes which threw an exception because you can't modify collections while your iterating over them.  I got around this by filling a list of replacement structures which was iterated over after the main routine was finished.  What was really a deal breaker is that this changed the order of the branches.  I finally realized I don't need to modify any collections I can just modify the nodes and edges.

Yay edges!

        private DtBranch Postprune(DtBranch branch)
        {
            DtBranch bottom = branch;
            //
            if (branch.HasAllLeafChildren)
                return branch;
            else if (branch.HasSingleChild)
            {
                DtBranch child = branch.Children.Single() as DtBranch;
                bottom = Postprune(child);
                bottom.Edge = branch.Edge;
                bottom.Edge.To = bottom;
            }
            else
                foreach (DtNode child in branch.Children)
                    if(child.Kind == DtElementKind.Branch)
                        Postprune(child as DtBranch);
            //
            return bottom;
        }

Saturday, April 14, 2012

Decision Tree Workshop up on Google Code


Decision Tree Workshop is now available at http://code.google.com/p/decision-tree-workshop/.

It's a demo translated from python to C# based upon the article Building Decision Trees in Python
 by Christopher Roach.  I highly recommend reading it if you are unfamiliar with decision trees.

I decided to use Linq for the ID3 code because it's basically a bunch of database operations, and it makes for more understandable code in my opinion.  Besides it's just a demo, I'm not worried about speed at this point.

If the code seems strange it's because I'd like to experiment building the tree in parallel in the future.  So that is why I'm implementing worker classes instead of just using recursion.  Ideally these workers would be able to farm out some work to the GPU.  Is there Linq to GPU?  Hmmm, it appears so.  I think I will look at this Brahma thing.

I've driven myself nuts trying to validate it.  When you  translate a program you don't even really understand, things tend to get lost in translation and then you have to go back and compare the two sets of code over and over until things work.  However, it's that same process that helps to actually understand the algorithms underlying the code.  Well, I'm happy to say I'm satisfied enough with it now or I wouldn't even be writing this post.

Figure 1.  The Purchase dataset, target attribute = purchase?
Figure 1 is an example of a good tree, ideally you want the tree to be pyramid shaped which means that at run-time it requires the least amount of tests to get from the root to a result.

Figure 2.  The Iris dataset, target = class.
Figure 2 is an example of a bad tree, because there are too many child nodes under most of the nodes.  This is because the Iris dataset's attributes are all numeric except for the class attribute.  The solution is to create split points based upon the distribution of the attribute.  This is what C4.5 does.

Figure 3 looks especially depressing because at run-time it will perform a bunch of unnecessary tests.  What's happening is that the records in these branches have the same values for the attribute node that is being constructed, and then finally when it runs out of attributes to split on it reaches the target attribute and branches to multiple results.  The solution here is to use post-pruning like C4.5 does and remove the redundant nodes.

Well that about sums it up.  Next phase will be to clean up ID3 some more and then implement C4.5.
Figure 3.  The Zoo dataset, target = animal.

Wednesday, December 9, 2009

Behavior Language WIP

namespace SequenceTestNamespace

using System
using System.Collections.Generic
using System.Text
using System.IO

using BwBot.Runtime
using BwBot.Runtime.Tasks

class SequenceTest()
    def Main(sequence-test)
        {Console.WriteLine("a");}
        sequence
            {Console.WriteLine("1");}
            {Console.WriteLine("2");}
            {Console.WriteLine("3");}
        {Console.WriteLine("b");}
        sequence
            {Console.WriteLine("1");}
            {Console.WriteLine("2");}
            {Console.WriteLine("3");}
        {Console.WriteLine("c");}
        sequence
            {Console.WriteLine("1");}
            {Console.WriteLine("2");}
            {Console.WriteLine("3");}
            
sequence-test

Tuesday, June 16, 2009

Redo

I finally decided to go forward with a C# redo since the C++ version is too long and hard of a row to plow.

I've also cooked up a new syntax that is basically a prefix version of Python's syntax.

class Test():
def Foo1($x is foo): #forward chaining method
rule:
$x has $y
$y is bar
-->
$y is foo

I've already written a simple tokenizer based on this article about regular expression parsing in C#. Another article that was helpful was Python: Myths about Indentation. That only took one day. The rest of the week was spent agonizing whether to have the parser just read the token list or convert to s-expressions and then parse.

I finally went back to an earlier idea of writing a class called TokenTree that extends List(object) where each element is either a token or a subtree. This should make things easier on the parsing end for various reasons. The s-expression idea was kind of neat but it seems redundant since I need AST classes anyways.

Later on I'll have to dump the RegEx stuff and write a more sophisticated tokenizer but at least this was a quick way to get started. For now I need to start writing the recursive descent parsing functions and the AST classes.

Thursday, December 11, 2008

Agent Language Preview

This is a preview of the Agent Language I'm working on. More details will go in a Wiki or something.

Methods are triggered by the addition and deletion of facts within the state. Methods without triggers are evaluated upon state creation. Rules within the methods are evaluated sequentially. States are evaluated until no new information can be generated at which time any facts that were proposed will cause the creation of successor states into which those proposals will be added along with any other facts in the current state.

This is a work in progress therefore any and all comments or criticisms are extremely welcome!

$ = Logic variable
. = Property of the rule, method, state, or agent

;
; Missionaries and Cannibals 
;

(domain Mac ()
 (method main ()
  ;state success
  (rule
   (topic $g (type goal))
   ($g object $bank)
   ($g missionaries $m1)
   ($bank missionaries $m2)
   ($m1 = $m2)
   ($g cannibals $c1)
   ($bank cannibals $c2)
   ($c1 = $c2)
   ($g boat $b1)
   ($bank boat $b2)
   ($b1 = $b2)
   ==>
   halt
  ) 
  ;state failure
  (rule
   (Bank-1 missionaries $m1)
   ($m1 > 0)
   (Bank-1 cannibals $c1)
   ($m1 < $c1)
   -->
   (assert(.state fail "Too many cannibals on Bank1!!!"))
   ==>
   fail
  )
  ;state failure
  (rule
   (Bank-2 missionaries $m2)
   ($m2 > 0)
   (Bank-2 cannibals $c2)
   ($m2 < $c2)
   -->
   (assert(.state fail "Too many cannibals on Bank2!!!"))
   ==>
   fail
  )
  ;state failure
  (rule
   (.parent (.goal missionaries $m1) (.goal cannibals $c1) )
   (.goal missionaries $m2)
   ($m1 = $m2)
   (.goal cannibals $c2)
   ($c1 = $c2)
   -->
   (assert(.state fail "State undoes parent goal!!!"))
   ==>
   fail
  )
  (rule
   (topic $g (type goal))
   (not($g is achieved))  
   -->
   (temporary(.agent achieve $g))
  )
 ) 
 (method MoveMissionaryPropose (.agent achieve $g)
  (rule
   (topic $bank1 (type bank))
   ($bank1 boat $boat)
   ($bank1 missionaries $m)
   ($m > 0) 
   (topic $bank2 (type bank))
   ($bank2 <> $bank1)   
   -->
   (propose((.agent move-boat $boat)
    (missionaries 1)
    (cannibals 0)
    (source $bank1)
    (destination $bank2) ) )
  )
 )
 (method MoveCannibalPropose (.agent achieve $g)
  (rule
   (topic $bank1 (type bank))
   ($bank1 boat $boat)
   ($bank1 cannibals $c)
   ($c > 0) 
   (topic $bank2 (type bank))
   ($bank2 <> $bank1)   
   -->
   (propose ((.agent move-boat $boat)
    (missionaries 0)
    (cannibals 1)
    (source $bank1)
    (destination $bank2) )
   )
  )
 )
 (method MoveMissionariesPropose (.agent achieve $g)
  (rule
   (topic $bank1 (type bank))
   ($bank1 boat $boat)
   ($bank1 missionaries $m)
   ($m > 1) 
   (topic $bank2 (type bank))
   ($bank2 <> $bank1)   
   -->
   (propose ((.agent move-boat $boat)
    (missionaries 2)
    (cannibals 0)
    (source $bank1)
    (destination $bank2) )
   )
  )
 )
 (method MoveCannibalsPropose (.agent achieve $g)
  (rule
   (topic $bank1 (type bank))
   ($bank1 boat $boat)
   ($bank1 cannibals $c)
   ($c > 1) 
   (topic $bank2 (type bank))
   ($bank2 <> $bank1)   
   -->
   (propose ((.agent move-boat $boat)
    (missionaries 0)
    (cannibals 2)
    (source $bank1)
    (destination $bank2) )
   )
  )
 )
 (method MoveMissionaryCannibalPropose (.agent achieve $g)
  (rule
   (topic $bank1 (type bank))
   ($bank1 boat $boat)
   ($bank1 missionaries $m)
   ($m > 0)
   ($bank1 cannibals $c)
   ($c > 0) 
   (topic $bank2 (type bank))
   ($bank2 <> $bank1)   
   -->
   (propose ((.agent move-boat $boat)
    (missionaries 1)
    (cannibals 1)
    (source $bank1)
    (destination $bank2) )
   )
  )
 )
 (method MoveBoatApply (.agent move-boat $boat)
  (rule
   (.thought source $bank1)
   (.thought destination $bank2)
   (.thought missionaries $m)
   (.thought cannibals $c)
   ($bank1 missionaries $m1)
   ($bank2 missionaries $m2)
   ($bank1 cannibals $c1)
   ($bank2 cannibals $c2)
   -->
   (retract 
                ($bank1 boat $boat)
    ($bank1 missionaries $m1)
    ($bank2 missionaries $m2)
    ($bank1 cannibals $c1)
    ($bank2 cannibals $c2)            
            )
   (assert 
    ($bank2 boat $boat)   
    ($bank1 missionaries (- $m1 $m))
    ($bank2 missionaries (+ $m2 $m))
    ($bank1 cannibals (- $c1 $c))
    ($bank2 cannibals (+ $c2 $c))   
   )
  )
 )
)
;
;
;
(problem Mac
 (topic Boat-1
  (type boat)
 )
 (topic Bank-1
  (type bank)
  (missionaries 3)
  (cannibals 3)
  (boat Boat-1)
 )
 (topic Bank-2
  (type bank)
  (missionaries 0)
  (cannibals 0)
 )
 (topic Goal-1
  (type goal)
  (object Bank-2)
  (missionaries 3)
  (cannibals 3)
  (boat Boat-1)
 )
)

Syntax Highlighter Test

I found a way to insert code snippets into the blog. As long as you can find somewhere to host the files SyntaxHighlighter seems to be the best solution. Here is a good tutorial to get rolling.


class MyClass {
bool blah(){
return 2 > 1 ;
}
}

Sunday, May 11, 2008

Looks are Everything

Well, I managed to rearrange and rebuild everything with Visual C++ 2008 express.

Now it's time to get rid of the uglies!

I went over to a place called Psionic where most Ogre folks get their free models.

Thankfully I found a patch for the Ogre Blender Exporter that does the axis swapping again.

Only problem is that the Jeep model is still backwards. Either I research how to rotate vertices in Blender or I can just rotate the mesh in Ogre.

It's never easy.