Archive for April, 2011

Gaussian mutation in Clojush

Sunday, April 24th, 2011

A couple of Clojush updates, mostly involving a new mutation operator that perturbs floating-point constants in a program with Gaussian noise. Details below.

Available as usual from: https://github.com/lspector/Clojush

220110424: - Added Gaussian mutation for floating point literals. This is 
            a genetic operator on par with ordinary mutation, crossover,
            and simplification, with the probability for applying this operator
            set with the gaussian-mutation-probability argument to pushgp
            (which defaults to zero). The behavior of this operator, when used,
            is controlled by two other arguments to pushgp, 
            gaussian-mutation-per-number-mutation-probability (which is the
            probability that any particular floating point number in the 
            program will actually be mutated -- this defaults to 0.5) and
            gaussian-mutation-standard-deviation (which specifies the standard
            deviation of the Gaussian noise generator that is used to 
            produce changes to numbers -- this defaults to 0.1).
          - Added examples/gaussian_mutation_demo.clj to demonstrate Gaussian
            mutation.
          - Added examples/korns_regression_p12.clj, a symbolic regression
            problem based on Michael Korns's draft chapter from GPTP11.

No-pop tagging in Clojush

Saturday, April 9th, 2011

Added an option that changes the behavior of tagging instructions in Clojush, so that they tag without popping the tagged items off of the stacks from which they came. Details below.

Available as usual from: https://github.com/lspector/Clojush

20110409: - Added support for no-pop tagging through a var called
            global-pop-when-tagging (holding an atom with a boolean value)
            and a boolean argument to pushgp called pop-when-tagging. 
            The defaults are true, for backwards compatibility. If 
            @global-pop-when-tagging is false (which will result from 
            passing false as a :pop-when-tagging keyword argument to pushgp)
            then executing instructions of the form tag_[type]_[number]
            will tag a value as usual, but the tagged item will not be popped
            from its source stack.

Writing new Push instructions for Clojush

Thursday, April 7th, 2011

A few key facts and tips for anyone who wants to write new Push instructions for use in Clojush:

  • All Push instructions in Clojush must be written to take one argument (a Push interpreter state) and return a Push interpreter state. When you think of the execution of a Push program you should generally think of the instructions as taking arguments from stacks and pushing results on stacks, but at the Clojure level the function that implements an instruction actually takes a complete Push interpreter state (containing all of the stacks with all of their contents) and returns a complete Push interpreter state (containing stacks that have possibly been modified by the execution of the instruction).
  • Push instructions in Clojush should be defined not with the normal Clojure function definition macro “defn” but rather with “define-registered”. This does the work of defn but also registers the instruction in the Push instruction table, so that the Push interpreter will know that it is a Push instruction and know how to call it properly.
  • To understand what a call to define-registered should look like it helps to first appreciate how to define Clojure functions in yet another way, with the “def” special form (which creates a var) and the “fn” special form (which creates a function). A function defined as (defn foo [x] (+ x 2)), which returns the value of its single argument plus 2, can also be defined as (def foo (fn [x] (+ x 2))). In this version the fn expression returns an anonymous function, which def then puts in a var called foo. This is just like defining a constant var, with an expression like (def bar 3), but with a function as the value. The defn version is a little neater, but defn is really just a macro that expands into the version with def and fn.
  • Calls to define-registered should look like calls to def with a function value, in particular with a function value that takes one argument (which will be a Push interpreter state) and which returns a possibly-modified Push interpreter state.
  • A Push interpreter state is a Clojure struct, which is really just a Clojure hash-map with some pre-defined keys. You can access the stacks using hash-map accessors to pull things out (for example, (:integer state) to get the integer stack) and assoc to return a state with a modified stack (for example, (assoc state :integer ()) to return state but with an empty integer stack), but it is probably simpler and clearer in most cases to use the utility function stack-ref to access stacks and the functions push-item and pop-item to return states with modified stacks.
  • One very simple instruction definition that is in a lot of examples is:
    (define-registered in
      (fn [state] (push-item (stack-ref :auxiliary 0 state) :integer state)))

    This defines and registers a function called “in” that takes one argument, a state, and returns the state that results from pushing something — specifically the thing that’s on the top of the auxiliary stack — onto the integer stack in the given state.

  • An example that demonstrates a bit more, and which is probably a good template for many more complex instructions, is float_sin:
    (define-registered float_sin
      (fn [state]
        (if (not (empty? (:float state)))
          (push-item (keep-number-reasonable (Math/sin (stack-ref :float 0 state)))
            :float
            (pop-item :float state))
          state)))

    This follows a generally useful pattern of first checking to see if the conditions for executing the instruction are met — in this case, that means checking to make sure that there’s at least one thing on the float stack (that is, that it’s not empty) — and then returning either the result of executing the instruction or the original state unchanged. In this particular case executing the instruction means replacing the top float with its sine, which we get by pushing the sine onto the float stack in the state that results from popping the float stack in the original state. Here we compute the sine by grabbing the top item on the float stack in the original state, calling Math/sin on it, and then running it through a utility function called keep-number-reasonable that prevents the generation of problematic numerical values (see the implementation in clojush.clj for details).

  • Sometimes the Clojure “threading” operators like ->> can be handy for running a state through a bunch of modifications. See the definition of integer_fromfloat in clojush.clj for a simple example. You never have to do this; it’s just a convenience if you like the threading style.
  • If you use the utility function top-item to access the top item of a stack, then note that it will return :no-stack-item if there was no top item to return. This will also be returned from stack-ref when it tries to access an empty stack, but note that stack-ref is not safe for invalid positions (like trying to get the second item on a stack with only one item), so you should check that there are enough arguments (e.g. with (count (:boolean state)) to see how many things are on the boolean stack) and avoid calling stack-ref if it might be looking too deep.
  • You will notice that many of the Push instructions defined in clojush.clj are actually defined with the help of a higher-order function that returns the necessary fn form when given additional arguments, e.g. for the type on which you want to operate. For example, look at the definitions of exec_dup and all of the other dup instructions. Rather than writing all of the code for checking if the necessary argument is there, and then duplicating it if it is there (but returning the state unchanged if it is not), over again for each type, I wrote a higher-order utility called “duper” that takes a Push type and returns a function with all of the right stuff for the appropriate stack. That means that I only had to write the guts of the instruction once, and that I could define dup for each new type with a simple new statement like (define-registered boolean_dup (duper :boolean)). This is one of the nice things about working in a Lisp-like language, but it’s just a convenience and you could define all of the Push instructions that you need without using this trick.