Stackless tagging

October 28, 2010
by Lee Spector (lspector)

Following up the recent post and comments onĀ Lexicographic names, along with some off-blog discussion, I think I’m ready to implement something called “stackless tagging.”

First the tagging part: This is really just what we’ve been discussing as lexicographic names, but (as we discussed) the name “name” isn’t quite right, and we might want this new thing to co-exist with Push3 names some day. So we were casting about for other names/metaphors, ranging from the vacuous (“index”) to the wacky (something based on The Price is Right), and I subsequently scoured a thesaurus and even Charles Sanders Peirce’s 19th century classification of signifiers…. leading to a couple of candidates I didn’t love (I think the leader was “lexeme” — blah)…. before I realized what should have been obvious to me: “tags”. This comes from Holland via Riolo and company in their model of the evolution of altruism, on which Jon Klein and I later did some work (ALife paper, GPTP06 chapter)

The key idea is that a tag is like a name except that you don’t have to match it exactly. In the Riolo-style models tags were floating point numbers and similarity was numeric difference (which was thresholded), and we may implement something similar (probably more like integers, using differences but not thresholding), but Holland’s notion was more general (see his book Hidden Order). In any event, at a conceptual level all that we care about is that there’s a way to address things with inexact references (via similarity), and the tag concept covers this nicely. (BTW in my quick look back at Hidden Order I don’t see explicit mention of non-identical similarity, but I think it’s implicit.)

The stackless part is Tom’s suggestion that, for parsimony, we just build the tags directly into tagging and tag-lookup instruction names. So for example we may have things like integer.tag-4123 which pops the top integer and tags it with 4123, along with things like at-tag-3745 which pushes whatever is tagged with 3745 (OR, if nothing is tagged with 3745, whatever has the closet tag) onto the exec stack.

We could have typed at-tag instructions and separate tag stores by type, but for now, for simplicity if nothing else, I plan to have only one store and have at-tag-<num> instuctions push whatever has the closest tag onto the exec stack, from where it may end up on another stack via execution. I will have type-based tagging instructions, since otherwise it would be hard to tag different types of things, but not type-based tag lookup/reference instructions.

This kind of stackless mechanism would be a awkward with Push3 names, at least if you have a potentially large or unbounded set of possible names, because in that situation you really want a name stack so that you can correlate definitions with references, etc. But with similarity-based (e.g. lexicographic) lookup this seems much more reasonable. And since it does provide a highly parsimonious way to do this “nameish” thing, and is also simple to implement (although I think that I will treat these new “instructions” in a non-standard way in clojush…), I think it’s a good thing to try first.



10 Responses to “Stackless tagging”

  1.   tom Says:

    I really like how this is shaping up, both in concept and nomenclature. I think the stackless part is important for keeping things clean and useful. The main thing to determine concept-wise at this point seems to be whether to have typed at-tag instructions, but this can be explored later. I guess the other issue is to figure out what at-tag instructions will do before any tags have been defined.

    As for the names of things, I like “tag”, but the “at” instruction for referring seems a bit vague. I think I’d prefer “get” or “refer”.

    I’m putting this at the top of my post-Other-Minds list of things to do with Psh.

  2.   lspector Says:

    I think that for now I will make references prior to any tagging be no-ops. Pushing a default might be reasonable but I don’t know what the default would be (or even what type it would be in the untyped-store scheme I’m going to start with).

    On the names for the reference instructions: I don’t like “get” because get-tag-3212 suggests you’d be getting a tag rather than an associated value, and get-3212 is too vague… I think that “ref” or “refer” are better but also don’t seem 100% right to me… maybe “lookup”? Here’s a weird idea: “gat”, which is tag backwards and also “get at tag”… Too weird? Yeah, too weird. What about “from”? So you’d tag something with something like integer.tag-3212 and retrieve with from-3212. I guess retrieve-3212 is okay too.

    Anyone see what the best choice would be?

  3.   lspector Says:

    Ah — how about “near”, as in “near-2132”?

    That emphasizes the approximate matching lookup feature, which is core.

  4.   tom Says:

    I somewhat like “near” and “lookup”, but am not sold yet. “gat” would actually strike me as intriguing, if it weren’t for the fact that along with Git, there would be too many g*t words in my life.

    How about “find” or “retrieve”? These to some extent incorporate both getting and approximation matching.

    Or maybe this requires a new word, meaning “lookup what’s at this index or, if not defined, the next defined index.” How about “zorch”? (This was first a joke, but it sounds sort of like “search”, which sort of makes sense?) Or “referch”, combining “refer” and “search” in a way that pleasingly sounds like “research”.

    I guess a real word is probably the way to go. I’ll post if I have any other ideas.

  5.   lspector Says:

    You’re right that “find” and “retrieve” say “get” and also “match,” and they’re compatible with approximate matching, but they both make it seem like you’re getting the tag itself: find-x makes it seem like you’ll get x or something close, not something connected to x (or connected to something close). There’s indirection, and it’d be nice to capture that.

    Tags are like receptors on molecules that allow binding by matching, but it’s the molecule that you want, not the receptor itself. “At” and “near” deal with this because of their grammar (you wouldn’t think that at-x or near-x would give you x, but rather something that’s at or near x) but I agree they’re not great, in part because of their spatial connotations.

    Semantically I guess what we really want is “item-with-tag-that-best-matches-x,” but that’s not very catchy.

    It seems like a good term could come from the molecular binding analogy, but the connotations of “bind” in computer science are too different, I think.

  6.   lspector Says:

    Maybe just “tagged-x”. The approximate matching isn’t there, but it goes well with “tag-x”.

  7.   tom Says:

    I think “retrieve-x”, “tagged-x”, and “near-x” work for me, in that order of preference. But, I’m not too picky, just let us know what you go with so that we can keep things the same across implementations.

  8.   bill Says:

    “match”?

  9.   Push » Blog Archive » Stackless tagging (+ fixes/enhancements) in Clojush Says:

    […] Stackless tagging […]

  10.   lspector Says:

    “match” isn’t bad, but I went with “tagged” — see http://push.i3ci.hampshire.edu/2010/11/04/stackless-tagging-fixesenhancements-in-clojush/

Leave a Reply

You must be logged in to post a comment.