arrows and regex

This post will introduce Ripple’s application and regular expression syntax. The very first releases of Ripple (as in the screencast) included a single, infix symbol, “/” for the application of mappings. For example, to map the numbers 2 and 3 to their sum, you would have used (2 3 /add). While the slash is still supported for the sake of backwards compatibility, Ripple’s preferred syntax is now entirely postfix-based, and includes various constructions for “forward” and “backward” application of mappings, as well as for regular expressions. These are:

  • “>>” — forward application. Simply applies a mapping, exactly once. For instance, the example above should be written: (2 3 add >>) using the preferred syntax.
  • “<<” — backward application. Applies the inverse of a mapping, insofar as this is defined. For instance, the expression (2 3 add <<) yields (-1), as the inverse mapping of add is defined to be sub, the subtraction primitive. In RDF applications, backward application is useful for traversing links from head to tail rather than from tail to head. For example, if (:timbl foaf:knows >>) yields all individuals known by Tim Berners-Lee, then (:timbl foaf:knows <<) yields all individuals who know Tim Berners-Lee, according to Ripple’s knowledge base. In the current application, you can even traverse backwards from literal values. For example, ("Timothy Berners-Lee" foaf:name <<) yields (:timbl) himself.
  • “?” — optional quantifier. This and the following constructions provide POSIX-style regular expressions in Ripple. When it stands before an application operator, “?” applies the operator both once and not at all. For instance, the expression (:timbl foaf:knows? >>) yields both Tim Berners-Lee and the individuals he knows. The expression (42 neg? >>) yields both (42) and (-42).
  • “*” — star quantifier. When it stands before an application operator, “*” applies the operator zero or more times. This is particularly useful when working with recursive data structures such as lists. For example, the expression ((10 20 30) rdf:rest* >> rdf:first >>) yields (10), (20), and (30).
  • “+” — plus quantifier. Like “*”, but applies its operator at least once. Thus, the expression ((10 20 30) rdf:rest+ >> rdf:first >>) yields only (20) and (30), as the rdf:rest mapping is applied once, then twice before the end of the list is reached.
  • “{n}” — numeric quantifier. Applies its operator a single, specified number n of times. For instance, ((10 20 30) rdf:rest{2} >> rdf:first >>) yields (30). The expression (:timbl foaf:knows{2} >>) yields all individuals known by Tim Berners-Lee, in transitive fashion for two degrees. This is the same as (:timbl foaf:knows >> foaf:knows >>).
  • “{n,m}” — range quantifier. Applies its operator at least n times and at most m times. For instance, ((10 20 30) rdf:rest{0,1} >> rdf:first >>) yields (10) and (20). (:timbl foaf:knows{2,3} <<) yields all individuals from whom Tim Berners-Lee is two or three degrees removed, according to the foaf:knows mapping.

Note that despite this diversity of syntax, there is and always has been only one true application operator in Ripple, still called op. Apart from the forward application symbol “>>” which is simply an alias for op, all of the above constructions are merely syntactic sugar for expressions involving op together with one primitive mapping or another. For example, the expression (:timbl foaf:knows{2} >>) parses to the same Ripple program as (:timbl foaf:knows 2 timesApply op).

Prettifying the command line

Right. Blog. Keyboard. Fingers. Just start typing. So, I needed to take a screen capture of the Ripple command line for a presentation yesterday, and was a little embarrassed by this old and awkward formatting:

    
    1 >>  :timbl >> foaf:knows >> foaf:name >> .
    
    rdf:_1  ("Dan Brickley"@en)
    rdf:_2  ("Libby Miller")
    rdf:_3  ("Jim Hendler")
    rdf:_4  ("Henry J. Story")
    
    2 >>
    

Old, because this is how it has been since the dawn of Ripple time. Awkward, because:

  1. The >> input prompt clashes with the >> application operator (which in earlier versions of Ripple was a slash, apart from being an infix operator. More later on the new syntax).
  2. The RDF Bag -styled index for query results (rdf:_1 and so on) has always been a little misleading. It’s particularly wrong now that Ripple is much more loosely coupled with the RDF data model.
  3. Without the spurious RDF Bag syntax, the parentheses around individual query results (indicating that they are lists) are as unnecessary as they are unsightly. Just as the top-level parentheses of a line of input are omitted — so you can type 2 3 add >> instead of the more obviously list-like (2 3 add >>) — so it can be with output: just pretend the parentheses are there, and remember that query results really are lists.

It took all of five minutes to put a much improved format in place:

    
    1) :timbl >> foaf:knows >> foaf:name >> .
    
      [1]  "Dan Brickley"@en
      [2]  "Libby Miller"
      [3]  "Jim Hendler"
      [4]  "Henry J. Story"
    
    2)
    

This does look better, doesn’t it?

Ripple’s not dead

Alright, so I’m not much of a blogger. I suppose that’s obvious by now. Nonetheless, the subject of this very neglected blog, the Ripple language, has come a long way in the last seven-and-a-half months. Ripple is now used commercially, which has driven its development in new and interesting directions. The language and query environment, now compatible with any Sesame 2.0 Sail implementation, are clearly separated from the linked data client, which in turn is compatible with Sesame-based applications distinct from Ripple. The API has been extended to allow for more specialized network algorithms. A developer may now embed Ripple query strings in Java source code, making it much easier to use Ripple as a software component, as opposed to a stand-alone tool. The syntax of the language has grown and matured, with support for regular expressions, backward and forward traversal of networks, and user-friendly, pattern-matching program definitions. In short, Ripple is becoming a real programming language. As it’s an open-source language, I’ve decided that the source code really ought to be accessible somewhere (other than in months-old release packages), so I’ve put it on Google Code. Maybe I should pipe the SVN commit messages into my blog.

Note: you can check out an up-to-the-minute working copy of Ripple like so (requires a Subversion client)

svn checkout http://ripple.googlecode.com/svn/trunk/ ripple

Ripple 0.4 released

This is the first software release for Ripple since its debut at ESWC last June. While the syntax and computational model of of the language have not changed, the implementation contains a lot of new material in its libraries, and also makes better use of Java concurrency, speeding up queries and improving interactivity. Particular new features include:

  • Streaming query results. One shortcoming of previous releases was that there was no way to interrupt a query. It was all too easy to find yourself stuck waiting on a program which either would never terminate or was busy churning out far more results than you needed. What’s more, you had no way of seeing what was going on in those situations, because query results were tucked away in a buffer until the program halted. Now, individual query results stream onto your terminal as soon as they’re computed. A double tap to the ESC key tells the query engine that you’ve seen enough.
  • Floating point math. Ripple 0.3 was limited to integer arithmetic. Ripple 0.4, on the other hand, has over two dozen math primitives (borrowed entirely from Java), including trig functions and a random number generator, all of which play well with both integer and floating point numbers.
  • RDF document primitives. Ripple’s RDF-specific features are largely hidden behind its syntax. RDF documents are requested, parsed and manipulated transparently, as needed to answer queries. That’s as it should be. However, so often do I find myself using document-centric tools in conjunction with Ripple that it’s been very handy to just stick a couple of document-centric primitives in Ripple itself: the graph:triples primitive consumes the URI of a Semantic Web document and produces each of the statements contained in the document, while graph:namespaces produces each of the namespaces defined in the document.
  • Literal reification and type-casting primitives. A list of rewiring scenarios by Stefano Mazzocchi inspired a number of primitives for manipulating data types, including graph:toUri which consumes a string literal and produces a URI. Additionally, xsd:type and xml:lang are no longer mere URIs in Ripple: they’re primitive functions. xsd:type gives you the data type of a typed literal, and xml:lang gives you the language tag (if any) of a plain literal.
  • Even more new primitives. This release makes most java.lang.String methods available through a new string: library. I’ve thrown in a few new stack: primitives, too, as well as a services: library with hooks into selected Semantic Web services, including Sindice and Swoogle.
  • owl:sameAs smushing for backwards compatibility. Ripple’s primitive functions are identified with URIs which are as cool as my domain name (and I think that’s cool enough for alpha software). But what happens when primitives change? Older versions of Ripple will not understand programs which reference newer primitives, and that’s just how it goes. However, new versions of Ripple must be able to understand old programs, which is where owl:sameAs links come in. If a primitive described in a newer namespace is linked through owl:sameAs to a primitive in an older namespace, the application understands them to be the same thing. It makes publishing library descriptions that much easier.
  • Extension loader. Previously, if you wanted to add new primitives to Ripple, you had to insert them into Ripple’s own source code and rebuild. Now all you need is a Ripple dependency and a text file called extensions.txt which tells the application what to load when it starts up. To include custom libraries in the query environment, just list them by name.
  • Improved crawling. Ripple is now able to handle multiple HTTP connections in parallel, which means dramatically faster crawling. In keeping with crawler etiquette, Ripple maintains a history of HTTP requests as it goes, taking care not to burden any one host with rapid fire requests.
  • Miscellaneous: this release has a better ratio of unit tests to core classes than 0.3, as well as an improved launcher script, new command-line options, and a context-preserving TriG cache.

Finally, you might notice a side-project called the “Ripple media extension” alongside Ripple proper on the downloads page. It includes some new functionality which doesn’t necessarily belong in a command-line tool, but which I find interesting:

  • media:play — plays an audio (for now: MIDI) file
  • media:show — displays an image
  • media:speak — speaks a passage of text in the default FreeTTS voice

I expect this to grow into something of an experiment in visual and spoken Semantic Web UI.