Wednesday, 7 January 2009

restish resources ... from the ground up

I've been working on a WSGI web framework called restish. Yes, another one ... sorry! Only, this one's got a serious preference for trying to stay close to HTTP and the way of the web, and therefore encourages REST principles.

restish is really simple and, in my opinion, a pleasure to use. It's also extremely light-weight compared to some other web frameworks, mostly because it doesn't actually attempt to do that much :).

One feature that will hopefully be of interest is that there is no reliance on threads. There's no thread local use anywhere. In fact, they're banned, I despise the things! That allows a restish app to run happily inside a threaded Paste Deploy server or a Spawning server in greenlet mode, i.e. with threads switched off), or potentially any other web server. Basically, *you* choose how you want to deploy it. If you decide to use a threaded model then fine, but why should the web framework dictate to you form the start?

Anyway enough, let's see some code. I thought I'd try to demonstrate the basic idea of restish resources. (I do intend to move all this to the restish documentation at some point but that will take longer than an informal blog post.)

Oh, all the following bits of code *should* run. If you want to try them out, the 'application' is actually a WSGI application. Running under Spawning is as easy as:


$ spawn module_name.application


The simplest resource imaginable


from restish import app, http

def hello_world(request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, world!')

application = app.RestishApp(hello_world)


OK, so there's this thing called a RestishApp. It's just a WSGI application that kicks of the request handling process. Nothing too interesting there. When it's created it's passed the root resource for the site.

A resource, at its simplest, is something callable that takes a http.Request instance as its only arg and returns a http.Response instance. You can build a http.Response yourself but the http module provides some response factories to simplify application code and save a bit of typing.

The Resource class


from restish import app, http, resource

class HelloWorld(resource.Resource):
def __call__(self, request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, world!')

root_resource = HelloWorld()
application = app.RestishApp(root_resource)


You weren't really expecting anything interesting so soon were you? ;-)

Most of the time using a function as a resource is too limiting so restish provides a Resource class. It has some magic abilities as we'll see later but, just like the hello_world(request) function above, it's basically something callable.

Request parameters


from restish import app, http, resource

class Users(resource.Resource):
def __call__(self, request):
username = request.GET.get('username') or 'anonymous'
return http.ok([('Content-Type', 'text/plain')], 'Hello, %s!'%(username,))

root_resource = Users()
application = app.RestishApp(root_resource)


I hope noone's impressed by that code. In fact, I'm not even going to describe it but would like to point out that passing the username as a URL segment is almost certainly a nicer way to do things. So, moving swifly on ...

Resource children


from restish import app, http, resource

class Users(resource.Resource):

def __call__(self, request):
doc = "matt: %s" % (request.path.child('matt'),)
return http.ok([('Content-Type', 'text/plain')], doc)

@resource.child()
def matt(self, request, segments):
return user

def user(request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, matt!')

root_resource = Users()
application = app.RestishApp(root_resource)


The Users resource (the root of the site) returns a document that looks like, "matt: /matt", where "/matt" is the URL of the "matt" resource. Notice how the URL for the matt resource is created? 'request.path' is a url.URL instance - a smart string that knows how to parse and manipulate URLs, e.g. by adding a child segment. http.Request has a few URL instance attributes.

The 'matt' method has a @child decorator to expose it as a child resource factory. By default @child() uses the name of the decorated method as the name of the segment it matches so here it will be called to create a resource for the 'matt' child, i.e the thing at the URL '/matt'.

(You can pass an explicit segment name to @child instead, e.g. @child('matt'), allowing you to call your method whatever you want.)

Statically-named children are useful and quite common but dynamically-named children are more interesting.

Dynamically named resource children


from restish import app, http, resource

USERS = ['alice', 'matt', 'rebecca']

class Users(resource.Resource):

def __call__(self, request):
doc = '\n'.join(['%s: %s' % (username, request.path.child(username)) \
for username in USERS])
return http.ok([('Content-Type', 'text/plain')], doc)

@resource.child('{username}')
def child_user(self, request, segments, username):
if username in USERS:
return User(username)

class User(resource.Resource):

def __init__(self, username):
self.username = username

def __call__(self, request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, %s!'%(self.username,))

root_resource = Users()
application = app.RestishApp(root_resource)


We now have a "database" of users. OK, so it's just a list of username but you get the idea. The User resource returns a document containing a list of users each with their URL.

This time, the @child decorator has been passed a segment match template. '{username}' means match a single URL segment, extract the segment and pass it to the method as the username keyword arg.

The child_user method returns a User resource instance, giving it the username the resource represents, or None to signal a 404.

If you think the User class is bit "heavy" then, no problem, use a partial function instead, or a lambda it you prefer:


class Users(resource.Resource):
[...]
@resource.child('{username}')
def child_user(self, request, segments, username):
if username in USERS:
return functools.partial(user, username)

def user(username, request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, %s!'%(username,))


Request methods

So far, every resource would respond in exactly the same way for all HTTP methods. It doesn't differentiate between GET, POST, PUT, DELETE, etc. Let's fix that now.


from restish import app, http, resource

USERS = ['alice', 'matt', 'rebecca']

class Users(resource.Resource):

@resource.GET()
def text(self, request):
doc = '\n'.join(['%s: %s' % (username, request.path.child(username)) \
for username in USERS])
return http.ok([('Content-Type', 'text/plain')], doc)

@resource.child('{username}')
def child_user(self, request, segments, username):
if username in USERS:
return User(username)

class User(resource.Resource):

def __init__(self, username):
self.username = username

@resource.GET()
def text(self, request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, %s!'%(self.username,))

root_resource = Users()
application = app.RestishApp(root_resource)


The only difference here is that we've replaced the resource's __call__ method with a nicely named method decorated with @resource.GET(). Now the resources only respond to a HTTP GET; anything else returns a "405 Method Not Allowed" response.


$ curl -X GET http://localhost:8080/
alice: /alice
matt: /matt
rebecca: /rebecca
$ curl -X POST http://localhost:8080/
405 Method Not Allowed


I've actually just sneakily introduced some content negotation too. Not only does @resource.GET() match the HTTP method but it also matches the request's "Accept" header. However, GET defaults to an "Accept" match '*/*', i.e. any content type the client asks for.

Content negotiation ... at last

I mentioned above that decorating with @GET also performs '*/*' content negotiation. We can easily configure a resource to handle requests for different content types.


import simplejson
from restish import app, http, resource

USERS = ['alice', 'matt', 'rebecca']

class Users(resource.Resource):

@resource.GET(accept='text/plain')
def text(self, request):
doc = '\n'.join(['%s: %s' % (username, request.path.child(username)) \
for username in USERS])
return http.ok([], doc)

@resource.GET(accept='application/json')
def json(self, request):
users = [{'username': username, 'url': request.path.child(username)} \
for username in USERS]
return http.ok([], simplejson.dumps(users))

@resource.child('{username}')
def child_user(self, request, segments, username):
if username in USERS:
return User(username)

class User(resource.Resource):

def __init__(self, username):
self.username = username

@resource.GET(accept='text')
def text(self, request):
return http.ok([], 'Hello, %s!'%(self.username,))

@resource.GET(accept='json')
def json(self, request):
doc = simplejson.dumps({'username': self.username, 'url': request.path})
return http.ok([], doc)

root_resource = Users()
application = app.RestishApp(root_resource)


This time we have 'text' and 'json' methods, decorated with @GET(accept='text/plain') and @GET('application/json') respectively. Now we have a resource that will look at the Accept header, find the best matching method and call it. No match results in a "406 Not Acceptable" error.


$ curl -H "Accept: text/plain" http://localhost:8080/
alice: /alice
matt: /matt
rebecca: /rebecca
$ curl -H "Accept: application/json" http://localhost:8080/
[{"username": "alice", "url": "/alice"}, {"username": "matt", "url": "/matt"}, {"username": "rebecca", "url": "/rebecca"}]
$ curl -H "Accept: text/html" http://localhost:8080/
406 Not Acceptable


Note that the resource no longer has to specify Content-Type headers. That's because the Accept matching process knows what it found and fills it in for you ... how kind :). (Don't worry, you can still include the Content-Type in the response headers if you want to handle it yourself.)

Note also that the User resource uses shorthand in the form of @GET(accept='text') and @GET(accept='json'). They're expanded to the full MIME type on your behalf and so work just the same. Frankly, typing 'application/json' is tedious and 'application/xhtml+xml' is perverse ;-).


Well, that's all for now although there's a few other things I wanted to mention. A quick list will have to do for now:


  • wildcard accept matching, e.g. 'image/*'

  • PUT, POST, DELETE, etc

  • Content-Type header matching (basically the same as Accept matching but for data sent from the client

  • Handling multiple content types with one method, e.g. @GET(accept=['html', 'xhtml'])

  • @child URL matching in general

  • @child that matches any URL

  • Consuming additional URL segments during traversal



Hope someone finds this post interesting!

Saturday, 3 January 2009

The Royal Institution Christmas Lectures

I really enjoyed watching the Royal Institution Christmas Lectures, presented by Chris Bishop. I don't think I've actually watched them properly since I was a kid!

We sat down as a family every night to watch and it clearly sparked some interest from the kids. We've had the cover off the computer looking at its innards, we've been playing with Phun (the kids remembered it after seeing the multi-touch displays), we played with GNOME's Dasher. Heck, we even touched on public key encryption.

Sure, there were a couple of bits that weren't so good. In particualr Bill Gates was seriously dull and the programme on software, The Ghost in the Machine, was not exactly great (the kids still don't know what I do ;-)) but over all, fantastic.

If only all TV was that interesting!

GET and idempotence

I was reminded today of what seems to be a common misunderstanding of the idempotent requirements of a GET in a REST-ful architecture. GET doesn't mean the response must be the same each time; only that it must have no side effects, i.e. it should never cause the server's state to change.

For instance, CouchDB includes a resource, /_uuids, that returns a number of server-generated UUIDs. As far as I know, it only exists to support languages without a decent UUID library. It has no effect on the server.

However, CouchDB will only respond when /_uuids is POST'ed to:

$ curl -X GET http://localhost:5984/_uuids?count=2
$ curl -X POST http://localhost:5984/_uuids?count=2
{"uuids":["d267c530c9591eeeed72f589fcef5599","f4c12de5147ec97e7a770440cc7a69f2"]}

One suggestion on the mailing list (although not from one of CouchDB's core developers) for the use of POST is to, "comply with REST as it returns a different output each time".

A GET would be just fine here. In fact, it would be more in keeping with the intended use of the HTTP methods.

It's easy to come up with examples of a REST-ful resource that sends a different response every request, with probably the most obvious being some sort of time server. Consider the following URLs:
  • /time
  • /time/BST
  • /time/EST
  • etc
You would GET the current time from an appropriate resource and I sure hope there will be a different response each time ;-). You could also PUT a time to one of those resource to set the time.