Wednesday, 7 January 2009

restish resources ... from the ground up

I've been working on a WSGI web framework called restish. Yes, another one ... sorry! Only, this one's got a serious preference for trying to stay close to HTTP and the way of the web, and therefore encourages REST principles.

restish is really simple and, in my opinion, a pleasure to use. It's also extremely light-weight compared to some other web frameworks, mostly because it doesn't actually attempt to do that much :).

One feature that will hopefully be of interest is that there is no reliance on threads. There's no thread local use anywhere. In fact, they're banned, I despise the things! That allows a restish app to run happily inside a threaded Paste Deploy server or a Spawning server in greenlet mode, i.e. with threads switched off), or potentially any other web server. Basically, *you* choose how you want to deploy it. If you decide to use a threaded model then fine, but why should the web framework dictate to you form the start?

Anyway enough, let's see some code. I thought I'd try to demonstrate the basic idea of restish resources. (I do intend to move all this to the restish documentation at some point but that will take longer than an informal blog post.)

Oh, all the following bits of code *should* run. If you want to try them out, the 'application' is actually a WSGI application. Running under Spawning is as easy as:


$ spawn module_name.application


The simplest resource imaginable


from restish import app, http

def hello_world(request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, world!')

application = app.RestishApp(hello_world)


OK, so there's this thing called a RestishApp. It's just a WSGI application that kicks of the request handling process. Nothing too interesting there. When it's created it's passed the root resource for the site.

A resource, at its simplest, is something callable that takes a http.Request instance as its only arg and returns a http.Response instance. You can build a http.Response yourself but the http module provides some response factories to simplify application code and save a bit of typing.

The Resource class


from restish import app, http, resource

class HelloWorld(resource.Resource):
def __call__(self, request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, world!')

root_resource = HelloWorld()
application = app.RestishApp(root_resource)


You weren't really expecting anything interesting so soon were you? ;-)

Most of the time using a function as a resource is too limiting so restish provides a Resource class. It has some magic abilities as we'll see later but, just like the hello_world(request) function above, it's basically something callable.

Request parameters


from restish import app, http, resource

class Users(resource.Resource):
def __call__(self, request):
username = request.GET.get('username') or 'anonymous'
return http.ok([('Content-Type', 'text/plain')], 'Hello, %s!'%(username,))

root_resource = Users()
application = app.RestishApp(root_resource)


I hope noone's impressed by that code. In fact, I'm not even going to describe it but would like to point out that passing the username as a URL segment is almost certainly a nicer way to do things. So, moving swifly on ...

Resource children


from restish import app, http, resource

class Users(resource.Resource):

def __call__(self, request):
doc = "matt: %s" % (request.path.child('matt'),)
return http.ok([('Content-Type', 'text/plain')], doc)

@resource.child()
def matt(self, request, segments):
return user

def user(request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, matt!')

root_resource = Users()
application = app.RestishApp(root_resource)


The Users resource (the root of the site) returns a document that looks like, "matt: /matt", where "/matt" is the URL of the "matt" resource. Notice how the URL for the matt resource is created? 'request.path' is a url.URL instance - a smart string that knows how to parse and manipulate URLs, e.g. by adding a child segment. http.Request has a few URL instance attributes.

The 'matt' method has a @child decorator to expose it as a child resource factory. By default @child() uses the name of the decorated method as the name of the segment it matches so here it will be called to create a resource for the 'matt' child, i.e the thing at the URL '/matt'.

(You can pass an explicit segment name to @child instead, e.g. @child('matt'), allowing you to call your method whatever you want.)

Statically-named children are useful and quite common but dynamically-named children are more interesting.

Dynamically named resource children


from restish import app, http, resource

USERS = ['alice', 'matt', 'rebecca']

class Users(resource.Resource):

def __call__(self, request):
doc = '\n'.join(['%s: %s' % (username, request.path.child(username)) \
for username in USERS])
return http.ok([('Content-Type', 'text/plain')], doc)

@resource.child('{username}')
def child_user(self, request, segments, username):
if username in USERS:
return User(username)

class User(resource.Resource):

def __init__(self, username):
self.username = username

def __call__(self, request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, %s!'%(self.username,))

root_resource = Users()
application = app.RestishApp(root_resource)


We now have a "database" of users. OK, so it's just a list of username but you get the idea. The User resource returns a document containing a list of users each with their URL.

This time, the @child decorator has been passed a segment match template. '{username}' means match a single URL segment, extract the segment and pass it to the method as the username keyword arg.

The child_user method returns a User resource instance, giving it the username the resource represents, or None to signal a 404.

If you think the User class is bit "heavy" then, no problem, use a partial function instead, or a lambda it you prefer:


class Users(resource.Resource):
[...]
@resource.child('{username}')
def child_user(self, request, segments, username):
if username in USERS:
return functools.partial(user, username)

def user(username, request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, %s!'%(username,))


Request methods

So far, every resource would respond in exactly the same way for all HTTP methods. It doesn't differentiate between GET, POST, PUT, DELETE, etc. Let's fix that now.


from restish import app, http, resource

USERS = ['alice', 'matt', 'rebecca']

class Users(resource.Resource):

@resource.GET()
def text(self, request):
doc = '\n'.join(['%s: %s' % (username, request.path.child(username)) \
for username in USERS])
return http.ok([('Content-Type', 'text/plain')], doc)

@resource.child('{username}')
def child_user(self, request, segments, username):
if username in USERS:
return User(username)

class User(resource.Resource):

def __init__(self, username):
self.username = username

@resource.GET()
def text(self, request):
return http.ok([('Content-Type', 'text/plain')], 'Hello, %s!'%(self.username,))

root_resource = Users()
application = app.RestishApp(root_resource)


The only difference here is that we've replaced the resource's __call__ method with a nicely named method decorated with @resource.GET(). Now the resources only respond to a HTTP GET; anything else returns a "405 Method Not Allowed" response.


$ curl -X GET http://localhost:8080/
alice: /alice
matt: /matt
rebecca: /rebecca
$ curl -X POST http://localhost:8080/
405 Method Not Allowed


I've actually just sneakily introduced some content negotation too. Not only does @resource.GET() match the HTTP method but it also matches the request's "Accept" header. However, GET defaults to an "Accept" match '*/*', i.e. any content type the client asks for.

Content negotiation ... at last

I mentioned above that decorating with @GET also performs '*/*' content negotiation. We can easily configure a resource to handle requests for different content types.


import simplejson
from restish import app, http, resource

USERS = ['alice', 'matt', 'rebecca']

class Users(resource.Resource):

@resource.GET(accept='text/plain')
def text(self, request):
doc = '\n'.join(['%s: %s' % (username, request.path.child(username)) \
for username in USERS])
return http.ok([], doc)

@resource.GET(accept='application/json')
def json(self, request):
users = [{'username': username, 'url': request.path.child(username)} \
for username in USERS]
return http.ok([], simplejson.dumps(users))

@resource.child('{username}')
def child_user(self, request, segments, username):
if username in USERS:
return User(username)

class User(resource.Resource):

def __init__(self, username):
self.username = username

@resource.GET(accept='text')
def text(self, request):
return http.ok([], 'Hello, %s!'%(self.username,))

@resource.GET(accept='json')
def json(self, request):
doc = simplejson.dumps({'username': self.username, 'url': request.path})
return http.ok([], doc)

root_resource = Users()
application = app.RestishApp(root_resource)


This time we have 'text' and 'json' methods, decorated with @GET(accept='text/plain') and @GET('application/json') respectively. Now we have a resource that will look at the Accept header, find the best matching method and call it. No match results in a "406 Not Acceptable" error.


$ curl -H "Accept: text/plain" http://localhost:8080/
alice: /alice
matt: /matt
rebecca: /rebecca
$ curl -H "Accept: application/json" http://localhost:8080/
[{"username": "alice", "url": "/alice"}, {"username": "matt", "url": "/matt"}, {"username": "rebecca", "url": "/rebecca"}]
$ curl -H "Accept: text/html" http://localhost:8080/
406 Not Acceptable


Note that the resource no longer has to specify Content-Type headers. That's because the Accept matching process knows what it found and fills it in for you ... how kind :). (Don't worry, you can still include the Content-Type in the response headers if you want to handle it yourself.)

Note also that the User resource uses shorthand in the form of @GET(accept='text') and @GET(accept='json'). They're expanded to the full MIME type on your behalf and so work just the same. Frankly, typing 'application/json' is tedious and 'application/xhtml+xml' is perverse ;-).


Well, that's all for now although there's a few other things I wanted to mention. A quick list will have to do for now:


  • wildcard accept matching, e.g. 'image/*'

  • PUT, POST, DELETE, etc

  • Content-Type header matching (basically the same as Accept matching but for data sent from the client

  • Handling multiple content types with one method, e.g. @GET(accept=['html', 'xhtml'])

  • @child URL matching in general

  • @child that matches any URL

  • Consuming additional URL segments during traversal



Hope someone finds this post interesting!

2 comments:

Anonymous said...

So, what is the problem with thread locals? I'm looking at different frameworks right now trying to pick one out. Is thread locals something I should be concerned with? (I dont have much experience with threads locals or their downsides)

Matt Goodall said...

@Anonymous thread locals may or may not be a problem for you. It depends how you want to write and run your web application.

If you use thread locals you're limiting yourself to running in a web server that is threaded. That quite likely excludes event-driven servers, i.e. Twisted, eventlet/spawning, Tornado, and all the generator-based servers, that can typically handle huge numbers of concurrent connections.

If it makes sense for your application to use thread locals then fine, no problem. But why be forced down that route by a library or framework unless truly necessary?

I'm not saying that thread locals have no purpose but I believe they are often abused:

* to provide psuedo-globals. WSGI already has a great request-scoped place for this - the environ.

* for the "convenience" of not having to pass args to functions. Passing args around is a good thing - it's explicit and makes code more readable, testable and predictable.

restish doesn't use thread locals (it works fine in a threaded or event-driven server) and explicitly passes the request/environ to anything that needs it. That's the way I like it ;-).