WSGIDispatcher

Joe Gregorio

Introducing WSGIDispatcher [code][docs][tests]. It's just like Luke Arno's Selector, except for the following:

  • The license is MIT.
  • It does not use setuptools.
  • It has no external dependencies.
  • It has unit tests.
  • Regular expressions are compiled lazily.
  • Non-regex path expression are treated as string matches, not regular expressions.
  • You can mix and match templates and regular expressions in the same instance of Dispatcher().
  • Hooking up applications that handle all methods doesn't require _ANY_.

The lazy compilation of regular expressions is important when running as CGI; you don't want to have each template compiled until you know all the previous template rules failed to match.

For now the code sits in the 1812 project. I haven't decided if it's enough code to warrant being in a standalone project or if it can just live as a part of 1812.

I've mentioned previously that I'm not thrilled with Python documentation options, so for this module I'm experimenting with docutils. All of the documentation is in the module doc strings as reStructured Text, which I then extract and convert into HTML using the docutils library.

Regarding the documentation problem, have you considered the use of doctests? Make the documentation do double-duty as functional tests in addition to the unit tests you have. If you do explore this option, you may want to also use zope.testbrowser for testing the web based functionality.

BTW, note that in your current docs, you repeat the URL 'http://localhost:8000/index/Joe' twice, with two different outputs ('Hello Joe' vs. 'Hi there'), which is exactly the kind of documentation error doctests are good at catching.

Posted by Michael R. Bernstein on 2007-04-24

Ha, you are just reluctant to separate it into its own package because you aren't using the handy things Setuptools gives you to manage the packages!

Posted by Ian Bicking on 2007-04-24

Michael,

I fixed the bug in the docs, thanks! I will look at doctests.

Posted by Joe on 2007-04-24

I am not really sure why you list "does not use setuptools" as though it is a feature. While I understand setuptools can make some aspects difficult regarding packaging modules for distributions, overall, using setuptools has been extremely helpful in my own experience. Can you explain?

Posted by Eric on 2007-04-24

Eric,

Setuptools is so focused on helping the developer it actually ends up being hostile to the consumer of any libraries that use it.

For example:


joe@joe-laptop:~$ wget http://cheeseshop.python.org/packages/source/P/Pylons/Pylons-0.9.2.tar.gz
joe@joe-laptop:~$ tar -xzf Pylons-0.9.2.tar.gz 
joe@joe-laptop:~$ cd Pylons-0.9.2
/home/joe/Pylons-0.9.2
joe@joe-laptop:~/Pylons-0.9.2$ python setup.py install
The required version of setuptools (>=0.6c2) is not available, and
can't be installed while this script is running. Please install
 a more recent version first.

(Currently using setuptools 0.6c1 (/usr/lib/python2.4/site-packages/setuptools-0.6c1-py2.4.egg))
joe@joe-laptop:~/Pylons-0.9.2$ 

Now let's say you were actually lucky enough to download a package that used a version of setuptools that was actually present, or one that used a version of setuptools that was actually there when it went to download the file from the cheeseshop. Your problems, as a user, do not end there. For example:

joe@joe-laptop:~$ wget http://cheeseshop.python.org/packages/source/w/wsgiref/wsgiref-0.1.2.zip
joe@joe-laptop:~$ unzip wsgiref-0.1.2.zip 
joe@joe-laptop:~$ cd wsgiref-0.1.2
/home/joe/wsgiref-0.1.2
joe@joe-laptop:~/wsgiref-0.1.2$ sudo python setup.py install

joe@joe-laptop:~/wsgiref-0.1.2$ cd ..
joe@joe-laptop:~$ python
Python 2.4.4c1 (#2, Oct 11 2006, 21:51:02) 
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import wsgiref
>>> help(wsgiref)
Traceback (most recent call last):
  File "", line 1, in ?
  File "/usr/lib/python2.4/site-packages/setuptools-0.6c1-py2.4.egg/site.py", line 339, in __call__
    
  File "/usr/lib/python2.4/pydoc.py", line 1656, in __call__
    self.help(request)
  File "/usr/lib/python2.4/pydoc.py", line 1700, in help
    else: doc(request, 'Help on %s:')
  File "/usr/lib/python2.4/pydoc.py", line 1483, in doc
    pager(title % desc + '\n\n' + text.document(object, name))
  File "/usr/lib/python2.4/pydoc.py", line 303, in document
    if inspect.ismodule(object): return self.docmodule(*args)
  File "/usr/lib/python2.4/pydoc.py", line 1060, in docmodule
    for file in os.listdir(object.__path__[0]):
OSError: [Errno 20] Not a directory: '/usr/lib/python2.4/site-packages/wsgiref-0.1.2-py2.4.egg/wsgiref'
>>> 

Oops.

And don't even bother trying to read the source, because it's locked up in an egg.

And shall we mention what setuptools is doing behind the scenes? See Jacob's post on that.

So, as you can see, 'not using setuptools' is indeed a feature, for the users of my code.

Posted by Joe on 2007-04-24

Is there a reason nobody's taking HTTP_HOST into account in doing routing? It's kinda nice to use host names as params...

Posted by Bill Seitz on 2007-04-24

Hi great news, but one thing i found disturbing with selector was (at least few months ago) that it wasn't able to dispatch to static files. Maybe i'm wrong in my reasonning (design) but i think it would be interesting to have this functionality into the same module. What is your thought on this question ?

Posted by sébastien on 2007-04-24

Thanks for the explanation. I must say that I have not run into these issues through using easy_install, but I could see how it would be a problem. With that said, I would prefer to use easy_install in the same way I prefer apt over working with RPMs. Like everything in life there is a healthy balance that should be strived for.

Posted by Eric on 2007-04-24

Bill,

Presumably because that type of routing is best left to routers?

Posted by Joe on 2007-04-25

The documentation, when discussing range specifiers, uses the word 'color' where I think the word 'colon' is meant. The range specifier follows a color in the template name Regarding setuptools. Does this mean I cannot get this module via easy_install? If so, that's too bad. Currently I use Selector, but have to hand-install Collection.py, that's a pain. Anyway, I think this is a great combination of functionality that previously required two different packages. Thanks!

Posted by Brad on 2007-04-25

Sorry, I'll try to clarify the use-case. If you're hosting multiple "spaces" that share the same code base, you might like to use different host names, or even domain names, per user. I suppose if your WSGI server is proxied behind Apache you could have rules in there to map http://webseitz.fluxent.com/wiki/FrontPage to http://fluxent.com/webseitz/wiki/FrontPage or something. Is that an approach you'd recommend? I don't see using a *router* for something like that.

Posted by Bill Seitz on 2007-04-25

Brad,

Fixed the typo, thanks!

Posted by Joe on 2007-04-25

Bill,

Yeah, if you're on the same machine it's a matter of how your web server is configured. If you're going for scale then you will distribute those names among a bunch of servers via DNS, which is what I meant when I said router.

Posted by Joe on 2007-04-25

I think it’s quite premature to dismiss host-based dispatching off hand like that. There are many good reasons to want to do it.

One is speed: browsers are generally configured to be very conservative about the number of concurrent connections to the same host; if you serve your images, stylesheets and script sources from multiple different hostnames, you can get them much faster to users with browsers in the default configuration (ie. 98% of the population).

Another is security: Javascript loaded from jcg.users.example.org won’t be able to access cookies set by headers or scripts from ap.users.example.org. There are much lower barriers between the cookies on pages example.org/users/jcg and example.org/users/ap.

Feeds are another use case: since the number of bots which don’t know a 410 from a hole in the ground remains greater than the number of those who do, it is desirable to serve feeds from single-purpose hosts such as feed.weblog.example.org rather than example.org/weblog/feed/, because then you can throw out the DNS entry when you pull the rug from under the feed, causing all future requests to die instantly long before they reach your webserver.

Etc etc etc…

Sure, some of these might involve distributing the responsibilities to several machines and involving routers. However, that is usually something that takes place quite far along into the lifetime of the app; prior to very high loads that might make such separation necessary, these requests will realistically all end up in the same process. For that process, the hostname is really nothing more than another part of the path, only written backwards and with funny directory separators.

Posted by Aristotle Pagaltzis on 2007-04-26

Also, if you're adding new "spaces" all the time, does mod_proxy do dynamic mapping from rules or regexp, or do you have to manually add each mapping (and restart Apache)?

Posted by Bill Seitz on 2007-04-27

Hi All,

I want to address a few things above. (Disclaimer: I am the author of Selector.)

1. Is Setuptools a Feature or a Bug?

This totally depends on your target users. It is absurd to be absolute about it. If you are mainly worried about ~$5/month hosting environments with no rights, it sucks. If you are focused on other environments (like the ones that more serious developers usually have) it is a huge help. For me it has been an extremely valuable set of tools.

Joe, I think the way you portray your setuptools difficulties is a little disingenuous. Anyone who has read the APP list might suspect that your real problem with it is that it helps the rest of us at the expense of shared hosting users. That is a valid point, but does not justify your dramatic reenactments of setuptools errors.

I think that supporting setuptools is a huge help for some environments and a huge hindrance to others. Lets not limit ourselves to one-size-fits-all solutions.

2. Selector and Host-based Dispatch.

There is more than one type of dispatcher in the Selector distribution. It is very easy to do host-based dispatch and the right place to do it depends on the app. A router is not always the right place, even if saying so makes for an easy, glib blow-off. Please email me or the WSGI-Components list if you want to talk about it more.

3. Selector and Static Content.

I have another module called Static that is easy to hook to Selector.

http://lukearno.com/projects/static/

If you are looking for a light framework that uses Selector, you can use SimpleWeb, Ro Baccia (though, I suppose that will be changing) and I just released one myself, called Rest in Python:

http://lukearno.com/projects/rip/

4. MIT License

I hear a lot of FUD about the LGPL but that is all it is. I hope that everyone understands that the LGPL is not viral for Python:

http://www.gnu.org/licenses/lgpl-java.html

5. Selector and Unit Tests

Joe, you have asked me about unit tests a couple of times and I was sort of hoping that you would contribute. They are needed. Actually, someone was working on some but they had a baby and got a little distracted. Anybody want to write some unit tests? :) If not, it is on my to-do list.

6. Lazy Regex Compilation

This is _not_ a feature, unless you are running CGI apps... say on a cheap shared hosting account with no rights. It means that all the patterns up to the point where a match is made are compiled to handle each request instead of all of them at start-up. For my apps, I usually run an SCGI (flup) server for months at a time and like my regexes to be compiled once at start-up and not at all for each request. Again, this is all about holding us all to the lowest common denominator of the ~$5/month shared hosting with no rights and CGI as your only option for Python.

I think it is good to have alternatives that are tailored to different types of users, but lets acknowledge that and not act as if our favorite way to do things is the One Right Way. I am glad that there will be another alternative for low-end shared hosting deployment.

Cheers,
- Luke
http://lukearno.com/

Posted by Luke Arno on 2007-04-27

Luke,

It means that all the patterns up to the point where a match is made are compiled to handle each request...

Looking at the code you'll see that the compiled regexs are cached, and they will only be compiled once each in a long running process. Also, the problems I had were not on a shared host, those sessions were captured on my laptop. There are plenty of other technical points that could be made, other points that could be responded to, but all of that is besides the point, the real point:

If you are mainly worried about ~$5/month hosting environments with no rights, it sucks. If you are focused on other environments (like the ones that more serious developers usually have) it is a huge help.

Thank you for so perfectly exemplifying the setuptools attitude. An attitude that is baked into setuptools, and I believe, the same attitude that's the root cause of why setuptools got rejected from inclusion in Python 2.5. Until there is a version of setuptools that works for everyone, including "the lowest common denominator", and not just "serious developers", then no one should be advocating its use.

Posted by Joe on 2007-04-27

Joe,

If you have an idea of how to provide some of the key features that setuptools provides (easy_install, dependency handling, resource management, etc...), which are far too useful for me to justify passing up, in a way that will work for cheap shared hosting too, please tell us about it. If you can't, then we ought to blaze forward. By making the experience of using Python as compelling as possible, I think we might eventually push shared hosting providers to make arrangements to provide that experience. If we just try to wait it out, the problems involved will never be solved and Python will be the next PHP.

Setuptools is far from perfect, but we need to make it better or replace it with something better. Just throwing it away is not constructive. Pretending that the problems that setuptools solves are not there is not a solution. If it doesn't work for you, don't use it, but try to understand the priorities of those for whom it does. For me and my work setuptools is a huge win. I, like many developers these days, generally write software that provides a service and the "consumer" just installs a standard client (a browser). My dream is not to have my blogging program installed in the cgi-bins of tens of thousands shared hosting accounts. If that is your goal, fine. Just don't expect the rest of us to cripple our software for your goals. If I can also support the shared hosting case, great. First, I have to support the environment that I and my target audience (developers who provide _services_) work and play in.

I hope that setuptools will become more user friendly and easier to provide in a shared hosting setting. I hope that at the same time, as setuptools matures and more people use it, that shared hosting providers will feel a greater demand for it and make the effort to provide it. (This could be setuptools or something like it. I don't care about that.) If these two currents do not converge, we may never escape the stone age of distribution and deployment. ...Unless, of course, you have that all-at-once, should-have-been-obvious-all-along idea about how to meet everyone's needs, which would be great. I am all ears. ;)

Cheers,
- Luke

Posted by Luke Arno on 2007-04-28

Luke,

I told you:

... the problems I had were not on a shared host, those sessions were captured on my laptop.

You responded:

If you have an idea of how to provide some of the key features that setuptools provides ... in a way that will work for cheap shared hosting too, please tell us about it.

Then added:

I hope that setuptools will become more user friendly and easier to provide in a shared hosting setting.

And later declared:

I am all ears.

I would say the evidence was to the contrary.

Posted by Joe on 2007-04-28

Joe,

You said that the real point was that, [u]ntil there is a version of setuptools that works for everyone, that no one should be advocating its use. If you were not talking about shared hosting, at least primarily, I don't know what your real point was. Maybe just you and your laptop?

I already stated that I found your dramatic reenactment disingenuous and I am still not buying it. Your mis-configured laptop proves nothing. It takes a few seconds to correct your problem by getting a more recent version, as the error suggests. You have been waving these error messages or ones like them around for months (at least). I was ignoring your sales demonstration. ("The _other_ brand of paper towel just pushes the mess around!" does nothing for me.)

If you want to talk about how to actually solve the real issues, like shared hosting; if you want to talk honestly about how we can improve setuptools or replace it with something better, I _am_ all ears. If you are going to treat this as only a verbal sparing match or simply insist that setuptools be thrown away without offering an alternative, I will be bored soon.

Posted by Luke Arno on 2007-04-28

comments powered by Disqus