Routes 1.9 Release

Posted by ben Sat, 14 Jun 2008 03:53:52 GMT

I released Routes 1.9 today, which is another step on the Road to Routes 2.0. Some of the highlights that people will be most interested that I had previously blogged about now available:

Minmization is optional

Pylons 0.9.7 will default to turning minimization off (projects are free to leave it on if desired). This means that constructing a route like this with minimization off:
map.connect('/:controller/:action/')
will actually require both the controller and the action to be present, and the trailing slash. This addresses the trailing slash issue I wanted to fix as well.

Named Routes will always use the route named

This is now on by default in Routes 1.9, which results in faster url_for calls as well as the predictability that comes with knowing exactly which route will be used.

Optional non-Rails’ish syntax

You can now specify route paths in the same syntax that Routes 2 will be using:

map.connect('/{controller}/{action}/{id}')
Or if you wanted to include the requirement that the id should be 2 digits:
map.connect('/{controller}/{action}/{id:\d\d}')
Routes automatically builds the appropriate regular expression for you, keeping your routes a lot easier to skim over than a bunch of regular expressions.

Routes 2 will be bringing redirect routes, and generation-only routes, making Routes 1.9 a great way to transition to Routes 2 when its ready.

Pylons on JVM's (and other VMs) 1

Posted by ben Wed, 07 May 2008 18:40:20 GMT

Phil Jenvey has been making some great progress getting all the components of Pylons running on Jython, and posted a good write-up of the remaining work being done. It’s interesting to note that one of the big issues will affect any web framework on Jython, not just Pylons. That is, the reload time when used in development to restart the server.

While I don’t plan on deploying Pylons apps in WAR files anytime soon, its nice to see Jython emerging as a candidate for deployment.

Most bizarre Git service and other stupid Rails powered "businesses"

Posted by ben Wed, 07 May 2008 03:18:21 GMT

I can’t help but get totally baffled when I see a business model like this.

Yes, that’s right, you can pay for the privilege of keeping a copy of your distributed version control system (DVCS) private repositories on someone else’s machines. You also get to pay depending on how many people you want to allow to collaborate on it.

Nevermind that one of the entire points of a DVCS is that you do NOT need a central repository. Does anyone actually work at a “Large Company” (as the page indicates) that would be stupid enough to pay $100/month so they can put all their proprietary and very personal code repositories on a third party web service?

So what are you paying for? Well, to start with, they have awesome integration with Lighthouse, since we all know there’s no decent free open-source issue tracking system… cough trac cough roundup cough. Oh wait, since there’s absolutely no simple web-based issue tracking systems, let’s have another slick business model to get people to pay for a stripped down Trac (but this time with a really pretty UI)!

What do these sites have in common? Rails, “look ma, I can copy-paste the business plan too” pricing models, and some good graphic designers at the helm. There also seems to be an interesting amount of promotion between these sites, as well as a nice blog post from the Rails creator himself promoting GitHub. I’m sure no one who has read this rant should be surprised though.

I only hope that no one starts to believe that a DVCS actually requires these “please pay” copies of their DVCS repo.

Google Datastore and the shift from a RDBMS 8

Posted by ben Sun, 13 Apr 2008 23:23:47 GMT

So many random musings and theories on Google App Engine, I won’t bother musing about it myself, except to mention that Ian Bicking put together instructions for running Pylons on it. These also work fine for using the latest Pylons 0.9.7 beta.

I got Beaker, the session and caching WSGI middleware that Pylons uses, running fine on Google now, using Google Datastore as the backend. Diving into the Datastore docs to get a grip on what’s the best way to implement it shed some light on the transition any developer thinking about writing data-backed apps for GAE (Google App Engine) will need to tackle.

Some notes on terminology, Google has Entities, Kinds, and Properties. These correspond roughly to Rows, Tables, and Columns in RDBMS-speak. Kinds can also be called classes, because in the Python API, you create a class and inherit from the appropriate datastore class. Entities may also be referred to as instances, since performing a query returns a list of objects (instances).

Sessions and Datastore

First, regarding sessions. Beaker will now let a Pylons app use normal sessions on GAE, the real question is, should you?

The Google User API makes it trivial to get currently logged in user, and the datastore comes with a property type for a ‘table’ that is specifically made for a Google user account reference. So with just one short command, you can have an entity from the Datastore that corresponds to a given user, ie:

userpref = UserPrefs.all().filter('user =', users.get_current_user()).get()

The Datastore is blindingly fast for reads and queries, so there’s a compelling reason to ignore sessions altogether and just fetch the appropriate preferences or what-have-you. This leaves people with the normal reason for wanting more, ie, a session, “But wait, I want to stash other little things with the user when they run around my app!”. Not a problem.

Google’s Datastore has an Expando class for entities that lets you dynamically add properties of various types. It’s like having a RDBMS where you can just add columns to each row, on the fly. The dynamic_properties() entity method makes it easy upon pulling an object, to see what dynamic properties were already assigned.

As far as I’m concerned, this pretty much mitigates the need for a session system. If you didn’t want to require user login, you could always make a little session ID yourself, and keep that on the UserPrefs table as a separate property, then query on that.

Rethinking how you store/query/insert data

Going slowly through all the Datastore docs and especially reading some of the performance information people were drumming up on the GAE mail list brought up a number of issues with how people with RDBMS backgrounds approached Datastore. Many of the table layouts I saw pasted on the mail list were clearly written for how an RDBMS works, with sometimes significant work required to adapt it to deal with Datastore.

A little background might help understand this difference. Google Datastore is implemented on top of BigTable, which is described briefly in the paper as a “sparse, distributed, persistent multi-demensional sorted map”. One of the other descriptions I heard in a talk on data storage techniques at FOO Camp from a Google developer was, “think of a BigTable table as a spreadsheet, except with pretty much as many columns as you want”.

This brings about a fairly big shift in thinking for the developer who grew up on an RDBMS. The fairly normalized organization of data written without regard to massively distributed data stores suddenly becomes a rather big problem. Consider a few of the ‘limitations’ of Datastore that will jump right out at you:

  • You cannot query across relations
  • You cannot retrieve more than 1000 rows in a query
  • Writes are much much slower than you’re used to (a developer on the mail list said 50 inserts with 2 fields each almost ate up the 3 seconds allowed for a web request)
  • There are zero database functions available
  • There is no “GROUP BY…”, which doesn’t matter much if you read the prior bullet point
  • Transactions can only be wrapped around entities in the same entity group (ie, the same section of the distributed database)
  • Referential integrity only sort of exists
  • No triggers, no views, no constraints
  • No GIS Polygon types, or anything beyond just a GeoPoint (Odd, considering that Google has so much mapping stuff)

Then of course, a few of the new things that might leave you scratching your head, quite happy, or both:

  • Keys for an entity may have ancestors (ancestors aren’t relations, they’re different and have to do with Entity Groups, which determine what you can do in a transaction, wheeee!)
  • An Entity Group doesn’t have to all be of the same Kind, its more of an instruction to Datastore to keep these near each other when distributed
  • Key’s can be made before the entity, just so you can make descendent entities of the key, then make the ancestor
  • The handy ListProperty, when used in a query, will let you use the conditional argument and apply it to every item in the list (sort of like an uber ‘IN (...)’ query, except it can also find all the data where a member in the list was <, >, or = to something else)
  • Making more Entity groups is a good idea when you frequently need a batch of “these few things” for a request, especially if you need to alter them all at once in a transaction
  • Normalizing is frequently bad since you can’t query across relations, dynamic properties make it easy to heavily denormalize. If you do normalize some data and its for the same batch of ‘things you always need at once’, use Entity groups. Or use a ReferenceProperty if its merely something related you may occasionally hit.
  • The ReferenceProperty() does not have to refer to a known kind, you can decide on the fly what datastore classes to reference if not specified when declaring the ReferenceProperty
  • Many to Many relations aren’t what you think, now you could have a ListProperty() of ReferenceProperty()’s, which may or may not all refer to instances of the same class
  • A query may return entities of different kinds, if querying for entities of a given ancestor

(There’s probably a bunch more as well, these were some of the obvious ones that jumped out at me)

The end result of this, is that the standard way a developer writes out the table schema for a RDBMS should be dumped almost entirely when considering an app using Google Datastore. Storing data and using Google Datastore isn’t difficult, but it is a pretty hefty paradigm shift, especially if you’ve never left RDBMS-land. This is not a trivial change to make in approaching your data.

I rather enjoyed working with these new ways of tackling data, and the possibilities opened by the ways it lets me store and refer to data in many ways goes beyond the traditional RDBMS. In the short term though, I doubt I’ll be making any GAE app’s until there’s an alternative implementation thats production ready… I just can’t handle the lock-in.

And of course, please note any corrections or inaccuracies in the comments.

Where's the Capistrano knock-off for us Python web devs? 19

Posted by ben Thu, 10 Apr 2008 01:27:24 GMT

Rails, and Ruby in general has had Capistrano for awhile now to help with the task of deployment and automating builds for servers, and even clusters of servers. Where is something like this for Python?

Now, before people note that I could easily use Capistrano for my Python project, I should note that it is rather annoying having to install yet another language. On the other hand, given that I will likely only need to install it on my development machine (which running OSX already has Ruby… and gems), it doesn’t seem too horrible to just use Capistrano and be done with it.

However, Capistrano doesn’t quite manage the Python egg’s, and the task isn’t exactly trivial. zc.buildout, which I previously ranted about due to odd docs does the management pretty well. It even results in a rather consistent build experience no matter where it occurs. Two commands, and boom, the app is ready to go.

Unfortunately, life isn’t quite that easy. When something does go wrong with buildout, trying to track it down can be exceptionally hairy. Having a tool so ‘magical’ as I’ve heard some describe it, carries its own penalties when things fail. Buildout also fails to automate the task of deploying the app itself to the other machine, which is still a manual process. It does manage egg’s rather well, though it does some very odd mangling of sys.path to accomplish this in every script.

I don’t need something as full featured as Capistrano, but I’d love to see something that has no more requirements than I’m already depending on (Python), that can handle the task of easily automating deployment of a Python application – including ensuring all the proper versions of the eggs I want are used – on a remote *nix machine. I recall seeing a post (I think by Jeff Rush) awhile back, on a system just like this that he unfortunately never released. Vellum also looks like it could be hacked further to do this task…

Is there some build/deployment tool that is just Python that I’ve missed? Something that will let me setup a script for some commands on how to deploy my app on another server and setup (hopefully in a virtualenv) the webapp so its ready-to-run (and optionally restart it/migrate the db/etc :)?

Older posts: 1 2 3 4 ... 17