I've just started as a Developer Advocate for Google App Engine and there's been a lot of talk of "lock-in" recently. Google will have more to say officially in the future, as obviously my blog isn't the place to read about Google's official position, but I would like to point out the following things that have come up in conversation that people didn't know:
Also, you can see that other issues are on the roadmap to be addressed, such as bulk upload and download.
Posted by Fred Blasdel on 2008-11-06
If the App Engine team really can't see how incredibly proprietary App Engine appears to outsiders, y'all have a big problem. I've had a couple of discussions about this with folks at Google, and I'm pretty disappointed at the disingenuous tone about lock-in I've gotten so far. The general answer seems to be "There's no lock-in: it's almost all Open Source!"
Unfortunately, the "lock-in" problem has less to do with open source than it does with the simple fact that App Engine is drastically different from any other sort of application hosting available anywhere.
It's not that it's hard to take an application written against App Engine elsewhere; it's literally impossible:
app.yaml, nor is the a spec or a schema describing how some hypothetical app.yaml server would work.urllib or anything else that talks to sockets (httplib2!); I have to use urlfetch. Yes, urlfetch is open source... but am I really going to write code against that API not intended for App Engine?So look. I obviously have a horse in this race and want App Engine to do well. More than that, it's damned cool tech and I want to be able to use it. I very well might be able to ignore the lock-in stuff given how nice App Engine promises to be. But the first step needs to be some real honesty about the unprecedented levels of lock-in App Engine represents.
On preview: this comes across as pretty harsh, and I'm sorry for that. I really want to like App Engine, but I'm just very frustrated that an honest discussion about lock-in really isn't happening. With luck this blog post will be the beginnings of that discussion!
Posted by Jacob Kaplan-Moss on 2008-11-06
I don't agree with you about the Users API, especially when Google has just committed to providing OpenIDv2 authentication for Google Accounts — the only thing that is not straightforwardly replicable is the full functionality of the nickname() instance method on User objects.
I also don't begrudge the need for urlfetch, as the design of AppEngine totally precludes the availability of vanilla sockets. Their intention is for all traffic to be via HTTP: buffered, proxied, and accounted for via Google Frontend. If they let you at raw sockets, you could get up to all sorts of trouble! Their urlfetch is certainly not a perfect solution (no way to set a timeout), but it is not awful. Anything you could do with sockets but not with urlfetch is disallowed for a reason.
A large part of why progress on AppEngine is so slow-going is because the group working on it is kept small, yet I wouldn't really have it any other way — it's certainly preferable to the confused mess of marketing materials that is Windows Azure!
Posted by Fred Blasdel on 2008-11-06
Posted by brian on 2008-11-08
Until there’s another provider and some people have successfully migrated their App Engine apps to that other service, there’s lock-in for practical purposes.
That said, the app I’m developing has neither a database nor user logins, so it is not as lock-in prone as many other apps. However, it’s written in Java… (Yes, I realize that offering the full JVM functionality on App Engine would be very problematic and limiting the functionality would be problematic, too, when existing code expects to be able to do stuff.)
Posted by Henri Sivonen on 2008-11-09
I understand your frustration. I do want this to begin the conversation, and I hope to have some more news to share in the coming days.
Nothing else on the planet (well, nothing approaching production quality) will start WSGI servers given app.yaml, nor is the a spec or a schema describing how some hypothetical app.yaml server would work.
Here is the documentation for app.yaml, though I'm not sure that addresses your concern completely.
The datastore API runs against something (BigTable) that I can't buy at any price from anywhere but Google.
Worse, the non-relational-database world is even more proprietary and incompatible than the relational world. If I decide to ditch Oracle I've got to rewrite a bunch of SQL queries... but at least both use SQL! I literally would have to rewrite my application from scratch if I wanted to replace App Engine's datastore with, say, CouchDB.
This is something (MegaData) that I've been talking about for a while now, and it's one of the reasons that I was drawn to App Engine. The point I've been making is that datastores built to manage huge amounts of data work differently from a traditional RDBMS. The general-purpose-RDBMS-all-your-data-in-fourth-normal-form does not scale. The reality is that there is no one-right-way to handle large amounts of data today; Google has BigTable and that's the model exposed in App Engine, Amazon has exposed their model with S3, are there are plenty of other competing ideas in this space, such as CouchDB, the streaming database Michael Stonebraker talks about in his paper, "'One size fits all': an idea whose time has come and gone", and who-knows-what is ultimately going to be built with Drizzle.
While all of them are different, there are some underlying commonalities that will be surprising and uncomfortable if you are coming from a relational background, such as a restricted scope for transactions, a lack of joins, and the accompanying denormalization of data. That's because you can't do those things efficiently in a general way across a large number of machines. Given that, are there idioms and best practices that are common across all these systems? Could you build an abstraction layer that worked across all of them? I don't know the answers to those questions. I do know it's different, I do know it can be frustrating, and I do want to work on bridging that gap between RDBMS's and MegaData datastores.
Posted by Joe Gregorio on 2008-11-10
I do want this to begin the conversation, and I hope to have some more news to share in the coming days.
Extremely good news! As far as I'm concerned, as long as there's an open conversation about the issues (and benefits) that's a real win even if there's stuff that still needs to be private. I do prefer the "don't announce until it works" approach over the hype and vaporware that dominates so much of the industry. But when "don't hype" becomes an excuse to not discuss shortcomings (*looks at Apple*)... that's bad.
I'm very much looking forward to seeing what's next... these are certainly interesting times to be a web developer.
Could you build an abstraction layer that worked across all of [the non-relational databases]?
I think you can*. If you look at the internals of relational databases, you'll notice that they by no means work the same way. Even databases that implement common patterns (e.g. MVCC) do so in radically different ways. However, they do expose a (mostly) common interface in the form of SQL.
Now, anyone who's worked with relational database for some time knows that even if you ignore the different SQL dialects there's dramatic differences in performance and semantics across different engines. Even something seemingly simple like SELECT COUNT(*) has radically different characteristics in different databases. Like all abstractions, SQL leaks.
But this isn't a problem -- or, more accurately, the benefits of a common interface outweigh the drawbacks. You write your SQL against the idealized abstract interface, and then when it starts hurting you dive in and optomize for your specific situation.
I see no reason why we can't treat MegaData stores (nice term, BTW) in a similar manner. Expose a common interface that works "most of the time," and then just dive down and work at a lower level when the abstraction leaks.
* At least, I'm giving it a shot in Django. We'll see where it goes.
Posted by Jacob Kaplan-Moss on 2008-11-11
Here's one of bits of news to share that I alluded to earlier: gae-sqlite.
Posted by Joe Gregorio on 2008-11-12
2008-11-06
Posted by Dave Tauzell on 2008-11-06