Initial Throughts on Google App Engine

Apr 2, 2009 · 3 min read

I got some time this weekend to experience what cloud computing is all about. There are a few well-known “infrastructures” you can utilise at this moment, including Microsoft Azure, Amazon Elastic Compute Cloud (Amazon EC2) and Google App Engine. Honestly, I don’t have time to try Azure and EC2 yet. The reason I am attracted to App Engine is their “advertisements” in youtube. You can see that it looks very simple build a scalable web application from stretch. So, I gave a go this weekend.

No doubt at all. It is amazingly simple to startup. The only prerequisite is you must have python 2.5x installed. You can just download the SDK. For Windows users, the SDK is in form of an installer which helps you to make all the required path shortcuts (setting PATH environmental variable) (youtube video about installation). For Mac users, there is even a UI that saves you a few console commands (youtube video about installation). After downloading the SDK, I followed the tutorial to set up a simple guestbook application. The whole process only took me 30 minutes (including time to digest the python scripts and .yaml files) and surely it is painless.

As you may know, I’m coming from a .NET programming background. However, I found understanding python scripts is not all difficult. The hurdle to jump through to learn how to use App Engine is very low. For learning purposes, the quotas like CPU usage, datastore access, bandwidth usage, disk space, etc, that Google given is surely more than enough.

Before I go on to but some really useful applications, I’m now trying to understand the differences between “normal” server-client computing and cloud computing (actually, and between distributed computing and cloud computing as well). What I found so far is mainly about data store and caching. There are a few BIG words that I’ve come across about data store, like bigtable, entities, keys, entity groups, transactions. No matter what I have for my data, App Engine stores all data in a single table and that’s bigtable which allows good scalability. In the code, all a need to do is define a class which, in turn, defines what an entity is. Then calling its “put” method will save the entity into bigtable. Bigtable is not a relational database and it utilises time stamp property to maintain consistency. There are no table locks or row locks for write access. The write accesses are in a competitive environment and, if another transaction is writing to it, the transaction is rolled-back and try again. In bigtable, transaction is based on an entity group (a hierarchy of entities) and the assignments of entities to entity group is arbitrary. Thus, to allow efficient transactions, entity groups is to be designed very carefully to allow high throughput (basically, less disk writes). This is first difference I encountered so far. Relating to bigtable, there are a few interesting articles giving some insights in datas store designs, like Sharding Counters (related to pattern and anti-pattern problem), Modeling Entity Relationships, Extending Model Properties, etc. There is also a talk from Brett Slatkin talking about some advance techniques.

I’ll try to blog about caching in App Engine and more interesting link about this topic soon.

Happy coding! :)