Archive for the ‘SQLAlchemy’ Category

Best of Breed: TG’s job is hard. Here’s why.

Thursday, September 17th, 2009

In 2005 Kevin Dangoor made the decision to glue together a few existing technologies into something that would be useful for the creation of web pages.  This was marketed as the “Best of Breed” selection.  The challenge with Best of Breed is that the best is constantly in flux.  Also, making that claim requires TG to put in a serious amount of effort in _finding_ and evaluating the best of breed technology and then integrating them with TG.  This is a job I LOVE doing.  Even better than that, when I find something that doesn’t cut the mustard when it comes to being the best of things, I get to write it.

For example, when SQLAlchemy blazed on the scene, and TG was one of the first frameworks to support it.  My own frustrations with SQLObject lead to an early adoption of SQLAlchemy, and I never looked back.  This illustrates TG’s challenge succinctly.  In order to choose SA, I had to give up ModelDesigner and Catwalk.  For me, I just want to write code, not use a [buggy] web page to manipulate data or move pictures around to auto-generate my code.

In 2007 at Pycon, I tried in vain to make Catwalk work with SA.  It just wouldn’t happen.  This was the spark that lit the fire which has become the new TGAdmin, driven by Sprox.  The interesting thing here is that if the “Next ORM” is found, a reasonably small amount of code has to be written to make this happen.   We already have some successes with this with Sprox.

Let’s look at the template language choices you have as a TG developer.  TG started with Kid, and while this is a decent XML-oriented framework, if you used it in the early days if you are like me, you probably hit your head on the keyboard a few times for each complex page you tried to write.  www.percious.com still runs kid tho, and that’s important to note.  TG has not only to support new users who want to use new technologies.  We have to support those sites that are still running but want to migrate over to newer technology.  And I think we do a pretty good job of that, but again, this is a more difficult job than say a Dj-framework that has absolute control over the template language.

So, TG 1.1 will use SA and Genshi, the entire community decided to make the decision I made 3 years ago, that the usage of SA outweighs the benefits from Catwalk and ModelDesigner. Hooray!  Again, framework decisions made by committee are challenging.  This is what I LOVE about TG.  I didn’t have to wait 3 years for the community to catch up, I added SA to my stack of tools, and went  happily on my way.

Even the framework is not immune to TG’s “Best of Breed” mentality.  In 2007, while I was futzing with Catwalk, Mark Ramm hid in a room with Ben Bangert for a few hours and created what would become TG2.0.  We had been struggling with the changes that CherryPy 3.0 represented, and even though in some ways CP3 is a better back end server, we decided to use Pylons for our server level stuff because of the collaboration potential there.  Pylons is indeed a great platform for server-level development.

TG’s team spends an unbelievable amount of time evaluating what _could_ be the best of breed, and some things make the cut, some don’t.  Sometimes we even add stuff because it’s neat and we want to support it for that person who thinks it is the best of breed.  Sometimes folks come to us with a request to support what they believe is the best of breed, and we do the best to enable them to provide their solution as yet another way of using TurboGears.  ToscaWidgets is probably the best example of this.  If you look at the repository, you will see over 15 JavaScript library wrappers, and quite a few other libs that make creating web content easier.  Again, TG allows the developer to choose which JS library fits them the best.  For 2.0 we decided to leave that choice up to the developer.

In short, TG makes the choices so that you don’t have to.  We spend a lot of time examining new technologies, and exploring what _might_ work for you.  All of our developers use TG in real world applications that differ from giant source code repositories to scientific database management applications.  TG handles a diverse set of application in stride.

The next part in this three part series is entitled: “Best of Breed: TG is still the best choice for people who hate hitting the wall.” This will further express the flexibility of TurboGears and describe a bit about what we are doing to make it even easier to get started with TG.

Pycon 2009

Wednesday, February 4th, 2009

So, Pycon registration has been up for a few days, I will be speaking both on and off-podium (read: open space) and providing assistance to and presenting tutorials.  Here is a run down of what I am planning in case you wanted a little bit more in-depth information.

Tutorials:

Turbogears2 Beginner and Intermediate:

I will be assisting Mark Ramm by giving individuals help installing and using the new TurboGears2 framework.  Mark is an experienced tutorial presenter, an expert in the technology, and in general a fun character to spend a few hours with.  When you leave his tutorials you should expect to have a working version of TG2 on your machine, along with an understanding of Model, View, and Controller paradigms.  Middleware, Forms, and REST will also be covered.  One note, if you are getting started with TG2, it’s best to have it installed and running if you plan to attend only the Intermediate Section.  We will not be going over installation in the second-half.

 Toscawidgets: Test Driven Modular Ajax:

I am presenting this tutorial which will describe how to use the valuable Toscawidgets package to create web content.  If you are currently use WSGI technology, and are interested in creating reusable, modular web content, this is a perfect way to get started.  I will show you how to configure TW middleware to work with pylons (which is applicable to other frameworks like repoze.bfg, paste, or even plone/Grok).  I will then describe how you might use this middleware to generate web forms.  The last few hours of class will be devoted to using the JavaScript utilities of TW to create an Ajaxified website, and test it using YUITest.

The Big F’ing Tutorial: Development Using the repoze.bfg Web Framework

I will assist/present with Chris McDonough about this up-and-coming framework who’s goals are to utilize bits of the zope 3 framework, wsgi, and new technologies to make a lighting-fast web server.  Those of you who are familiar with Zope technologies may be interested to find how nicely some of the familiar bits of zope are integrated with wsgi with repoze.bfg.

 Presentations:

Using Sphinx and Doctests to provide Robust Documentation

This is a 1/2 hour slot which describes how you can integrate tested documentation with your source code… with sanity!  I go over a quick install of Sphinx, and use some screencasts to demenstrate how to add, run, and display doctests using it.

Open Space:Agile Development with SQLAlchemy and Python Testing Tools

I really enjoy giving this talk, and even though it was not accepted as a formal talk, I will find a venue by way of Open Space to express my knowledge of Testing, SA, and Nose.  I have given this talk a few times now, and it’s fairly polished.  My presentation, while on some dry topics, won’t put you to sleep.  Carefully prepared screencasts and photograph-punctuated slides makes the 45 minutes breeze by.  Questioneers/Hecklers welcome!

 Sprint Topics

I want to spend some time with the Dispatch of TG2, and probably push Sprox further a bit.  If you are just starting with TG, please feel free to contribute.  Sprinting is a great way to learn a lot from the experts in the domain.  We usually do a meet-greet-install the night before the sprints.  Oh, and I’ve been known to provide refreshments to all of our sprinting hordes (read: FREE BEER).

So, I hope to see all of you there!  If you see me in the hall, feel free to introduce yourself and tell me what you are using Python for!

Coding Binge

Tuesday, January 27th, 2009

I haven’t written to the blog in a while.  Quite frankly, I’ve been busy.  In the last 30 days, I have released 3 software new packages, updated 1, deprecated 1, participated in a sprint that lasted a virtual 2 weeks, closed countless tickets, and pushed forward TG2 functionality.

TG2b4 was released last Saturday.  This was mostly a bug-fix release, but b3 is where the new functionality really came into the scene.  TG2b3 is the first build to include Sprox, a new library for schema-generated widget generation.  Sprox is the offspring of DBSprockets.  I decided I liked the declarative part of DBSprockets so much I wanted to spin it off as it’s own entity.  Sprox looses DBSprocket’s table-based dependency, utilizing the mapping provided by SQLAlchemy.  I realized that much of DBSprocket’s code was doing precisely what SQLSoup was designed to do, and decided to focus my efforts on making and extremely configurable widget base.  The result was a considerable removal of the cruft that was associated with DBSprockets.  Sprox releases with an excellent documentation base provided by Sphinx.

There has been a bit of resistance to Sprox, people were/are confused/upset about my providing yet more options for schema based widget generation.  The fact is I have yet to find anything that performs as well as Sprox from a developer/speed standpoint, and I needed to provide our TurboGears user base with a better way to administrate their site, and also allow them to use that tool component-wise in their system.  I think this method for developing widgets is well done in other frameworks, and we need a solid answer to this problem.  Sprox is just that.

The next step was to re-work Catwalk to use Sprox.  This took a little effort, and I put in RESTful URLs while I was at it, but struggled with making the URLs work within TG2’s dispatch system.  The result was as close to REST as you can get without conforming to a set standard.

The result of hacking REST into Catwalk got me thinking, and I decided to implement content-type dispatch as well as RESTful dipatch in TG2.  I went back for another round on Catwalk, and converted it over to the standard.

I’ve also been toying around with Dojo at NREL.  I’m pretty much done with ExtJS due to licensing issues, a not-so-hot codebase, and weak support from IRC.  It’s bad when you go on to ask a question on the channel as a 6 month-user of a software project and end up spending all your time answering everyone else’s questions (as the most experienced person in the room).  Something must be said for an organization that does not push paid consulting as a primary focus on their website…  #dojo has been an exceptional resource for getting my work done.  Those guys know their software, and lend a great hand to help you with it.

Back to the topic at hand… I was able to shoe-horn Dojo into Sprox with little effort, and implemented DojoCatwalk, which worked, but was ultimately not what I wanted.  What I really wanted was configurability.  I started work on tgext.admin, which was supposed to provide enough functionality to replace tgcrud, a command to auto-create crud in your own TG application.  To support tgext.admin, I created a new package called tgext.crud, which provided a CrudRestController, which is a simple way of providing crud for any object in your model.  AdminController combines this functionality with that of Mark’s lookup code to provide a fast, configurable set of tables/forms/etc for all objects in your model.  AdminController takes a declarative AdminConfig as input which provides a consistent way to create your administrative toolset.  Did I mention it does Dojo tables with ajax loading?  Yeah.

I’m not done with this binge yet.  Catwalk is going to mutate one more time before I’m through with it.  It is going to become a default-configured AdminController specifically designed to work within the context of a quickstarted TG2 application.  I had one blocker ticket which was solved last weekend, so it’s time to get Catwalk good and finished.

DBSprockets is back, Baby!

Friday, December 12th, 2008

So I have had a pretty long hiatus from working on dbsprockets, but I’m back… with a vengeance.  So, I worked hard to get Rum working in TG2, but struggled my ass off and got nowhere.  Left tangled in a web of peak-rules that I did not want to decipher, I began to think about dbsprockets again.  I mean, were we really that far off from what RUM is offering?

The answer turned out to be no.  All I needed to do was to replace the dbsprockets primitives with a class structure.  This turned out to be about an 8 hour job.  Not bad for a day’s work.  And now you can do things like:

from dbsprockets.declaratives import FormBase
from myWebapp.model import User
...
class RegistrationForm(FormBase):
    __model__ = User
    __limit_fields__ = 'user_name', 'email_address', 'display_name', 'password', 'verify'
    __required_fields__ = 'user_name', 'email_address', 'display_name', 'password', 'verify'
    email_address = TextField
    verify = PasswordField('verify')
    __base_validator__ =  Schema(chained_validators=(FieldsMatch('password',
                                                        'verify',
                                                        messages={'invalidNoMatch':
                                                                  "Passwords do not match"}),))
registration_form = RegistrationForm()
class ATurboGearsController(BaseController):
    @validate(form=registration_form.__widget__)
    @expose('genshi:sproxtest.templates.register')
    def register(self, **kw):
        pylons.c.widget = registration_form
        return dict(value={})

This turns out to be a much simpler way of handling forms, because now you can subclass to your heart’s content (for instance, make one base user form which you subclass for admin, registration, profile and login), or even come up with your own wacky WidgetSelector that chooses widgets for you and subclass as you desire. Here is initial documentation, which I will express in more detail at a later date when I have more time to do it the right way.

The simple fact is that you can customize form widgets with ease, limit fields to a set of fields, drop a few fields, basically anything you can think of to change how the database schema actually displays on the page.  The same is true of field validators.  Simply define an attribute of the class that has a validator, widget instance, or widget type, and dbsprockets will do the right thing.  If you want to override both the field and the validator, all you must do is create a widget wit the validator attribute populated.  The greatest thing is that your forms (and tables!) will change with you as your database schema migrates.

Right now I am keeping the primitives way of getting the values from the provider to populate the tables/forms.   This is likely to change in the future.  The primitive way of doing forms is now deprecated.  I will be adding deprecation warnings to the code.  Hooray!

Look for ajax support in the near future on both the view and data side of things.  I have a clear understanding of ExtJS (2.0.1), JSON, and LGPL now and I am not affraid to use it.  In the mean time, a dev version of dbsprockets 0.5 is up at pypi.

SQLAlchemy Migrate Process Hiccups

Friday, December 5th, 2008

So, I’ve done migration processes for two medium-large database schemas (50-100 tables) and I have found what I believe to be a disconnect in the process of migrating and a database and developing a database application.


The Problem
——————-
Here is the problem in general.  I am sorry it is so long winded, but it’s hard to see what is going on without this full expression.

You create a schema, and use that schema to create a database.  Maybe
you are using Pylons, and using setup-app to create the schema. Everyone is happy in schema land, and then someone realizes you need to make a change, and maybe even more changes in the future.  You decide to use migrate, because hell, migrate will make things easier. So, you write the migration, modify your model code and everything is honky dory.  Clients are happy, your developer feels like he has a maintainable codebase.  Your project begins to grow.  Awesome.

Not Awesome.  Your second developer needs a development environment. You give them your model, they setup-app it and they are ready to go. While they are plugging along, you realize you need a new migrate, so you create it, and upgrade your development version of the database, which is now at migration 2.  The production also goes off without a hitch, it’s now on 2.

What about your new development buddy?  When he ran migrate_version, he set his database up to version 0.  So, migrate will try to do 0->1 and then 1->2, but 0->1 will fail because his database is already at version 1 even though his version table says its 0.  Now he is in pain, and re-creates his entire database to get moving, and the whole cycle starts over again, although he can work… until you do the next migration.

This is not ideal, and I have been struggling with it at one of my clients.  In that situation, I am the developer buddy, bopping in from time to time to lend a hand.  I spend probably 5% of my time getting up and running, where if their migration system was somehow linked to their database model, I could be synchronized, and actually use it for a development tool, rather than a dusty old production tool.

Solution 1
—————
So, I came up with my own solution, and have talked to a close friend about an alternative solution.  I’d like to discuss both of them as they pertain to migrate, in the hopes that we can improve migrate, and  re-connect the development environment with the production one.

My solution is to connect the model base code to the migration version.  The way I accomplish this is to add a version.py file at the root of my model code, which has, incredibly, the version of the migration changeset that matches this model definition.  I also have a custom database creation script which creates a migrate_version table and fills it with a record of the correct version number.  Now I can svn up, and migrate up and everyone is happy. The problem with this method is that I have to maintain a damn file with the version number, and not go insane doing it.  Since my model contains 3 different database schemas in it, that means I have to maintain 3 variables in my version.py .  Personally, I don’t mind this, but I had to write more documentation than the amount in this post to make certain that my predecessors (or me in 6 months) can figure out what the heck is going on.

Solution 2
————–
The second solution, which Mark came up with, is to preserve the initial model.  Then, you do all of the migrations to bring you up to the current codebase.  This does not require any version file tracking, and if your migrations add boilerplate entries, you don’t have to record them in a second place. I think solution 2 is reasonable, but could be more time consuming to execute (which is probably not really a problem with everyone’s 2ghz machines these days).  The problems I see are that the boilerplate entries you make for production may a not be the same as you make for development.  This makes testing a bit harder if they differ. I also feel that solution 2 is more prone to problems, simply because there are a lot more cranks to turn to get a development database.

How can migrate help
—————————–
The way to provide solution 1 to the problem in migrate would be to allow the developer to connect his migrations to his model code, and allow migrate some level of control over the “versions” module in the model code.  So, when you do a migrate commit, you are also saying that the model code supports this version of migration.  It will probably only take me  a few hours to make this happen.  We then create an easy way to create the migrate_version table by importing something.  This way you can have a custom createdb script that also creates the migrate_version table.  Perhaps we could even provide a reasonable createdb template.

Solution 2 only requires us to provide a custom creation script.  For this we will have to preserve the initial schema (schema 0) and then upgrade.  We could probably also include a template instead of a script so that you can manage boilerplate on your own.

My own conclusions
—————————-
I lean more towards solution 1, mostly because I feel that it will provide for easier boilerplating, because you will have access to your whole model at once, instead of bits and pieces of your model as it moves from one version the next.  I also think the boilerplate for dev. is likely to be a lot different than that of prod for most folks, and I feel like solution 1 offers an easier way of handling it.  It also seems like solution 1 is much more efficient, because you don’t have to go and re-do all of that migration work, you have already arrived at the correct-for-now solution.

If you made it this far congratulations!  Now, go grab a cup of coffee, stretch your arms, come back and tell me what you think.

Plonecon Recap

Monday, October 13th, 2008

Last week I attended Plonecon in Washington D.C.  We use Plone quite a bit at NREL, and I  wanted toploneconf come up to speed on the state of the art, and see if I could find some folks to collaborate with on the Scientific Data Management front.   I was not only surprised by the number of people in attendance, but also by the progress being made in forefront of Plone development.

Of course, one of the big news items was what is known in the community as the “Plone tax”, which is a reference to the 30+ seconds it takes to start Plone.  It was announced that this has been reduced to 6 seconds, which in my mind is still too much, but I am glad they are making progress in this area.  [edit] I believe the speedups were achieved in the trunk of plone, not for 3.2.   Plone 3.2 is supposed to be all eggs, which I think is a great benefit to the community.  I have already been using Plone in this manner with Repoze, which packaged Plone as an egg a few months ago.

I was impressed by the changes to the user interface which were proposed by Alexander Limi in his talk on the future of Plone’s user experience.  It looks like portlet/viewlet terminology is being supplanted with the term “widget” which I think is more standard in the web design world. I think the new user interface is more intuitive, simplifying the page down to those components which will be most important, while still making available all of the functionality of Plone.

I was also impressed with Kapil Thangavelu’s content mirror.  While I don’t think his project which takes Plone content and injects it into an sql database will be immediately useful to me, I think it offers a nice path for someone who wants to convert their Zope/Plone site over to an relational-database based site.  It also provides a nice bridge for someone who wants to use existing relational database tools to mine data penned up within a zope database.

Coming from a TG background,  I was of course interested in Repoze.  Chris McDonough gave a compelling talk on repoze, specifically repoze.bfg and what is motivation was for creating yet another framework. We ran a BOF together, and had quite a bit of response.  I demo’d using Toscawidgets within Plone (running on repoze.plone).

Craig Swank and I also demo’d one of our repoze.plone application for a group of people interested in laboratory informatics systems.  I started up a google group for this, and I sincerely hope that our small community can find a way to collaborate efforts in order to increase our productivity in a way only OSS can.

Speaking at Front Range Pythoneers

Tuesday, June 17th, 2008

Tomorrow Night I will be speaking about agile technologies with SQLAlchemy at the Front Range Pythoneers monthly meeting. I spent a few hours last night working out my presentation and creating a number of screencasts which show how to use tools like virtualenv, paster, and nosetests.  I also touch apon sqlalchemy, and how one would set up a test environment for their database schema.  Even if you don’t live in the Boulder/Denver area the information could be valuable to you, so I decided to set up a googlecode repository to store all of my tutorial-related materials.  It is called PythonTutorials.