Archive for the ‘Python’ Category

TG’s Killer Features: SQLAlchemy. Obvious, no?

Monday, September 28th, 2009

So, as I stated in a previous entry, I’ve been using SQLAlchemy for more than three years now.  If you know anything about me, you’d find it amazing that I have contributed little more than a patch here or there.  Why is that?  I think that’s because every time I try to find something that I need SQLAlchemy to do, it already does it.  I have spent so little time trying to make SQLAlchemy do what I want it to do, and so _much_ time getting work done with it.

Here is a good example.  One ohttp://www.flickr.com/photos/27342383@N07/3430321620/f my clients is a

company developing sports management software by the name of MVP.  Although I work on their next-gen stuff, I was called on by them to promote the students on the older system to the next grade over in the summer time.  Their database system had well over 40 tables, requiring no less then 10 of them to have modification.  I fired up sa, reflected the changes, wrote the changes in python, with simple loops around what records I needed to change, an it was done.  In one hour’s time I had a happy client, and a school system with a functioning system.  Two things about this are amazing.  1.  At the time, I had practically zilch in the Postgres experience department.  2. I only needed to learn the schema of the tables I was modifying, and I could do this with introspection.  The fact that I was able to do this task in less than an hour was only made possible by my knowledge of SQLAlchemy’s table-based architecture.  I did not need to know the nuances of Postgres’ SQL language (I was more familiar with MySQL at the time), or a in-depth knowledge of the database schema.  I was able to pick and prod until the job was done, and it was painless.

I really like the fact that SQLAlchemy’s approach to making an ORM is layered.  I can drop to whatever layer I need to to meet the requirements of my project.  SQL, table objects, and mapped objects all have their place in the grand scheme of things, and I have used them all to varying degrees. What SQLAlchemy achieves with this layering is the ability to adapt to existing projects, schemas, etc, and make considerably useful software, in a short amount of time.  By figuring out the nuances of different SQL dialects, it frees me up to focus on the task at hand, and provide products to my clients that work in a broad range of situations.

But this post is supposed to be about TurboGears, right?  For me, SQLAlchemy is more than a tool I use for TurboGears projects, or even web applications, it’s something I can use any time I have to access a relational database.  It is the ultimate base for writing tools that help me get my job done.  That job might be providing a system to allow schools to schedule matches against each other, or enabling scientists to access their data directly using objects at a python prompt, instead of assembling arcane sql strings to gather data.  SQLAlchemy is such a good basis in fact, that it makes building tools to help do my job even easier.

http://www.flickr.com/photos/giantrebus/2864731590/

I have had the opportunity to contribute to sqlalchemy-migrate.  If you don’t know anything about the project, consider this:  You have a production database you cannot break.  You have 5 minutes downtime to modify the schema, update records, etc.  You need to be able to back out the changes and bring the system back up if everything breaks.  Migrate lets you do all of this, and provides a versioning system to ensure that such a process moves smoothly.  I have used migrate with postgres, found some rough edges and fixed them.  SA’s table mapping makes this possible, migrate just adds a layer to make versioning and some table creation processes easier.  I only hope that someday some of the migrate code makes it’s way back into the SA codebase.

I do lots of testing.  I made numerous schema changes today to one of my databases at www.nrel.gov in fact.  I test the database schema for

http://www.flickr.com/photos/6x7/322551650/

matching against my definitions with some yet-to-be-released software.  There are about 3000 tests.  With tables numbering in the fifty-semod, I needed something to make data entry easier for my tests.  I wrote bootalchemy to do this.  You pass it some YAML, and the models module, and it performs all the entries for you.  It does a bit of introspection to determine dates, and it has reference pointing so you can inter-connect your objects within your yaml (using & and * like my old friend C).  This has vastly decreased the amount of time it takes me to create new test data for new tests.  Again, this is possible only with the framework that SA provides me.

Lately I have been interested in providing a broader base of scientists at NREL access to their data using python as a medium.  Scientists (especially physicists) are really good at conceptualizing data.  They use crazy tools like Matlab, and R and all sorts of proprietary tools to manipulate their data.  They aren’t afraid of a command prompt.  My idea is to give them something like Sage, but with direct access to their data as mapped objects.  I also want to be able to show up at a scientist’s desk with laptop in tow, connect to one of their existing databases, and spit out a admin-style web interface based on TG in a few minutes.  These notions have driven me to contribute sqlautocode.  At this point sqlautocode will spew out a page (or 7) of python code that provides you with Declarative Objects and an interactive prompt.  sqlautocde works as a library, so you can use saautocode’s output to directly in memory without generating any code at all.  All this is possible with SA, and I don’t know of anything else that can do all of this.

I have focused on the technical here, but beyone that is a great team of individuals like Mike Bayer and Jason Kirtland who put in long hours and answer questions promptly on the mailing list.  I often wonder how these guys get any sleep.  The thing that excites me most about SA is that it will soon release version 0.6, which to me means that these guys have 4 more versions of increased functionality before they consider it “done.”  So, as you can see, there is more to SA than just the ORM it provides.  It’s framework gives you freedom to expand your horizons and get your job done, by focusing it’s task on the challenges that relational databases all have, so you don’t have to.  This is what makes SA a killer feature of TG.

The third segment in this series is: TG’s Killer Feature: The Admin (Yes, We do, and it rocks)

TG’s Killer Features: Object Dispatch

Wednesday, September 23rd, 2009

When people ask me what really sets TurboGears apart from the rest of the frameworks out there, I throw away everything we have built upon our foundation and find one thing remaining.  Object Dispatch.  Dispatch is probably a heinously boring topic for some folks.  For me, it’s a fascinating topic, one full of complexity, twists, and turns.  This is my main contribution to TurboGears, creating a dispatch system which is flexible enough to handle the most complex url mapping you can come up with, simple enough for the average Joe to use.  Like most of the technologies TG builds upon, the dispatch system is designed to be easy to start with, and easy to mould when you find that the defaults don’t meet your needs.

Let’s look at TG’s humble roots.  CherryPy was probably not the first framework to realize that nested classes are a fantastic way to organize urls.  Afterall, web addresses are nested, right?  (More on that later)  As the web server of choice for our founder, Kevin Dangoor, CherryPy offered a solid foundation in what remains in my mind to be the killer feature of TG.  Without OD, you pretty much don’t have TurboGears.  (You have Pylons).  Speaking of our brethren, lets look at a few frameworks to see how you dispatch a simple url.

Let’s take this url for instance:

rules/section/3

Pylons:

in routes.py you would add the line:

map.connect(’/rules/section/:id’, controller=‘section’ action=‘show’)

create the controller:

class SectionController:

def show(self, id):

return ‘i am a section, hear me roar’

Django:

edit urls.py to add:

(r’^rules/section/(?P<id>\d+)$’, ‘myapp.views.section’),

create the controller:

def section(request, id):

return HttpResponse(‘i am a section, hear me roar’)

Both are remarkably similar here, but I would give the advantage to Django for two reasons.  First, it uses regular expressions.  That means I don’t have to learn yet another dialect for matching strings.  Second, it specifies the controller directly, instead of the framework determining the location of the controller code with some magic lookup method.  Lastly, it provides a serialized method for routing, which means that you can provide a set of lookups in a file which can be digitized, modified, and re-serialized.  This is a huge advantage if you are planning on making a large system of swappable parts.   However, I don’t think the routes system is bad, and I have built my own system around it to manage the above problems, for a very plugin-required system I am building.  By the way, I’m no Django expert, so please feel free to comment on my code samples and I will fix them if need be.

Okay, so let’s look at the TG version of the same thing.

routing match code:

huh?  what’s routing?

controller code:

class SectionController:

@expose()

def index(self, id):

return ‘i am a section, hear me roar’

class RulesController:

section = SectionController()

class RootController:

rules = RulesController()

Hmm.  Notice the lack of weird symbols in the router code?  Oh, waitaminute, there _is_ no routing code!  That’s one less thing for someone to learn, one less thing to have to code.  Except there’s a catch.  That little @expose; what’s that all about?  Well, because the routing is done with introspection, you have to tell the dispatcher that something is _allowed_ to be dispatched upon or not.  So, already you have addressed some security concerns, where otherwise you would have to test that your routes don’t end up in a method that should not be exposed to the outside world.  And in complicated routing, this can be a definite issue.

So, Object Dispatch provides a provably simpler path to nirvana for the web programmer, but it’s just not cutting it for you.  Well, first off, with TG2 you can easily drop back and use Pylon’s routing mechanism.  Secondly, we build a RestController, which routes based on HTTP Verbs.  Lastly, in TG 2.1 you have the cream-de-la-creme of dispatch power.  You may write your own dispatch mechanism.  Theoritically, you could write a DjangoDispatchedController, that uses regular expression to determine the enclosed members’ dispatch mechanism.

Wait, huh?  Okay, so dispatch works in a similar manner for every framework, but not every framework can so easily switch dispatch mechanisms on the fly.  Here’s basically how it works in TG:

First, we go through the normal Pylons routing, this usually deposits us at the RootController of your app, unless you have overridden the pylons routes for your app to do something else.  Next, the routing mech takes your url, and splits it by ‘/’ into a list of strings.  It now looks for attributes of the root controller that match the first string in the list.  If that attribute is a method, and the method’s number of arguments jives with the remaining items in the list, the dispatcher says it’s done, and fires that method.

If the attribute found in the initial dispatch is an instance of a class, and the class has no _dispatch method, it traverses that class with the remaining elements of the list, repeating down the list until it finds a method that matches.

Now, if the attribute is an instance, and it has a _dispatch method, the dispatcher will then _become_ the _dispatch method, and dispatch will continue in whatever method was identified by the new _dispatch method.

So, if you read that whole description, I commend you, but if you only looked at the picture that works too.  Basically what I am saying here is that TG2 provides you with the ability to create your own method for dispatching your controller code based on the remaining URL in the path (that’s the “black” box).  And it works too, this is how we implemented TGController and RestController, and soon we will have an AmfController that does a Remote Procedure Call method of dispatch, to help simplify controller code for our Adobe Flex friends.

TurboGears is in my mind designed correctly, with very little tradeoffs made for the sake of pragmatics.  This no-sacrifices approach to the framework is yielding results.  As a developer, choosing TurboGears means that you can get started pretty easily, and are not stuck once you hit the limitations of TG’s developer’s imaginations.

I have decided to make this blog entry into a series on what makes TG 2.x a compelling choice as a web framework.  The next blog entry will be:  TG’s Killer Features: SQLAlchemy (obvious, no?)

Best of Breed: TG is still the best choice for people who hate hitting the wall.

Monday, September 21st, 2009

We get a lot of refugees who come to TG from other frameworks where they got themselves to a place they could not get out of.  I have worked with some of these frameworks in the past.  Things are awesome in the beginning.  You work on some less complex stuff, maybe change a template around, or the theme for a site.  Then you have some technical detail you need changed and BAM!  You just hit the wall.  Now you are forced to dig, or suck it  up and go on IRC and ask noob question.  Many times I find myself getting shut down, told to RTFM, or whatever.

With TG I never really felt the wall.  TW was a bit of a hump to get over, but seriously, things are so un-coupled in TG that using a technology you are already familiar with, or swapping out the standard stack is not something totally unheard of.  I for one do not like working with Genshi.  I’ll do it, and I support it for major OSS work I do, but really I prefer Mako.  Mako is fast, works in a non-xml way (which makes it great for writing form-emails for instance).  What this means is that TG has good support for Mako in the “standardized” parts like the admin.

The thing is, we really did not have to add much to TG to allow it to work with Mako.  I think one of the ingenious things that TG supports is dotted template lookup.  What this does is allow you to pull data from any package, because the lookup occurs using pkg_resources.  Beautiful, now we have the ability to move templates into their own succinct packages.  Also, we support non-dotted template lookup for template languages like Jinja, and adding support for dotted lookup wouldn’t be too hard to do in the future.

The new TGAdmin interface is another example where hitting the wall is just not something that happens.  TGAdmin is built on new technologies for TG2.0.  Namely RestController, Sprox, and lookup.  It creates a custom controller for each of your model classes, and therefore you have a good place to start from when you get going with your application.  This is great for demoing.  With SQLAutocode, Sprox, and TGAdmin you can literally hook into your client’s database (MySQL, Postgres, sqlite or MSSQL) and generate web forms where you can have an infinitely scrolling tableview, edit, and create forms.  From there you have the ability to customize further, by hiding table/form fields, changing the look and feel of any component in the chain.  This is made possible by Sprox’s configuration interface.  You can also modify the controller code for each model by adding controller methods to your default controller in the admin’s config object.  All of this customization is outlined in a tutorial.  

The great thing with the new TGAdmin is that since it is based on Sprox, you can re-use the knowledge of sprox externally to the admin, and also bring any existing knowledge you may have about ToscaWidgets to the table.  You can even use sprox outside of TurboGears for any other python web-based applications that you have that also use SQLAlchemy (read: Pylons).

To get back to the TurboGears discussion, a lot of folks don’t want everything that TG brings to the table.  Some folks need different auth/auth models,  have no use for a widget library, or even a relational database connection.  TG can serve you too, and it’s still fast.  Recently SourceForge moved over to TG for their main website.  This requires a connection to MongoDB on the back end (read: no SQLAlchemy), zero ToscaWidgets, and a different authentication method.  According to Mark Ramm, they had started with a different framework which lends itself to the Jinja2 templating engine.  As far as I know, Mark was able to meet all of these goals, and eliminate 9/10ths of the server load they had with their previous system, written in something non-python.  This just goes to show how flexible TG was to be able to meet all of these needs, and still provide a technically sufficient solution.  Maybe he will comment more about this in the future.

Now, there are those folks out there who have decided to role their own.  And WSGI definitely supports, if not encourages this behavior.  But I just have to ask:  Who’s agenda are you really fulfilling?  Even if people like your code and you have a small following, are you helping to further the benefits of your projects, or your career in general?  Are you helping your customer in the long run if you leave the project, or leaving them with a dead-end piece of code?  Will they be able to find someone to replace you, should you decide to leave?

TG has an active community of folks who are willing and able to help get your contributions into the main code branch.  We have embraced mercurial, and use it’s abilities to offer a lessened barrier for those who want to contribute.  So, please, before you go make your own framework, see if you can help us make TG better, and your reward is that you now have a community of folks that will help maintain your work, even if you decide not to.

Basically what I am saying here is that TG has an even learning curve.  Sure, as you get more involved, the problems will get harder, but you wont end up having to re-write half of the framework just to get it to do what you want.  And, if you do find something that needs work to meet your needs, you have options to participate in the development of TG.

In 2008 TurboGears ran a sprint series to flesh out the 2.0 release.  We successfully released in Spring of 2009, thanks to the hard work and dedication of a number of folks who saw the process through.  This was a great opportunity for folks who wanted to be actively involved with a web framework to jump in.  It is very likely in the coming months that TG 2.1 will see the same sort of community out-reach as we prepare to move from a development cycle, to a release one.  So, look forward to that, find out how you can contribute, and by all means, give us feedback as to what you really want from a Best of Breed framework.

Best of Breed: TG’s job is hard. Here’s why.

Thursday, September 17th, 2009

In 2005 Kevin Dangoor made the decision to glue together a few existing technologies into something that would be useful for the creation of web pages.  This was marketed as the “Best of Breed” selection.  The challenge with Best of Breed is that the best is constantly in flux.  Also, making that claim requires TG to put in a serious amount of effort in _finding_ and evaluating the best of breed technology and then integrating them with TG.  This is a job I LOVE doing.  Even better than that, when I find something that doesn’t cut the mustard when it comes to being the best of things, I get to write it.

For example, when SQLAlchemy blazed on the scene, and TG was one of the first frameworks to support it.  My own frustrations with SQLObject lead to an early adoption of SQLAlchemy, and I never looked back.  This illustrates TG’s challenge succinctly.  In order to choose SA, I had to give up ModelDesigner and Catwalk.  For me, I just want to write code, not use a [buggy] web page to manipulate data or move pictures around to auto-generate my code.

In 2007 at Pycon, I tried in vain to make Catwalk work with SA.  It just wouldn’t happen.  This was the spark that lit the fire which has become the new TGAdmin, driven by Sprox.  The interesting thing here is that if the “Next ORM” is found, a reasonably small amount of code has to be written to make this happen.   We already have some successes with this with Sprox.

Let’s look at the template language choices you have as a TG developer.  TG started with Kid, and while this is a decent XML-oriented framework, if you used it in the early days if you are like me, you probably hit your head on the keyboard a few times for each complex page you tried to write.  www.percious.com still runs kid tho, and that’s important to note.  TG has not only to support new users who want to use new technologies.  We have to support those sites that are still running but want to migrate over to newer technology.  And I think we do a pretty good job of that, but again, this is a more difficult job than say a Dj-framework that has absolute control over the template language.

So, TG 1.1 will use SA and Genshi, the entire community decided to make the decision I made 3 years ago, that the usage of SA outweighs the benefits from Catwalk and ModelDesigner. Hooray!  Again, framework decisions made by committee are challenging.  This is what I LOVE about TG.  I didn’t have to wait 3 years for the community to catch up, I added SA to my stack of tools, and went  happily on my way.

Even the framework is not immune to TG’s “Best of Breed” mentality.  In 2007, while I was futzing with Catwalk, Mark Ramm hid in a room with Ben Bangert for a few hours and created what would become TG2.0.  We had been struggling with the changes that CherryPy 3.0 represented, and even though in some ways CP3 is a better back end server, we decided to use Pylons for our server level stuff because of the collaboration potential there.  Pylons is indeed a great platform for server-level development.

TG’s team spends an unbelievable amount of time evaluating what _could_ be the best of breed, and some things make the cut, some don’t.  Sometimes we even add stuff because it’s neat and we want to support it for that person who thinks it is the best of breed.  Sometimes folks come to us with a request to support what they believe is the best of breed, and we do the best to enable them to provide their solution as yet another way of using TurboGears.  ToscaWidgets is probably the best example of this.  If you look at the repository, you will see over 15 JavaScript library wrappers, and quite a few other libs that make creating web content easier.  Again, TG allows the developer to choose which JS library fits them the best.  For 2.0 we decided to leave that choice up to the developer.

In short, TG makes the choices so that you don’t have to.  We spend a lot of time examining new technologies, and exploring what _might_ work for you.  All of our developers use TG in real world applications that differ from giant source code repositories to scientific database management applications.  TG handles a diverse set of application in stride.

The next part in this three part series is entitled: “Best of Breed: TG is still the best choice for people who hate hitting the wall.” This will further express the flexibility of TurboGears and describe a bit about what we are doing to make it even easier to get started with TG.

Best of Breed: TurboGears is alive and breathing. We are even thriving.

Tuesday, September 15th, 2009

I think a lot of people wonder what’s happened to TurboGears.  Where is TG going?  Where has it gone?  In a recent mailing list post, we were blasted for our documentation, or lack there-of.  People seem sort of frustrated that they have a great tool in TG2.0, but have to spend so much time isolating their own technical problems that they fail to see that there is considerable documentation in most areas, but that the docs have a few sore spots here and there.

2008 was all about making TG2 _work_.  We’re past that now.  Most things pretty much work, some things work really well.  Other things need some attention.  Now that we’ve got the hard part of actually  designing a functioning framework, we can focus on documentation, and using that valuable framework we have written to push the envelope of what TG can do.

One of the “things” we need provide to the user community is better documentation.  In the past few weeks I have seen more drive in our community to improve the docs than ever before.  Michael Pedersen has taken over responsibility for our documentation.  I cannot thank him enough for his work, both in reviewing, reorganizing, adding to, and fixing errors in our existing documentation.  His kind of no-sacrifices attitude towards the docs means that we won’t just have “something” up there, we will have what it takes for developers to create web applications using TurboGears.

ToscaWidgets is a sore spot for a lot of folks.  I feel your pain.  Lot’s of folks say you don’t really need TW to do what it does because you are just creating HTML forms, what is so hard about that.  Well, I’ll tell you that I could not have written Sprox without it’s flexibility.  Here’s the good news: TW has been re-written from the ground up by Paul Johnston in the past few months.  I’ve been helping in this process, providing the tests that will make it more stable than the previous version, and making sure the codebase is not so complex a feeble mind like my own cannot comprehend it.  I spent some time benchmarking it, and making sure it’s as fast as it can be.   TW2 is 2x as fast as TW.  It approaches the speed of simpler frameworks that _only_ produce html (they don’t do resource injection, parameter cascading, etc.)

On other fronts, Jorge Vargas and I have been working on integrating MongoDB with Sprox.  This will become the “killer app” for sprox 0.7.  For me, this represents proof of concept for Sprox.  We have successfully integrated the basic workings of MongoDB into Sprox, which means I generalized in the right places enough for this to work.  The result is a TG Admin that will work for MongoDB or for SQLAlchemy equally.

So yeah, there’s still a lot of activity on the TG front, and if you pop into IRC you can feel free to chat up at least one of the TG dev team at almost any hour.  Also, we are having a DocSprint Sept. 25-27 (with a main emphasis on Sept. 26), in Boulder, CO and worldwide remotely.  We will be addressing the over 100 todo items that Michael has so graciously gathered for us.

This is the first part in a 3 part serious on TurboGears.  The next part is entitled: “TG’s Job is hard.  Here’s why.” which will discuss various philosophical challenges with running a project like TurboGears.

Pycon 2009 Recap

Friday, April 3rd, 2009

It felt like this year Pycon was executed to near perfection. Many struggles I had with last years Pycon were addressed both by the organizers, and some creative thinking.
In this post I will recap everything that happened from my perspective.

WSGI House

I gathered a few close friends from the TG team and a couple of wildcards for perspective to share a house for the continuum of the conference.  Having a house gave us a place to go home to at night and meet with friends, often staying up late talking about issues surrounding our favorite software.  Having a focused group I feel is important because you spend less time off on wild tangents.  The first (and pretty much only) rule of the house was that you pay the same amount whether you stay one night or nine.  At least one of our members was encouraged by this rule to stay for the sprints which he hadn’t done before. Success!

Tutorials

For me, tutorials got off to a shaky start, but we seemed to recover nicely.  TurboGears has a lot of momentum right now, and it makes it hard to come up with a succinct tutorial when there is so much functionality to cover.  I think we were able to recover and that our students managed to soak in enough knowledge from our proverbial fire hose to create some useful applications.  I think we have a good start on a new book.

I was extremely impressed with the quality of students who were attending my ToscaWidgets tutorial.  Every single student finished every example.  I chose Pylons to give the tutorial, and although it is a little harder to integrate TW in the stream than does TurboGears2, it installed quickly and flawlessly.  Overall, I think the tutorial was a success.

Talks

This year I did not focus on attending the talks, but instead chose wisely based on speaker and topic and allowed my feet to do the walking if the talk became uninteresting.  I definitely missed some talks, but the AV team has done an incredible job putting the talks up on blip.tv so that I can review them later.

This year I did not miss Raymond Hettinger’s talk on AI in python and was enthralled by a speaker who could successfully put a page of code on the screen and keep my interest.  I showed up to support Philip Jenvey in his talk on Pylons on Jython but was impressed by his ability to provide a succinct example on where Jython really shines.  I am hoping that more people take a second look at this really well done presentation.

Now, I am a SQLAlchemy supporter through and through, but find the domain of database mapping an interesting echosystem.  While the ORM panel was littered by advertising chatter from one of the panelists who did not even write an ORM, an obvious dis-inclusion was Robert Brewer who wrote Dejavu, a very nice way to map persistent resources of different types for use in an “objecty” way.  Bob’s talk was especially interesting and makes me wonder if SQLAlchemy could leverage some of the work with AST that Bob beautifully displayed with some of the most amazing one-handed keyboarding I have ever seen.

Open Space

Well, I said I was going to give a talk at the Open Space, and ended up not doing so.  Part of the problem was the utter lack of projectors in the OS rooms, and part of it was a reluctance to break up the collaborative/discussive vibe that was going on in these sessions.  WSGIers hammered out a 2.0 spec, which involved a discussion I only monitored in passing.  I was disappointed by the lack of people who showed up for the GSoC BOF, but I think the economy held back a lot of students from attending Pycon.  It was also nice to allow my feet to walk around and see what was up in different projects.  I met one guy who took REST way to far and got to express some of my dissatisfaction with one of the available tools.  On a more positive note, the TG BOF was well-attended  and it was nice to see so many users wondering what was up in TG land.

Sprints

This year I refused to let the noobs get me down and actually wrote some code.  I am sorry if I did not act as a good host of the TG project, but we have some important milestones coming up and I just wanted to get work done on that.  Sprinting remains a cornerstone of our development process and I will see if we can’t get our monthly
sprints happening again in 2009.  I was however able to completely re-engineer our dispatch system, and while it is not currently 100% complete, it should be finished in a matter of days.  RestController now supports variable arguments for get_one, delete, and put, as well as supporting lookup and default.  Anyone can actually now create their own dispatch mechanism, since this functionality has been generalized.  Simply subclass Dispatcher, override _dispatch() and go to town.  I look forward
to seeing what kind of crazy code this brings to TG land.  A lot of discussion has been had on how to make “plugins” or “extensions” for TG, and you can rest assured that we will have this functionality soon.

Thanks

Thanks to all of my house mates who put up with my “mothering”.  Thanks to all of you who tolerated my “um”s at my talk on Sphinx, and especially to Georg Brandl who answered some questions.  Thanks to the organizers, volunteers and staff that came together to create what has been my best Pycon to date.

Pycon 2009

Wednesday, February 4th, 2009

So, Pycon registration has been up for a few days, I will be speaking both on and off-podium (read: open space) and providing assistance to and presenting tutorials.  Here is a run down of what I am planning in case you wanted a little bit more in-depth information.

Tutorials:

Turbogears2 Beginner and Intermediate:

I will be assisting Mark Ramm by giving individuals help installing and using the new TurboGears2 framework.  Mark is an experienced tutorial presenter, an expert in the technology, and in general a fun character to spend a few hours with.  When you leave his tutorials you should expect to have a working version of TG2 on your machine, along with an understanding of Model, View, and Controller paradigms.  Middleware, Forms, and REST will also be covered.  One note, if you are getting started with TG2, it’s best to have it installed and running if you plan to attend only the Intermediate Section.  We will not be going over installation in the second-half.

 Toscawidgets: Test Driven Modular Ajax:

I am presenting this tutorial which will describe how to use the valuable Toscawidgets package to create web content.  If you are currently use WSGI technology, and are interested in creating reusable, modular web content, this is a perfect way to get started.  I will show you how to configure TW middleware to work with pylons (which is applicable to other frameworks like repoze.bfg, paste, or even plone/Grok).  I will then describe how you might use this middleware to generate web forms.  The last few hours of class will be devoted to using the JavaScript utilities of TW to create an Ajaxified website, and test it using YUITest.

The Big F’ing Tutorial: Development Using the repoze.bfg Web Framework

I will assist/present with Chris McDonough about this up-and-coming framework who’s goals are to utilize bits of the zope 3 framework, wsgi, and new technologies to make a lighting-fast web server.  Those of you who are familiar with Zope technologies may be interested to find how nicely some of the familiar bits of zope are integrated with wsgi with repoze.bfg.

 Presentations:

Using Sphinx and Doctests to provide Robust Documentation

This is a 1/2 hour slot which describes how you can integrate tested documentation with your source code… with sanity!  I go over a quick install of Sphinx, and use some screencasts to demenstrate how to add, run, and display doctests using it.

Open Space:Agile Development with SQLAlchemy and Python Testing Tools

I really enjoy giving this talk, and even though it was not accepted as a formal talk, I will find a venue by way of Open Space to express my knowledge of Testing, SA, and Nose.  I have given this talk a few times now, and it’s fairly polished.  My presentation, while on some dry topics, won’t put you to sleep.  Carefully prepared screencasts and photograph-punctuated slides makes the 45 minutes breeze by.  Questioneers/Hecklers welcome!

 Sprint Topics

I want to spend some time with the Dispatch of TG2, and probably push Sprox further a bit.  If you are just starting with TG, please feel free to contribute.  Sprinting is a great way to learn a lot from the experts in the domain.  We usually do a meet-greet-install the night before the sprints.  Oh, and I’ve been known to provide refreshments to all of our sprinting hordes (read: FREE BEER).

So, I hope to see all of you there!  If you see me in the hall, feel free to introduce yourself and tell me what you are using Python for!

Coding Binge

Tuesday, January 27th, 2009

I haven’t written to the blog in a while.  Quite frankly, I’ve been busy.  In the last 30 days, I have released 3 software new packages, updated 1, deprecated 1, participated in a sprint that lasted a virtual 2 weeks, closed countless tickets, and pushed forward TG2 functionality.

TG2b4 was released last Saturday.  This was mostly a bug-fix release, but b3 is where the new functionality really came into the scene.  TG2b3 is the first build to include Sprox, a new library for schema-generated widget generation.  Sprox is the offspring of DBSprockets.  I decided I liked the declarative part of DBSprockets so much I wanted to spin it off as it’s own entity.  Sprox looses DBSprocket’s table-based dependency, utilizing the mapping provided by SQLAlchemy.  I realized that much of DBSprocket’s code was doing precisely what SQLSoup was designed to do, and decided to focus my efforts on making and extremely configurable widget base.  The result was a considerable removal of the cruft that was associated with DBSprockets.  Sprox releases with an excellent documentation base provided by Sphinx.

There has been a bit of resistance to Sprox, people were/are confused/upset about my providing yet more options for schema based widget generation.  The fact is I have yet to find anything that performs as well as Sprox from a developer/speed standpoint, and I needed to provide our TurboGears user base with a better way to administrate their site, and also allow them to use that tool component-wise in their system.  I think this method for developing widgets is well done in other frameworks, and we need a solid answer to this problem.  Sprox is just that.

The next step was to re-work Catwalk to use Sprox.  This took a little effort, and I put in RESTful URLs while I was at it, but struggled with making the URLs work within TG2’s dispatch system.  The result was as close to REST as you can get without conforming to a set standard.

The result of hacking REST into Catwalk got me thinking, and I decided to implement content-type dispatch as well as RESTful dipatch in TG2.  I went back for another round on Catwalk, and converted it over to the standard.

I’ve also been toying around with Dojo at NREL.  I’m pretty much done with ExtJS due to licensing issues, a not-so-hot codebase, and weak support from IRC.  It’s bad when you go on to ask a question on the channel as a 6 month-user of a software project and end up spending all your time answering everyone else’s questions (as the most experienced person in the room).  Something must be said for an organization that does not push paid consulting as a primary focus on their website…  #dojo has been an exceptional resource for getting my work done.  Those guys know their software, and lend a great hand to help you with it.

Back to the topic at hand… I was able to shoe-horn Dojo into Sprox with little effort, and implemented DojoCatwalk, which worked, but was ultimately not what I wanted.  What I really wanted was configurability.  I started work on tgext.admin, which was supposed to provide enough functionality to replace tgcrud, a command to auto-create crud in your own TG application.  To support tgext.admin, I created a new package called tgext.crud, which provided a CrudRestController, which is a simple way of providing crud for any object in your model.  AdminController combines this functionality with that of Mark’s lookup code to provide a fast, configurable set of tables/forms/etc for all objects in your model.  AdminController takes a declarative AdminConfig as input which provides a consistent way to create your administrative toolset.  Did I mention it does Dojo tables with ajax loading?  Yeah.

I’m not done with this binge yet.  Catwalk is going to mutate one more time before I’m through with it.  It is going to become a default-configured AdminController specifically designed to work within the context of a quickstarted TG2 application.  I had one blocker ticket which was solved last weekend, so it’s time to get Catwalk good and finished.

DBSprockets is back, Baby!

Friday, December 12th, 2008

So I have had a pretty long hiatus from working on dbsprockets, but I’m back… with a vengeance.  So, I worked hard to get Rum working in TG2, but struggled my ass off and got nowhere.  Left tangled in a web of peak-rules that I did not want to decipher, I began to think about dbsprockets again.  I mean, were we really that far off from what RUM is offering?

The answer turned out to be no.  All I needed to do was to replace the dbsprockets primitives with a class structure.  This turned out to be about an 8 hour job.  Not bad for a day’s work.  And now you can do things like:

from dbsprockets.declaratives import FormBase
from myWebapp.model import User
...
class RegistrationForm(FormBase):
    __model__ = User
    __limit_fields__ = 'user_name', 'email_address', 'display_name', 'password', 'verify'
    __required_fields__ = 'user_name', 'email_address', 'display_name', 'password', 'verify'
    email_address = TextField
    verify = PasswordField('verify')
    __base_validator__ =  Schema(chained_validators=(FieldsMatch('password',
                                                        'verify',
                                                        messages={'invalidNoMatch':
                                                                  "Passwords do not match"}),))
registration_form = RegistrationForm()
class ATurboGearsController(BaseController):
    @validate(form=registration_form.__widget__)
    @expose('genshi:sproxtest.templates.register')
    def register(self, **kw):
        pylons.c.widget = registration_form
        return dict(value={})

This turns out to be a much simpler way of handling forms, because now you can subclass to your heart’s content (for instance, make one base user form which you subclass for admin, registration, profile and login), or even come up with your own wacky WidgetSelector that chooses widgets for you and subclass as you desire. Here is initial documentation, which I will express in more detail at a later date when I have more time to do it the right way.

The simple fact is that you can customize form widgets with ease, limit fields to a set of fields, drop a few fields, basically anything you can think of to change how the database schema actually displays on the page.  The same is true of field validators.  Simply define an attribute of the class that has a validator, widget instance, or widget type, and dbsprockets will do the right thing.  If you want to override both the field and the validator, all you must do is create a widget wit the validator attribute populated.  The greatest thing is that your forms (and tables!) will change with you as your database schema migrates.

Right now I am keeping the primitives way of getting the values from the provider to populate the tables/forms.   This is likely to change in the future.  The primitive way of doing forms is now deprecated.  I will be adding deprecation warnings to the code.  Hooray!

Look for ajax support in the near future on both the view and data side of things.  I have a clear understanding of ExtJS (2.0.1), JSON, and LGPL now and I am not affraid to use it.  In the mean time, a dev version of dbsprockets 0.5 is up at pypi.

SQLAlchemy Migrate Process Hiccups

Friday, December 5th, 2008

So, I’ve done migration processes for two medium-large database schemas (50-100 tables) and I have found what I believe to be a disconnect in the process of migrating and a database and developing a database application.


The Problem
——————-
Here is the problem in general.  I am sorry it is so long winded, but it’s hard to see what is going on without this full expression.

You create a schema, and use that schema to create a database.  Maybe
you are using Pylons, and using setup-app to create the schema. Everyone is happy in schema land, and then someone realizes you need to make a change, and maybe even more changes in the future.  You decide to use migrate, because hell, migrate will make things easier. So, you write the migration, modify your model code and everything is honky dory.  Clients are happy, your developer feels like he has a maintainable codebase.  Your project begins to grow.  Awesome.

Not Awesome.  Your second developer needs a development environment. You give them your model, they setup-app it and they are ready to go. While they are plugging along, you realize you need a new migrate, so you create it, and upgrade your development version of the database, which is now at migration 2.  The production also goes off without a hitch, it’s now on 2.

What about your new development buddy?  When he ran migrate_version, he set his database up to version 0.  So, migrate will try to do 0->1 and then 1->2, but 0->1 will fail because his database is already at version 1 even though his version table says its 0.  Now he is in pain, and re-creates his entire database to get moving, and the whole cycle starts over again, although he can work… until you do the next migration.

This is not ideal, and I have been struggling with it at one of my clients.  In that situation, I am the developer buddy, bopping in from time to time to lend a hand.  I spend probably 5% of my time getting up and running, where if their migration system was somehow linked to their database model, I could be synchronized, and actually use it for a development tool, rather than a dusty old production tool.

Solution 1
—————
So, I came up with my own solution, and have talked to a close friend about an alternative solution.  I’d like to discuss both of them as they pertain to migrate, in the hopes that we can improve migrate, and  re-connect the development environment with the production one.

My solution is to connect the model base code to the migration version.  The way I accomplish this is to add a version.py file at the root of my model code, which has, incredibly, the version of the migration changeset that matches this model definition.  I also have a custom database creation script which creates a migrate_version table and fills it with a record of the correct version number.  Now I can svn up, and migrate up and everyone is happy. The problem with this method is that I have to maintain a damn file with the version number, and not go insane doing it.  Since my model contains 3 different database schemas in it, that means I have to maintain 3 variables in my version.py .  Personally, I don’t mind this, but I had to write more documentation than the amount in this post to make certain that my predecessors (or me in 6 months) can figure out what the heck is going on.

Solution 2
————–
The second solution, which Mark came up with, is to preserve the initial model.  Then, you do all of the migrations to bring you up to the current codebase.  This does not require any version file tracking, and if your migrations add boilerplate entries, you don’t have to record them in a second place. I think solution 2 is reasonable, but could be more time consuming to execute (which is probably not really a problem with everyone’s 2ghz machines these days).  The problems I see are that the boilerplate entries you make for production may a not be the same as you make for development.  This makes testing a bit harder if they differ. I also feel that solution 2 is more prone to problems, simply because there are a lot more cranks to turn to get a development database.

How can migrate help
—————————–
The way to provide solution 1 to the problem in migrate would be to allow the developer to connect his migrations to his model code, and allow migrate some level of control over the “versions” module in the model code.  So, when you do a migrate commit, you are also saying that the model code supports this version of migration.  It will probably only take me  a few hours to make this happen.  We then create an easy way to create the migrate_version table by importing something.  This way you can have a custom createdb script that also creates the migrate_version table.  Perhaps we could even provide a reasonable createdb template.

Solution 2 only requires us to provide a custom creation script.  For this we will have to preserve the initial schema (schema 0) and then upgrade.  We could probably also include a template instead of a script so that you can manage boilerplate on your own.

My own conclusions
—————————-
I lean more towards solution 1, mostly because I feel that it will provide for easier boilerplating, because you will have access to your whole model at once, instead of bits and pieces of your model as it moves from one version the next.  I also think the boilerplate for dev. is likely to be a lot different than that of prod for most folks, and I feel like solution 1 offers an easier way of handling it.  It also seems like solution 1 is much more efficient, because you don’t have to go and re-do all of that migration work, you have already arrived at the correct-for-now solution.

If you made it this far congratulations!  Now, go grab a cup of coffee, stretch your arms, come back and tell me what you think.