Another brilliant list from McSweeney's: Email addresses it would be really annoying to give out over the phone.
With some of these youtube videos, even if parts are fake (and I have no reason to think this isn't legit), enormous imagination and effort were required. This video gets the alpinegizmo seal of approval.
If you want one new show to watch now that Battlestar Galactica has ended, make it NBC's new show, Kings. I haven't been this excited by imaginative television in years. In the US you can catch it on hulu. The ratings weren't good, by the way, so let's all cross our fingers and hope NBC gives this show a chance to breathe.
If you prefer to wait and watch a whole season of something at once, take a look at the first season of Damages, also on hulu. Both Glenn Close and Ted Danson play characters who will stop at nothing to get what they want -- and of course they are in conflict, and can't both win. Ted Danson is driven by greed and ambition; Glenn Close wants to bring justice to those ruined by Ted Danson. She's on the good side of this battle, but she's got more darkness in her than a black hole. And I've left out the appealing main character -- Glenn Close's young associate -- will she succumb and become just as twisted as her mentor?
[Part of the Rails on XML series.]
An essential part of working with XML and Ruby is being able to serialize ruby objects to XML. There are several approaches out there for doing this; the one that appeals to me most is the one implemented by Rails in ActiveSupport. It provides a nice to_xml method for your ActiveRecord objects, and deals with arrays and hashes that contain ActiveRecord models. For example, here's what you might get from a really simple blog engine in response to Page.find(:first).to_xml:
<?xml version="1.0" encoding="UTF-8"?>
<page>
<allow-comments type="boolean" nil="true"></allow-comments>
<body>blah blah blah</body>
<id type="integer">1</id>
<title>my first post</title>
<created-at type="datetime">2008-01-08T17:52:54+01:00</created-at>
<updated-at type="datetime">2008-11-26T17:01:51+01:00</updated-at>
<user-id type="integer">1</user-id>
</page>
There may be more information here than we will need for XSLT rendering (eg it probably doesn't matter that id is an integer), but I like the thoroughness of this approach, and the way they handle arrays and hashes is quite nice.
For our purposes we need to be able to serialize to XML anything we might put in a controller instance variable. The xml_serialization gem adds support for integers, symbols, and strings, and arrays and hashes that contain them.
For example, {:numbers => [1, 2], :strings => ['one', 'two']}.to_xml
<?xml version="1.0" encoding="UTF-8"?>
<hash>
<numbers type="array">
<number>1</number>
<number>2</number>
</numbers>
<strings type="array">
<string>one</string>
<string>two</string>
</strings>
</hash>
We also cover the important case where you already have XML (eg from a database or web service) that you want to pass through unmolested.
@data = RawXML.new '<tag>content</tag>'
In this case, @data.to_xml returns the same string that went in, i.e. '<tag>content</tag>'.
[Part of the Rails on XML series.]
I am pleased to report that Mirai España has agreed to release as free software the code we wrote for using XSLT views in Ruby on Rails applications. I have packaged this software as two separate components: the xml_serialization gem for serializing ruby objects to XML, and the xslt_render rails plugin.
At this point documentation is lacking. However, there isn't that much code and the tests do illustrate how the pieces are intended to be used, so the adventurous might be able to drop this into an app and get it working. I do plan future posts that will explain more of the details, which I'm sure will help. However, I know a few of you are interested in seeing the code now, so I don't want to delay in releasing it.
This week I was noticing one of those small, but possibly revealing cultural differences between France and the US. It has to do with what doctors wear at work. I was having a routine examination done, and the specialist I was there to see -- male, mid-40s -- was wearing jeans and a long-sleeved fantasy sport league polo shirt. No white lab coat or stethoscope or any other of the traditional trappings of the medical profession (traditional, that is, from an American perspective). At first when I saw him coming and going about the office, I wasn't sure if he was the doctor, or a technician there to repair something. If I wasn't used to the fact that french physicians dress extremely casually, it never would've even occurred to me that he could be the doctor.
I'm not complaining, mind you. Just curious about what else might be different, but less obvious.
- Weeks at home: 30
- Country points: 4
- Airline flights: 36
- Photos: 3222
Countries visited for the first time: Portugal, Russia, Latvia, Lithuania
Flight departures: Marseille (12), Madrid (11), Lyon (2), Hamburg (2), Nice (2), Riga (2), Amsterdam(1), Berlin (1), Paris (1), Porto (1), St Petersburg (1)
Note for travelers: calling 112 in Portugal is not guaranteed to get you someone who speaks English.
Today is a travel day for me, and I'm writing these posts while sitting in various airport terminals. First Hamburg, and now Frankfurt. What I find rather odd is that it is possible to travel within Europe without ever showing any ID. I used my credit card to get my boarding pass, and I showed the boarding pass at security and at the gate. At the gate there wasn't even a person involved, just a barcode scanner.
Yesterday I decided it would be a good idea to modify the installation of mysql that I use on my laptop so that the data on the disk is encrypted. With google I found a few people looking for help with this, but no solutions, so I'll share what I figured out.
First some configuration details. I'm running mysql 5.0 on a macbook pro with leopard (OS X 10.5.6). I'm pretty sure the same approach I'm using will work across any similar setup, but your mileage may vary. Also, I installed mysql from mysql.com, but you should be able to easily adapt this technique to work with the version from macports.
After shutting down the mysql server, the first thing I did was to create an encrypted sparse disk image (using disk utility), move /usr/local/mysql/data to the new disk image, and create a symbolic link from /usr/local/mysql/data to the new, encrypted location. I expected that to pretty much take care of it (and if it had, I wouldn't be blogging about it). When I looked at why mysql would no longer start up, I found that the wrong user now owned the data files -- I owned everything, instead of the mysql user. Nothing chown can't fix, right? Wrong.
Turns out that the files and directories in mounted disk images are owned by whoever mounts them -- the ownership information stored in the filesystem within the disk image is ignored. Yikes! Fortunately, this is merely the default behavior, and can be switched off. If you bring up a mounted disk image in the finder's get info window, at the very bottom, inside the "Sharing & Permissions" section, you will find a checkbox labeled "Ignore ownership on this volume". It is checked by default; you want to turn this off.

You will notice that I gave everyone read and write permission to this volume. I'm not super happy about this, but so far I haven't found any other way to allow the mysql user to be able to reach inside and do what it needs to do. Of course, what this volume contains is a data directory that only the mysql user is able to access, so I don't feel too bad. Nevertheless, if someone finds a solution to this, let me know, ok?
The other problem I've found comes up when I want to unmount the mysql data disk image. I usually find the finder telling me I can't eject this volume because it is still in use. In these cases lsof is my friend, as in "lsof /Volumes/mysql-data". And the culprit is usually mds, aka spotlight. You can stop this by going into System Preferences > Spotlight > Privacy and adding the encrypted disk image volume to the list of things spotlight should not index. Unfortunately it seems that this preference is not sufficiently persistent -- I keep having to re-set it -- so I may resort to disabling spotlight entirely, or try spotless.
Of course, it is also necessary to mount the encrypted disk image before starting mysql, and to stop mysql before unmounting it. Kudos to anyone who goes to the trouble of cleanly automating these steps.
[Part of the Rails on XML series.]
Two really nice things flow from using XSLT for the views in the way I described last time, in Part 4.
First is that you get partials for free. To understand why this is so, you need to understand a little about how the XSLT language works. When you are writing a view in XSLT, you are describing a transformation from XML to HTML. You are saying, when you see this in the XML coming in, produce that in the HTML coming out. This means that setting up a controller action to re-render part of a page is staggeringly simple. All you have to do is provide a subset of the XML used to render the full page, and only the relevant portion of the HTML will be generated. You can use exactly the same XSLT view for both the full page and the partial.
The second thing you get is a much cleaner separation between your controllers and your views. Everything the views need will be, and must be, provided in the XML. And given the rendering pipeline we established, that means everything will come from the controller instance variables. Of course most everything in normal rails views comes from controller instance variables too, but it always possible to do something dirty in an rhtml view, like reach around to the models and include a bunch of business logic in your view. Since we have no ruby code in our views, these cheats are simply impossible. That may seem like a limitation, but it can be liberating to be able to say that there is a contract between the backend team and the frontend team whereby the backend team will produce XML that contains everything that’s needed, and the frontend team will produce HTML from that XML. We found it was a good thing to have a real boundary there.
[Part of the Rails on XML series.]
One of our project’s major goals was to use XSLT for the views, and to integrate an XSLT rendering path into Rails in as natural a way as possible. This meant that each typical controller action would have a corresponding .xsl file, rather than a .html.erb file.
An XSL stylesheet is a program that describes a transformation; in our case a transformation from XML to XHTML. In some ways the XSL transformation language (XSLT) is very awkward, but it is a programming language. For various reasons – because our client already had XSLT files for his site, because there were members of the team who knew XSLT and not Ruby, and because it just seemed cleaner – we decided to see how far we could get if we used pure XSLT for the Rails views, with no embedded Ruby.
We realized that if we didn’t use Ruby in these views, we would not have direct access to the controllers’ instance variables. We decided the most Rails-like approach we could think of would be to have the framework automatically make all controller instance variables available to the XSLTs, by serializing this data to XML, and then provide the resulting XML as the input to the XSLT view. At this point we decided it was time to dig in and read through the Rails ActionView code and figure out how to extend it to work this way.
We found it is pretty easy to add a new renderer to Rails, and we experimented with that approach for a while. Ultimately we found it better for our application to write a simple module that we include in our controllers. This module defines a render_through_xslt method that works like this:
module XSLTRender
def apply_xslt(template_file, xml)
# use xsltproc to perform the transformation
end
def instance_hash
hash = instance_variables.inject({}) do |vars, var|
key = var[1..-1]
vars[key] = instance_variable_get(var)
vars
end
hash['flash'] = flash
hash
end
def render_through_xslt(options = {})
template = find_template(options)
page_xml = instance_hash.to_xml
html = apply_xslt(template, page_xml)
render :text => html
end
end
The final version is a bit more complex, as we added support for rendering partials and doing internationalization.
Last night I spent a couple of hours integrating disqus into this site. This is a web service that implements comments. My server is still just serving up static web pages, now with a bit of javascript included. Look at the source for this site if you want to see how it works on this end.
So far I'm really delighted with how this switch to jekyll + disqus has gone. I can't recommend this solution if you're not comfortable with dealing with HTML, CSS, and the command line, but if you are a programmer -- especially if you are a ruby programmer -- you may find the simplicity compelling. I am a little concerned in the long run about depending on a company with no visible business model for the comment functionality, but I guess I'm willing to take the risk. Maintaining a viable comment feature in the face of torrents of comment spam is too much work, so I'm happy to outsource the problem.
[Part of the Rails on XML series.]
A bit of overly simplified, sample XML. Don't get hung up on the details; the schema and data are made up. I just want to give you something concrete to look at so I can raise a few issues for you to consider.
<property>
<type>office space</type>
<area>100</area>
<rooms>
<room size="60" />
<room size="30" />
<room size="10" type="bathroom" />
</rooms>
<owner>
<name>John Q. Public</name>
<contactinfo>
<phones>
<phone number="1.555.555.1212" type="voice" use="contact" />
<phone number="1.666.555.1212" type="mobile" use="contact" />
</phones>
</contactinfo>
</owner>
<location>
<position lat="40.5000801086426" lng="-3.37295293807983" />
<address>
<street>123 Main St</street>
<postalcode>28805</postalcode>
</address>
</location>
</property>
Creating, editing, and validating the XML
If you are thinking of writing an XML-based web application, where is the XML going to come from? How will it be edited? Where will you implement the business logic to validate that the data being entered makes sense?
Changes to the structure/schema of the XML (migrations)
How will you manage the evolution of the XML schema over time -- in other words, what will take the place of Rails migrations? Will you validate XML schema changes against an XML DTD?
Will (some) users be able to make changes to the XML schema, or will this require programmer intervention?
Finding good solutions to the problems of validation and migration is a key challenge when considering Rails on XML. In our application we generated the XML from other sources, and thereby sidestepped these issues, at least for a while.
[Part of the Rails on XML series.]
Among the first design decisions to be made with any web application are how to represent the data, and where to store it. The assumption behind this set of articles is that you are going to have XML data, probably in some sort of XML database. But why might someone come to this conclusion, and what are the alternatives?
For many applications, modeling the data for a relational database is pretty straightforward, especially after you've done it many times before. Customer records, addresses, phone numbers, orders, login-ids and passwords -- we all know what to do with this stuff. Moreover, there are good tools available, and there are lots of talented people around who know how to use them. Both MySQL and PostgreSQL have their weak points, but are nevertheless pretty well-behaved and a small team isn't going to waste much time fighting with either one of them. Rails, with ActiveRecord, migrations, and validations, provides a good foundation for keeping complex applications relatively sane.
By contrast, choosing an XML data representation is much riskier. The tools are less mature, and few there be that know how to use them. Don't do it unless you have a good reason.
One obvious domain for applying the XML web application toolchain is for working with an existing set of XML documents. I suspect this is a relatively rare case, however.
As a motivating example, imagine being asked to build a commercial real estate site that will have detailed listings for business properties for rent and for sale: farms, office space, retail space, restaurants, hardware stores, etc. New types of properties will be added to the site on a regular basis -- six months from now you may encounter the first listing for a marina, or a mine. The web site is expected to have rich editing and search interfaces, making it possible, for example, to search for bars for sale with seating for more than 50 customers, or for a 2-room office with an on-street entrance and its own bathroom.
In this case, modeling each property as an XML document seems like an interesting idea. Attributes that many or all of the properties have can be managed uniformly (eg listing agent, asking price, size in square-meters, taxes), and more unique characteristics (eg number of berths of various lengths in a marina) can also be represented, edited, and searched.
Once you have decided to use XML for at least some of your data, you have to decide if you are going to be a purist and use XML for everything, or not. The cleanliness of a pure approach appealed to us, but didn't seem practical. We already had a login system we liked, based on restful authentication and open-id; we planned to use attachment-fu for uploaded images; and in general we wanted to leave the door open for any and all Rails plugins. Our client had already chosen to use DB2, which supports hybrid use of both XML and SQL, and we decided to take advantage of that.
Another interesting choice that I've been playing with, but haven't used for anything real, would be to use one of the new document-oriented data stores, such as CouchDB. It is my belief -- but I can't completely justify it -- that if you are planning to exploit the hierarchical structure that an XML document allows you to use, then XQuery is going to win over what you can do with CouchDB.
[Part of the Rails on XML series.]
I'm sure the first question in many readers' minds is "Why bother?". The intersection between the XML community and the Ruby community is very small. A rubyist might use XML to implement a RESTful web service, but that's going far enough, thank you very much. However, if you've been paying attention to the pure XML workflow for creating web applications, it has some interesting ideas, and some exuberant followers. I'm not sure I'm tapped into the most forceful propaganda for this way of thinking, but you can sample it here, in this quote from 2006:
A profound change is likely about to shake up your world if you’re a web developer, one that I suspect will make the recent efforts in the AJAX space pale in comparison as far as its effect. Very quietly, over the last few weeks, the Mozilla team has been upgrading their XForms capabilities ...
In point of fact, you can create some incredibly rich “web experience” applications that are mixed XHTML and XForms, and can do so in a remarkably short amount of time. These are not simply empty text fields that you lay out as you do with HTML input elements -- XForms bears about as much resemblance to HTML forms as a Bengal tiger has to a ferret. http://www.oreillynet.com/xml/blog/2006/03/why_xforms_matter_revisited.html
and in this quote from 2007:
[E]nlightened innovators ... have seen the real world benefits of dumping object middle-tier stacks and relational databases and going with a pure declarative approach to solving business problems based around XForms/REST and XQuery. These [are] the people that are going to lead innovations in application development. ...
At one of the XForms sessions a young Ruby advocate stated that he thought his Ruby code was "beautiful" but he did not himself think that XForms code was "beautiful". Most of the people in the audience agreed that XForms was beautiful but like any new language, it takes getting used to. ...
XForms/REST/XQuery to me is indeed beautiful. Its beauty comes from its ability to quickly map real-world requirements into working systems with high fidelity. No other system in the world approaches its elegant architecture. http://datadictionary.blogspot.com/2007/12/impressions-of-xml-2007-in-boston.html
To put these remarks in context, I need to backup and give an outline of the entire XML web application toolchain.
- XML database: your application data lives here, in XML
- XQuery: database query language, it takes the place of SQL
- XSLT: transforms XML to HTML (can be applied in the DB, your app server, or in the browser)
- XForms: forms that post XML back to the database
Note that if you buy into all of this, you don't really need an application server any more, be it implemented in Rails, or not. (If you want to play with some examples of this pure XML web app architecture, I think the sample apps distributed with eXist, an open source, native XML database, are the best place to start.)
Of course it is now 2009, and we've yet to see the XForms/REST/XQuery approach gain traction in the way you might've expected, if you bought into what was being said a couple of years ago. Yes, in some ways this XML architecture for web application development is truly elegant -- but some pieces of the puzzle are rather hideous.
I'll have more to say about what's good, and bad, in future postings. I'll also talk about various approaches we explored for combining these tools with Rails.
For much of 2008 my partner Larry Baltz and I were exploring how to build Ruby on Rails applications on top of an XML database. Our client was convinced that some of their data was inherently better suited to an XML representation, and wanted to see if we could craft a productive development environment from a marriage of these technologies. The journey has been quite interesting, revealing both pitfalls and pleasant surprises. Hopefully this account will save others some grief, and also point out some interesting directions for future work.
In the series of postings to follow I will cover various aspects of this journey.
I'm starting off the New Year right -- not simply with a long overdue blog posting, but also a whole new blogging infrastructure. For a while I've been using Typo, but I've come to realize it is way too bloated for my simple needs. So now I am using jeykll, which generates the static site you see here from a simple git repository. Take a look if you want to see how it works. So I far I think it's a big improvement -- for you, because the site is so much faster, and for me, because it is much simpler and easier to maintain.
If you are following the feed for my blog, I recommend you update your subscription. So as not to lose readers who've been using Typo's RSS feed, I added some rudimentary RSS support to what I found in jekyll, but the atom feed is more robust.
I am painfully aware that I've spent several hours playing with the code for this blog today, and only a few minutes writing. This may say something about which activity I enjoy more, but hopefully I will sustain a more regular posting schedule in 2009.
Rather than producing a native XML database like eXist or MarkLogic, IBM has chosen to add XML features to DB2, creating an interesting -- and convoluted -- hybrid that IBM has branded "pureXML".
If you've drunk the XML database koolaid, you may be thinking about how going down this path is going to (1) give you a toolchain where web app development becomes a much more declarative process, with better separation of concerns; (2) give you more flexibility than a relational database; (3) fit more naturally with a web-services model; and/or (4) get you moving toward the semantic web.
For me, one of the attractive prospects would be to clean up the process of generating HTML. If the data coming out of your database is already XML, then its not much of a leap to use an XSLT to transform that XML into HTML. Traditional view templating schemes (such as erb) can be a mess, so maybe this world of XML databases offers a cleaner solution.
Architecturally you've got three choices of where to apply an XSL transform -- in the database, in your application server, or in the browser. Each of these has advantages and disadvantages, but as a first pass we've been exploring having DB2 do the job. The results have been somewhat disappointing.
To get DB2 to apply an XSL stylesheet to some XML, you have to use an SQL function called XSLTRANSFORM. The XSL stylesheet has to be provided as the result of an SQL expression, which generally means that it is stored in an XML column in the database. The XML document to be transformed might also come from an XML column, or it might be the result of an XQUERY embedded in the SQL you've been forced to use as the overall structure for this operation. Given this structure it's not surprising that xsl:include declarations are not supported, but this is definitely problematic. Also disappointing is that stylesheets incorporating XQUERY statements are not supported.
A simple example in which the XML to be transformed is coming from the xmldata column in the employees table:
SELECT XSLTRANSFORM(xmldata
USING (SELECT xsldata FROM xslts WHERE name = 'employee.xsl')) AS html
FROM employees WHERE id = 833038373
It's a little more fun when the source of the XML is an embedded xquery:
SELECT XSLTRANSFORM (
(SELECT XMLCAST(XMLQUERY('$data//Phones' PASSING xmldata AS "data") AS XML)
FROM employees WHERE id = 833038373)
USING (SELECT xsldata FROM xslts WHERE name = 'phones.xsl')) AS html
FROM employees WHERE id = 833038373
It may look a bit strange that the WHERE id = 833038373 clause occurs twice. The second instance could be pretty much anything that returns one row. While many SQL databases will accept a simple "select 1", DB2 insists on a FROM clause, such as "select 1 from employees fetch first 1 rows only". The only really important part of the last line above (i.e. "FROM employees WHERE id = 833038373") is that it return exactly one row.
XSLTRANSFORM isn't the only case where DB2's pureXML forces you to use an unholy mixture of SQL and XQUERY. As IBM explains, "With plain XQuery you can not exploit full-text search capabilities provided by the DB2 Net Search Extender (NSE). You need to involve SQL for full-text search." This means doing something like this:
xquery
<Employees> {
for $e in db2-fn:sqlquery(
"SELECT xmldata FROM EMPLOYEES.EMPLOYEES
WHERE CONTAINS(xmldata,
'SECTION(""/Employee/Info/Address/LocationInfo"") ""madrid"" ') > 0"
)
let $ei := $e/Employee/Info
return
<EmployeeInfo>
{ $ei/Name, <Departments> {$ei/DepartmentInfo/Name} </Departments> }
</EmployeeInfo>
} </Employees>
For clarity, this example has been broken across multiple lines. Don't expect DB2 to be happy with input this easy to read.
This whole business of XQUERY embedded in SQL, and vice versa, can get amazingly messy. We've spent quite a bit of time figuring out how to do quoting and casting in order to keep both sides happy. And although DB2 supports SQL sub-queries within XQUERY, so far we've found it impossible to figure out how to quote SQL sub-queries within an XQUERY that in turn is embedded in SQL, which is necessary if you want to use XSLTRANSFORM on the result of an xquery like the one above.
Resources
For a new project we've been using IBM's DB2 database and its pureXML features. This has been my first serious exposure to the brave new world of building web applications on an XML platform. Now that I've been doing battle with DB2 for seven weeks, I thought I'd give a progress report. In subsequent posts I'll try to share what we're learning about how well this all works.
In the past I've used SQLite, MySQL and PostgreSQL (usually underneath Rails) for web development. Compared to these open source databases, my first impression of DB2 has been -- to be blunt -- that it sucks. Now I'm aware that proponents of DB2 (see Zen And The Art Of Ruby Programming, for example) will tell you that because Rails is database agnostic (with a mysql bias) it doesn't show off the strengths of DB2's advanced database features. Apparently, DB2's strengths include "utmost speed, pureXML, compression, replication, high availability, [and] affordable 24/7 support." That may be true. I feel a bit like I woke up one morning, and found an 18-wheeler in my garage. It may be extremely capable and powerful, but so far, I'm just feeling proud that I've managed to get it down the driveway and onto the street without hitting anything or killing anyone. I still miss all the things that made driving my car a comfortable experience.
IBM, if you're listening, there are a few simple things you could do to make the transition to DB2 less painful:
Put some resources into improving the usability of the tools. For starters, put readline support into the command line processor, and modify the command editor in db2cc to wrap lines within a fixed-width textarea. (Yes, I'm aware of the history and edit features in CLP.) Figuring out how to do really complex queries with the current tools has been a seriously frustrating experience.
Get rid of silly limits. For example, the shell interface truncates XML results to 4000 bytes. What's the point of offering a shell-based command line processor if you can't use it in shell scripts?
Find some way to make routine administrative tasks, including backup and restore, managing text and xml indexes, and doing server restarts, less of a problem. We're spending way too much time fighting with the basics.
The ibm_db ruby on rails adapter is half-finished. Please get us to the point where "rake test" works.
We're getting past these issues by building in-house tools to smooth over the rough edges. Why bother? pureXML is the answer. In the future I hope to let you know if it proved worth the bother.