Bangkok Unrest

Written by Matt on March 11, 2010 – 2:26 am -

In celebration of my arrival in Bangkok the opposition party is apparently planning a million person “red shirt” rally. Exciting! On the bright side, “The UDD [United Front for Democracy against Dictatorship] can only afford to keep its protest going for three to five days. If the government has not fallen by that time, it will have to withdraw and draw up a new strategy.” I always pick the best times to travel. ) (Mom, don’t worry. I’ll stay safe!)

Subscribe to my RSS feed

Interview with Matt Mullenweg and Mike Little

Written by mike on March 9, 2010 – 6:30 pm -

The interview I did with Matt Mullenweg at WordCamp UK in Cardiff last year has finally made it on to WordPress.tv

In it,  Gurbir ?Si?ngh of astrotalkuk interviews Matt and I.  We discuss the history of WordPress,  the open source philosophy behind it,  a little about our backgrounds, fame, and… astronomy.

Go watch the interview, it’s pretty cool.

Subscribe to my RSS feed

Back to Firefox

Written by Matt on March 8, 2010 – 10:54 pm -

After a good while (I can’t search my Twitter stream) on Chrome I’m switching back to Firefox as my primary browser, and actually uninstalled Chrome. Why? I was getting the “Oh snap” failure page all the time, even on Google’s own Youtube! The only support I was pointed to was this page, and when I followed the instructions there when I restarted Chrome everything was gone. The sentence “copy the relevant files from the “Backup User Data” folder to your new “User Data” folder.” is useless when you consider the folder has 50+ files to sort through and I wasn’t sure which one was causing my previous problems. So back to Firefox, and thanks to Xmarks all of my stuff is there. I’m also using this persona which is pretty sweet. The feature I missed most on Chrome was lame: the ability to click and hold a folder then release on a bookmark I wanted to open. On Chrome you have to click twice. It bugged me. Now back on Firefox I feel like the browser has a large head.

Subscribe to my RSS feed

Distributed Company

Written by Matt on March 8, 2010 – 9:44 pm -

Toni Schneider, the CEO of Automattic, writes 5 reasons why your company should be distributed.

Subscribe to my RSS feed

Happy Thirteenth Birthday Megan

Written by mike on March 8, 2010 – 7:00 pm -

Happy Birthday Megs, a teen at last!

Megan

Have a great day. See you later

Gramps, Nan, and Jamie

xoxoxox

Subscribe to my RSS feed

Bug Chasing

Written by Dougal on March 7, 2010 – 6:00 pm -

Okay, so in my post about Code Spelunking I mentioned about how working on a project can lead you to explore the code because you need to become more familiar with how the code works. But it can also lead you to explore the code to figure out why code doesn’t work. In this particular case, I spent many hours puzzling over why something didn’t work correctly, chasing down the root cause, and eventually finding a bug in the WordPress core. I documented the bug in Ticket #12394, provided a patch, and it was committed to core in Changeset [13561], which will be part of WordPress 3.0.

And how did I find this little buglet? As usual, it’s because I was doing something a little off the beaten track. I was working on some code which imports XML data into WordPress, on a scheduled basis (hourly, daily, weekly, etc). During testing, sometimes the images in the imported content would come through fine, and other times, they would be missing the src attribute, without which, there really isn’t an image, is there? So you’d view the post and there would be this big 300-pixel square hole with just the alt text where the image should have been.

At first, I didn’t know why it worked only some of the time. Then I saw the pattern that when I ran the code “manually” via a “Run now” button in my options screen, the images worked. But when the code ran via WP-Cron, they didn’t. At first, I thought it was a timing issue, and that maybe when the cron action hooks fired, maybe there was some piece of WordPress functionality that wasn’t loaded yet. But shunting my execution hook to run at a later point didn’t fix anything.

Next, I decided that one key difference when running manually versus running from cron was me — I was logged in as an admin. And, in fact, after some debugging, I determined that there was no user context at all when running from cron. When I modified the code to run as myself, the image tags came through cleanly. Well, I didn’t want to hard-code the program to always run as me, so I added a user selector to the options so that the owner of the posts could be set.

But then when I started testing again, with users of various roles, the problem cropped up again. In particular, it worked great for a user with the Editor role, but not for the Author role. Digging a little deeper into the differences between the two roles, the thing that jumped out at me is that Editors (and Admins) have the “unfiltered_html” capability.

You see, normally, when you write a post, it is sent through a series of filters which take your free-form writing, and turn it into cleaner HTML. One of these filters is called ‘kses‘ (which stands for ‘kses strips evil scripts’). This filter is especially important on multi-author blogs where you might not be able to give 100% trust to the other authors. Otherwise, one of them would be able to (for instance) put javascript in a post which would steal the cookie information from another user who reads the post. So it is the job of kses to ensure that only “safe” HTML is kept. This would also keep you from embedding things like YouTube videos, Java applets, and other fun useful things. So users with the unfiltered_html capability set in their profiles are able to post without this filtering.

This certainly seemed like a likely culprit, except for one thing: even when post content is filtered through kses, the HTML img tag is not filtered out. And neither is the src attribute on an image. That is specifically supposed to be allowed. An image is a perfectly normal thing to have in a post. So why, oh why, was my src attribute being stripped?

I started looking very closely at the kses library. It’s a rather hairy bit of code, full of complex regular expressions and state-machine logic. But when reverse-engineering how the attribute-cleaning bits work, I noticed something in one of the regular expressions: it was hardcoded to expect a space between the end of an attribute and the closing of a tag. In other words, it expected an image tag to look something like this:

<img width='400' height='300' src='people.jpg' />

But, since my data was coming from an XML source, there was no extraneous space. My image tags looked like this:

<img width='400' height='300' src='people.jpg'/>

Notice the subtle difference? There is no space between the final single-quote around 'people.jpg' and the /> which closes the tag. And because of the way the match was being done, kses was throwing away any attribute that abutted the tag-close in that fashion.

The next question was: was this (technically) a bug, or was kses just being strict about some rules of formatting? A quick search turned up the Empty Elements section of the XHTML spec, which covers the syntax for empty tags like img, br, and hr. The examples given there do not include a space before the end of the elements. Furthermore, this section points to the HTML Compatibility Guidelines, which show that adding a space is for compatibility with older HTML browsers. So, since the XHTML spec does not require the space, and WordPress is supposed to render XHTML code, the behavior in kses was definitely a bug, and not just bad manners. I quickly worked up a patch, submitted it on Trac, and brought it to the attention of the core team.

Fortunately, the WordPress system of filters allows you to alter just about anything on the fly, so I was able to “trick” the system into thinking that the posting user selected in my plugin had the unfiltered_html capability, even when they really didn’t. This allowed me to work around the bug while my plugin is running.

This bug was pretty minor in the grand scheme of things. Probably not many people had ever run into it. But after hours of puzzling over those broken image tags, it felt darned good to find it, and — more importantly — squash it. And after the release of WordPress 3.0, nobody will have to scratch their heads over it again. Yay me!

Related posts:

  1. WordPress Code Spelunking
  2. WordPress Webhooks Plugin
  3. Easy Gravatars 1.2

Subscribe to my RSS feed

old and new [Flickr]

Written by michel v — intraordinaire.com on March 6, 2010 – 5:27 pm -

michel v — intraordinaire.com posted a photo:

old and new

When old and new meet, south of piazza San Marco in Venice.
Italy, february 2010.

Subscribe to my RSS feed

First Day at #WCIRL

Written by Donncha on March 6, 2010 – 12:35 pm -

So, day one of WordCamp Ireland draws to a close, there is a dinner tonight but the talks and sessions are over for the day.

I briefly helped John Handelaar during his talk on WordPress MU, but my main talk was on WP Super Cache. Thank you Hanni, Jane and Sheri for recording the talk. Hopefully it’ll be available online next week. In the meantime here’s the OpenOffice slides of my talk.

I must extend a big thank you to Sabrina Dent and Katherine Nolan for organising a great day and to the sponsors who made the weekend possible.

Looking forward to the dinner tonight, and the rest of the conference tomorrow.

Related Posts

  • No related posts

Subscribe to my RSS feed

LA Saturday

Written by Matt on March 6, 2010 – 3:00 am -

A day in LA spent looking at Fort Street carpets and vintage furniture around town, and then SOHO House for the Montblanc / Harvey Weinstein pre-Oscars dinner and party. (Stopped taking photos once the actual party started, didn’t want to get kicked out ) .)

Subscribe to my RSS feed

Harvard Gazette

Written by Matt on March 4, 2010 – 7:14 pm -

The Harvard Gazette is now on WordPress, with a beautiful magazine-style design. There’s a whole meme/argument going around a few blogs and Twitter saying WordPress isn’t a CMS. Who cares what you call it, look at the amazing sites you can create. (And manage content on.) Who woulda thunk it. I thought WordPress was only good for “just a blog” — what are these Harvard gonzos doing? Fie! I say.

Subscribe to my RSS feed