If-Modified-Since date formats in Firefox and IE7
June 21st, 2008 | by Sean |Everybody who has a webserver wants to use less bandwidth. A webserver sends out a lot of copies of the same information. Stylesheets, images, static HTML pages, javascript files. Many of these rarely, if ever, change from one visit by a browser to the next. Fortunately, there are a few schemes implemented in HTTP that help save bandwidth.
I’ve been writing a webserver recently, so I’ve become more acquainted with HTTP features than I might ever have wanted to. Don’t ask me why I’m writing a webserver, I’m beginning to wonder myself. It seemed like a good idea at the time I started it.
When I first started watching Apache access logs (this is a kind of mental illness, I can’t recommend it), I noticed 304 responses for some popular files. HTTP 304 is ‘Not Modified‘. Most browsers cache files they download from webservers, and don’t download them again if they haven’t changed. The way they do this is by conditionally fetching content from the webserver.
Firefox had a super handy extension called ‘LiveHTTPHeaders‘ that allows you to see exactly what messages your browser exchanges with the web server. Even if you don’t have a direct use for it, the conversations between your browser and web servers can tell you a lot about how the Web works. Last time I looked, this extension wasn’t available in Firefox 3. Epiphany has something very similar, so I’ve been using that browser while I’m developing my webserver. You can read the HTTP conversations by using a tool like WireShark too, a very powerful and complex network analysis tool.
A browser makes a conditional request from a webserver when it has an earlier copy of the object in its cache. To avoid the possibility of presenting out-of-date content to the user, the browser sends an ‘If-Modified-Since‘ header field, with the date the browser last downloaded the content. The Webserver checks this date against the date the content was last changed on the server. If the server content has changed, the content is sent as normal. If the content has not changed since the browser last downloaded it, the server makes a ‘304 Not Modified’ response, and the browser uses the cached content.
I noticed in the log file of my webserver a number of
java.text.ParseException: Unparseable date
messages. I use Java‘s SimpleDateFormat class on the server for all sorts of different date parsing, but ran into some extra difficulties on this occasion. The difficulties arose because just one of the machines here uses IE7 on Windows XP. This browser was sending a different format from all the other browsers:
- Linux-Firefox 3: “Fri May 30 10:14:52 MYT 2008”
- Windows-Firefox 2: “Fri May 30 16:34:38 MYT 2008”
- Linux-Epiphany 2: “Fri May 30 16:34:38 MYT 2008”
- Linux-Opera 9: “Fri May 30 16:34:38 MYT 2008”
- Windows-IE7: “Fri, 30 May 2008 08:34:38 GMT”
I’m not convinced this is a locale problem, since it was the same Windows PC that produced the consistent date from Firefox 2 and the inconsistent date from IE7. There doesn’t seem to be a standard for the date format in the If-Modified-Since header field, so it seems I have no choice but to code my webserver for differing date formats.
I’m currently using 2 different SimpleDateFormats to make the problem go away, but I wonder how many different formats will I get when I deploy this webserver? If-Modified-Since experts, your comments, please!
5 Responses to “If-Modified-Since date formats in Firefox and IE7”
By Poon Poi Ming on Jun 21, 2008 | Reply
Hi Seanie, I got utterly lost halfway. Too technical for me, but I guess the target audience of this article is more for the tech-savvy, LOL.
By Gareth Farrington on Jul 17, 2008 | Reply
I was writing some file serving code and I saw something like this.
The HTTP spec recommends that the client store the exact date/time sent from the server, in the format the server sent it. So Firefox does this and if your server sends timestamps in the Last-Modified header in some funky format the browser will spit it back out in that same format.
IE is using an RFC 1123 date format which is THE preferred format for all dates in HTTP headers. IE transformed the date sent from the server into the standard format. This can cause problems if the server and client clocks are not synchronized or if the server uses a non standard date format.
So the Firefox behaviors is safer but it returns illegal date strings. The IE behavior is less robust but its output is not illegal.
Check the server code to see if you are sending out last modified headers in the “Fri May 30 10:14:52 MYT 2008” format (which is invalid)
By Sean on Jul 17, 2008 | Reply
Thanks very much for that pointer Gareth – that put me on the right path. My server was indeed sending out dates in a funky format – the Date.toString() format! It’s close to the asctime() format mentioned in the IETF’s draft HTTP/1.0 spec, but not quite the same.
One last wrinkle that surprised me was the insistence on GMT:
I’ve re-coded the If-modified-since behaviour of my webserver. Now it follows IETF’s recommendation in RFC1123‘s Robustness Principle:
The non-IE browsers I was using must be designed along the lines of “When in Rome, do what the Romans do”!
By spenser on Jun 23, 2009 | Reply
There is a very good reason for the use of GMT. It is unambiguous, even when countries or regions decide to unadvisedly and unilaterally change the rules for calculating daylight savings time. For those who look at GMT regularly, they soon get the hang of making the required mental adjustment for timezone offsets as they browse logs.
By Sean on Jun 23, 2009 | Reply
Thanks, Spenser. In retrospect, it’s not surprising at all. The If-Modified-Since technique (for example) is obviously more robust and more efficient if all servers use the same time standard. I think my surprise had more to do with my own ignorance of time standards and the idea that such an important standard should be based on a location ‘just up the road’ from where I was born!