When I'm adding events to the R Community Calendar, I can only enter dates and times in my local time zone (US Pacific Time). This is problematic when I need to enter events in other countries: I have to do some fancy (and unreliable) mental gymnastics to, say, convert a future date and time in London into my local time to make sure it's accurately listed on the calendar. So I decided to tackle the problem with R and learned a lot about dates, times, and time zones in the process.
pb.txt <- "2009-06-03 19:30"
I've used a standardized, unambiguous time format: year-month-date eliminates the confusion between US and British-style dates, and a 24-hour clock eliminates the AM/PM ambiguity. This format is required by the as.POSIXct function, which I'll use to convert it to an object in R representing the date and time in the London time zone:
pb.date <- as.POSIXct(pb.txt, tz="Europe/London")
Now, I can convert it to my local timezone using the format function (documented in the help for strptime):
> format(pb.date, tz="America/Los_Angeles",usetz=TRUE) [1] "2009-06-03 11:30:00 PDT"
d <- c("2009-03-07 12:00", "2009-03-08 12:00", "2009-03-28 12:00", "2009-03-29 12:00", "2009-10-24 12:00", "2009-10-25 12:00", "2009-10-31 12:00", "2009-11-01 12:00")
t1 <- as.POSIXct(d,"America/Los_Angeles")
cbind(US=format(t1),UK=format(t1,tz="Europe/London"))
- You can work with dates represented as character strings in various formats like "12/25/2008" or "8 Jan 2009 2:43 PM" using the function strptime
- It's a bit tricky to find valid timezones to use with the tz= argument. Common timezone abbreviations like "BST" (British Summer Time) sometimes work, but might not mean what you think they mean: EST refers to a time-zone in Canada that does not observe daylight savings time, NOT US Eastern Standard Time. It's safest to use the time zone names listed on this Wikipedia page (especially if you're dealing with data from the archives of Louisville, Kentucky). The list of known timezones is system-specific, but these ones should work on all R implementations.
- Important warning: If you misspell a timezone, or use one that isn't known, you don't get an error: UTC (basically Greenwich Mean Time) is used instead. If you're not sure, print your POSIXct object: if it looks like this: "2009-06-03 19:30:00 UTC" your time zone wasn't recognized. On my machine this happens if I try to use "PDT" or "PST" as a timezone, even though dates are printed with those characters representing the time zone. "US/Pacific works", but I can't find any documentation listing time zones beginning with "US/".
- You might think you can convert a POSIXct object from one time zone to another like this: as.POSIXct(pb.date,tz="US/Pacific"), but it doesn't work. You can strip the time-zone information from the object first with c: as.POSIXct(c(pb.date),tz="US/Pacific"). But it's probably safest just to use format.
- POSIXlt objects represent times like POSIXct objects do, but you only want to use them if you need to extract information like the day of the week, or if daylight savings is in effect. I had difficulties working with timezones using POSIXlt objects, so you'll probably want to stick with as.POSIXct for most purposes. The differences between the two are documented in ?DateTimeClasses.



And when you think you have it all figured out do (inspired by help("DateTimeClasses")):
> d <- c("2005-12-31 23:59:59", "2005-12-31 23:59:60", "2006-01-01 00:00:00")
> as.POSIXct(d,tz="UTC")
[1] "2005-12-31 23:59:59 UTC" "2006-01-01 00:00:00 UTC"
[3] "2006-01-01 00:00:00 UTC"
> as.POSIXlt(d,tz="UTC")
[1] "2005-12-31 23:59:59 UTC" "2005-12-31 23:59:60 UTC"
[3] "2006-01-01 00:00:00 UTC"
:-)
Also note that DST rules are implemented by your operating system so if you upgrade your system the external representation of POSIXct objects may change. I spent a couple of weeks debugging this once. Use POSIXlt for legal dates (“the contract/assembly line/whatever starts...”) and POSIXct for events in the universe (asteroid impacts, say).
Posted by: Allan Engelhardt | June 02, 2009 at 08:38
Try tz="PST8PDT" for the daylight savings time in the pacific time zone. On my mac you can see all the available time zones in the /usr/share/zoneinfo directory, I'm pretty sure that would be similar on most unix based systems. Not sure where you would find the list on Windows.
Also note this is highly dependent on the sytem you are using the code on so this is not really portable code. I usually store everything as GMT to try and keep things constistent and as portable as possible. Also be aware of the data you are getting, I wasted a day on finding out a data set was stored strictly in PST and never adjusted for daylight savings once.
Posted by: Ben Kujala | June 03, 2009 at 10:35
I want to find a way to find a maximum value between two columns (a,b) and create a third column in a dataframe c = max(a,b). See example below.
I tried using apply but it does not work because it coerces the data into a matrix/character conversion. How can this be done?
> mydf=data.frame(a=rep(Sys.time()+rnorm(1),10), b=rep(Sys.time(), 10))
> mydf
a b
1 2009-06-26 08:53:36 2009-06-26 08:53:35
2 2009-06-26 08:53:36 2009-06-26 08:53:35
3 2009-06-26 08:53:36 2009-06-26 08:53:35
4 2009-06-26 08:53:36 2009-06-26 08:53:35
5 2009-06-26 08:53:36 2009-06-26 08:53:35
6 2009-06-26 08:53:36 2009-06-26 08:53:35
7 2009-06-26 08:53:36 2009-06-26 08:53:35
8 2009-06-26 08:53:36 2009-06-26 08:53:35
9 2009-06-26 08:53:36 2009-06-26 08:53:35
10 2009-06-26 08:53:36 2009-06-26 08:53:35
Posted by: henrique chang | June 26, 2009 at 06:55
That's awesome! Exactly the code I was looking for! Thanks, dude!
Posted by: GMTSlider.com | September 28, 2010 at 05:25