Skip to content

Reading a datetime vector

10 messages · D Wolf, Peter Dalgaard, Jim Lemon +1 more

#
This is a mailing list. I don't know how you are interacting with it... using a website rather than an email program can lead to some confusion since there can be many ways to accomplish the task of interacting with the mailing list. My email program has a "reply-all" button when I am looking at an email. It also has an option to write the email in plain text, which often prevents the message from getting corrupted (recipient not seeing what you sent to the list).

Using the str function on a literal string (the name of a file) will indeed tell you that you gave it a character string. Specifying a column in your data might tell you something more interesting... e.g.

str( df2_TZ$DateTimeStamp )

If that says you have character data then Jim Lemon's suggestion would be a good next thing to look at.  If it is factor data then you should use the as.character function on the data column and then follow Jim's suggestion. If it is numeric then you probably need to convert it using an appropriate origin (e.g. as described at [1] or [2]).

I have had best luck setting the default timezone string when converting to POSIXt types... e.g.

# specify timezone assumed by input data
Sys.setenv( TZ="GMT" )
testdtm <- as.POSIXct( "1/1/2016 00:00", format = "%m/%d/%Y %H:%M" )
# inspect the result
testdtm
str( testdtm )
# view data from a different timezone
Sys.setenv( TZ="Etc/GMT+8" )
# no change to the underlying data, but it prints out differently now because the tz attribute is "" which implies using the default TZ
testdtm

[1] http://blog.mollietaylor.com/2013/08/date-formats-in-r.html
[2] https://www.r-project.org/doc/Rnews/Rnews_2004-1.pdf
2 days later
#
Hello Everyone,?
The column begins populated with integers as so:1/1/2013 0:00 in the spreadsheet equals 41257 in R's dataframe1/1/2013 0:15 in the spreadsheet equals 41257.010416666664 in R's dataframe...41257 must be in minutes since 1440min/day * .010416666664 day = 15 minutes. 41257 minutes is about 29 days: 41257 min / 1440 min/day = 28.65 days. So I don't know why the dataframe is showing 41257 for 1/12013 0:00.?
Oddly, R sees the vector as NULL despite the fact it has integers in each record in the column:data_type = str(df2_TZ$DateTimeStamp) produces a NULL (empty) variable.?

I tried:
df2_TZ = read.xlsx2("DF_exp.xlsx", sheetName = "Sheet1")Sys.setenv(TZ = "GMT")testdtm <- as.POSIXct(df2_TZ$DateTimeStamp, format = "%m/%d/%Y %H:%M")# Inspect the resulttestdtmstr(testdtm)
testdtm is a vector filled with NA values, which figures since DateTimeStamp is NULL.?
I noticed in the table on page 32 of the R Help Desk pdf you linked to that dp-as.POSIXct(format(dp, tz="GMT")) is the only option listed for time zone difference. So I tried:df2_TZ = read.xlsx2("DF_exp.xlsx", sheetName = "Sheet1")df2_TZ_seq <- as.POSIXct(format(dt2_TZ, tz="GMT"))
and got:?Error in format(dt2_TZ, tz = "GMT") : object 'dt2_TZ' not found
Is the vector neither character nor factor, since it's NULL? Where do I go from here??
 Thank You,Doug

Hi Doug,What you have done is to ask whether the character string "DF_exp.xlsx" is a character string. I think Yogi Berra, were he still around, could have told you that. What will give you some useful information is:
str(DF_exp.xlsx)
which asks for information about the object, not its name.
Jim
On Friday, February 19, 2016 12:41 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
This is a mailing list. I don't know how you are interacting with it... using a website rather than an email program can lead to some confusion since there can be many ways to accomplish the task of interacting with the mailing list. My email program has a "reply-all" button when I am looking at an email. It also has an option to write the email in plain text, which often prevents the message from getting corrupted (recipient not seeing what you sent to the list).

Using the str function on a literal string (the name of a file) will indeed tell you that you gave it a character string. Specifying a column in your data might tell you something more interesting... e.g.

str( df2_TZ$DateTimeStamp )

If that says you have character data then Jim Lemon's suggestion would be a good next thing to look at. If it is factor data then you should use the as.character function on the data column and then follow Jim's suggestion. If it is numeric then you probably need to convert it using an appropriate origin (e.g. as described at [1] or [2]).

I have had best luck setting the default timezone string when converting to POSIXt types... e.g.

# specify timezone assumed by input data
Sys.setenv( TZ="GMT" )
testdtm <- as.POSIXct( "1/1/2016 00:00", format = "%m/%d/%Y %H:%M" )
# inspect the result
testdtm
str( testdtm )
# view data from a different timezone
Sys.setenv( TZ="Etc/GMT+8" )
# no change to the underlying data, but it prints out differently now because the tz attribute is "" which implies using the default TZ
testdtm

[1] http://blog.mollietaylor.com/2013/08/date-formats-in-r.html
[2] https://www.r-project.org/doc/Rnews/Rnews_2004-1.pdf

-- 
Sent from my phone. Please excuse my brevity.
On February 19, 2016 7:48:31 AM PST, D Wolf <doug45290 at yahoo.com> wrote:
Hello Jeff,
I ran str() on the vector and it returned character.> str("DF_exp.xlsx")?chr "DF_exp.xlsx"
This is my first thread on this forum, and I'm not sure how to reply to the thread instead of just sending the reply to your email account; I don't see a 'reply' link in the thread.I've read this page and I don't think it advises on how to reply in the thread:?R: Posting Guide: How to ask good questions that prompt useful answers

| ? |
| ? |  | ? | ? | ? | ? | ? |
| R: Posting Guide: How to ask good questions that prompt ...Posting Guide: How to ask good questions that prompt useful answers This guide is intended to help you get the most out of the R mailing lists, and to avoid embarra... |
|  |
| View on www.r-project.org | Preview by Yahoo |
|  |
| ? |


Thank You,Doug Wolfinger
On Friday, February 19, 2016 12:51 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
You are being rather scattershot in your explanation, so I suspect you are not being systematic in your troubleshooting. Use the str function to examine the data column after you pull it in from excel. It may be numeric, factor, or character, and the approach depends on which that function returns.
#
It is not minutes... read the Excel documentation for representing dates... it is days since December 30, 1899 on Windows.  Read the links I provided in my last email. 

Also read ?str ... that function does not return anything... it only prints out information so don't expect to get anything useful by assigning the output of that function to a variable. 

Also read the examples section of the help file ?read.xlsx2 for relevant help.
#
Hi Doug,
It is difficult for us to work out what is happening as we don't have
access to a toy data set that we can play with. Excel spreadsheets are one
of those things that you can't just attach to your email to the help list.
If there is somewhere you can leave a _small_ Excel sample file (take the
first 10 rows, say) that we can download (Google Drive, Dropbox?) and
include the URL in your email, maybe someone can offer more than guesses.

Jim
#
On 22 Feb 2016, at 18:30 , Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:

            
I seem to recall that that is actually only true for dates after March 1, 1900. (The reason that it is not counting December 31st being that someone thought that 1900 was a leap year.) 

<Checks Wikipedia: Yep, 1900 is still a leap year in Excel. The original perpetrator was Lotus 1-2-3 and Microsoft went for but-compatibility.>

-pd
#
In addition to my previous message, DF_extract_clean.R is the program in the dropbox folder that I am currently working on.
Doug
On Tuesday, February 23, 2016 4:02 AM, Jim Lemon <drjimlemon at gmail.com> wrote:
Hi Doug,It is difficult for us to work out what is happening as we don't have access to a toy data set that we can play with. Excel spreadsheets are one of those things that you can't just attach to your email to the help list. If there is somewhere you can leave a _small_ Excel sample file (take the first 10 rows, say) that we can download (Google Drive, Dropbox?) and include the URL in your email, maybe someone can offer more than guesses.
Jim
#
Doug,
We're getting warm. If we ask really nicely, will you tell us the URL of
the "dropbox folder you are working on"?

Jim
On Wed, Feb 24, 2016 at 9:29 AM, D Wolf <doug45290 at yahoo.com> wrote:

            

  
  
#
Hi again,
My apologies - I didn't see the other email.

JIm
On Wed, Feb 24, 2016 at 10:29 AM, Jim Lemon <drjimlemon at gmail.com> wrote:

            

  
  
#
Hi Doug,
I see what the problem is now. When your Excel file is read in with
read.xlsx2, the DateTimeStamp is read as days since Microsoft's time epoch
(see earlier posts on this). As these values are numeric, they cannot be
converted in the same way as a human readable date/time string. The easiest
way I could think of to get around this is to export the XLSX file as CSV.
Then you will have the date/time strings and can convert them to POSIX
date/time values. Note that your format spec was slightly wrong - day is
first.

# first export the EXCEL file as a CSV file then
df2_TZ = read.csv("/media/KINGSTON/DF_exp2.csv",stringsAsFactors=FALSE)
df2_TZ$DateTimeStamp<-strptime(df2_TZ$DateTimeStamp,"%d/%m/%Y %H:%M")
# and I get
df2_TZ$DateTimeStamp
 [1] "2013-01-01 00:00:00 EST" "2013-01-01 01:00:00 EST"
 [3] "2013-01-02 23:15:00 EST" "2013-01-02 23:30:00 EST"
 [5] "2013-01-02 23:45:00 EST" "2013-01-03 00:00:00 EST"
 [7] "2013-01-03 01:00:00 EST" "2013-01-03 01:15:00 EST"
 [9] "2013-01-04 23:00:00 EST" "2014-11-24 15:04:00 EST"
[11] "2013-01-04 23:15:00 EST" "2013-01-04 23:30:00 EST"
[13] "2013-01-05 00:30:00 EST" "2013-01-05 00:45:00 EST"
[15] "2013-01-26 00:00:00 EST" "2013-07-19 15:42:00 EST"

Jim
#
You are overthinking this.  The answer is in the help file for read.xls2.