Skip to content

Saving Google worksheets with common prefix

5 messages · Jennifer Sabatier, Henrik Bengtsson

#
Hi R-Help,

So, I will try to provide a reproducible example...I basically made a dummy
spreadsheet that contains the same number of tabs as the spreadsheet I am
really interested in.  The data on that spreadsheet is really sensitive so
I couldn't use it.

Anyway, here are the various sheets in the spreadsheet:
[1] "Operations"    "Financing"     "Income"        "Balance sheet" "Cash
Flows211"
[6] "Cash Flows210" "Cash Flows29"  "Cash Flows28"  "Cash Flows27"

I am only interested in these sheets:
[1] "Cash Flows211" "Cash Flows210" "Cash Flows29"  "Cash Flows28"  "Cash
Flows27"

I want to save them to csv files that contain the same name or similar as
the sheet name.

Here's the error I'm getting (using traceback i ran it twice to get the
trace):
[1] "Cash Flows211"
[1] "Cash Flows211 - 2014-11-20.csv"
6: stop(err)
5: stop.if.HTTP.error(http.header)
4: getURLContent(uri, .opts = .opts, .encoding = .encoding, binary =
binary,
       curl = curl)
3: getForm("https://www.google.com/accounts/ClientLogin", accountType =
"HOSTED_OR_GOOGLE",
       Email = login, Passwd = password, service = service, source = appID,
       .opts = list(ssl.verifypeer = FALSE))
2: getGoogleAuth(usrname, pword, "...", service = "wise")
1: getGoogleDocsConnection(getGoogleAuth(usrname, pword, "...",
       service = "wise"))
Error in getURL(sheet at cellsfeed, curl = getCurlCon(con), followlocation =
TRUE) :
  trying to get slot "cellsfeed" from an object of a basic class ("NULL")
with no slots


Here's the code:

# install the RGoogleDocs package
install.packages("RGoogleDocs", repos = "http://www.omegahat.org/R",
type="source", dep=F)

library(RGoogleDocs)

usrname <- "r.project.user at gmail.com"

pword <- "fakepword"

sheets.con <- getGoogleDocsConnection(getGoogleAuth(usrname, pword, "...",
service = "wise"))

a <- getDocs(sheets.con)

ts <- getWorksheets('Google spreadsheet example', sheets.con)

ts2 <- ts[grep("^Cash Flow", names(ts))]
nms <- names(ts2)
lnth <- length(ts2)
sheetz <- list("integer" = lnth, "names" = nms)
sheetz

for (i in sheetz$names) {
print(i)
file.name <- paste(i, " - ", Sys.Date(), ".csv", sep="")
print(file.name)
traceback()
tab <- sheetAsMatrix(ts2$i, header = TRUE, as.data.frame = TRUE, trim =
TRUE)
writecsv(tab, file.name)

}


Now, if I do this as below it works:



tab <- sheetAsMatrix(ts$"Cash Flows211", header = T, as.data.frame = TRUE,
trim = TRUE)

head(tab)
STATEMENTS OF CASH FLOWS    NA    NA    NA    NA
1                                 Year   0.0   1.0   2.0   3.0
2                           Net income  <NA> -43.0  -6.0  32.0
3                    Plus depreciation  <NA> 100.0 100.0 100.0
4           Less increase in inventory -10.0 -15.0 -10.0  -8.0
5 Less increase in accounts receivable     - -60.0 -24.0 -18.0
6    Plus increase in accounts payable   8.0  12.0   8.0   6.0



So, why can't I automate this?

BTW, you should be able to access this spreadsheet.  I made a dummy Google
account and put this dummy spreadsheet on it.
#
Anyone can help?

On Thu, Nov 20, 2014 at 6:44 PM, Jennifer Sabatier <
plessthanpointohfive at gmail.com> wrote:

            

  
  
#
Add a print(ts2$i) to see if that really gives what you think it does.

Then take it from there.

Henrik
On Nov 21, 2014 7:19 AM, "Jennifer Sabatier" <plessthanpointohfive at gmail.com>
wrote:

  
  
#
On Fri, Nov 21, 2014 at 10:36 AM, Henrik Bengtsson <hb at biostat.ucsf.edu>
wrote:
You're right.  It's not doing what I think it's supposed to be doing.  The
print(ts2$i) gives NULL:


[1] "Cash Flows211"
NULL
[1] "Cash Flows211 - 2014-11-21.csv"
7: getURL(sheet at cellsfeed, curl = getCurlCon(con), followlocation = TRUE)
6: getCells(sheet, con = con)
5: xmlParse(getCells(sheet, con = con))
4: xpathApply(doc, path, fun, ..., namespaces = namespaces, sessionEncoding
= sessionEncoding,
       addFinalizer = addFinalizer)
3: getNodeSet(doc, "//a:entry/gs:cell", c(a = "http://www.w3.org/2005/Atom",

       gs = "http://schemas.google.com/spreadsheets/2006"))
2: processCells2(doc, trim, header, as.data.frame, stringsAsFactors =
stringsAsFactors)
1: sheetAsMatrix(ts$i, header = TRUE, as.data.frame = TRUE, trim = TRUE) at
#7
Error in getURL(sheet at cellsfeed, curl = getCurlCon(con), followlocation =
TRUE) :
  trying to get slot "cellsfeed" from an object of a basic class ("NULL")
with no slots

So, I've got to figure out how to feed that list of names properly.

Thanks!

Jen
#
I got it!


for (i in sheetz$names) {
print(i)
file.name <- paste(i, " - ", Sys.Date(), ".csv", sep="")
print(file.name)
#traceback()
i <- sheetAsMatrix(ts[[i]], header = TRUE, as.data.frame = TRUE, trim =
TRUE)
write.csv(i, file.name)
i
}

I have to use ts[[i]] rather than ts$i (or ts2, either)

Thanks!

Jen

PS - if anyone wants to use that code and that dummy gmail account to play
around with R, that's fine - be my guest.  The email address is hidden but
it's "r dot project dot user at gmail dot com" and the password is in the
code.  Upload your own data and play, if you like.




On Fri, Nov 21, 2014 at 11:02 AM, Jennifer Sabatier <
plessthanpointohfive at gmail.com> wrote: