Skip to content

Read fst files

13 messages · Eric Berger, Duncan Murdoch, Jan van der Laan +3 more

#
R-Help Forum

 

Anyone know why the following line of code would error out:  myObject <-
read_fst(unz("Dataset.zip", filename = "filename.fst"))

 

Error: Incomplete expression: filename <- read_fst(unz("Dataset.zip",
filename = "filename.fst") 

 

I often use similar code with *.csv files in a zipped folder. For example:
myObject <- read.csv(unz("Dataset.zip", filename = "filename.csv")), which
works just fine.

 

Jeff Reichman
#
You are missing the second closing parenthesis. This is what the error
message is telling you.


On Wed, Jun 9, 2021 at 2:44 AM Jeff Reichman <reichmanj at sbcglobal.net>
wrote:

  
  
#
Eric

 

Typo on my point. 

 

setwd("C:/Users/reichmaj/Documents/My_Reference_Library /Regression")

myObject <- read_fst(unz("Dataset.zip", filename = "myFile.fst")) # read fst file

 

Error in path.expand(path) : invalid 'path' argument

 

So then I tried

 

myObject <- read_fst(unz("C:/Users/reichmaj/Documents/My_Reference_Library /Regression /Dataset.zip", filename = "myFile.fst"))

 

Error in path.expand(path) : invalid 'path' argument

 

Error in the path??

 

Because this works just fine

 

myObject <- read.csv(unz("C:/Users/reichmaj/Documents/My_Reference_Library /Regression /Dataset.zip", filename = "myFile.csv"))

 

My only though is I can?t use the two function s together when dealing with fst files ??

 

From: Eric Berger <ericjberger at gmail.com> 
Sent: Wednesday, June 9, 2021 3:50 AM
To: reichmanj at sbcglobal.net
Cc: R mailing list <r-help at r-project.org>
Subject: Re: [R] Read fst files

 

You are missing the second closing parenthesis. This is what the error message is telling you.
On Wed, Jun 9, 2021 at 2:44 AM Jeff Reichman <reichmanj at sbcglobal.net <mailto:reichmanj at sbcglobal.net> > wrote:
R-Help Forum



Anyone know why the following line of code would error out:  myObject <-
read_fst(unz("Dataset.zip", filename = "filename.fst"))



Error: Incomplete expression: filename <- read_fst(unz("Dataset.zip",
filename = "filename.fst") 



I often use similar code with *.csv files in a zipped folder. For example:
myObject <- read.csv(unz("Dataset.zip", filename = "filename.csv")), which
works just fine.



Jeff Reichman





______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org>  mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Even if ultimately you want to use the functions together, for debugging
the problem you should split them into two, as in

a <- unz("C:/Users/reichmaj/Documents/My_Reference_Library /Regression
/Dataset.zip", filename = "myFile.fst")
See if that works, and examine 'a'.

And once that is working

read_fst(a)

to see what that does.

Let us know.


On Wed, Jun 9, 2021 at 3:18 PM Jeff Reichman <reichmanj at sbcglobal.net>
wrote:

  
  
#
It looks as though read_fst wants a filename, not a connection.

You should do it in two steps:

  unzip("Dataset.zip", files = "myFile.fst")
  myObject <- read_fst("myFile.fst")

This is obviously untested; you didn't even say what package read_fst() 
comes from.

Duncan Murdoch
On 09/06/2021 8:18 a.m., Jeff Reichman wrote:
#
Duncan

Yea that will work. It appears to be related to setting my working dir, for what ever reason neither seem to work
(1) knitr::opts_knit$set(root.dir ="~/My_Reference_Library/Regression") # from R Notebook or
(2) setwd("C:/Users/reichmaj/Documents/My_Reference_Library/Regression") # from R chunk

So it appears I can either (as you suggested) use two steps or combine but I need to enter the full path. Why other file types don't seem to need the full path ....?????

myObject <- read_fst(unz("C:/Users/reichmaj/Documents/My_Reference_Library/Regression/Datasest.zip", filename = "myFile.fst"))

Thank you. I guess just one of those R things

Jeff



-----Original Message-----
From: Duncan Murdoch <murdoch.duncan at gmail.com> 
Sent: Wednesday, June 9, 2021 7:27 AM
To: reichmanj at sbcglobal.net; 'Eric Berger' <ericjberger at gmail.com>
Cc: 'R mailing list' <r-help at r-project.org>
Subject: Re: [R] Read fst files

It looks as though read_fst wants a filename, not a connection.

You should do it in two steps:

  unzip("Dataset.zip", files = "myFile.fst")
  myObject <- read_fst("myFile.fst")

This is obviously untested; you didn't even say what package read_fst() comes from.

Duncan Murdoch
On 09/06/2021 8:18 a.m., Jeff Reichman wrote:
#
On 09/06/2021 9:12 a.m., Jeff Reichman wrote:
You need to read the documentation for read_fst() to find what it needs. 
  If it doesn't explain this, then you should report the issue to its 
author.
No, it's a read_fst() thing.

Duncan Murdoch
#
read_fst is from the package fst. The fileformat fst uses is a binary 
format designed to be fast readable. It is a column  oriented format and 
compressed. So, to be able to work fst needs access to the file itself 
and wont accept a file connection as functions like read.table an 
variants accept.

Also, because it is a binary compressed format using a compression 
method that is fast to read, compressing also to zip seems to defeat the 
purpose of fst.

HTH,
Jan
On 09-06-2021 15:28, Duncan Murdoch wrote:
#
Jan

Makes sense. Its just that I often receive  large zip files that contain a variety of file types.

Jef

-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Jan van der Laan
Sent: Wednesday, June 9, 2021 12:56 PM
To: r-help at r-project.org
Subject: Re: [R] Read fst files



read_fst is from the package fst. The fileformat fst uses is a binary 
format designed to be fast readable. It is a column  oriented format and 
compressed. So, to be able to work fst needs access to the file itself 
and wont accept a file connection as functions like read.table an 
variants accept.

Also, because it is a binary compressed format using a compression 
method that is fast to read, compressing also to zip seems to defeat the 
purpose of fst.

HTH,
Jan
On 09-06-2021 15:28, Duncan Murdoch wrote:
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
... but if you are receiving multiple-file zips then you should not be using unz() the way you are in your original post.

I have to agree with other responders suggesting that you handle unzipping fst zips manually rather than as part of an R one-liner.
On June 9, 2021 11:26:34 AM PDT, Jeff Reichman <reichmanj at sbcglobal.net> wrote:

  
    
#
Try using unzip(zipfile, files="desiredFile", exdir=tf<-tempfile()), not
unz(zipfile, "desiredFile"), to copy the desired file from the zip file to
a temporary location and use read_fst(tf) to read the desired file.

-Bill

On Wed, Jun 9, 2021 at 11:27 AM Jeff Reichman <reichmanj at sbcglobal.net>
wrote:

  
  
#
On 09/06/2021 1:55 p.m., Jan van der Laan wrote:
Thanks for the info.  I think it is possible to handle such a file in a 
binary connection, but doing that in C/C++ would be kind of horrible, so 
I can understand your choice.

Duncan Murdoch
#
Bill

 

So I understand that?s just unzipping the file to a temporary dir which then would allow read_fst to access the file directly .

 

Jeff

 

From: Bill Dunlap <williamwdunlap at gmail.com> 
Sent: Wednesday, June 9, 2021 1:43 PM
To: reichmanj at sbcglobal.net
Cc: Jan van der Laan <rhelp at eoos.dds.nl>; r-help at r-project.org
Subject: Re: [R] Read fst files

 

Try using unzip(zipfile, files="desiredFile", exdir=tf<-tempfile()), not unz(zipfile, "desiredFile"), to copy the desired file from the zip file to a temporary location and use read_fst(tf) to read the desired file.

 

-Bill
On Wed, Jun 9, 2021 at 11:27 AM Jeff Reichman <reichmanj at sbcglobal.net <mailto:reichmanj at sbcglobal.net> > wrote:
Jan

Makes sense. Its just that I often receive  large zip files that contain a variety of file types.

Jef

-----Original Message-----
From: R-help <r-help-bounces at r-project.org <mailto:r-help-bounces at r-project.org> > On Behalf Of Jan van der Laan
Sent: Wednesday, June 9, 2021 12:56 PM
To: r-help at r-project.org <mailto:r-help at r-project.org> 
Subject: Re: [R] Read fst files



read_fst is from the package fst. The fileformat fst uses is a binary 
format designed to be fast readable. It is a column  oriented format and 
compressed. So, to be able to work fst needs access to the file itself 
and wont accept a file connection as functions like read.table an 
variants accept.

Also, because it is a binary compressed format using a compression 
method that is fast to read, compressing also to zip seems to defeat the 
purpose of fst.

HTH,
Jan
On 09-06-2021 15:28, Duncan Murdoch wrote:
______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org>  mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org>  mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.