An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111129/f54b2241/attachment.pl>
Extracting from zip, removing certain file extensions
4 messages · Mathew Brown, jim holtman, Duncan Murdoch
use pattern matching (regular expressions): e.g.,
myFileNames[grepl("slt$", myFileNames)]
On Tue, Nov 29, 2011 at 8:36 AM, Mathew Brown
<mathew.brown at forst.uni-goettingen.de> wrote:
Hi there, I'm running R on windows 7 with Rstudio. Everyday I receive a zip file where ?a bunch of half-hourly files are zipped together. I then use xx=unzip(ind) to get xx, which consists of : [1] "./2011/A20112961503.flx" "./2011/A20112961503.log" "./2011/A20113211730.slt" "./2011/A20113211800.slt" "./2011/A20113211830.slt" "./2011/A20113211900.slt" ?[7] "./2011/A20113211930.slt" "./2011/A20113212000.slt" "./2011/A20113212030.slt" "./2011/A20113212100.slt" "./2011/A20113212130.slt" "./2011/A20113212200.slt" [13] "./2011/A20113212230.slt" "./2011/A20113212300.slt" "./2011/A20113212330.slt" "./2011/A20113220000.slt" "./2011/A20113220030.slt" "./2011/A20113220100.slt" [19] "./2011/A20113220130.slt" "./2011/A20113220200.slt" "./2011/A20113220230.slt" "./2011/A20113220300.slt" "./2011/A20113220330.slt" "./2011/A20113220400.slt" [25] "./2011/A20113220430.slt" "./2011/A20113220500.slt" "./2011/A20113220530.slt" "./2011/A20113220600.slt" "./2011/A20113220630.slt" "./2011/A20113220700.slt" [31] "./2011/A20113220730.slt" "./2011/A20113220800.slt" "./2011/A20113220830.slt" "./2011/A20113220900.slt" "./2011/A20113220930.slt" "./2011/A20113221000.slt" [37] "./2011/A20113221030.slt" "./2011/A20113221100.slt" "./2011/A20113221130.slt" "./2011/A20113221200.slt" "./2011/A20113221230.slt" "./2011/A20113221300.slt" [43] "./2011/A20113221330.slt" "./2011/A20113221400.slt" "./2011/A20113221430.slt" "./2011/A20113221500.slt" "./2011/A20113221530.slt" "./2011/A20113221600.slt" [49] "./2011/A20113221630.slt" "./2011/A20113221700.slt" "./2011/A20113221730.slt" What I want is to keep all the slt files and remove the other file types. How do I remove all the non slt files from xx? I want this to be automated so I don't have to state the entire file name each time. Thanks ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.
On 29/11/2011 8:36 AM, Mathew Brown wrote:
Hi there, I'm running R on windows 7 with Rstudio. Everyday I receive a zip file where a bunch of half-hourly files are zipped together. I then use xx=unzip(ind) to get xx, which consists of : [1] "./2011/A20112961503.flx" "./2011/A20112961503.log" "./2011/A20113211730.slt" "./2011/A20113211800.slt" "./2011/A20113211830.slt" "./2011/A20113211900.slt" [7] "./2011/A20113211930.slt" "./2011/A20113212000.slt" "./2011/A20113212030.slt" "./2011/A20113212100.slt" "./2011/A20113212130.slt" "./2011/A20113212200.slt" [13] "./2011/A20113212230.slt" "./2011/A20113212300.slt" "./2011/A20113212330.slt" "./2011/A20113220000.slt" "./2011/A20113220030.slt" "./2011/A20113220100.slt" [19] "./2011/A20113220130.slt" "./2011/A20113220200.slt" "./2011/A20113220230.slt" "./2011/A20113220300.slt" "./2011/A20113220330.slt" "./2011/A20113220400.slt" [25] "./2011/A20113220430.slt" "./2011/A20113220500.slt" "./2011/A20113220530.slt" "./2011/A20113220600.slt" "./2011/A20113220630.slt" "./2011/A20113220700.slt" [31] "./2011/A20113220730.slt" "./2011/A20113220800.slt" "./2011/A20113220830.slt" "./2011/A20113220900.slt" "./2011/A20113220930.slt" "./2011/A20113221000.slt" [37] "./2011/A20113221030.slt" "./2011/A20113221100.slt" "./2011/A20113221130.slt" "./2011/A20113221200.slt" "./2011/A20113221230.slt" "./2011/A20113221300.slt" [43] "./2011/A20113221330.slt" "./2011/A20113221400.slt" "./2011/A20113221430.slt" "./2011/A20113221500.slt" "./2011/A20113221530.slt" "./2011/A20113221600.slt" [49] "./2011/A20113221630.slt" "./2011/A20113221700.slt" "./2011/A20113221730.slt" What I want is to keep all the slt files and remove the other file types. How do I remove all the non slt files from xx? I want this to be automated so I don't have to state the entire file name each time.
Use a regular expression:
xx <- grep("slt$", xx, value=TRUE)
If you want to do more complicated matching, read ?glob2rx or ?regexp.
Duncan Murdoch
Great, many thanks.
On 11/29/2011 3:09 PM, Duncan Murdoch wrote:
On 29/11/2011 8:36 AM, Mathew Brown wrote:
Hi there, I'm running R on windows 7 with Rstudio. Everyday I receive a zip file where a bunch of half-hourly files are zipped together. I then use xx=unzip(ind) to get xx, which consists of : [1] "./2011/A20112961503.flx" "./2011/A20112961503.log" "./2011/A20113211730.slt" "./2011/A20113211800.slt" "./2011/A20113211830.slt" "./2011/A20113211900.slt" [7] "./2011/A20113211930.slt" "./2011/A20113212000.slt" "./2011/A20113212030.slt" "./2011/A20113212100.slt" "./2011/A20113212130.slt" "./2011/A20113212200.slt" [13] "./2011/A20113212230.slt" "./2011/A20113212300.slt" "./2011/A20113212330.slt" "./2011/A20113220000.slt" "./2011/A20113220030.slt" "./2011/A20113220100.slt" [19] "./2011/A20113220130.slt" "./2011/A20113220200.slt" "./2011/A20113220230.slt" "./2011/A20113220300.slt" "./2011/A20113220330.slt" "./2011/A20113220400.slt" [25] "./2011/A20113220430.slt" "./2011/A20113220500.slt" "./2011/A20113220530.slt" "./2011/A20113220600.slt" "./2011/A20113220630.slt" "./2011/A20113220700.slt" [31] "./2011/A20113220730.slt" "./2011/A20113220800.slt" "./2011/A20113220830.slt" "./2011/A20113220900.slt" "./2011/A20113220930.slt" "./2011/A20113221000.slt" [37] "./2011/A20113221030.slt" "./2011/A20113221100.slt" "./2011/A20113221130.slt" "./2011/A20113221200.slt" "./2011/A20113221230.slt" "./2011/A20113221300.slt" [43] "./2011/A20113221330.slt" "./2011/A20113221400.slt" "./2011/A20113221430.slt" "./2011/A20113221500.slt" "./2011/A20113221530.slt" "./2011/A20113221600.slt" [49] "./2011/A20113221630.slt" "./2011/A20113221700.slt" "./2011/A20113221730.slt" What I want is to keep all the slt files and remove the other file types. How do I remove all the non slt files from xx? I want this to be automated so I don't have to state the entire file name each time.
Use a regular expression:
xx <- grep("slt$", xx, value=TRUE)
If you want to do more complicated matching, read ?glob2rx or ?regexp.
Duncan Murdoch