Mathew Brown
2011-Nov-29 13:36 UTC
[R] Extracting from zip, removing certain file extensions
Hi there, I'm running R on windows 7 with Rstudio. Everyday I receive a zip file where a bunch of half-hourly files are zipped together. I then use xx=unzip(ind) to get xx, which consists of : [1] "./2011/A20112961503.flx" "./2011/A20112961503.log" "./2011/A20113211730.slt" "./2011/A20113211800.slt" "./2011/A20113211830.slt" "./2011/A20113211900.slt" [7] "./2011/A20113211930.slt" "./2011/A20113212000.slt" "./2011/A20113212030.slt" "./2011/A20113212100.slt" "./2011/A20113212130.slt" "./2011/A20113212200.slt" [13] "./2011/A20113212230.slt" "./2011/A20113212300.slt" "./2011/A20113212330.slt" "./2011/A20113220000.slt" "./2011/A20113220030.slt" "./2011/A20113220100.slt" [19] "./2011/A20113220130.slt" "./2011/A20113220200.slt" "./2011/A20113220230.slt" "./2011/A20113220300.slt" "./2011/A20113220330.slt" "./2011/A20113220400.slt" [25] "./2011/A20113220430.slt" "./2011/A20113220500.slt" "./2011/A20113220530.slt" "./2011/A20113220600.slt" "./2011/A20113220630.slt" "./2011/A20113220700.slt" [31] "./2011/A20113220730.slt" "./2011/A20113220800.slt" "./2011/A20113220830.slt" "./2011/A20113220900.slt" "./2011/A20113220930.slt" "./2011/A20113221000.slt" [37] "./2011/A20113221030.slt" "./2011/A20113221100.slt" "./2011/A20113221130.slt" "./2011/A20113221200.slt" "./2011/A20113221230.slt" "./2011/A20113221300.slt" [43] "./2011/A20113221330.slt" "./2011/A20113221400.slt" "./2011/A20113221430.slt" "./2011/A20113221500.slt" "./2011/A20113221530.slt" "./2011/A20113221600.slt" [49] "./2011/A20113221630.slt" "./2011/A20113221700.slt" "./2011/A20113221730.slt" What I want is to keep all the slt files and remove the other file types. How do I remove all the non slt files from xx? I want this to be automated so I don't have to state the entire file name each time. Thanks [[alternative HTML version deleted]]
jim holtman
2011-Nov-29 14:04 UTC
[R] Extracting from zip, removing certain file extensions
use pattern matching (regular expressions): e.g., myFileNames[grepl("slt$", myFileNames)] On Tue, Nov 29, 2011 at 8:36 AM, Mathew Brown <mathew.brown at forst.uni-goettingen.de> wrote:> > > Hi there, > I'm running R on windows 7 with Rstudio. Everyday I receive a zip file > where ?a bunch of half-hourly files are zipped together. > I then use > xx=unzip(ind) > to get xx, which consists of : > > [1] "./2011/A20112961503.flx" "./2011/A20112961503.log" "./2011/A20113211730.slt" "./2011/A20113211800.slt" "./2011/A20113211830.slt" "./2011/A20113211900.slt" > ?[7] "./2011/A20113211930.slt" "./2011/A20113212000.slt" "./2011/A20113212030.slt" "./2011/A20113212100.slt" "./2011/A20113212130.slt" "./2011/A20113212200.slt" > [13] "./2011/A20113212230.slt" "./2011/A20113212300.slt" "./2011/A20113212330.slt" "./2011/A20113220000.slt" "./2011/A20113220030.slt" "./2011/A20113220100.slt" > [19] "./2011/A20113220130.slt" "./2011/A20113220200.slt" "./2011/A20113220230.slt" "./2011/A20113220300.slt" "./2011/A20113220330.slt" "./2011/A20113220400.slt" > [25] "./2011/A20113220430.slt" "./2011/A20113220500.slt" "./2011/A20113220530.slt" "./2011/A20113220600.slt" "./2011/A20113220630.slt" "./2011/A20113220700.slt" > [31] "./2011/A20113220730.slt" "./2011/A20113220800.slt" "./2011/A20113220830.slt" "./2011/A20113220900.slt" "./2011/A20113220930.slt" "./2011/A20113221000.slt" > [37] "./2011/A20113221030.slt" "./2011/A20113221100.slt" "./2011/A20113221130.slt" "./2011/A20113221200.slt" "./2011/A20113221230.slt" "./2011/A20113221300.slt" > [43] "./2011/A20113221330.slt" "./2011/A20113221400.slt" "./2011/A20113221430.slt" "./2011/A20113221500.slt" "./2011/A20113221530.slt" "./2011/A20113221600.slt" > [49] "./2011/A20113221630.slt" "./2011/A20113221700.slt" "./2011/A20113221730.slt" > > What I want is to keep all the slt files and remove the other file types. How do I remove all the non slt files from xx? I want this to be automated so I don't have to state the entire file name each time. > > Thanks > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.
Duncan Murdoch
2011-Nov-29 14:09 UTC
[R] Extracting from zip, removing certain file extensions
On 29/11/2011 8:36 AM, Mathew Brown wrote:> > Hi there, > I'm running R on windows 7 with Rstudio. Everyday I receive a zip file > where a bunch of half-hourly files are zipped together. > I then use > xx=unzip(ind) > to get xx, which consists of : > > [1] "./2011/A20112961503.flx" "./2011/A20112961503.log" "./2011/A20113211730.slt" "./2011/A20113211800.slt" "./2011/A20113211830.slt" "./2011/A20113211900.slt" > [7] "./2011/A20113211930.slt" "./2011/A20113212000.slt" "./2011/A20113212030.slt" "./2011/A20113212100.slt" "./2011/A20113212130.slt" "./2011/A20113212200.slt" > [13] "./2011/A20113212230.slt" "./2011/A20113212300.slt" "./2011/A20113212330.slt" "./2011/A20113220000.slt" "./2011/A20113220030.slt" "./2011/A20113220100.slt" > [19] "./2011/A20113220130.slt" "./2011/A20113220200.slt" "./2011/A20113220230.slt" "./2011/A20113220300.slt" "./2011/A20113220330.slt" "./2011/A20113220400.slt" > [25] "./2011/A20113220430.slt" "./2011/A20113220500.slt" "./2011/A20113220530.slt" "./2011/A20113220600.slt" "./2011/A20113220630.slt" "./2011/A20113220700.slt" > [31] "./2011/A20113220730.slt" "./2011/A20113220800.slt" "./2011/A20113220830.slt" "./2011/A20113220900.slt" "./2011/A20113220930.slt" "./2011/A20113221000.slt" > [37] "./2011/A20113221030.slt" "./2011/A20113221100.slt" "./2011/A20113221130.slt" "./2011/A20113221200.slt" "./2011/A20113221230.slt" "./2011/A20113221300.slt" > [43] "./2011/A20113221330.slt" "./2011/A20113221400.slt" "./2011/A20113221430.slt" "./2011/A20113221500.slt" "./2011/A20113221530.slt" "./2011/A20113221600.slt" > [49] "./2011/A20113221630.slt" "./2011/A20113221700.slt" "./2011/A20113221730.slt" > > What I want is to keep all the slt files and remove the other file types. How do I remove all the non slt files from xx? I want this to be automated so I don't have to state the entire file name each time.Use a regular expression: xx <- grep("slt$", xx, value=TRUE) If you want to do more complicated matching, read ?glob2rx or ?regexp. Duncan Murdoch