Hello List, I have a? one column data frame to store file name with extension. I want to create new column to keep file name only without extension. I tried to use strsplit("name1.csv", "\\.")[[1]] to do that, but it just retain the first row only and it is a vector.? how can do this for all of rows and put it into a new column? thank you, Kai [[alternative HTML version deleted]]
Hello, I would suggest something like `tools::file_path_sans_ext` instead of `strsplit` to remove the file extension. This is also vectorized, so you won't have to use a `sapply` or `vapply` on it. I hope this helps! On Wed, Jul 7, 2021 at 9:28 PM Kai Yang via R-help <r-help at r-project.org> wrote:> Hello List, > I have a one column data frame to store file name with extension. I want > to create new column to keep file name only without extension. > I tried to use strsplit("name1.csv", "\\.")[[1]] to do that, but it just > retain the first row only and it is a vector. how can do this for all of > rows and put it into a new column? > thank you, > Kai > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
You would need to loop through the list to use strsplit() -- you are confused about list structure. Here's a simple way to do it using regex's -- **assuming that there is only one period in your names that delineates the extension.** If this is not true, then this **will fail**. This is vectorized and so will be more efficient than looping. d <- data.frame (fn = c("name1.csv", "name2.txt")) d d$first <- sub("\\..+","",d$fn) d Cheers, Bert "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Jul 7, 2021 at 6:28 PM Kai Yang via R-help <r-help at r-project.org> wrote:> > Hello List, > I have a one column data frame to store file name with extension. I want to create new column to keep file name only without extension. > I tried to use strsplit("name1.csv", "\\.")[[1]] to do that, but it just retain the first row only and it is a vector. how can do this for all of rows and put it into a new column? > thank you, > Kai > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Thu, 8 Jul 2021 01:27:48 +0000 (UTC) Kai Yang via R-help <r-help at r-project.org> wrote:> Hello List, > I have a? one column data frame to store file name with extension. I > want to create new column to keep file name only without extension. I > tried to use strsplit("name1.csv", "\\.")[[1]] to do that, but it > just retain the first row only and it is a vector.? how can do this > for all of rows and put it into a new column? thank you, KaiYour example is confusing/garbled. You are applying strsplit() to a single character string namely "name1.csv". Your intent presumably is to apply it to a *vector* (say "v") of file names. A syntax which would work is sapply(strsplit(v,"\\."),function(x){x[1]}) E.g. v <- c("clyde.txt","irving.tex","melvin.pdf","fred.csv") sapply(strsplit(v,"\\."),function(x){x[1]}) which gives the output> [1] "clyde" "irving" "melvin" "fred"Note that the output of strplit() is a *list* the i-th entry of which is a vector consisting of the "split" of the i-th entry of the vector to which strsplit() is applied. In your example (corrected, so that it makes sense, by replacing the string "names.csv" by my vector "v") you get strsplit(v,"\\.")[[1]] [1] "clyde" "txt" the result of splitting "clyde.txt"; you want the first entry of this result, i.e. "clyde". My "sapply()" construction produces the first entry of each entry of the list produced by strsplit(). It is useful to get your thoughts clear, understand what you are doing and understand what the functions that you are using do. (Read the help!) cheers, Rolf Turner -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276
sub( "\\.[^.]*$", "", fname ) On July 7, 2021 6:27:48 PM PDT, Kai Yang via R-help <r-help at r-project.org> wrote:> Hello List, >I have a? one column data frame to store file name with extension. I >want to create new column to keep file name only without extension. >I tried to use strsplit("name1.csv", "\\.")[[1]] to do that, but it >just retain the first row only and it is a vector.? how can do this for >all of rows and put it into a new column? >thank you, >Kai > > > > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.