Suppose I have a file with the the following structure - call the two space-separated fields 'label' and 'count': ABC 3 DDG 5 ABB 2 What I need to do is parse each line of the file, and then depending on the value of count, write out the value of 'label' to a new file, but 'count' times. In other words, take the preceding, and output ABC ABC ABC DDG DDG DDG DDG DDG ABB ABB I was wondering if there was an elegant/simple way to do this? I can do this relatively easily in perl, or awk, but am stumped by getting a bit of R code to accomplish the same thing. Many thanks in advance...
Figured it out on my own. Basically, use the replicate command for each line of the data.frame, then appending to a file. On 7/6/2021 9:27 AM, Evan Cooch wrote:> Suppose I have a file with the the following structure - call the two > space-separated fields 'label' and 'count': > > ABC 3 > DDG 5 > ABB 2 > > > What I need to do is parse each line of the file, and then depending > on the value of count, write out the value of 'label' to a new file, > but 'count' times. In other words, take the preceding, and output > > ABC > ABC > ABC > DDG > DDG > DDG > DDG > DDG > ABB > ABB > > I was wondering if there was an elegant/simple way to do this? I can > do this relatively easily in perl, or awk, but am stumped by getting a > bit of R code to accomplish the same thing. > > Many thanks in advance... >
Hi Evan, I assume you know how to get the data into a data frame (e.g. via read.csv). Here I will create the example data explicitly, creating a data frame x. x <- data.frame( label=c("ABC","DDG","ABB"), count=c(3,5,2) ) Then create a character vector with the data as you want it. y <- unlist(sapply( 1:nrow(x), function(i) rep( x$label[i], x$count[i] ) )) Finally print it to a file, say 'myfile' (to get one element per line I did a bit of a trick). write.table(x=t(t(y)),file="myfile",row.names=FALSE,col.names=FALSE,quote=FALSE) HTH, Eric On Wed, Jul 7, 2021 at 10:15 AM Evan Cooch <evan.cooch at gmail.com> wrote:> Suppose I have a file with the the following structure - call the two > space-separated fields 'label' and 'count': > > ABC 3 > DDG 5 > ABB 2 > > > What I need to do is parse each line of the file, and then depending on > the value of count, write out the value of 'label' to a new file, but > 'count' times. In other words, take the preceding, and output > > ABC > ABC > ABC > DDG > DDG > DDG > DDG > DDG > ABB > ABB > > I was wondering if there was an elegant/simple way to do this? I can do > this relatively easily in perl, or awk, but am stumped by getting a bit > of R code to accomplish the same thing. > > Many thanks in advance... > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
On Tue, 6 Jul 2021 09:27:20 -0400 Evan Cooch <evan.cooch at gmail.com> wrote:> I was wondering if there was an elegant/simple way to do this?rep(label, times = count) should give you a character vector with the answer ready for writeLines(). -- Best regards, Ivan
Hello, Use ?rep. Since you say you have a file, in the code below I will read the data from a connection. Then create the string. txtfile <- "ABC 3 DDG 5 ABB 2" tc <- textConnection(txtfile) df1 <- read.table(tc) close(tc) rep(df1[[1]], df1[[2]]) #[1] "ABC" "ABC" "ABC" "DDG" "DDG" "DDG" "DDG" "DDG" "ABB" "ABB" Hope this helps, Rui Barradas ?s 14:27 de 06/07/21, Evan Cooch escreveu:> Suppose I have a file with the the following structure - call the two > space-separated fields 'label' and 'count': > > ABC 3 > DDG 5 > ABB 2 > > > What I need to do is parse each line of the file, and then depending on > the value of count, write out the value of 'label' to a new file, but > 'count' times. In other words, take the preceding, and output > > ABC > ABC > ABC > DDG > DDG > DDG > DDG > DDG > ABB > ABB > > I was wondering if there was an elegant/simple way to do this? I can do > this relatively easily in perl, or awk, but am stumped by getting a bit > of R code to accomplish the same thing. > > Many thanks in advance... > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.