markleeds at verizon.net
2006-Jul-05 21:33 UTC
[R] apologes if you already saw this :efficiency question
hi everyone : i'm not sure if my previous mail about this got sent. i was typing and erroneosuyl hit a button and lost what i was typing. anyway, i have the code below ( it works ) in which i run through the rows of a dataframe, taking out the first two fields which are characters strings ( with some extra spacing so i yuse gsub) and appending these character strings to a list so that i can build one big list. there are 17,000 rows so i was hoping there might be a ( even just slightly. it doesn't have to be incresible improvement ) more efficient way to do this. I also think that remember someone saying that using the c command to make something bigger is not a good idea. the code is below. thanks. for paircounter in 1:nrow(tempdata) { firsstock<-gsub(" ","",tempdata[paircounter,1] secondstock<-gsub(" ","",tempdata[paircounter,2] if ( paircounter == 1 ) { stocklist<-c(firststock,secondstock) } else { stocklist<(stocklist,firststock,secondstock) } }
jim holtman
2006-Jul-06 02:49 UTC
[R] apologes if you already saw this :efficiency question
Is this what you want to do?> x <- data.frame(a=paste(letters[1:10], 1:10),+ b=paste(letters[11:20], 1:10), c=paste(LETTERS[1:10], 1:10))> xa b c 1 a 1 k 1 A 1 2 b 2 l 2 B 2 3 c 3 m 3 C 3 4 d 4 n 4 D 4 5 e 5 o 5 E 5 6 f 6 p 6 F 6 7 g 7 q 7 G 7 8 h 8 r 8 H 8 9 i 9 s 9 I 9 10 j 10 t 10 J 10> (y <- as.vector(t(x[,1:2])))[1] "a 1" "k 1" "b 2" "l 2" "c 3" "m 3" "d 4" "n 4" "e 5" "o 5" "f 6" "p 6" "g 7" "q 7" [15] "h 8" "r 8" "i 9" "s 9" "j 10" "t 10"> gsub(" ", "", y)[1] "a1" "k1" "b2" "l2" "c3" "m3" "d4" "n4" "e5" "o5" "f6" "p6" "g7" "q7" "h8" "r8" [17] "i9" "s9" "j10" "t10">On 7/5/06, markleeds@verizon.net <markleeds@verizon.net> wrote:> > hi everyone : i'm not sure if my previous mail about > this got sent. i was typing and > erroneosuyl hit a button and lost what i was typing. > > anyway, i have the code below ( it works ) in which i run through the rows > of a dataframe, taking out the first two > fields which are characters strings ( with some extra spacing so > i yuse gsub) and appending these character strings to a list so that i can > build one big list. > > there are 17,000 rows so i was hoping there might be a ( even just > slightly. it doesn't have to be incresible improvement ) more efficient way > to do this. I also think that remember someone saying that using the c > command to make something bigger is not a good idea. > > the code is below. thanks. > > for paircounter in 1:nrow(tempdata) { > > firsstock<-gsub(" ","",tempdata[paircounter,1] > secondstock<-gsub(" ","",tempdata[paircounter,2] > > if ( paircounter == 1 ) { > stocklist<-c(firststock,secondstock) > } else { > stocklist<(stocklist,firststock,secondstock) > } > } > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >-- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What is the problem you are trying to solve? [[alternative HTML version deleted]]
markleeds at verizon.net
2006-Jul-06 05:52 UTC
[R] apologes if you already saw this :efficiency question
>From: jim holtman <jholtman at gmail.com> >Date: Wed Jul 05 21:49:33 CDT 2006 >To: "markleeds at verizon.net" <markleeds at verizon.net> >Cc: r-help at stat.math.ethz.ch >Subject: Re: [R] apologes if you already saw this :efficiency questionjim : i don't want to take advantage of your kindness and generosity but when you have time, could you think about the following. remember the function gabor gave me to pick out the column of a dataframe ( for the same named columns ) that hadt the most non zero elements. it was tapply(seq(DF),names(Df),f) where f was function(x) x[which.max(colSums(Df[x]!=0)] I was hoping that it wouldn't be so difficult to change the criteria to the following. rather than pick out the column with the maximum # of nonzero elements, I want to take the average of the same named columns but don't include zero valued elements that are in any rows. So, the resultant matrix would be the unique names and the columns would be averages of the samed named columns but if a column had a zero in one of it s rows, then that zero wouldn't be included in the average. Basically, this is because in this case, zero doesn't really mean 0. it means leave it out because it's not involved. i'm sorry to bother you and it's not urgwnt and i won't start bothering you all the time. i am very aware of ( not in the R sense but in other ways ) how generosity can get taken advantage of so that's the las tthing I want to do. Thanks a lot. also, sometimes examples help, so , if you need one, i can definitely make one up. actually, i will make one up and send you in the next email. i want to send this because if i write too long an email my email dies and i lose it. Mark>Is this what you want to do??> x <- data.frame(a=paste(letters[1:10], 1:10), >+ b=paste(letters[11:20], 1:10), c=paste(LETTERS[1:10], 1:10)) >> x >????? a??? b??? c >1?? a 1? k 1? A 1 >2?? b 2? l 2? B 2 >3?? c 3? m 3? C 3 >4?? d 4? n 4? D 4 >5?? e 5? o 5? E 5 >6?? f 6? p 6? F 6 >7?? g 7? q 7? G 7 >8?? h 8? r 8? H 8 >9?? i 9? s 9? I 9 >10 j 10 t 10 J 10 >> (y <- as.vector(t(x[,1:2]))) >?[1] "a 1"? "k 1"? "b 2"? "l 2"? "c 3"? "m 3"? "d 4"? "n 4"? "e 5"? "o 5"? "f 6"? "p 6"? "g 7"? "q 7" >[15] "h 8"? "r 8"? "i 9"? "s 9"? "j 10" "t 10" >> gsub(" ", "", y) >?[1] "a1"? "k1"? "b2"? "l2"? "c3"? "m3"? "d4"? "n4"? "e5"? "o5"? "f6"? "p6"? "g7"? "q7"? "h8"? "r8" >[17] "i9"? "s9"? "j10" "t10" >> > > >?On 7/5/06, markleeds at verizon.net <markleeds at verizon.net> wrote:hi everyone : i'm not sure if my previous mail about >this got sent. i was typing and >erroneosuyl hit a button and lost what i was typing. > >anyway, i have the code below ( it works ) in which i run through the rows of a dataframe, taking out the first two >fields which are characters strings ( with some extra spacing so >i yuse gsub) and appending these character strings to a list so that i can build one big list. > >there are 17,000 rows so i was hoping there might be a ( even just slightly. it doesn't have to be incresible improvement ) more efficient way to do this. I also think that remember someone saying that using the c command to make something bigger is not a good idea. > >the code is below. thanks. > >???????????? for paircounter in 1:nrow(tempdata) { > >????????????????firsstock<-gsub(" ","",tempdata[paircounter,1] >????????????????secondstock<-gsub(" ","",tempdata[paircounter,2] > >???????????????? if ( paircounter == 1 ) { >???????????????????? stocklist<-c(firststock,secondstock) >??????????????????} else { >??????????????????????stocklist<(stocklist,firststock,secondstock) >??????????????????} >????????????????} > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > > >-- >Jim Holtman >Cincinnati, OH >+1 513 646 9390 (Cell) >+1 513 247 0281 (Home) > >What is the problem you are trying to solve?