Hi, my data looks this:
id forma program kod obor
rocnik
1 10001 kombinovan? Matematika M1101 matematika 1
2 10002 prezen?n? Informatika N1801 teoretick? informatika 1
3 10002 prezen?n? Informatika B1801 obecn? informatika 3
4 10003 prezen?n? Informatika M1801 softwarov? syst?my 5
5 10004 prezen?n? Informatika B1801 obecn? informatika 2
6 10005 kombinovan? Informatika P1801 diskr?tn? modely a algoritmy 2
stav ukrok
1 zanechal 2002/2003
2 studuje
3 absolvoval 2008/2009
4 absolvoval 2005/2006
5 zanechal 2007/2008
6 zanechal 2004/2005
data$ukrok is a factor
data$rocnik is numeric
I want to create new column (data$z) and in this column have to be
as.numeric(first 4 char of column(data$ukrok))-data$rocnik ---- by the
rows
If ukrok is empty it means 2009.
I know how to do it by cycle FOR , but this is not rigth way. I have too
many observation, and this way is soo slowly.
Know someone how to do it using function TAPPLY ? or another apply function
???
--
View this message in context:
http://r.789695.n4.nabble.com/help-with-tapply-or-other-apply-tp2122683p2122683.html
Sent from the R help mailing list archive at Nabble.com.
You don't show how you are doing it with
a 'for' loop, but I suspect that you just
need to eliminate the subscript you are
using for rows.
For example:
for(i in 1:nrow(data)) {
data$z[i] <- data[i, 'x'] + data[i, 'y']
}
can be written more simply and much more
efficiently as:
data$z <- data[, 'x'] + data[, 'y']
Using an "apply" function is not going to
improve the efficiency. This is the subject
of Circles 3 and 4 of 'The R Inferno'.
On 02/05/2010 11:26, peterko wrote:>
> Hi, my data looks this:
> id forma program kod obor
> rocnik
> 1 10001 kombinovan? Matematika M1101 matematika 1
> 2 10002 prezen?n? Informatika N1801 teoretick? informatika 1
> 3 10002 prezen?n? Informatika B1801 obecn? informatika 3
> 4 10003 prezen?n? Informatika M1801 softwarov? syst?my 5
> 5 10004 prezen?n? Informatika B1801 obecn? informatika 2
> 6 10005 kombinovan? Informatika P1801 diskr?tn? modely a algoritmy 2
> stav ukrok
> 1 zanechal 2002/2003
> 2 studuje
> 3 absolvoval 2008/2009
> 4 absolvoval 2005/2006
> 5 zanechal 2007/2008
> 6 zanechal 2004/2005
>
> data$ukrok is a factor
> data$rocnik is numeric
>
> I want to create new column (data$z) and in this column have to be
> as.numeric(first 4 char of column(data$ukrok))-data$rocnik ---- by the
> rows
> If ukrok is empty it means 2009.
> I know how to do it by cycle FOR , but this is not rigth way. I have too
> many observation, and this way is soo slowly.
> Know someone how to do it using function TAPPLY ? or another apply function
> ???
--
Patrick Burns
pburns at pburns.seanet.com
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')
On 05/02/2010 08:26 PM, peterko wrote:> > Hi, my data looks this: > id forma program kod obor > rocnik > 1 10001 kombinovan? Matematika M1101 matematika 1 > 2 10002 prezen?n? Informatika N1801 teoretick? informatika 1 > 3 10002 prezen?n? Informatika B1801 obecn? informatika 3 > 4 10003 prezen?n? Informatika M1801 softwarov? syst?my 5 > 5 10004 prezen?n? Informatika B1801 obecn? informatika 2 > 6 10005 kombinovan? Informatika P1801 diskr?tn? modely a algoritmy 2 > stav ukrok > 1 zanechal 2002/2003 > 2 studuje > 3 absolvoval 2008/2009 > 4 absolvoval 2005/2006 > 5 zanechal 2007/2008 > 6 zanechal 2004/2005 > > data$ukrok is a factor > data$rocnik is numeric > > I want to create new column (data$z) and in this column have to be > as.numeric(first 4 char of column(data$ukrok))-data$rocnik ---- by the > rows > If ukrok is empty it means 2009. > I know how to do it by cycle FOR , but this is not rigth way. I have too > many observation, and this way is soo slowly. > Know someone how to do it using function TAPPLY ? or another apply function > ???Hi Peterko, You can do this with vector calculation like this: # make sure that ukrok is character, not factor data$ukrok<-as.character(data$ukrok) # set the empty elements to the correct value data$ukrok[nchar(data$ukrok)==0]<-"2009/2009" # split ukrok, unlist it, take only the first component # convert to numeric and subtract rocnik data$z<-as.numeric(matrix(unlist(strsplit(data$ukrok,"\")), ncol=2,byrow=TRUE)[,1])-data$rocnik Jim
Thank you Jim, it works very well. I do not need do it using tapply, but i think that it is the way. And Patrick thank you too -- View this message in context: http://r.789695.n4.nabble.com/help-with-tapply-or-other-apply-tp2122683p2122728.html Sent from the R help mailing list archive at Nabble.com.