Jonas Walter
2013-Feb-25 13:38 UTC
[R] creating variable that codes for the match/mismatch between two other variables
Dear all, I have got two vectors coding for a stimulus presented in the current trial (mydat$Stimulus) and a prediction in the same trial (mydat$Prediciton), respectively. By applying an if-conditional I want to create a new vector that indicates if there is a match between both vectors in the same trial. That is, if the prediction equals the stimulus. When I pick out some trials randomly, I get some trials with no match (mydat$Stimulus[1] != mydat$Prediction[1]) as well as some trials with a match (mydat$Stimulus[1] == mydat$Prediction[1]). However, if I apply the following code, each trial is coded as a match. Why, what do I wrong? In some blocks, there was no prediction recorded. Therefore, I want those trials to be labeled differently [that is, match = 7]. Coding-legend: 1 = match 0 = no match 7 = no prediction recorded The code: # create varialbe that codes match/mismatch of prediction vs. stimulus mydat$match <- 0 for (i in seq_along(1:nrow(mydat))) { # if there is a match, mydat$match[i] = 1 if (mydat$Stimulus[i] == mydat$Prediction[i]) { mydat$match = 1 # the next to conditions refer to blocks without prediction recording. Therefore, the corresponding trials are coded with mydat$match[i] = 7. } else if (mydat$BlockOrder[i] == 1 & mydat$Block_nr[i] == 1) { mydat$match = 7 } else if (mydat$BlockOrder[i] == 2 & mydat$Block_nr[i] == 4) { mydat$match == 7 } } # The corresponding dataframe structure: str(mydat) 'data.frame': 9302 obs. of 18 variables: $ BlockOrder : int 1 1 1 1 1 1 1 1 1 1 ... $ Block_nr : num 1 1 1 1 1 1 1 1 1 1 ... $ Trial_nr : int 1 2 3 4 5 6 7 8 9 10 ... $ PreSeq.Length : int 1 2 2 1 1 2 0 2 2 2 ... $ PreSeq : int 21 12 21 20 20 12 0 21 22 11 ... $ Sequence : int 121111 121212 121111 121111 112212 121221 121111 121111 122112 121111 ... $ Category : int 2 1 3 2 1 1 3 3 1 3 ... $ FixCross.Latency : int 1429 1043 1093 1297 1155 1449 1140 1396 1341 1427 ... $ Stimulus : int 2 1 2 2 1 1 1 1 2 1 ... $ RT : int 333 275 378 428 442 388 340 394 414 542 ... $ RT.Button_pressed: int 2 1 2 2 1 1 1 1 2 1 ... $ RT.Accuracy : int 1 1 1 1 1 1 1 1 1 1 ... $ Prediction : int 0 0 0 0 0 0 0 0 0 0 ... $ Confidence : int 0 0 0 0 0 0 0 0 0 0 ... $ ITI : int 1053 1182 1467 1431 1103 1170 1232 1393 1356 1495 ... $ Subject : num 4 4 4 4 4 4 4 4 4 4 ... $ ITruns : num 0 0 0 1 0 1 2 3 0 0 ... $ match : num 1 1 1 1 1 1 1 1 1 1 ... # mydat$match, the new variable, only contains ones. min(mydat$match) [1] 1> max(mydat$match)[1] 1 # example: row 1699: no match Stimulus - Prediction mydat$Stimulus[1699] == mydat$Prediction[1699] # [1] FALSE # but: mydat$match[1699] # [1] 1 How can I get the right coding? Where is the mistake? Thanks! Best, Jonas Walter [[alternative HTML version deleted]]
PIKAL Petr
2013-Feb-25 14:53 UTC
[R] creating variable that codes for the match/mismatch between two other variables
Hi> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Jonas Walter > Sent: Monday, February 25, 2013 2:38 PM > To: r-help at r-project.org > Subject: [R] creating variable that codes for the match/mismatch > between two other variables > > > > Dear all, > > I have got two vectors coding for a stimulus presented in the current > trial (mydat$Stimulus) and a prediction in the same trial > (mydat$Prediciton), respectively. > By applying an if-conditional I want to create a new vector that > indicates if there is a match between both vectors in the same trial. > That is, if the prediction equals the stimulus. > > When I pick out some trials randomly, I get some trials with no match > (mydat$Stimulus[1] != mydat$Prediction[1]) as well as some trials with > a match (mydat$Stimulus[1] == mydat$Prediction[1]). > > However, if I apply the following code, each trial is coded as a match. > Why, what do I wrong? > > In some blocks, there was no prediction recorded. Therefore, I want > those trials to be labeled differently [that is, match = 7]. > > Coding-legend: > > 1 = match > 0 = no match > 7 = no prediction recorded > > The code: > > # create varialbe that codes match/mismatch of prediction vs. stimulus > > mydat$match <- 0 > > for (i in seq_along(1:nrow(mydat))) { > # if there is a match, mydat$match[i] = 1 if > (mydat$Stimulus[i] == mydat$Prediction[i]) { > mydat$match = 1 > # the next to conditions refer to blocks without prediction recording. > Therefore, the corresponding trials are coded with mydat$match[i] = 7. > } else if (mydat$BlockOrder[i] == 1 & mydat$Block_nr[i] == 1) { > mydat$match = 7 > } else if (mydat$BlockOrder[i] == 2 & mydat$Block_nr[i] == 4) { > mydat$match == 7 > } > }Well, why so complicated? (mydat$Stimulus == mydat$Prediction)*1 gives you vector of 1 when there is match and 0 when there is no match. I do not understand your no prediction though. How is no prediction coded? If NA, the resulting vector will have NA in corresponding item too. Regards Petr> > # The corresponding dataframe structure: > > str(mydat) > 'data.frame': 9302 obs. of 18 variables: > $ BlockOrder : int 1 1 1 1 1 1 1 1 1 1 ... > $ Block_nr : num 1 1 1 1 1 1 1 1 1 1 ... > $ Trial_nr : int 1 2 3 4 5 6 7 8 9 10 ... > $ PreSeq.Length : int 1 2 2 1 1 2 0 2 2 2 ... > $ PreSeq : int 21 12 21 20 20 12 0 21 22 11 ... > $ Sequence : int 121111 121212 121111 121111 112212 121221 > 121111 121111 122112 121111 ... > $ Category : int 2 1 3 2 1 1 3 3 1 3 ... > $ FixCross.Latency : int 1429 1043 1093 1297 1155 1449 1140 1396 1341 > 1427 ... > $ Stimulus : int 2 1 2 2 1 1 1 1 2 1 ... > $ RT : int 333 275 378 428 442 388 340 394 414 542 ... > $ RT.Button_pressed: int 2 1 2 2 1 1 1 1 2 1 ... > $ RT.Accuracy : int 1 1 1 1 1 1 1 1 1 1 ... > $ Prediction : int 0 0 0 0 0 0 0 0 0 0 ... > $ Confidence : int 0 0 0 0 0 0 0 0 0 0 ... > $ ITI : int 1053 1182 1467 1431 1103 1170 1232 1393 1356 > 1495 ... > $ Subject : num 4 4 4 4 4 4 4 4 4 4 ... > $ ITruns : num 0 0 0 1 0 1 2 3 0 0 ... > $ match : num 1 1 1 1 1 1 1 1 1 1 ... > > # mydat$match, the new variable, only contains ones. > > min(mydat$match) > [1] 1 > > max(mydat$match) > [1] 1 > > # example: row 1699: no match Stimulus - Prediction > > mydat$Stimulus[1699] == mydat$Prediction[1699] # [1] FALSE > > # but: > > mydat$match[1699] > # [1] 1 > > How can I get the right coding? Where is the mistake? > > Thanks! > > Best, > Jonas Walter > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.