It's generally a very good idea to examine the structure of data after you have read it in. str(data2) would have shown you that read.csv() turned your strings into factors, and that's why the == operator no longer does what you think it does. use ... data_2 <- read.csv("excel_data.csv", stringsAsFactors = FALSE) ... to turn this off. Also, the %in% operator will achieve more directly what you are trying to do. No need for loops. B.> On Oct 12, 2017, at 4:25 PM, Yasin Gocgun <yasing053 at gmail.com> wrote: > > Hi, > > I have two columns that contain numbers along with letters (as shown below) > and have different lengths. Each entry in the first column is likely to be > found in the second column at most once. > > For each entry of the first column, if that entry is found in the second > column, I would like to get the corresponding index. For instance, if the > first entry of the first column is 5th entry in the second column, I would > like to keep this index 5. > > AST2017000005534 TUR2017000001428 > CTS2017000079930 CTS2017000071989 > CTS2017000079931 CTS2017000072015 > > In a loop, when I use the following code to get those indices, > > > data_2 = read.csv("excel_data.csv") > column_1 = data_2$data1 > column_2 = data_2$data2 > > match_list <- array(0,dim=c(310,1)); # 310 is the length of the first > column > > for (indx in 1: 310){ > for(indx2 in 1:713){ # 713 is the length of the second column > if(column_1[indx] == column_2[indx2] ){ > match_list[indx,1] = indx2; > break; > } > } > } > > > R provides the following error: > > Error in Ops.factor(column_1[indx], column_2[indx2]) : > level sets of factors are different > > So can someone explain me how I can resolve this issue? > > Thnak you, > > Yasin > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.

Combining and completing the advice from Greg and Boris the complete solution is two lines: data_2 <- read.csv("excel_data.csv", stringsAsFactors = FALSE) match_list <- match( data_2$data1, data_2$data2 ) The vector match_list will have the matching position when it exists and NA's otherwise. Its length will be the same as the length of data_2$data1. You should get experience in reading the help information for R functions. In this case, type ?match to get information about the 'match' function. HTH, Eric On Fri, Oct 13, 2017 at 12:16 AM, Boris Steipe <boris.steipe at utoronto.ca> wrote:> It's generally a very good idea to examine the structure of data after you > have read it in. str(data2) would have shown you that read.csv() turned > your strings into factors, and that's why the == operator no longer does > what you think it does. > > use ... > > data_2 <- read.csv("excel_data.csv", stringsAsFactors = FALSE) > > ... to turn this off. Also, the %in% operator will achieve more directly > what you are trying to do. No need for loops. > > B. > > > > > > On Oct 12, 2017, at 4:25 PM, Yasin Gocgun <yasing053 at gmail.com> wrote: > > > > Hi, > > > > I have two columns that contain numbers along with letters (as shown > below) > > and have different lengths. Each entry in the first column is likely to > be > > found in the second column at most once. > > > > For each entry of the first column, if that entry is found in the second > > column, I would like to get the corresponding index. For instance, if the > > first entry of the first column is 5th entry in the second column, I > would > > like to keep this index 5. > > > > AST2017000005534 TUR2017000001428 > > CTS2017000079930 CTS2017000071989 > > CTS2017000079931 CTS2017000072015 > > > > In a loop, when I use the following code to get those indices, > > > > > > data_2 = read.csv("excel_data.csv") > > column_1 = data_2$data1 > > column_2 = data_2$data2 > > > > match_list <- array(0,dim=c(310,1)); # 310 is the length of the first > > column > > > > for (indx in 1: 310){ > > for(indx2 in 1:713){ # 713 is the length of the second column > > if(column_1[indx] == column_2[indx2] ){ > > match_list[indx,1] = indx2; > > break; > > } > > } > > } > > > > > > R provides the following error: > > > > Error in Ops.factor(column_1[indx], column_2[indx2]) : > > level sets of factors are different > > > > So can someone explain me how I can resolve this issue? > > > > Thnak you, > > > > Yasin > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]