Dear list, Sorry if this is a really simple question to ask, but I have been stuck at this for some time and have no luck so far. I have 2 data files- a file containing IDs which I am interested in (maybe about 20 of them)- lets call this file A, and another file containing all the IDs (which is more than 1 million) and the corresponding expression values- lets call this file B. So for example, this is how file B looks like: 2534215 4.73483 3.06499 0.70032 2.20247 2.57014 2.68004 5.20362 4.56531 5.53275 4.6597 4.49301 3.26696 4.45926 3.33294 4.91008 4.69106 2.80828 3.85168 4.31348 3.84926 2620456 7.13062 7.40361 7.41215 7.12816 6.03347 7.77204 8.37129 8.4811 8.4156 8.48804 8.33357 7.77168 7.60426 7.0769 8.06268 7.4646 6.29976 6.76743 7.41375 7.30186 2830090 0.54261 2.14015 2.30475 1.85821 0.46711 0.92093 0.77438 0.94734 0.68643 0.57625 0.72098 0.988 2.11342 0.4447 0.3774 0.67744 2.0072 1.90465 0.87868 0.5177 2575215 3.40684 2.08213 1.69333 3.01074 2.28529 1.9541 2.14518 1.67518 2.83476 3.28296 2.21263 2.46864 3.30994 2.42013 5.22249 1.98937 2.09089 1.94941 2.43488 3.59808 2849041 1.48954 3.91934 1.7037 1.77033 1.54604 0.93467 0.96066 2.24328 2.92171 2.15935 0.42652 1.46658 1.49061 1.72481 1.09684 0.98344 1.83167 2.10103 2.40971 0.84771 3102546 5.88775 5.67887 5.42377 3.45853 4.99174 3.04696 5.29118 4.43669 5.18792 5.34072 5.06025 3.25504 5.50719 5.92674 4.71267 5.51699 3.54069 5.28343 6.36131 4.7219 2925428 1.55075 1.74401 2.07019 2.13188 0.8098 1.40341 0.99192 1.39257 0.51152 2.50187 1.39294 1.29555 0.67688 1.2912 1.75177 1.61535 1.31611 0.91411 1.1937 0.79358 2372272 7.69508 7.32337 7.47569 7.13584 5.86709 6.15057 8.03573 7.77657 7.93019 7.6854 8.23643 3.42619 8.90083 9.18137 8.54825 7.39468 7.88283 8.08621 8.41234 8.23991 3360788 1.8126 3.57725 1.68462 2.83186 2.29188 1.723 2.27151 3.57716 3.12974 1.7842 2.49073 3.42127 1.52801 2.43608 2.38312 4.10846 2.34492 1.68719 0.87868 2.5681 3443892 6.03618 5.53088 3.59312 2.43822 4.67525 2.89741 3.57465 3.25502 3.30023 3.06377 3.4388 2.31478 4.74265 5.6276 2.82511 4.8615 5.65537 5.93262 6.29311 6.13882 3298798 1.16559 2.3974 1.28858 0.566 0.74531 0.93919 2.68312 1.20792 1.48076 1.69595 1.67204 0.5811 1.02014 0.84925 1.54777 1.30054 2.14382 3.31986 1.19556 2.7532 3019819 1.73561 3.53995 2.28098 1.91677 3.64978 2.23223 2.39018 1.66967 2.60563 2.82343 3.60435 1.7998 1.10665 1.42922 1.60167 1.39237 3.08337 1.07063 2.08086 1.9265 3173936 1.55062 1.50538 1.87352 1.83726 3.59104 2.50134 0.71658 2.07377 1.50345 2.71367 1.68669 2.05536 1.88829 0.86234 0.80113 2.2544 0.86643 2.03882 1.53434 1.46399 2424908 0.42785 1.66894 2.27908 2.99882 3.284 1.99961 0.79164 0.44145 1.40957 2.05127 1.40553 3.38542 0.82951 1.06653 1.43876 2.15272 0.69042 0.66962 2.02502 0.7646 And this is how file A looks like: 2620456 2830090 2575215 2849041 How do I actually extract and export the corresponding values for A in B? Thanks in advanced! Jeremy [[alternative HTML version deleted]]
Hello, Try the following. idx <- B$ID %in% A B[idx, ] In the case of your data example, the return value is 0 rows. (No matches) Also, the best way to post a data example is using ?dput dput( head(B, 30) ) # copy & paste the output of this Hope this helps, Rui Barradas Em 27-10-2012 07:07, Jeremy Ng escreveu:> Dear list, > > Sorry if this is a really simple question to ask, but I have been stuck at > this for some time and have no luck so far. > > I have 2 data files- a file containing IDs which I am interested in (maybe > about 20 of them)- lets call this file A, and another file containing all > the IDs (which is more than 1 million) and the corresponding expression > values- lets call this file B. > > So for example, this is how file B looks like: > > 2534215 4.73483 3.06499 0.70032 2.20247 2.57014 2.68004 5.20362 4.56531 > 5.53275 4.6597 4.49301 3.26696 4.45926 3.33294 4.91008 4.69106 2.80828 > 3.85168 4.31348 3.84926 > 2620456 7.13062 7.40361 7.41215 7.12816 6.03347 7.77204 8.37129 8.4811 > 8.4156 8.48804 8.33357 7.77168 7.60426 7.0769 8.06268 7.4646 6.29976 6.76743 > 7.41375 7.30186 > 2830090 0.54261 2.14015 2.30475 1.85821 0.46711 0.92093 0.77438 0.94734 > 0.68643 0.57625 0.72098 0.988 2.11342 0.4447 0.3774 0.67744 2.0072 1.90465 > 0.87868 0.5177 > 2575215 3.40684 2.08213 1.69333 3.01074 2.28529 1.9541 2.14518 1.67518 > 2.83476 3.28296 2.21263 2.46864 3.30994 2.42013 5.22249 1.98937 2.09089 > 1.94941 2.43488 3.59808 > 2849041 1.48954 3.91934 1.7037 1.77033 1.54604 0.93467 0.96066 2.24328 > 2.92171 2.15935 0.42652 1.46658 1.49061 1.72481 1.09684 0.98344 1.83167 > 2.10103 2.40971 0.84771 > 3102546 5.88775 5.67887 5.42377 3.45853 4.99174 3.04696 5.29118 4.43669 > 5.18792 5.34072 5.06025 3.25504 5.50719 5.92674 4.71267 5.51699 3.54069 > 5.28343 6.36131 4.7219 > 2925428 1.55075 1.74401 2.07019 2.13188 0.8098 1.40341 0.99192 1.39257 > 0.51152 2.50187 1.39294 1.29555 0.67688 1.2912 1.75177 1.61535 1.31611 > 0.91411 1.1937 0.79358 > 2372272 7.69508 7.32337 7.47569 7.13584 5.86709 6.15057 8.03573 7.77657 > 7.93019 7.6854 8.23643 3.42619 8.90083 9.18137 8.54825 7.39468 7.88283 > 8.08621 8.41234 8.23991 > 3360788 1.8126 3.57725 1.68462 2.83186 2.29188 1.723 2.27151 3.57716 3.12974 > 1.7842 2.49073 3.42127 1.52801 2.43608 2.38312 4.10846 2.34492 1.68719 > 0.87868 2.5681 > 3443892 6.03618 5.53088 3.59312 2.43822 4.67525 2.89741 3.57465 3.25502 > 3.30023 3.06377 3.4388 2.31478 4.74265 5.6276 2.82511 4.8615 5.65537 5.93262 > 6.29311 6.13882 > 3298798 1.16559 2.3974 1.28858 0.566 0.74531 0.93919 2.68312 1.20792 1.48076 > 1.69595 1.67204 0.5811 1.02014 0.84925 1.54777 1.30054 2.14382 3.31986 > 1.19556 2.7532 > 3019819 1.73561 3.53995 2.28098 1.91677 3.64978 2.23223 2.39018 1.66967 > 2.60563 2.82343 3.60435 1.7998 1.10665 1.42922 1.60167 1.39237 3.08337 > 1.07063 2.08086 1.9265 > 3173936 1.55062 1.50538 1.87352 1.83726 3.59104 2.50134 0.71658 2.07377 > 1.50345 2.71367 1.68669 2.05536 1.88829 0.86234 0.80113 2.2544 0.86643 > 2.03882 1.53434 1.46399 > 2424908 0.42785 1.66894 2.27908 2.99882 3.284 1.99961 0.79164 0.44145 > 1.40957 2.05127 1.40553 3.38542 0.82951 1.06653 1.43876 2.15272 0.69042 > 0.66962 2.02502 0.7646 > > And this is how file A looks like: > 2620456 > 2830090 > 2575215 > 2849041 > > How do I actually extract and export the corresponding values for A in B? > > Thanks in advanced! > Jeremy > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi, Try this: B<-readLines(textConnection("2534215 4.73483 3.06499 0.70032 2.20247 2.57014 2.68004 5.20362 4.56531 5.53275 4.6597 4.49301 3.26696 4.45926 3.33294 4.91008 4.69106 2.80828 3.85168 4.31348 3.84926 2620456 7.13062 7.40361 7.41215 7.12816 6.03347 7.77204 8.37129 8.4811 8.4156 8.48804 8.33357 7.77168 7.60426 7.0769 8.06268 7.4646 6.29976 6.76743 7.41375 7.30186 2830090 0.54261 2.14015 2.30475 1.85821 0.46711 0.92093 0.77438 0.94734 0.68643 0.57625 0.72098 0.988 2.11342 0.4447 0.3774 0.67744 2.0072 1.90465 0.87868 0.5177 2575215 3.40684 2.08213 1.69333 3.01074 2.28529 1.9541 2.14518 1.67518 2.83476 3.28296 2.21263 2.46864 3.30994 2.42013 5.22249 1.98937 2.09089 1.94941 2.43488 3.59808 2849041 1.48954 3.91934 1.7037 1.77033 1.54604 0.93467 0.96066 2.24328 2.92171 2.15935 0.42652 1.46658 1.49061 1.72481 1.09684 0.98344 1.83167 2.10103 2.40971 0.84771 3102546 5.88775 5.67887 5.42377 3.45853 4.99174 3.04696 5.29118 4.43669 5.18792 5.34072 5.06025 3.25504 5.50719 5.92674 4.71267 5.51699 3.54069 5.28343 6.36131 4.7219 2925428 1.55075 1.74401 2.07019 2.13188 0.8098 1.40341 0.99192 1.39257 0.51152 2.50187 1.39294 1.29555 0.67688 1.2912 1.75177 1.61535 1.31611 0.91411 1.1937 0.79358 2372272 7.69508 7.32337 7.47569 7.13584 5.86709 6.15057 8.03573 7.77657 7.93019 7.6854 8.23643 3.42619 8.90083 9.18137 8.54825 7.39468 7.88283 8.08621 8.41234 8.23991 3360788 1.8126 3.57725 1.68462 2.83186 2.29188 1.723 2.27151 3.57716 3.12974 1.7842 2.49073 3.42127 1.52801 2.43608 2.38312 4.10846 2.34492 1.68719 0.87868 2.5681 3443892 6.03618 5.53088 3.59312 2.43822 4.67525 2.89741 3.57465 3.25502 3.30023 3.06377 3.4388 2.31478 4.74265 5.6276 2.82511 4.8615 5.65537 5.93262 6.29311 6.13882 3298798 1.16559 2.3974 1.28858 0.566 0.74531 0.93919 2.68312 1.20792 1.48076 1.69595 1.67204 0.5811 1.02014 0.84925 1.54777 1.30054 2.14382 3.31986 1.19556 2.7532 3019819 1.73561 3.53995 2.28098 1.91677 3.64978 2.23223 2.39018 1.66967 2.60563 2.82343 3.60435 1.7998 1.10665 1.42922 1.60167 1.39237 3.08337 1.07063 2.08086 1.9265 3173936 1.55062 1.50538 1.87352 1.83726 3.59104 2.50134 0.71658 2.07377 1.50345 2.71367 1.68669 2.05536 1.88829 0.86234 0.80113 2.2544 0.86643 2.03882 1.53434 1.46399 2424908 0.42785 1.66894 2.27908 2.99882 3.284 1.99961 0.79164 0.44145 1.40957 2.05127 1.40553 3.38542 0.82951 1.06653 1.43876 2.15272 0.69042 0.66962 2.02502 0.7646")) A<-read.table(text=" 2620456 2830090 2575215 2849041 ",sep="",header=FALSE) B1<-data.frame(ID=as.numeric(unlist(strsplit(B," ")))) B1[B1$ID %in% A$V1,] #[1] 2620456 2830090 2575215 2849041 ?which(B1$ID %in% A$V1) #[1] 22 43 64 85 A.K. ----- Original Message ----- From: Jeremy Ng <jeremy.ng.wk1990 at gmail.com> To: r-help at r-project.org Cc: Sent: Saturday, October 27, 2012 2:07 AM Subject: [R] Searching up a list of values Dear list, Sorry if this is a really simple question to ask, but I have been stuck at this for some time and have no luck so far. I have 2 data files- a file containing IDs which I am interested in (maybe about 20 of them)- lets call this file A, and another file containing all the IDs (which is more than 1 million) and the corresponding expression values- lets call this file B. So for example, this is how file B looks like: 2534215 4.73483 3.06499 0.70032 2.20247 2.57014 2.68004 5.20362 4.56531 5.53275 4.6597 4.49301 3.26696 4.45926 3.33294 4.91008 4.69106 2.80828 3.85168 4.31348 3.84926 2620456 7.13062 7.40361 7.41215 7.12816 6.03347 7.77204 8.37129 8.4811 8.4156 8.48804 8.33357 7.77168 7.60426 7.0769 8.06268 7.4646 6.29976 6.76743 7.41375 7.30186 2830090 0.54261 2.14015 2.30475 1.85821 0.46711 0.92093 0.77438 0.94734 0.68643 0.57625 0.72098 0.988 2.11342 0.4447 0.3774 0.67744 2.0072 1.90465 0.87868 0.5177 2575215 3.40684 2.08213 1.69333 3.01074 2.28529 1.9541 2.14518 1.67518 2.83476 3.28296 2.21263 2.46864 3.30994 2.42013 5.22249 1.98937 2.09089 1.94941 2.43488 3.59808 2849041 1.48954 3.91934 1.7037 1.77033 1.54604 0.93467 0.96066 2.24328 2.92171 2.15935 0.42652 1.46658 1.49061 1.72481 1.09684 0.98344 1.83167 2.10103 2.40971 0.84771 3102546 5.88775 5.67887 5.42377 3.45853 4.99174 3.04696 5.29118 4.43669 5.18792 5.34072 5.06025 3.25504 5.50719 5.92674 4.71267 5.51699 3.54069 5.28343 6.36131 4.7219 2925428 1.55075 1.74401 2.07019 2.13188 0.8098 1.40341 0.99192 1.39257 0.51152 2.50187 1.39294 1.29555 0.67688 1.2912 1.75177 1.61535 1.31611 0.91411 1.1937 0.79358 2372272 7.69508 7.32337 7.47569 7.13584 5.86709 6.15057 8.03573 7.77657 7.93019 7.6854 8.23643 3.42619 8.90083 9.18137 8.54825 7.39468 7.88283 8.08621 8.41234 8.23991 3360788 1.8126 3.57725 1.68462 2.83186 2.29188 1.723 2.27151 3.57716 3.12974 1.7842 2.49073 3.42127 1.52801 2.43608 2.38312 4.10846 2.34492 1.68719 0.87868 2.5681 3443892 6.03618 5.53088 3.59312 2.43822 4.67525 2.89741 3.57465 3.25502 3.30023 3.06377 3.4388 2.31478 4.74265 5.6276 2.82511 4.8615 5.65537 5.93262 6.29311 6.13882 3298798 1.16559 2.3974 1.28858 0.566 0.74531 0.93919 2.68312 1.20792 1.48076 1.69595 1.67204 0.5811 1.02014 0.84925 1.54777 1.30054 2.14382 3.31986 1.19556 2.7532 3019819 1.73561 3.53995 2.28098 1.91677 3.64978 2.23223 2.39018 1.66967 2.60563 2.82343 3.60435 1.7998 1.10665 1.42922 1.60167 1.39237 3.08337 1.07063 2.08086 1.9265 3173936 1.55062 1.50538 1.87352 1.83726 3.59104 2.50134 0.71658 2.07377 1.50345 2.71367 1.68669 2.05536 1.88829 0.86234 0.80113 2.2544 0.86643 2.03882 1.53434 1.46399 2424908 0.42785 1.66894 2.27908 2.99882 3.284 1.99961 0.79164 0.44145 1.40957 2.05127 1.40553 3.38542 0.82951 1.06653 1.43876 2.15272 0.69042 0.66962 2.02502 0.7646 And this is how file A looks like: 2620456 2830090 2575215 2849041 How do I actually extract and export the corresponding values for A in B? Thanks in advanced! Jeremy ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.