roslinazairimah zakaria
2017-Mar-13 04:37 UTC
[R] Extract student ID that match certain criteria
Hi Bert, Thank you so much for your help. However I don't really sure what is the use of y values. Can we do without it? x <- as.character(FKASA$STUDENT_ID) y <- c(1,786) My.Data <- data.frame (x,y) My.Data[grep("^AA14", My.Data$x), ] I got the following data: x y 1 AA14068 1 7 AA14090 1 11 AA14099 1 14 AA14012 786 15 AA14039 1 22 AA14251 786 On Mon, Mar 13, 2017 at 11:51 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:> 1. Your code is incorrect. All entries are character strings and must be > quoted. > > 2. See ?grep and note in particular (in the "Value" section): > > "grep(value = TRUE) returns a character vector containing the selected > elements of x (after coercion, preserving names but no other > attributes)." > > > 3. While the fixed = TRUE option will work here, you may wish to learn > about "regular expressions", which can come in very handy for > character string manipulation tasks. ?regex in R has a terse, but I > have found comprehensible, discussion. There are many good gentler > tutorials on the web, also. > > > Cheers, > Bert > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Sun, Mar 12, 2017 at 8:32 PM, roslinazairimah zakaria > <roslinaump at gmail.com> wrote: > > Dear r-users, > > > > I have this list of student ID, > > > > dt <- c(AA14068, AA13194, AE11054, AA12251, AA13228, AA13286, AA14090, > > AA13256, AA13260, AA13291, AA14099, AA15071, AA13143, AA14012, AA14039, > > AA15018, AA13234, AA13149, AA13282, AA13218) > > > > and I would like to extract all student of ID AA14... only. > > > > I search and tried substrt, subset and select but it fail. > > > > substr(FKASA$STUDENT_ID, 2, nchar(string1)) > > Error in nchar(string1) : 'nchar()' requires a character vector > >> subset(FKASA, STUDENT_ID=="AA14" ) > > [1] FAC_CODE FACULTY STUDENT_ID NAME PROGRAM KURSUS > > CGPA ACT_SS ACT_VAL ACT_CS ACT_LED ACT_PS > > ACT_IM > > [14] ACT_ENT ACT_CRE ACT_UNI ACT_VOL... > > > > Thank you so much for your help. > > > > How do I do it? > > -- > > *Roslinazairimah Zakaria* > > *Tel: +609-5492370; Fax. No.+609-5492766* > > > > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; > > roslinaump at gmail.com <roslinaump at gmail.com>* > > Faculty of Industrial Sciences & Technology > > University Malaysia Pahang > > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >-- *Roslinazairimah Zakaria* *Tel: +609-5492370; Fax. No.+609-5492766* *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; roslinaump at gmail.com <roslinaump at gmail.com>* Faculty of Industrial Sciences & Technology University Malaysia Pahang Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia [[alternative HTML version deleted]]
roslinazairimah zakaria
2017-Mar-13 05:18 UTC
[R] Extract student ID that match certain criteria
Another question, How do I extract ID based on the third and fourth letter: I have for example, AA14004, AB15035, CB14024, PA14009, PA14009 etc I would like to extract ID no. of AB14..., CB14..., PA14... On Mon, Mar 13, 2017 at 12:37 PM, roslinazairimah zakaria < roslinaump at gmail.com> wrote:> Hi Bert, > > Thank you so much for your help. However I don't really sure what is the > use of y values. Can we do without it? > > x <- as.character(FKASA$STUDENT_ID) > y <- c(1,786) > My.Data <- data.frame (x,y) > > My.Data[grep("^AA14", My.Data$x), ] > > I got the following data: > > x y > 1 AA14068 1 > 7 AA14090 1 > 11 AA14099 1 > 14 AA14012 786 > 15 AA14039 1 > 22 AA14251 786 > > On Mon, Mar 13, 2017 at 11:51 AM, Bert Gunter <bgunter.4567 at gmail.com> > wrote: > >> 1. Your code is incorrect. All entries are character strings and must be >> quoted. >> >> 2. See ?grep and note in particular (in the "Value" section): >> >> "grep(value = TRUE) returns a character vector containing the selected >> elements of x (after coercion, preserving names but no other >> attributes)." >> >> >> 3. While the fixed = TRUE option will work here, you may wish to learn >> about "regular expressions", which can come in very handy for >> character string manipulation tasks. ?regex in R has a terse, but I >> have found comprehensible, discussion. There are many good gentler >> tutorials on the web, also. >> >> >> Cheers, >> Bert >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Sun, Mar 12, 2017 at 8:32 PM, roslinazairimah zakaria >> <roslinaump at gmail.com> wrote: >> > Dear r-users, >> > >> > I have this list of student ID, >> > >> > dt <- c(AA14068, AA13194, AE11054, AA12251, AA13228, AA13286, AA14090, >> > AA13256, AA13260, AA13291, AA14099, AA15071, AA13143, AA14012, AA14039, >> > AA15018, AA13234, AA13149, AA13282, AA13218) >> > >> > and I would like to extract all student of ID AA14... only. >> > >> > I search and tried substrt, subset and select but it fail. >> > >> > substr(FKASA$STUDENT_ID, 2, nchar(string1)) >> > Error in nchar(string1) : 'nchar()' requires a character vector >> >> subset(FKASA, STUDENT_ID=="AA14" ) >> > [1] FAC_CODE FACULTY STUDENT_ID NAME PROGRAM KURSUS >> > CGPA ACT_SS ACT_VAL ACT_CS ACT_LED ACT_PS >> > ACT_IM >> > [14] ACT_ENT ACT_CRE ACT_UNI ACT_VOL... >> > >> > Thank you so much for your help. >> > >> > How do I do it? >> > -- >> > *Roslinazairimah Zakaria* >> > *Tel: +609-5492370; Fax. No.+609-5492766* >> > >> > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; >> > roslinaump at gmail.com <roslinaump at gmail.com>* >> > Faculty of Industrial Sciences & Technology >> > University Malaysia Pahang >> > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > *Roslinazairimah Zakaria* > *Tel: +609-5492370 <+60%209-549%202370>; Fax. No.+609-5492766 > <+60%209-549%202766>* > > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; > roslinaump at gmail.com <roslinaump at gmail.com>* > Faculty of Industrial Sciences & Technology > University Malaysia Pahang > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia >-- *Roslinazairimah Zakaria* *Tel: +609-5492370; Fax. No.+609-5492766* *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; roslinaump at gmail.com <roslinaump at gmail.com>* Faculty of Industrial Sciences & Technology University Malaysia Pahang Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia [[alternative HTML version deleted]]
Hi Roslinazairimah, As Bert suggested, you should get acquainted with regular expressions. It can be confusing at times, but pays off in the long run. In your case, the pattern of "^[A-Z]{2}14.*" might work. Best, Ulrik On Mon, 13 Mar 2017 at 06:20 roslinazairimah zakaria <roslinaump at gmail.com> wrote:> Another question, > > How do I extract ID based on the third and fourth letter: > > I have for example, AA14004, AB15035, CB14024, PA14009, PA14009 etc > > I would like to extract ID no. of AB14..., CB14..., PA14... > > On Mon, Mar 13, 2017 at 12:37 PM, roslinazairimah zakaria < > roslinaump at gmail.com> wrote: > > > Hi Bert, > > > > Thank you so much for your help. However I don't really sure what is the > > use of y values. Can we do without it? > > > > x <- as.character(FKASA$STUDENT_ID) > > y <- c(1,786) > > My.Data <- data.frame (x,y) > > > > My.Data[grep("^AA14", My.Data$x), ] > > > > I got the following data: > > > > x y > > 1 AA14068 1 > > 7 AA14090 1 > > 11 AA14099 1 > > 14 AA14012 786 > > 15 AA14039 1 > > 22 AA14251 786 > > > > On Mon, Mar 13, 2017 at 11:51 AM, Bert Gunter <bgunter.4567 at gmail.com> > > wrote: > > > >> 1. Your code is incorrect. All entries are character strings and must be > >> quoted. > >> > >> 2. See ?grep and note in particular (in the "Value" section): > >> > >> "grep(value = TRUE) returns a character vector containing the selected > >> elements of x (after coercion, preserving names but no other > >> attributes)." > >> > >> > >> 3. While the fixed = TRUE option will work here, you may wish to learn > >> about "regular expressions", which can come in very handy for > >> character string manipulation tasks. ?regex in R has a terse, but I > >> have found comprehensible, discussion. There are many good gentler > >> tutorials on the web, also. > >> > >> > >> Cheers, > >> Bert > >> > >> Bert Gunter > >> > >> "The trouble with having an open mind is that people keep coming along > >> and sticking things into it." > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > >> > >> On Sun, Mar 12, 2017 at 8:32 PM, roslinazairimah zakaria > >> <roslinaump at gmail.com> wrote: > >> > Dear r-users, > >> > > >> > I have this list of student ID, > >> > > >> > dt <- c(AA14068, AA13194, AE11054, AA12251, AA13228, AA13286, AA14090, > >> > AA13256, AA13260, AA13291, AA14099, AA15071, AA13143, AA14012, > AA14039, > >> > AA15018, AA13234, AA13149, AA13282, AA13218) > >> > > >> > and I would like to extract all student of ID AA14... only. > >> > > >> > I search and tried substrt, subset and select but it fail. > >> > > >> > substr(FKASA$STUDENT_ID, 2, nchar(string1)) > >> > Error in nchar(string1) : 'nchar()' requires a character vector > >> >> subset(FKASA, STUDENT_ID=="AA14" ) > >> > [1] FAC_CODE FACULTY STUDENT_ID NAME PROGRAM > KURSUS > >> > CGPA ACT_SS ACT_VAL ACT_CS ACT_LED ACT_PS > >> > ACT_IM > >> > [14] ACT_ENT ACT_CRE ACT_UNI ACT_VOL... > >> > > >> > Thank you so much for your help. > >> > > >> > How do I do it? > >> > -- > >> > *Roslinazairimah Zakaria* > >> > *Tel: +609-5492370 <+60%209-549%202370>; Fax. No.+609-5492766 > <+60%209-549%202766>* > >> > > >> > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; > >> > roslinaump at gmail.com <roslinaump at gmail.com>* > >> > Faculty of Industrial Sciences & Technology > >> > University Malaysia Pahang > >> > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > ______________________________________________ > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide http://www.R-project.org/posti > >> ng-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > >> > > > > > > > > -- > > *Roslinazairimah Zakaria* > > *Tel: +609-5492370 <+60%209-549%202370> <+60%209-549%202370>; Fax. No. > +609-5492766 <+60%209-549%202766> > > <+60%209-549%202766>* > > > > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; > > roslinaump at gmail.com <roslinaump at gmail.com>* > > Faculty of Industrial Sciences & Technology > > University Malaysia Pahang > > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia > > > > > > -- > *Roslinazairimah Zakaria* > *Tel: +609-5492370 <+60%209-549%202370>; Fax. No.+609-5492766 > <+60%209-549%202766>* > > *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>; > roslinaump at gmail.com <roslinaump at gmail.com>* > Faculty of Industrial Sciences & Technology > University Malaysia Pahang > Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]