Thank you David, My intention was if I change the status column to numeric 0= Lost and 1 Won, then I can use this numeric variables to calculate the Percent game Won by each country. how did you read the data first? That was my problem. The actual data is in a file have to be read or laded. Thank you ! On Sat, Nov 14, 2015 at 6:10 PM, David L Carlson <dcarlson at tamu.edu> wrote:> It is always good to read the manual page for a function, but especially when it is not working as you expected. In this case if you look at the arguments for read.table(), you will find one called fill=TRUE that is useful in this case. > > Based on your ifelse(), you seem to be assuming that a blank is not missing data but a lost game. You may also discover that in your example wins are coded as w and W. Since character variables get converted to factors by default, you could use something like: > >> levels(test$STATUS) <- c("L", "W", "W") >> addmargins(xtabs(~Country+STATUS, test), 2) > STATUS > Country L W Sum > FRA 2 3 5 > GER 1 3 4 > SPA 2 1 3 > UNK 1 2 3 > USA 1 2 3 > > I'll let you figure out how to get the last column. > > David L. Carlson > Department of Anthropology > Texas A&M University > > -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ashta > Sent: Saturday, November 14, 2015 4:28 PM > To: R help <r-help at r-project.org> > Subject: [R] Ranking > > Hi all, > > I have the following raw data some records don't have the second variable. > > test <- read.table(textConnection(" Country STATUS > USA > USA W > USA W > GER > GER W > GER w > GER W > UNK W > UNK > UNK W > FRA > FRA > FRA W > FRA W > FRA W > SPA > SPA W > SPA "),header = TRUE, sep= "\t") > test > > It is not reading it correctly. > > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > line 17 did not have 2 elements > > > > After reading I want change the status column to numeric so that I > can use the table function > > test$STATUS <- ifelse(is.na(test$STATUS), 0, 1) > > at the end I want the following table (Country, Won, Lost , Number of > games played and % of score ) and pick the top 3 countries. > > COUNTRY Won Lost NG %W > USA 2 1 3 (2/3)*100 > GER 3 1 4 (3/4)*100 > UNK 2 1 3 (2/3)*100 > FRA 3 2 5 (3/5)*100 > SPA 1 2 3 (1/3)*100 > > Thank you in advance > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
I used your code but deleted sep="\t" since there were no tabs in your email and added the fill= argument I mentioned before. David -------- Original message -------- From: Ashta <sewashm at gmail.com> Date: 11/14/2015 6:40 PM (GMT-06:00) To: David L Carlson <dcarlson at tamu.edu> Cc: R help <r-help at r-project.org> Subject: Re: [R] Ranking Thank you David, My intention was if I change the status column to numeric 0= Lost and 1 Won, then I can use this numeric variables to calculate the Percent game Won by each country. how did you read the data first? That was my problem. The actual data is in a file have to be read or laded. Thank you ! On Sat, Nov 14, 2015 at 6:10 PM, David L Carlson <dcarlson at tamu.edu> wrote:> It is always good to read the manual page for a function, but especially when it is not working as you expected. In this case if you look at the arguments for read.table(), you will find one called fill=TRUE that is useful in this case. > > Based on your ifelse(), you seem to be assuming that a blank is not missing data but a lost game. You may also discover that in your example wins are coded as w and W. Since character variables get converted to factors by default, you could use something like: > >> levels(test$STATUS) <- c("L", "W", "W") >> addmargins(xtabs(~Country+STATUS, test), 2) > STATUS > Country L W Sum > FRA 2 3 5 > GER 1 3 4 > SPA 2 1 3 > UNK 1 2 3 > USA 1 2 3 > > I'll let you figure out how to get the last column. > > David L. Carlson > Department of Anthropology > Texas A&M University > > -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ashta > Sent: Saturday, November 14, 2015 4:28 PM > To: R help <r-help at r-project.org> > Subject: [R] Ranking > > Hi all, > > I have the following raw data some records don't have the second variable. > > test <- read.table(textConnection(" Country STATUS > USA > USA W > USA W > GER > GER W > GER w > GER W > UNK W > UNK > UNK W > FRA > FRA > FRA W > FRA W > FRA W > SPA > SPA W > SPA "),header = TRUE, sep= "\t") > test > > It is not reading it correctly. > > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > line 17 did not have 2 elements > > > > After reading I want change the status column to numeric so that I > can use the table function > > test$STATUS <- ifelse(is.na(test$STATUS), 0, 1) > > at the end I want the following table (Country, Won, Lost , Number of > games played and % of score ) and pick the top 3 countries. > > COUNTRY Won Lost NG %W > USA 2 1 3 (2/3)*100 > GER 3 1 4 (3/4)*100 > UNK 2 1 3 (2/3)*100 > FRA 3 2 5 (3/5)*100 > SPA 1 2 3 (1/3)*100 > > Thank you in advance > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
It is perhaps worth mentioning that the OP's desire to do the conversion to numeric to calculate won-lost percentages is completely unnecessary and indicates that he/she could benefit by spending some additional time learning R. See, e.g. ?tapply, ?table, ?prop.table, and friends. Cheers, Bert Bert Gunter "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." -- Clifford Stoll On Sun, Nov 15, 2015 at 8:28 PM, David L Carlson <dcarlson at tamu.edu> wrote:> I used your code but deleted sep="\t" since there were no tabs in your email and added the fill= argument I mentioned before. > > > > David > > > > -------- Original message -------- > From: Ashta <sewashm at gmail.com> > Date: 11/14/2015 6:40 PM (GMT-06:00) > To: David L Carlson <dcarlson at tamu.edu> > Cc: R help <r-help at r-project.org> > Subject: Re: [R] Ranking > > Thank you David, > > My intention was if I change the status column to numeric > 0= Lost and 1 Won, then I can use this numeric variables to calculate > the Percent game Won by each country. > how did you read the data first? > That was my problem. The actual data is in a file have to be read or laded. > > Thank you ! > > > > > > > On Sat, Nov 14, 2015 at 6:10 PM, David L Carlson <dcarlson at tamu.edu> wrote: >> It is always good to read the manual page for a function, but especially when it is not working as you expected. In this case if you look at the arguments for read.table(), you will find one called fill=TRUE that is useful in this case. >> >> Based on your ifelse(), you seem to be assuming that a blank is not missing data but a lost game. You may also discover that in your example wins are coded as w and W. Since character variables get converted to factors by default, you could use something like: >> >>> levels(test$STATUS) <- c("L", "W", "W") >>> addmargins(xtabs(~Country+STATUS, test), 2) >> STATUS >> Country L W Sum >> FRA 2 3 5 >> GER 1 3 4 >> SPA 2 1 3 >> UNK 1 2 3 >> USA 1 2 3 >> >> I'll let you figure out how to get the last column. >> >> David L. Carlson >> Department of Anthropology >> Texas A&M University >> >> -----Original Message----- >> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ashta >> Sent: Saturday, November 14, 2015 4:28 PM >> To: R help <r-help at r-project.org> >> Subject: [R] Ranking >> >> Hi all, >> >> I have the following raw data some records don't have the second variable. >> >> test <- read.table(textConnection(" Country STATUS >> USA >> USA W >> USA W >> GER >> GER W >> GER w >> GER W >> UNK W >> UNK >> UNK W >> FRA >> FRA >> FRA W >> FRA W >> FRA W >> SPA >> SPA W >> SPA "),header = TRUE, sep= "\t") >> test >> >> It is not reading it correctly. >> >> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : >> line 17 did not have 2 elements >> >> >> >> After reading I want change the status column to numeric so that I >> can use the table function >> >> test$STATUS <- ifelse(is.na(test$STATUS), 0, 1) >> >> at the end I want the following table (Country, Won, Lost , Number of >> games played and % of score ) and pick the top 3 countries. >> >> COUNTRY Won Lost NG %W >> USA 2 1 3 (2/3)*100 >> GER 3 1 4 (3/4)*100 >> UNK 2 1 3 (2/3)*100 >> FRA 3 2 5 (3/5)*100 >> SPA 1 2 3 (1/3)*100 >> >> Thank you in advance >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.