Hi all, I have the following raw data some records don't have the second variable. test <- read.table(textConnection(" Country STATUS USA USA W USA W GER GER W GER w GER W UNK W UNK UNK W FRA FRA FRA W FRA W FRA W SPA SPA W SPA "),header = TRUE, sep= "\t") test It is not reading it correctly. Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 17 did not have 2 elements After reading I want change the status column to numeric so that I can use the table function test$STATUS <- ifelse(is.na(test$STATUS), 0, 1) at the end I want the following table (Country, Won, Lost , Number of games played and % of score ) and pick the top 3 countries. COUNTRY Won Lost NG %W USA 2 1 3 (2/3)*100 GER 3 1 4 (3/4)*100 UNK 2 1 3 (2/3)*100 FRA 3 2 5 (3/5)*100 SPA 1 2 3 (1/3)*100 Thank you in advance
It is always good to read the manual page for a function, but especially when it is not working as you expected. In this case if you look at the arguments for read.table(), you will find one called fill=TRUE that is useful in this case. Based on your ifelse(), you seem to be assuming that a blank is not missing data but a lost game. You may also discover that in your example wins are coded as w and W. Since character variables get converted to factors by default, you could use something like:> levels(test$STATUS) <- c("L", "W", "W") > addmargins(xtabs(~Country+STATUS, test), 2)STATUS Country L W Sum FRA 2 3 5 GER 1 3 4 SPA 2 1 3 UNK 1 2 3 USA 1 2 3 I'll let you figure out how to get the last column. David L. Carlson Department of Anthropology Texas A&M University -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ashta Sent: Saturday, November 14, 2015 4:28 PM To: R help <r-help at r-project.org> Subject: [R] Ranking Hi all, I have the following raw data some records don't have the second variable. test <- read.table(textConnection(" Country STATUS USA USA W USA W GER GER W GER w GER W UNK W UNK UNK W FRA FRA FRA W FRA W FRA W SPA SPA W SPA "),header = TRUE, sep= "\t") test It is not reading it correctly. Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 17 did not have 2 elements After reading I want change the status column to numeric so that I can use the table function test$STATUS <- ifelse(is.na(test$STATUS), 0, 1) at the end I want the following table (Country, Won, Lost , Number of games played and % of score ) and pick the top 3 countries. COUNTRY Won Lost NG %W USA 2 1 3 (2/3)*100 GER 3 1 4 (3/4)*100 UNK 2 1 3 (2/3)*100 FRA 3 2 5 (3/5)*100 SPA 1 2 3 (1/3)*100 Thank you in advance ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thank you David, My intention was if I change the status column to numeric 0= Lost and 1 Won, then I can use this numeric variables to calculate the Percent game Won by each country. how did you read the data first? That was my problem. The actual data is in a file have to be read or laded. Thank you ! On Sat, Nov 14, 2015 at 6:10 PM, David L Carlson <dcarlson at tamu.edu> wrote:> It is always good to read the manual page for a function, but especially when it is not working as you expected. In this case if you look at the arguments for read.table(), you will find one called fill=TRUE that is useful in this case. > > Based on your ifelse(), you seem to be assuming that a blank is not missing data but a lost game. You may also discover that in your example wins are coded as w and W. Since character variables get converted to factors by default, you could use something like: > >> levels(test$STATUS) <- c("L", "W", "W") >> addmargins(xtabs(~Country+STATUS, test), 2) > STATUS > Country L W Sum > FRA 2 3 5 > GER 1 3 4 > SPA 2 1 3 > UNK 1 2 3 > USA 1 2 3 > > I'll let you figure out how to get the last column. > > David L. Carlson > Department of Anthropology > Texas A&M University > > -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Ashta > Sent: Saturday, November 14, 2015 4:28 PM > To: R help <r-help at r-project.org> > Subject: [R] Ranking > > Hi all, > > I have the following raw data some records don't have the second variable. > > test <- read.table(textConnection(" Country STATUS > USA > USA W > USA W > GER > GER W > GER w > GER W > UNK W > UNK > UNK W > FRA > FRA > FRA W > FRA W > FRA W > SPA > SPA W > SPA "),header = TRUE, sep= "\t") > test > > It is not reading it correctly. > > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > line 17 did not have 2 elements > > > > After reading I want change the status column to numeric so that I > can use the table function > > test$STATUS <- ifelse(is.na(test$STATUS), 0, 1) > > at the end I want the following table (Country, Won, Lost , Number of > games played and % of score ) and pick the top 3 countries. > > COUNTRY Won Lost NG %W > USA 2 1 3 (2/3)*100 > GER 3 1 4 (3/4)*100 > UNK 2 1 3 (2/3)*100 > FRA 3 2 5 (3/5)*100 > SPA 1 2 3 (1/3)*100 > > Thank you in advance > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.