jagadishpchary
2015-Jun-09 08:40 UTC
[R] Cross tabulation with top one variable and side as multiple variables
Hi: I have a huge data with lot of variables and I need to check the trend variations from year to year. In order to do so, I have to cross tabulate the year variable as top (constant) and all the remaining variables as side (attached the cross tabulation report). I have searched the forums but the syntax I could find for cross tabulation is between 2 or 3 variables. So i would request to provide a code which can print the data in the same way as in the attached. <http://r.789695.n4.nabble.com/file/n4708379/Untitled.png> -- View this message in context: http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379.html Sent from the R help mailing list archive at Nabble.com.
David Winsemius
2015-Jun-09 17:50 UTC
[R] Cross tabulation with top one variable and side as multiple variables
On Jun 9, 2015, at 1:40 AM, jagadishpchary wrote:> Hi: > > I have a huge data with lot of variables and I need to check the trend > variations from year to year. In order to do so, I have to cross tabulate > the year variable as top (constant) and all the remaining variables as side > (attached the cross tabulation report). I have searched the forums but the > syntax I could find for cross tabulation is between 2 or 3 variables. So i > would request to provide a code which can print the data in the same way as > in the attached. <http://r.789695.n4.nabble.com/file/n4708379/Untitled.png>I think you will find that people on this list expect you to provide data in the form of text rather than pictures. When I looked at the request there were two routes I considered: 1) combine margin.table with ftable and 2) investigate one of (but not both) of plyr or dply packages.> > > View this message in context: http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379.html > Sent from the R help mailing list archive at Nabble.com.Nabble is neither the R help mailing list nor its archive. Nabble also removes this message from replies. You should read the material about the list.> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA
Jeff Newmiller
2015-Jun-09 18:39 UTC
[R] Cross tabulation with top one variable and side as multiple variables
There are two issues here... calculation and presentation. The table function from base R can work with many variables. If your data set is so large that you have problems with memory then you could investigate data.table or sqldf packages, which perform the computations but do not present the data in cross tabulation form. You could use table or perhaps the tables package to render the data into the desired form. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. On June 9, 2015 1:40:53 AM PDT, jagadishpchary <p.jagadish at inrhythm-inc.com> wrote:>Hi: > >I have a huge data with lot of variables and I need to check the trend >variations from year to year. In order to do so, I have to cross >tabulate >the year variable as top (constant) and all the remaining variables as >side >(attached the cross tabulation report). I have searched the forums but >the >syntax I could find for cross tabulation is between 2 or 3 variables. >So i >would request to provide a code which can print the data in the same >way as >in the attached. ><http://r.789695.n4.nabble.com/file/n4708379/Untitled.png> > > > >-- >View this message in context: >http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379.html >Sent from the R help mailing list archive at Nabble.com. > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
John Kane
2015-Jun-09 21:18 UTC
[R] Cross tabulation with top one variable and side as multiple variables
We probably should have a better idea of what the raw data looks like and perhaps a bit better idea of what the analyis is to show. Have a look at http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and http://adv-r.had.co.nz/Reproducibility.html for some suggestions. In particular see the discussion about dput() for the best way to provide sample data to the help list. John Kane Kingston ON Canada> -----Original Message----- > From: p.jagadish at inrhythm-inc.com > Sent: Tue, 9 Jun 2015 01:40:53 -0700 (PDT) > To: r-help at r-project.org > Subject: [R] Cross tabulation with top one variable and side as multiple > variables > > Hi: > > I have a huge data with lot of variables and I need to check the trend > variations from year to year. In order to do so, I have to cross tabulate > the year variable as top (constant) and all the remaining variables as > side > (attached the cross tabulation report). I have searched the forums but > the > syntax I could find for cross tabulation is between 2 or 3 variables. So > i > would request to provide a code which can print the data in the same way > as > in the attached. > <http://r.789695.n4.nabble.com/file/n4708379/Untitled.png> > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.____________________________________________________________ Can't remember your password? Do you need a strong and secure password? Use Password manager! It stores your passwords & protects your account.
jagadishpchary
2015-Jun-18 06:46 UTC
[R] Cross tabulation with top one variable and side as multiple variables
I think my explanation in the post is not giving the full details on the job to be done. Sorry for that. Here is what I am doing.. 1. I have a SPSS data set with more than 2000 variables. However for test purpose I have created a temporary data set with 5 variables which I am reading it to R environment (Attached the test.sav file). 2. There is a variable called ?TREND? which has the year data. So all I need to do is cross tabulate the variables with this Trend variable. In SPSS the syntax would be CTABLES /VLABELS VARIABLES =ALL DISPLAY=LABEL /TABLES (AGET +SEXT +EDUCRT +JOBRT ) [COUNT F40.0] by TREND. The final cross tabulation results are placed in the attached excel report with sheet name ?Results?. As I am new to R - I tried searching the forums for the cross tabulation with top variable constant and multiple variables as side however I could not find it. Anyhow I tried using the below syntax : Xtabs ( ~ AGET +SEXT +EDUCRT +JOBRT + TREND, data=mydata) summary(~AGET +SEXT +EDUCRT +JOBRT, data= mydata, fun=table) ftable (mydata, row.vars=c("AGET ", " SEXT ", " EDUCRT " , ?JOBRT?), col.vars="TREND") the results are not identical to what I am getting in SPSS Hence I would request to suggest me a R code that helps me in getting the results as shown in the attached excel report with sheet name ?Results?. Test.sav <http://r.789695.n4.nabble.com/file/n4708799/Test.sav> Cross_tabulation.xlsx <http://r.789695.n4.nabble.com/file/n4708799/Cross_tabulation.xlsx> -- View this message in context: http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379p4708799.html Sent from the R help mailing list archive at Nabble.com.
David L Carlson
2015-Jun-18 15:09 UTC
[R] Cross tabulation with top one variable and side as multiple variables
They do not match because xtabs() in R produces a multidimensional array (one dimension for each variable). Looking at your spreadsheet on nabble, it appears that SPSS is just creating 4 crosstabulations with TREND against each of the other variables. That is easily done in R, but for tested code, you need to give us reproducible data using dput(). I get an error using read.spss() on your uploaded file. You should also read some of the extensive free documentation available on R. The ftable() function creates a two dimensional representation of that 5-dimensional array. But your spreadsheet is just a stack of two-dimensional tables. You could get there with the margin.table() function, but unless you really need the 5-dimensional array, you probably want something more like: rowvars <- c("AGET", "SEXT", "EDUCRT", "JOBRT") table.lst <- lapply(rowvars, function(x) xtabs(~x+TREND)) That would give you a list containing a crosstabulation table between each of the variables and TREND. A spreadsheet with 2000 tables seems a bit unwieldy so you might want to give some thought to what you really want as output. ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of jagadishpchary Sent: Thursday, June 18, 2015 1:46 AM To: r-help at r-project.org Subject: Re: [R] Cross tabulation with top one variable and side as multiple variables I think my explanation in the post is not giving the full details on the job to be done. Sorry for that. Here is what I am doing.. 1. I have a SPSS data set with more than 2000 variables. However for test purpose I have created a temporary data set with 5 variables which I am reading it to R environment (Attached the test.sav file). 2. There is a variable called ?TREND? which has the year data. So all I need to do is cross tabulate the variables with this Trend variable. In SPSS the syntax would be CTABLES /VLABELS VARIABLES =ALL DISPLAY=LABEL /TABLES (AGET +SEXT +EDUCRT +JOBRT ) [COUNT F40.0] by TREND. The final cross tabulation results are placed in the attached excel report with sheet name ?Results?. As I am new to R - I tried searching the forums for the cross tabulation with top variable constant and multiple variables as side however I could not find it. Anyhow I tried using the below syntax : Xtabs ( ~ AGET +SEXT +EDUCRT +JOBRT + TREND, data=mydata) summary(~AGET +SEXT +EDUCRT +JOBRT, data= mydata, fun=table) ftable (mydata, row.vars=c("AGET ", " SEXT ", " EDUCRT " , ?JOBRT?), col.vars="TREND") the results are not identical to what I am getting in SPSS Hence I would request to suggest me a R code that helps me in getting the results as shown in the attached excel report with sheet name ?Results?. Test.sav <http://r.789695.n4.nabble.com/file/n4708799/Test.sav> Cross_tabulation.xlsx <http://r.789695.n4.nabble.com/file/n4708799/Cross_tabulation.xlsx> -- View this message in context: http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379p4708799.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
jagadishpchary
2015-Jul-10 12:55 UTC
[R] Reading the non delimited file with no particular patterns in the data to R
I am beginner in R and I want to read a ASCII file to R environment. However, the ASCII file is a non delimited and the data is not continuous (have some blank spaces between the variables) so in order to read the data i have used the below syntax i.e test <- read.fwf("D:/R_process/ASCII.txt", width = c(10, 4, 1, 4, 9, 9, 1,1,1,1,1,1,1,3,8)) Now i am able to read it but the data read is wrong. Actually my out put should have only the applicable variables data but not the blank data. Attached is the ASCII data. Please let me know how should i write the syntax to read only the applicable data in the file. Thanks for your help in advance. test.txt <http://r.789695.n4.nabble.com/file/n4709699/test.txt> -- View this message in context: http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379p4709699.html Sent from the R help mailing list archive at Nabble.com.
Clint Bowman
2015-Jul-10 18:08 UTC
[R] Reading the non delimited file with no particular patterns in the data to R
Is this what your are expecting?> test <- read.fwf("test.dat", width = c(10, 4, 1, 4, 9,12,26,1,1,1,1,1,1,3,8))> testV1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 1 1 2015 1 4 0.766696 1 1 0 0 0 0 0 0 0 10 2 2 2015 1 4 1.458186 1 1 0 0 0 1 0 0 0 20 3 3 2015 1 4 0.185492 1 1 0 0 0 0 0 0 0 15 4 4 2015 1 4 0.961584 1 1 0 0 0 0 0 0 0 3 5 5 2015 1 4 0.650091 2 0 0 0 1 0 0 0 0 NA 6 6 2015 1 4 0.430350 1 1 0 0 0 0 0 0 0 20 7 7 2015 1 4 3.192895 2 1 0 1 1 0 0 0 0 0 8 8 2015 1 4 0.617127 1 1 0 1 0 1 0 0 0 15 9 9 2015 1 4 0.399207 1 1 0 0 0 0 0 0 0 10 Clint Clint Bowman INTERNET: clint at ecy.wa.gov Air Quality Modeler INTERNET: clint at math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600 FAX: (360) 407-7534 Olympia, WA 98504-7600 USPS: PO Box 47600, Olympia, WA 98504-7600 Parcels: 300 Desmond Drive, Lacey, WA 98503-1274 On Fri, 10 Jul 2015, jagadishpchary wrote:> I am beginner in R and I want to read a ASCII file to R environment. However, > the ASCII file is a non delimited and the data is not continuous (have some > blank spaces between the variables) so in order to read the data i have used > the below syntax i.e > test <- read.fwf("D:/R_process/ASCII.txt", width = c(10, 4, 1, 4, 9, 9, > 1,1,1,1,1,1,1,3,8)) > > Now i am able to read it but the data read is wrong. Actually my out put > should have only the applicable variables data but not the blank data. > Attached is the ASCII data. Please let me know how should i write the syntax > to read only the applicable data in the file. > > Thanks for your help in advance. test.txt > <http://r.789695.n4.nabble.com/file/n4709699/test.txt> > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Cross-tabulation-with-top-one-variable-and-side-as-multiple-variables-tp4708379p4709699.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >