thr3ads.net - similar to: "how to compare two datasets in R>?"

Displaying 20 results from an estimated 70000 matches similar to: "how to compare two datasets in R>?"

2011 Jan 23

Creating subsets of a matrix

Hello, Say I have 2 columns, bmi and gender, the first being all the values and the second being male or female. How would I subset this into males only and females only? I have searched these fora and read endlessly about select[] and split() functions but to no avail. Also the table is not ordered. bmi gender -> bmi gender + bmi gender 1 24.78 male

Lattice xyplots plots with multiple lines per cell

2010 Aug 13

Lattice xyplots plots with multiple lines per cell

Hello, I need to plot the means of some outcome for two groups (control vs intervention) over time (discrete) on the same plot, for various subsets such as gender and grade level. What I have been doing is creating all possible subsets first, using the aggregate function to create the means over time, then plotting the means over time (as a simple line plot with both control & intervention

How to make combination data

2005 Apr 19

How to make combination data

Dear R-user, I have a data like this below, age <- c("young","mid","old") married <- c("no","yes") income <- c("low","high","medium") gender <- c("female","male") I want to make some of combination data like these, age.income.dat <- expand.grid(age,

How long to wait for process?

2017 Jul 26

How long to wait for process?

UseRs, I have a dataframe with 2547 rows and several hundred columns in R 3.1.3. I am trying to run a small logistic regression with a subset of the data. know_fin ~ comp_grp2+age+gender+education+employment+income+ideol+home_lot+home+county > str(knowf3) 'data.frame': 2033 obs. of 18 variables: $ userid : Factor w/ 2542 levels

Plotting every probability curve

2012 Sep 11

Plotting every probability curve

I don't have a logistic regression model and am trying to generate probability curves for all possible combinations of the variables. My logit model has 5+ variables, and I want to draw curves for every scenario. See code below. When home_owner is 0 and 1, I want curves. The same goes for all other variables categories, so that I have permutations for all possible combinations. I've

data manipulation and summaries with few million rows

2011 Aug 24

data manipulation and summaries with few million rows

I have a data set with about 6 million rows and 50 columns. It is a mixture of dates, factors, and numerics. What I am trying to accomplish can be seen with the following simplified data, which is given as dput output below. > head(myData) mydate gender mygroup id 1 2012-03-25 F A 1 2 2005-05-23 F B 2 3 2005-09-08 F B 2 4 2005-12-07 F B 2

How long to wait for process?

2017 Jul 27

How long to wait for process?

Michael, Thank you for the suggestion. I will take your advice and look more critically at the covariates. John On 7/27/2017 8:08 AM, Michael Friendly wrote: > Rather than go to a penalized GLM, you might be better off > investigating the sources of quasi-perfect separation and simplifying > the model to avoid or reduce it. In your data set you have several > factors with large

Help to find only one class and differennt class

2005 Dec 20

Help to find only one class and differennt class

Dear R users, I have a problem, which I can not find a solution. Probably someone could help me? I have a result from my classification, like this > credit.toy [[1]] age married ownhouse income gender class 1 20-30 no no low male good 2 40-50 no yes medium female good [[2]] age married ownhouse income gender class 1 20-30 yes yes high male

ANOVA stuffs_How to save each result from FOR command?

2010 Oct 19

ANOVA stuffs_How to save each result from FOR command?

Dear R experts, I'm new in R and a beginner in terms of statistics. It should be simple question, but definitely difficult to solve it by myself. I'd like to see main effect of group(gender: sample size is different(M:F=23:18) and one of condition(cond) and the interaction at each subset from 90 datasets So I perform anova 90 times using a command like below; for(i in 1:90)

Levels in returned data.frame after subset

2010 Sep 04

Levels in returned data.frame after subset

Dear List, When I subset a data.frame, the levels are not re-adjusted (see example). Why is this? Am I missing out on some basic stuff here? Thanks Ulrik > m <- data.frame(gender = c("M", "M","F"), ht = c(172, 186.5, 165), wt = c(91,99, 74)) > dim(m) [1] 3 3 > levels(m$gender) [1] "F" "M" > s <- subset(m, m$gender ==

How long to wait for process?

2017 Jul 27

How long to wait for process?

Rather than go to a penalized GLM, you might be better off investigating the sources of quasi-perfect separation and simplifying the model to avoid or reduce it. In your data set you have several factors with large number of levels, making the data sparse for all their combinations. Like multicolinearity, near perfect separation is a data problem, and is often better solved by careful

How long to wait for process?

2017 Jul 27

How long to wait for process?

Hi, Late to the thread here, but I noted that your dependent variable 'know_fin' has 3 levels in the str() output below. Since you did not provide a full c&p of your glm() call, we can only presume that you did specify 'family = binomial' in the call. Is the dataset 'knowf3' the result of a subsetting operation, such that there are only two of the three levels of

How long to wait for process?

2017 Jul 27

How long to wait for process?

Marc, Sorry for the lack of info on my part. Yes, I did use 'family = binomial' and I did drop the 3rd level before running the model. I think the str(<subset>) that I wrote into my original email might not have been my final step before using glm. Thank you for reminding of the potential problem. I think Michael Friendly's idea is probably the solution I need to consider.

unique not working

2013 Mar 28

unique not working

i am using mac OSX 10.7.5, running R version 2.15.2 (2012-10-26) -- "Trick or Treat" when i do: uncountry <- unique(wvsAB[,7]) wvsAB$numcountry <- match(wvsAB$country, uncountry) "unstate" isn't attaching. > library(base) > uncountry <- unique(wvsAB[,7]) > wvsAB$numcountry <- match(wvsAB$country, uncountry) > ls(wvsAB) [1] "age"

looping using 'diverse' package measures

2017 Oct 19

looping using 'diverse' package measures

Hi everyone, I'm new at R (although I'm a Stata user for some time and somehow proficient in it) and I'm trying to use the 'diverse' R package to compute a few diversity measures on a sample of firms for a period of about 10 years. I was wondering if you can give me some hints on how to best proceed on using the 'diverse' package. My sample has the following setup.

bayesian HLM random effects

2013 Mar 28

bayesian HLM random effects

Hello, all. I've been working on this for sometime and was almost at the end/ last chunk of code i would need.... When I received an error. Rather than go to bed and think about it in the morning, I messed with my data and now I am not getting anything. I was up until 4am trying to fix this. Zip files of my data are attached (the data which ends in 'a' matches with wvsA and the

OLS standard errors

2008 Feb 26

OLS standard errors

Hi, the standard errors of the coefficients in two regressions that I computed by hand and using lm() differ by about 1%. Can somebody help me to identify the source of this difference? The coefficient estimates are the same, but the standard errors differ. ####Simulate data happiness=0 income=0 gender=(rep(c(0,1,1,0),25)) for(i in 1:100){ happiness[i]=1000+i+rnorm(1,0,40)

struggling to plot subgroups

2006 Nov 05

struggling to plot subgroups

Hi Folks, I have data that looks like this: freq gender xBar 1000 m 2.32 1000 f 3.22 2000 m 4.32 2000 f 4.53 3000 m 3.21 3000 f 3.44 4000 m 4.11 4000 f 3.99 I want to plot two lines (with symbols) for the two groups "m" and "f". I have tried the following: plot(xBar[gender=="m"]~freq[gender=="f"]) followed by

A really simple data manipulation example

2007 Jun 26

A really simple data manipulation example

In response to those who asked for a better explanation of what the Vilno software does, here's a simple example that gives some idea of what it does. LABRESULTS is a dataset with multiple rows per patient , with lab sodium measurements. It has columns: PATIENT_ID, VISIT_NUM, and SODIUM. DEMO is a dataset with one row per patient, with demographic data. It has columns: PATIENT_ID, GENDER.

colClasses: supressed 'NA'

2006 Sep 26

colClasses: supressed 'NA'

Hi, The colClasses seem to be supressing 'NA' vlaues. How do I fix this? R script and first 5 lines of output is below. File "test2.dat" has blanks that are read as "NA" when I do not use 'colClasses', but as blanks when I use 'colClasses'. temp.df <- read.fwf("test2.dat", width=c(10,1,1,1,1,2,2,3,3,1),

similar to: how to compare two datasets in R>?