similar to: how to compare two datasets in R>?

Displaying 20 results from an estimated 70000 matches similar to: "how to compare two datasets in R>?"

2011 Jan 23
2
Creating subsets of a matrix
Hello, Say I have 2 columns, bmi and gender, the first being all the values and the second being male or female. How would I subset this into males only and females only? I have searched these fora and read endlessly about select[] and split() functions but to no avail. Also the table is not ordered. bmi gender -> bmi gender + bmi gender 1 24.78 male
2010 Aug 13
2
Lattice xyplots plots with multiple lines per cell
Hello, I need to plot the means of some outcome for two groups (control vs intervention) over time (discrete) on the same plot, for various subsets such as gender and grade level. What I have been doing is creating all possible subsets first, using the aggregate function to create the means over time, then plotting the means over time (as a simple line plot with both control & intervention
2005 Apr 19
1
How to make combination data
Dear R-user, I have a data like this below, age <- c("young","mid","old") married <- c("no","yes") income <- c("low","high","medium") gender <- c("female","male") I want to make some of combination data like these, age.income.dat <- expand.grid(age,
2017 Jul 26
3
How long to wait for process?
UseRs, I have a dataframe with 2547 rows and several hundred columns in R 3.1.3. I am trying to run a small logistic regression with a subset of the data. know_fin ~ comp_grp2+age+gender+education+employment+income+ideol+home_lot+home+county > str(knowf3) 'data.frame': 2033 obs. of 18 variables: $ userid : Factor w/ 2542 levels
2012 Sep 11
1
Plotting every probability curve
I don't have a logistic regression model and am trying to generate probability curves for all possible combinations of the variables. My logit model has 5+ variables, and I want to draw curves for every scenario. See code below. When home_owner is 0 and 1, I want curves. The same goes for all other variables categories, so that I have permutations for all possible combinations. I've
2011 Aug 24
2
data manipulation and summaries with few million rows
I have a data set with about 6 million rows and 50 columns. It is a mixture of dates, factors, and numerics. What I am trying to accomplish can be seen with the following simplified data, which is given as dput output below. > head(myData) mydate gender mygroup id 1 2012-03-25 F A 1 2 2005-05-23 F B 2 3 2005-09-08 F B 2 4 2005-12-07 F B 2
2017 Jul 27
2
How long to wait for process?
Michael, Thank you for the suggestion. I will take your advice and look more critically at the covariates. John On 7/27/2017 8:08 AM, Michael Friendly wrote: > Rather than go to a penalized GLM, you might be better off > investigating the sources of quasi-perfect separation and simplifying > the model to avoid or reduce it. In your data set you have several > factors with large
2005 Dec 20
1
Help to find only one class and differennt class
Dear R users, I have a problem, which I can not find a solution. Probably someone could help me? I have a result from my classification, like this > credit.toy [[1]] age married ownhouse income gender class 1 20-30 no no low male good 2 40-50 no yes medium female good [[2]] age married ownhouse income gender class 1 20-30 yes yes high male
2010 Oct 19
2
ANOVA stuffs_How to save each result from FOR command?
Dear R experts, I'm new in R and a beginner in terms of statistics. It should be simple question, but definitely difficult to solve it by myself. I'd like to see main effect of group(gender: sample size is different(M:F=23:18) and one of condition(cond) and the interaction at each subset from 90 datasets So I perform anova 90 times using a command like below; for(i in 1:90)
2010 Sep 04
3
Levels in returned data.frame after subset
Dear List, When I subset a data.frame, the levels are not re-adjusted (see example). Why is this? Am I missing out on some basic stuff here? Thanks Ulrik > m <- data.frame(gender = c("M", "M","F"), ht = c(172, 186.5, 165), wt = c(91,99, 74)) > dim(m) [1] 3 3 > levels(m$gender) [1] "F" "M" > s <- subset(m, m$gender ==
2017 Jul 27
0
How long to wait for process?
Rather than go to a penalized GLM, you might be better off investigating the sources of quasi-perfect separation and simplifying the model to avoid or reduce it. In your data set you have several factors with large number of levels, making the data sparse for all their combinations. Like multicolinearity, near perfect separation is a data problem, and is often better solved by careful
2017 Jul 27
0
How long to wait for process?
Hi, Late to the thread here, but I noted that your dependent variable 'know_fin' has 3 levels in the str() output below. Since you did not provide a full c&p of your glm() call, we can only presume that you did specify 'family = binomial' in the call. Is the dataset 'knowf3' the result of a subsetting operation, such that there are only two of the three levels of
2025 Jan 19
1
Test For Difference of Betas By Group in car
Sent from my iPhone > On Jan 19, 2025, at 1:57?PM, David Winsemius <dwinsemius at comcast.net> wrote: > > ?I don?t understand why you don?t include the full text of the error. > > ? > David > Sent from my iPhone > >> On Jan 19, 2025, at 10:00?AM, Sparks, John via R-help <r-help at r-project.org> wrote: >> >> ?Hello R-Helpers, >>
2017 Jul 27
1
How long to wait for process?
Marc, Sorry for the lack of info on my part. Yes, I did use 'family = binomial' and I did drop the 3rd level before running the model. I think the str(<subset>) that I wrote into my original email might not have been my final step before using glm. Thank you for reminding of the potential problem. I think Michael Friendly's idea is probably the solution I need to consider.
2013 Mar 28
1
unique not working
i am using mac OSX 10.7.5, running R version 2.15.2 (2012-10-26) -- "Trick or Treat" when i do: uncountry <- unique(wvsAB[,7]) wvsAB$numcountry <- match(wvsAB$country, uncountry) "unstate" isn't attaching. > library(base) > uncountry <- unique(wvsAB[,7]) > wvsAB$numcountry <- match(wvsAB$country, uncountry) > ls(wvsAB) [1] "age"
2025 Jan 19
2
Test For Difference of Betas By Group in car
Hello R-Helpers, I was looking into how to test whether the beta coefficient from a regression would be the same for two different groups contained in the dataset for the regression. When I put that question into google, AI returned a very nice looking answer (and a couple of variations on it). library(car) data <- data.frame(income = c(30, 45, 50, 25, 60, 55), education =
2017 Oct 19
1
looping using 'diverse' package measures
Hi everyone, I'm new at R (although I'm a Stata user for some time and somehow proficient in it) and I'm trying to use the 'diverse' R package to compute a few diversity measures on a sample of firms for a period of about 10 years. I was wondering if you can give me some hints on how to best proceed on using the 'diverse' package. My sample has the following setup.
2013 Mar 28
4
bayesian HLM random effects
Hello, all. I've been working on this for sometime and was almost at the end/ last chunk of code i would need.... When I received an error. Rather than go to bed and think about it in the morning, I messed with my data and now I am not getting anything. I was up until 4am trying to fix this. Zip files of my data are attached (the data which ends in 'a' matches with wvsA and the
2025 Jan 19
1
Test For Difference of Betas By Group in car
I don?t understand why you don?t include the full text of the error. ? David Sent from my iPhone > On Jan 19, 2025, at 10:00?AM, Sparks, John via R-help <r-help at r-project.org> wrote: > > ?Hello R-Helpers, > > I was looking into how to test whether the beta coefficient from a regression would be the same for two different groups contained in the dataset for the
2008 Feb 26
3
OLS standard errors
Hi, the standard errors of the coefficients in two regressions that I computed by hand and using lm() differ by about 1%. Can somebody help me to identify the source of this difference? The coefficient estimates are the same, but the standard errors differ. ####Simulate data happiness=0 income=0 gender=(rep(c(0,1,1,0),25)) for(i in 1:100){ happiness[i]=1000+i+rnorm(1,0,40)