thr3ads.net - similar to: "test for contingency table when there are many zeros"

Displaying 20 results from an estimated 20000 matches similar to: "test for contingency table when there are many zeros"

Why two chisq.test p values differ when the contingency table is transposed?

2003 Jul 15

Why two chisq.test p values differ when the contingency table is transposed?

I'm using R1.7.0 runing with Win XP. Thanks, ...Tao ???????????????????????????????????????????????????????? >x [,1] [,2] [1,] 149 151 [2,] 1 8 >t(x) [,1] [,2] [1,] 149 1 [2,] 151 8 >chisq.test(x, simulate.p.value=T, B=100000) Pearson's Chi-squared test with simulated p-value (based on 1e+05 replicates) data: x X-squared = 5.2001, df =

Getting all possible contingency tables

2012 Dec 01

Getting all possible contingency tables

Hello all, Let say I have 2-way contingency table: Tab <- matrix(c(8, 10, 12, 6), nr = 2) and the Chi-squared test could not reject the independence: > chisq.test(Tab) Pearson's Chi-squared test with Yates' continuity correction data: Tab X-squared = 1.0125, df = 1, p-value = 0.3143 However I want to get all possible contingency tables under this independence

chisq test and fisher exact test

2005 Jun 22

chisq test and fisher exact test

Hi, I have a text mining project and currently I am working on feature generation/selection part. My plan is selecting a set of words or word combinations which have better discriminant capability than other words in telling the group id's (2 classes in this case) for a dataset which has 2,000,000 documents. One approach is using "contrast-set association rule mining" while the

Chi Square Test on two groups of variables

2005 May 26

Chi Square Test on two groups of variables

Dear R help I have been trying to conduct a chi square test on two groups of variables to test whether there is any relationship between the two sets of variables chisq.test(oxygen, train) Pearson's Chi-squared test data: oxygen X-squared = 26.6576, df = 128, p-value = 1 > chisq.test(oxygen) Pearson's Chi-squared test data: oxygen X-squared = 26.6576, df = 128,

mimic SPSS contingency table results

2010 Jul 08

mimic SPSS contingency table results

Dear all Seems that puzzles always come in packs. I was asked to help with some statistics in blood analysis. (You can not refuse your wife's asks :-). She has contingency table for values IgVH mutation and ZAP expression. I can do chi-square test (in R) and get a results, and with some literature I can try explain them. However she found an article in which they use SPSS and use

Testing for strength of fit using R

2009 Nov 26

Testing for strength of fit using R

Dear all, I am trying to validate a model by comparing simulated output values against observed values. I have produced a simple X-y scatter plot with a 1:1 line, so that the closer the points fall to this line, the better the 'fit' between the modelled data and the observation data. I am now attempting to quantify the strength of this fit by using a statistical test in R. I am no

Chi-squared approximation may be incorrect in: chisq.test(x)

2006 Dec 02

Chi-squared approximation may be incorrect in: chisq.test(x)

I am getting "Chi-squared approximation may be incorrect in: chisq.test(x)" with the data bleow. Frequency distribution of number of male offspring in families of size 5. Number of Male Offspring N 0 518 1 2245 2 4621 3 4753 4 2476 5

chisq.test vs manual calculation - why are different results produced?

2012 Feb 20

chisq.test vs manual calculation - why are different results produced?

Hello, I am trying to fit gamma, negative exponential and inverse power functions to a dataset, and then test whether the fit of each curve is good. To do this I have been advised to calculate predicted values for bins of data (I have grouped a continuous range of distances into 1km bins), and then apply a chi-squared test. Example: > data <- data.frame(distance=c(1,2,3,4,5,6,7),

Why two chisq.test p values differ when the contingency

2003 Jul 15

Why two chisq.test p values differ when the contingency

Hi Tao: The P-values for 2x2 table are generated based on a random (discrete uniform distribution) sampling of all possible 2x2 tables, conditioning on the observed margin totals. If one of the cells is extremely small, as in your case, you get a big difference in P-values. Suppose, you changed the cell with value 1 to, say, 5 or 6, then the two P-values are nearly the same. However, I

chisq.test using amalgamation automatically (possible ?!?)

2005 Jun 26

chisq.test using amalgamation automatically (possible ?!?)

Dear List, If any of observed and/or expected data has less than 5 frequencies, then chisq.test (Pearson's Chi-squared Test for Count Data from package:stats) gives warning messages. For example, x<-c(10, 14, 10, 11, 11, 7, 8, 4, 1, 4, 4, 2, 1, 1, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1) y<-c(9.13112391745095, 13.1626482033341, 12.6623267638188, 11.0130706413029, 9.16415925139016,

Find if there is independence

2002 May 23

Find if there is independence

Hello I have the matrix a<-matrix(c(2,1,0,1,2,2,1,5,7,2,5,12),nrow=6) a [,1] [,2] [1,] 2 1 [2,] 1 5 [3,] 0 7 [4,] 1 2 [5,] 2 5 [6,] 2 12 Suppose that in the first row we have 3 men of England, 2 with hair, and 1 no In the second we have 6 italian men, 1 with hair and 5 no ... I want to find if there is a dependence between men withouth hair and

Chi-Square test and survey results

2011 Oct 11

Chi-Square test and survey results

An organization has asked me to comment on the validity of their recent all-employee survey. Survey responses, by geographic region, compared with the total number of employees in each region, were as follows: > ByRegion All.Employees Survey.Respondents Region_1 735 142 Region_2 500 83 Region_3 897 78

chisq.test with simulate.p.value=TRUE (PR#13292)

2008 Nov 16

chisq.test with simulate.p.value=TRUE (PR#13292)

Full_Name: Reginaldo Constantino Version: 2.8.0 OS: Ubuntu Hardy (32 bit, kernel 2.6.24) Submission from: (NULL) (189.61.88.2) For many tables, chisq.test with simulate.p.value=TRUE gives a p value that is obviously incorrect and inversely proportional to the number of replicates: > data(HairEyeColor) > x <- margin.table(HairEyeColor, c(1, 2)) >

How to apply two parameter function in data frame

2012 Mar 06

How to apply two parameter function in data frame

I know this is something simple that I cannot do because I do not yet "think" in R. I have a data frame has a variable participation (a factor), and several other factors. I want a chisq test (no contingency tables) for participation vs all of the other factors. In SPSS I would do: CROSSTABS /TABLES= (my other factors) BY participation /FORMAT=NOTABLES /STATISTICS=CHISQ

Tests on contingency tables

2005 Feb 15

Tests on contingency tables

Dear all, I have a dataset with qualitative variables (factors) and I want to test the null hypothesis of independance between two variables for each pair by using appropriate tests on contingency tables. I first applied chisq.test and obtained dependance in almost all cases with extremely small p-values and warning messages. > chisq.test(table(data$ins.f, data$ins.st))$p.val [1]

p-value from chisq.test working strangely on 1.8.1

2003 Dec 09

p-value from chisq.test working strangely on 1.8.1

Hello everybody, I'm seeing some strange behavior on R 1.8.1 on Intel/Linux compiled with gcc 3.2.2. The p-value calculated from the chisq.test function is incorrect for some input values: > chisq.test(matrix(c(0, 1, 1, 12555), 2, 2), simulate.p.value=TRUE) Pearson's Chi-squared test with simulated p-value (based on 2000 replicates) data: matrix(c(0, 1, 1,

Validity of Pearson's Chi-Square for Large Tables

2009 Jun 03

Validity of Pearson's Chi-Square for Large Tables

Is Pearson's Chi-Square test for contingency tables asymptotically unbiased for large tables (large degrees of freedom) regardless of the expected values in each cell? The rule of thumb is that Pearson's Chi-square should not be used when large numbers of cells have expected values < 5. However, I compared the results on 4x4 contingency tables for R's chisq.test using chi-square

Chi-Square Test Disagreement

2008 Nov 26

Chi-Square Test Disagreement

I was asked by my boss to do an analysis on a large data set, and I am trying to convince him to let me use R rather than SPSS. I think Sweave could make my life much much easier. To get me a little closer to this goal, I ran my analysis through R and SPSS and compared the resulting values. In all but one case, they were the same. Given the matrix [,1] [,2] [1,] 110 358 [2,] 71 312 [3,]

Is it safe? Cochran etc

2004 Oct 09

Is it safe? Cochran etc

I have the following contingency table dat <- matrix(c(1,506,13714,878702),nr=2) And I want to test if their is an association between events A:{a,not(a)} and B:{b,not(b)} | b | not(b) | --------+-----+--------+ a | 1 | 13714 | --------+-----+--------+ not(a) | 506 | 878702 | --------+-----+--------+ I am worried that prop.test and chisq.test are not valid given the

fisher exact vs. simulated chi-square

2003 Apr 22

fisher exact vs. simulated chi-square

Dear All, I have a problem understanding the difference between the outcome of a fisher exact test and a chi-square test (with simulated p.value). For some sample data (see below), fisher reports p=.02337. The normal chi-square test complains about "approximation may be incorrect", because there is a column with cells with very small values. I therefore tried the chi-square with

similar to: test for contingency table when there are many zeros