Muhuri, Pradip (SAMHSA/CBHSQ)
2012-Dec-14 19:48 UTC
[R] format.pval () and printCoefmat ()
Hi List, My goal is to force R not to print in scientific notation in the sixth column (rel_diff - for the p-value) of my data frame (not a matrix). I have used the format.pval () and printCoefmat () functions on the data frame. The R script is appended below. This issue is that use of the format.pval () and printCoefmat () functions on the data frame gives me the desired results, but coerces the character string into NAs for the two character variables, because my object is a data frame, not a matrix. Please see the first output below: contrast_level1 contrast_level2). Is there a way I could have avoid printing the NAs in the character fields when using the format.pval () and printCoefmat () on the data frame? I would appreciate receiving your help. Thanks, Pradip setwd ("F:/PR1/R_PR1") load (file = "sigtests_overall_withid.rdata") #format.pval(tt$p.value, eps=0.0001) # keep only selected columns from the above data frame keep_cols1 <- c("contrast_level1", "contrast_level2","mean_level1", "mean_level2", "rel_diff", "p_mean", "cohens_d") #subset the data frame y0410_1825_mf_alc <- subset (sigtests_overall_withid, years=="0410" & age_group=="1825" & gender_group=="all" & drug=="alc" & contrast_level1=="wh", select=keep_cols1) #change the row.names row.names (y0410_1825_mf_alc)= 1:dim(y0410_1825_mf_alc)[1] #force format.pval(y0410_1825_mf_alc$p_mean, eps=0.0001) #print the observations from the sub-data frame options (width=120,digits=3 ) #y0410_1825_mf_alc printCoefmat(y0410_1825_mf_alc, has.Pvalue=TRUE, eps.Pvalue=0.0001) ####################### When format.pval () and printCoefmat () used contrast_level1 contrast_level2 mean_level1 mean_level2 rel_diff p_mean cohens_d 1 NA NA 18.744 11.911 0.574 0.00 0.175 2 NA NA 18.744 14.455 0.297 0.00 0.110 3 NA NA 18.744 13.540 0.384 0.00 0.133 4 NA NA 18.744 6.002 2.123 0.00 0.333 5 NA NA 18.744 5.834 2.213 0.00 0.349 6 NA NA 18.744 7.933 1.363 0.00 0.279 7 NA NA 18.744 10.849 0.728 0.00 0.203 8 NA NA 18.744 7.130 1.629 0.00 0.298 9 NA NA 18.744 9.720 0.928 0.00 0.242 10 NA NA 18.744 9.600 0.952 0.00 0.242 11 NA NA 18.744 16.135 0.162 0.17 0.067 . 12 NA NA 18.744 NA NA NA NA 13 NA NA 18.744 10.465 0.791 0.00 0.213 14 NA NA 18.744 15.149 0.237 0.02 0.092 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Warning messages: 1: In data.matrix(x) : NAs introduced by coercion 2: In data.matrix(x) : NAs introduced by coercion ####################### When format.pval () and printCoefmat () not used contrast_level1 contrast_level2 mean_level1 mean_level2 rel_diff p_mean cohens_d 1 wh 2+hi 18.7 11.91 0.574 1.64e-05 0.1753 2 wh 2+rc 18.7 14.46 0.297 9.24e-06 0.1101 3 wh aian 18.7 13.54 0.384 9.01e-05 0.1335 4 wh asan 18.7 6.00 2.123 2.20e-119 0.3326 5 wh blck 18.7 5.83 2.213 0.00e+00 0.3490 6 wh csam 18.7 7.93 1.363 1.27e-47 0.2793 7 wh cub 18.7 10.85 0.728 6.12e-08 0.2025 8 wh dmcn 18.7 7.13 1.629 1.59e-15 0.2981 9 wh hisp 18.7 9.72 0.928 3.27e-125 0.2420 10 wh mex 18.7 9.60 0.952 8.81e-103 0.2420 11 wh nhpi 18.7 16.14 0.162 1.74e-01 0.0669 12 wh othh 18.7 NA NA NA NA 13 wh pr 18.7 10.47 0.791 3.64e-23 0.2131 14 wh spn 18.7 15.15 0.237 1.58e-02 0.0922 Pradip K. Muhuri, PhD Statistician Substance Abuse & Mental Health Services Administration The Center for Behavioral Health Statistics and Quality Division of Population Surveys 1 Choke Cherry Road, Room 2-1071 Rockville, MD 20857 Tel: 240-276-1070 Fax: 240-276-1260 e-mail: Pradip.Muhuri@samhsa.hhs.gov<mailto:Pradip.Muhuri@samhsa.hhs.gov> The Center for Behavioral Health Statistics and Quality your feedback. Please click on the following link to complete a brief customer survey: http://cbhsqsurvey.samhsa.gov<http://cbhsqsurvey.samhsa.gov/> [[alternative HTML version deleted]]
On Dec 14, 2012, at 11:48 AM, Muhuri, Pradip (SAMHSA/CBHSQ) wrote:> Hi List, > > My goal is to force R not to print in scientific notation in the sixth column (rel_diff - for the p-value) of my data frame (not a matrix). > > I have used the format.pval () and printCoefmat () functions on the data frame. The R script is appended below. > > This issue is that use of the format.pval () and printCoefmat () functions on the data frame gives me the desired results, but coerces the character string into NAs for the two character variables, because my object is a data frame, not a matrix. Please see the first output below: contrast_level1 contrast_level2). > > Is there a way I could have avoid printing the NAs in the character fieldsThey are probably factor columns.> when using the format.pval () and printCoefmat () on the data frame? > > I would appreciate receiving your help. > > Thanks, > > Pradip > setwd ("F:/PR1/R_PR1") > > load (file = "sigtests_overall_withid.rdata") > > #format.pval(tt$p.value, eps=0.0001) > > # keep only selected columns from the above data frame > keep_cols1 <- c("contrast_level1", "contrast_level2","mean_level1", > "mean_level2", "rel_diff", > "p_mean", "cohens_d") > > #subset the data frame > y0410_1825_mf_alc <- subset (sigtests_overall_withid, > years=="0410" & age_group=="1825" > & gender_group=="all" & drug=="alc" > & contrast_level1=="wh", > select=keep_cols1) > #change the row.names > row.names (y0410_1825_mf_alc)= 1:dim(y0410_1825_mf_alc)[1] > > #force > format.pval(y0410_1825_mf_alc$p_mean, eps=0.0001)Presumably that call will produce desired results since it is on only one column. (I'm not sure why you think format.pval contributed to your NA output.)> > #print the observations from the sub-data frame > options (width=120,digits=3 ) > #y0410_1825_mf_alc > > printCoefmat(y0410_1825_mf_alc, has.Pvalue=TRUE, eps.Pvalue=0.0001)Why not use `cbind.data.frame` rather than trying to get `printCoefmat` to do something it (apparently) wasn't designed to do? cbind( y0410_1825_mf_alc[ 1:2], printCoefmat(y0410_1825_mf_alc[ -(1:2) ], has.Pvalue=TRUE, eps.Pvalue=0.0001) ) -- David.> > ####################### When format.pval () and printCoefmat () used > > > contrast_level1 contrast_level2 mean_level1 mean_level2 rel_diff p_mean cohens_d > > 1 NA NA 18.744 11.911 0.574 0.00 0.175 > 2 NA NA 18.744 14.455 0.297 0.00 0.110 > 3 NA NA 18.744 13.540 0.384 0.00 0.133 > 4 NA NA 18.744 6.002 2.123 0.00 0.333 > 5 NA NA 18.744 5.834 2.213 0.00 0.349 > 6 NA NA 18.744 7.933 1.363 0.00 0.279 > 7 NA NA 18.744 10.849 0.728 0.00 0.203 > 8 NA NA 18.744 7.130 1.629 0.00 0.298 > 9 NA NA 18.744 9.720 0.928 0.00 0.242 > 10 NA NA 18.744 9.600 0.952 0.00 0.242 > 11 NA NA 18.744 16.135 0.162 0.17 0.067 . > 12 NA NA 18.744 NA NA NA NA > 13 NA NA 18.744 10.465 0.791 0.00 0.213 > 14 NA NA 18.744 15.149 0.237 0.02 0.092 . > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > Warning messages: > 1: In data.matrix(x) : NAs introduced by coercion > 2: In data.matrix(x) : NAs introduced by coercion > > ####################### When format.pval () and printCoefmat () not used > > contrast_level1 contrast_level2 mean_level1 mean_level2 rel_diff p_mean cohens_d > 1 wh 2+hi 18.7 11.91 0.574 1.64e-05 0.1753 > 2 wh 2+rc 18.7 14.46 0.297 9.24e-06 0.1101 > 3 wh aian 18.7 13.54 0.384 9.01e-05 0.1335 > 4 wh asan 18.7 6.00 2.123 2.20e-119 0.3326 > 5 wh blck 18.7 5.83 2.213 0.00e+00 0.3490 > 6 wh csam 18.7 7.93 1.363 1.27e-47 0.2793 > 7 wh cub 18.7 10.85 0.728 6.12e-08 0.2025 > 8 wh dmcn 18.7 7.13 1.629 1.59e-15 0.2981 > 9 wh hisp 18.7 9.72 0.928 3.27e-125 0.2420 > 10 wh mex 18.7 9.60 0.952 8.81e-103 0.2420 > 11 wh nhpi 18.7 16.14 0.162 1.74e-01 0.0669 > 12 wh othh 18.7 NA NA NA NA > 13 wh pr 18.7 10.47 0.791 3.64e-23 0.2131 > 14 wh spn 18.7 15.15 0.237 1.58e-02 0.0922 > > > > Pradip K. Muhuri, PhD >David Winsemius Alameda, CA, USA