Sharma, Dhruv
2010-Dec-11 22:45 UTC
[R] is there a packge or code to generate markov chains in R
Hi, if i have data in the following time series format: time, amount, state 1 2222 A 1 333 B 2 45 A 2 77 B where states could be n and time periods t is there a package in R that would calculate the transition probabilities in a markov chain. for each t except t=0 to generate A B A B perhaps the best structure might be t, AA, AB, BA, BB. is there already a package for this ? i know i could hard code a solution but i was hoping there is a package for this so if thre are t time windows then t-1 markov chains could be generated. hope this makes sense. thanks Dhruv -----Original Message----- From: r-help-bounces at r-project.org on behalf of r-help-request at r-project.org Sent: Sat 12/11/2010 6:00 AM To: r-help at r-project.org Subject: R-help Digest, Vol 94, Issue 11 Send R-help mailing list submissions to r-help at r-project.org To subscribe or unsubscribe via the World Wide Web, visit https://stat.ethz.ch/mailman/listinfo/r-help or, via email, send a message with subject or body 'help' to r-help-request at r-project.org You can reach the person managing the list at r-help-owner at r-project.org When replying, please edit your Subject line so it is more specific than "Re: Contents of R-help digest..." Today's Topics: 1. Need help on nnet (jothy) 2. Re: How to enable Arial font for postcript/pdf figure on Windows? (Camille) 3. [R-pkgs] pgfSweave 1.1.1 Released (Cameron Bracken) 4. Compare one level of a factor with *all* other non-missing levels (deriK2000) 5. subset with two factors (Martin Spindler) 6. Re: [lattice xyplot] Help needed in help in customizing the panel.abline() function (Felix Andrews) 7. Re: subset with two factors (Michael Bedward) 8. Re: [lattice xyplot] Help needed in help in customizing the panel.abline() function (Girish A.R.) 9. New Installs, Same Trouble Loading doBy and coin Packages (Adam Carr) 10. Help..Neural Network (sadanandan) 11. Re: subset with two factors (Martin Spindler) 12. Re: ReadWrite.xls problem (Hans-Peter Suter) 13. Re: New Installs, Same Trouble Loading doBy and coin Packages (Peter Ehlers) 14. Re: Using Lagsarlm (Roger Bivand) 15. Re: Projecting data on a world map using long/lat (Michael Sumner) 16. Adding numbers in Outputs (Amelia Vettori) 17. Re: Compare one level of a factor with *all* other non-missing levels (Peter Ehlers) 18. Re: Sweave: Setting options with SweaveOpts{} when using driver=RweaveHTML (Duncan Murdoch) 19. Re: Adding numbers in Outputs (jim holtman) 20. Re: Adding numbers in Outputs (Peter Ehlers) 21. Re: Adding numbers in Outputs (Jinyan Huang) 22. Re: Compare one level of a factor with *all* other non-missing levels (deriK2000) 23. Re: Adding numbers in Outputs (Amelia Vettori) 24. (no subject) (andrija djurovic) 25. Re: (no subject) (jim holtman) 26. melt causes errors when characters and values are used (Daniel Brewer) 27. Re: ReadWrite.xls problem (haruo0409) 28. Re: Adding numbers in Outputs (jim holtman) 29. Re: (no subject) (David L Lorenz) 30. Textwrangler Languages Folder (Scott Chamberlain) 31. Re: Minimization of the distance (bluesky) 32. [r] overlap different line in a xyplot (lattice) (Francesco Nutini) 33. Re: survival: ridge log-likelihood workaround (Terry Therneau) 34. Re: Textwrangler Languages Folder (Ben Tupper) 35. Remove 100 years from a date object (Daniel Brewer) 36. help requested (profaar) 37. 45 Degree labels on barplot? Help understanding code previously posted. (Simon Kiss) 38. Re: Remove 100 years from a date object (Barry Rowlingson) 39. Re: help requested (Jinyan Huang) 40. Re: help requested (Henrique Dallazuanna) 41. survreg vs. aftreg (eha) - the relationship between fitted coefficients? (Eleni Rapsomaniki) 42. Re: Remove 100 years from a date object (Clint Bowman) 43. Re: Remove 100 years from a date object (Gabor Grothendieck) 44. Stricter read.table? (Stavros Macrakis) 45. Re: Remove 100 years from a date object (Daniel Brewer) 46. Re: Perl "cut" equivalent in R (Martin Maechler) 47. Re: Projecting data on a world map using long/lat (mathijsdevaan) 48. Re: Perl "cut" equivalent in R (William Dunlap) 49. new edition of R Companion to Applied Regression (John Fox) 50. Re: help requested (Mike Marchywka) 51. Re: Perl "cut" equivalent in R (Duncan Murdoch) 52. Re: [r] overlap different line in a xyplot (lattice) (Peter Ehlers) 53. Could concurrent R sessions mix up variables? (Anthony Damico) 54. help with RSQLite adding a new column (Michael D) 55. Re: help requested (Petr Savicky) 56. Re: 45 Degree labels on barplot? Help understanding code previously posted. (David Winsemius) 57. Re: Could concurrent R sessions mix up variables? (Duncan Murdoch) 58. Re: Could concurrent R sessions mix up variables? (Phil Spector) 59. survival package - calculating probability to survive a given time (Andreas Wittmann) 60. Re: Compare one level of a factor with *all* other non-missing levels (Peter Ehlers) 61. Re: survival package - calculating probability to survive a given time (David Winsemius) 62. Reorder factor and address embedded escapes (Rob James) 63. Time Series Row Label (dreadgazebo) 64. Re: Reorder factor and address embedded escapes (Erik Iverson) 65. Re: Reorder factor and address embedded escapes (David Winsemius) 66. Re: Encoding problem - I fails to read Hebrew text from online (Tal Galili) 67. spatial clusters (dorina.lazar) 68. Re: Time Series Row Label (Gabor Grothendieck) 69. WriteXLS error:Error in get(x, envir = envir) : variable names are limited to 256 bytes (Patrick McKann) 70. How to print colorful R output?? (casperyc) 71. Re: WriteXLS error:Error in get(x, envir = envir) : variable names are limited to 256 bytes (David Winsemius) 72. Re: WriteXLS error:Error in get(x, envir = envir) : variable names are limited to 256 bytes (David Winsemius) 73. Re: Stricter read.table? (Ben Bolker) 74. Quantile with discrete types (Stavros Macrakis) 75. Re: [r] overlap different line in a xyplot (lattice) (Ben Bolker) 76. locfit weights not working as expected (Layla Parast) 77. R question: memory usage (simon lu) 78. randomForest: help with combine() function (Dennis Duro) 79. Re: How to print colorful R output?? (Yihui Xie) 80. Re: WriteXLS error:Error in get(x, envir = envir) : variable names are limited to 256 bytes (Marc Schwartz) 81. Re: How to print colorful R output?? (Liviu Andronic) 82. Re: help with RSQLite adding a new column (Michael Bedward) 83. Re: 45 Degree labels on barplot? Help understanding code previously posted. (Jim Lemon) 84. Re: Adding numbers in Outputs (Petr PIKAL) ---------------------------------------------------------------------- Message: 1 Date: Fri, 10 Dec 2010 02:56:13 -0800 (PST) From: jothy <gnanajothyk at gmail.com> To: r-help at r-project.org Subject: [R] Need help on nnet Message-ID: <AANLkTikK7qyrFZPyuSZMqDZV17GQhZMYdU0pZjhcbL8m at mail.gmail.com> Content-Type: text/plain Hi, Am working on neural network. Below is the coding and the output> library (nnet)> uplift.nn<-nnet (PVU~ConsumerValue+Duration+PromoVolShare,y,size=3)# weights: 16 initial value 4068.052704 final value 3434.194253 converged> summary (uplift.nn)a 3-3-1 network with 16 weights options were - b->h1 i1->h1 i2->h1 i3->h1 16.64 6.62 149.93 2.24 b->h2 i1->h2 i2->h2 i3->h2 -42.79 -17.40 -507.50 -5.14 b->h3 i1->h3 i2->h3 i3->h3 3.45 1.87 18.89 0.61 b->o h1->o h2->o h3->o 402.81 41.29 236.76 6.06 I have few questions, please i need help Q1: How to interpret the above output Q2: My objective is to know the contribution of each independent variable. Q3: Which package of neural network provides the AIC or BIC values Regards jothy -- View this message in context: http://r.789695.n4.nabble.com/Need-help-on-nnet-tp3081744p3081744.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] ------------------------------ Message: 2 Date: Fri, 10 Dec 2010 03:20:23 -0800 (PST) From: Camille <c.pelat at imperial.ac.uk> To: r-help at r-project.org Subject: Re: [R] How to enable Arial font for postcript/pdf figure on Windows? Message-ID: <1291980023177-3081774.post at n4.nabble.com> Content-Type: text/plain; charset=UTF-8 Hi Agnes, I converted the Arial font files from ttf to afm using ttf2afm from MikTex complete installation. When used in R with the line recommended by Plos, they seem to give correct Arial font graphics: I checked by opening the ps file with a viewer (gsview), a text editor (notepad++) and Adobe illustrator. However I did not try (but hopefully will do soon) the ultimate test: submission to Plos. Here is a way to do it: (Mind the trick at step 5) 1) Download ttf2afm.exe (available for ex in the install directory of MikTex complete installation) 2) Fetch the arial ttf files in C:\Windows\Fonts 3) Place ttf2afm.exe and the ttf files in a directory (eg C:/ttf2afm/) 4) Open a DOS window (using cmd). Place yourself into the created directory (using cd). Then type ttf2afm.exe arial.ttf > arial.afm ttf2afm.exe arialbd.ttf > arial-Bold.afm ttf2afm.exe ariali.ttf > arial-Oblique.afm ttf2afm.exe arialbi.ttf > arial-BoldOblique.afm 5) Now, if you used ttf2afm.exe from MikTex you should open the created afm files with a text editor (ex Notepad++) and correct the following things: ? Remove the copyright line (or make it start zith comment and get rid of the (c) copyright symbol ) ? At the beginning of 4 of the first lines, the variable name is missing, so add it. Eg in arial.afm: o FontName ArialMT o FullName Arial o FamilyName Arial o Weight Normal Now you can use these fonts with the postscript function in R with the following line: postscript(file="try.ps", horizontal=F, onefile=F, width=4, height=4, pointsize=12, family=c ( ?C:/ttf2afm/arial.afm", ?C:/ttf2afm/arial-Bold.afm ", ?C:/ttf2afm/arial-Oblique.afm ", ?C:/ttf2afm/arial-BoldOblique.afm " ) ) hist( rnorm(100) ) dev.off() Cheers, Camille -- View this message in context: http://r.789695.n4.nabble.com/How-to-enable-Arial-font-for-postcript-pdf-figure-on-Windows-tp3017809p3081774.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 3 Date: Tue, 7 Dec 2010 16:26:40 -0700 From: Cameron Bracken <cameron.bracken at gmail.com> To: R-Packages <r-packages at r-project.org> Subject: [R] [R-pkgs] pgfSweave 1.1.1 Released Message-ID: <AANLkTikApA4_FpOi=PGsHZD8GM=UScK0ZRCo92dgEC++ at mail.gmail.com> Content-Type: text/plain; charset="utf-8" The next release of?pgfSweave?is now on CRAN! pgfSweave has seen some significant changes in the past couple of months. The main new features are: - Automatic code highlighting via the?highlight?package. This can be turned off with the new `highlight`?option. - "Tidying" of source code output via the?tidy?option. - Access to?tikzDevice?sanitization through a code chunk option?`sanitize` - Automatic addition of the?\pgfrealjobname?command if it does not exist similarly to the addition of the?\usepackage{Sweave}?line. - Setting tex.driver=latex will now (in addition to working) generate an eps file And of course bug fixes: - Fixes for bunches of issues related to the changes in Sweave in R 2.12. I think these issues are now resolved (fingers crossed) - keep.source?actually works now. See the?NEWS?file for the complete list of changes and the?vignette for information on now to use the new options. Cheers! -Cameron _______________________________________________ R-packages mailing list R-packages at r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages ------------------------------ Message: 4 Date: Fri, 10 Dec 2010 03:24:33 -0800 (PST) From: deriK2000 <derik2000 at yahoo.de> To: r-help at r-project.org Subject: [R] Compare one level of a factor with *all* other non-missing levels Message-ID: <1291980273092-3081777.post at n4.nabble.com> Content-Type: text/plain; charset=us-ascii Dear list, I try to compare the mean of a variable given a value of a factor with the mean of the same variable for all K-1 other non-missing values of this factor. This procedure I want to repeat for each level of the factor. Having read the recommendations of this list I want to avoid creating K-1 dummy variables and searched for options of the pairwise.t.test. But couldn't find a solution. Anyone with a suggestion how to do the comparisions? Cheers, Derik -- View this message in context: http://r.789695.n4.nabble.com/Compare-one-level-of-a-factor-with-all-other-non-missing-levels-tp3081777p3081777.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 5 Date: Fri, 10 Dec 2010 12:25:13 +0100 From: "Martin Spindler" <Martin.Spindler at gmx.de> To: r-help at R-project.org Subject: [R] subset with two factors Message-ID: <20101210112513.214200 at gmx.net> Content-Type: text/plain; charset="utf-8" Dear all, I have a dataframe of the following strucutre numacc_b coverage_b Geschlecht GG 1 0 1 W A 2 0 1 M A 3 0 1 M B 4 0 1 M B 5 0 1 W A 6 0 1 M B I would like to form a subset consisting of all entries with Geschlecht=M and GG=A. Using>T1 <- subset(daten1, Geschlecht=="M", GG=="A")delievers data frame with 0 columns and 6 rows> T1 <- subset(daten1, Geschlecht=="M")delievers numacc_b coverage_b Geschlecht GG 2 0 1 M A 3 0 1 M B 4 0 1 M B 6 0 1 M B 9 0 1 M B 10 0 1 M B But I want to select the dataframe according to both factos. What can I do? [[elided Yahoo spam]] Best, Martin -- GMX DSL Doppel-Flat ab 19,99 €/mtl.! Jetzt auch mit gratis Notebook-Flat! http://portal.gmx.net/de/go/dsl ------------------------------ Message: 6 Date: Fri, 10 Dec 2010 22:33:11 +1100 From: Felix Andrews <felix at nfrac.org> To: "Girish A.R." <garamach at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] [lattice xyplot] Help needed in help in customizing the panel.abline() function Message-ID: <AANLkTimD6dRUkkjEhW4-sB8CBDcFVASRRKkF6Xex+cb9 at mail.gmail.com> Content-Type: text/plain; charset=UTF-8 Hi Girish, Try this: disc <- xyplot(cnt_gt50pct_disc ~ week_num|sku_num, data=DF,type "h",lwd=2,panel = function(x, y, ...) { panel.abline(v = x[which.max(y)], lty = 2) panel.xyplot(x, y, ...) }) -Felix On 9 December 2010 17:35, Girish A.R. <garamach at gmail.com> wrote:> > Hi folks, > > I need some help in customizing the abline() function to be used in a > lattice plot. I have attached a reproducible example below. > > I need help in the following snippet: > disc <- xyplot(cnt_gt50pct_disc ~ week_num|sku_num, data=DF,type > "h",lwd=2,panel = function(...) { > ? ? ? ? ? panel.abline(v = 8, lty = 2) > ? ? ? ? ? panel.xyplot(...) > ? ? ? }) > > Is there a way I can give panel.abline() input from a which.max() function? > Essentially I need the vertical line to be drawn at the week_num > corresponding to the max (cnt_gt50pct_disc). > > Thanks in advance, > > -Girish > > ==========================================> Lines <- "sku_num week_num ? ?pct_inv_left ? ?cnt_gt50pct_disc > 1 ? 1 ? 99.88 ? 47 > 1 ? 2 ? 99.54 ? 109 > 1 ? 3 ? 98.7 ? ?260 > 1 ? 4 ? 97.83 ? 202 > 1 ? 5 ? 96.53 ? 389 > 1 ? 6 ? 94.11 ? 450 > 1 ? 7 ? 90.42 ? 459 > 1 ? 8 ? 86.63 ? 448 > 1 ? 9 ? 83.39 ? 411 > 1 ? 10 ?77 ?478 > 1 ? 11 ?71.65 ? 476 > 1 ? 12 ?67.3 ? ?463 > 1 ? 13 ?62.45 ? 472 > 1 ? 14 ?52.47 ? 488 > 1 ? 15 ?40.86 ? 486 > 1 ? 16 ?31.34 ? 484 > 1 ? 17 ?23.2 ? ?472 > 1 ? 18 ?17 ?458 > 1 ? 19 ?12.66 ? 423 > 1 ? 20 ?10.18 ? 364 > 1 ? 21 ?7.6 343 > 1 ? 22 ?3.09 ? ?343 > 1 ? 23 ?1.05 ? ?211 > 2 ? 1 ? 99.94 ? 30 > 2 ? 2 ? 99.4 ? ?151 > 2 ? 3 ? 98.85 ? 146 > 2 ? 4 ? 97.92 ? 274 > 2 ? 5 ? 97.03 ? 204 > 2 ? 6 ? 95.59 ? 378 > 2 ? 7 ? 92.81 ? 452 > 2 ? 8 ? 89.07 ? 470 > 2 ? 9 ? 85.11 ? 454 > 2 ? 10 ?81.68 ? 421 > 2 ? 11 ?75.34 ? 479 > 2 ? 12 ?70.05 ? 476 > 2 ? 13 ?66.11 ? 456 > 2 ? 14 ?61.85 ? 465 > 2 ? 15 ?53.2 ? ?485 > 2 ? 16 ?42.75 ? 486 > 2 ? 17 ?33.58 ? 481 > 2 ? 18 ?25 ?477 > 2 ? 19 ?18.13 ? 450 > 2 ? 20 ?12.97 ? 416 > 2 ? 21 ?10.03 ? 343 > 2 ? 22 ?7.03 ? ?293 > 2 ? 23 ?2.33 ? ?283 > 2 ? 24 ?0.77 ? ?116 > " > > DF <- read.table(con<- textConnection(Lines), skip = 1); > names(DF) <- scan(textConnection(Lines), what = "", nlines = 1) ; > close(con); > > require(latticeExtra) > DF$sku_num <- as.factor(DF$sku_num) > disc <- xyplot(cnt_gt50pct_disc ~ week_num|sku_num, data=DF,type > "h",lwd=2,panel = function(...) { > ? ? ? ? ? panel.abline(v = 8, lty = 2) > ? ? ? ? ? panel.xyplot(...) > ? ? ? }) > sales <- xyplot(pct_inv_left ?~ week_num|sku_num, data=swtop16,type > "l",lwd=2,panel = function(...) { > ? ? ? ? ? panel.abline(h = 75, lty = 2) > ? ? ? ? ? panel.xyplot(...) > ? ? ? }) > doubleYScale(disc, sales, style1 = 0, style2 = 2, add.ylab2 = TRUE,text > c("# stores with gt 50pct disc", "% Unsold")) > -- > View this message in context: http://r.789695.n4.nabble.com/lattice-xyplot-Help-needed-in-help-in-customizing-the-panel-abline-function-tp3079656p3079656.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Felix Andrews / ??? http://www.neurofractal.org/felix/ ------------------------------ Message: 7 Date: Fri, 10 Dec 2010 22:35:56 +1100 From: Michael Bedward <michael.bedward at gmail.com> To: Martin Spindler <Martin.Spindler at gmx.de> Cc: r-help at r-project.org Subject: Re: [R] subset with two factors Message-ID: <AANLkTikspbP6oyxaxTYAf6ua_eW3-DkfYk4KWWUcb3bW at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hello Martin, You were almost there :) T1 <- subset(daten1, Geschlecht=="M" & GG=="A") Hope this helps. Michael On 10 December 2010 22:25, Martin Spindler <Martin.Spindler at gmx.de> wrote:> Dear all, > > I have a dataframe of the following strucutre > > ?numacc_b coverage_b Geschlecht GG > 1 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?W ?A > 2 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?M ?A > 3 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?M ?B > 4 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?M ?B > 5 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?W ?A > 6 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?M ?B > > I would like to form a subset consisting of all entries with Geschlecht=M and GG=A. > > Using > >>T1 <- subset(daten1, Geschlecht=="M", GG=="A") > > delievers > > data frame with 0 columns and 6 rows > >> T1 <- subset(daten1, Geschlecht=="M") > > delievers > > ?numacc_b coverage_b Geschlecht GG > 2 ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? ?M ?A > 3 ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? ?M ?B > 4 ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? ?M ?B > 6 ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? ?M ?B > 9 ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? ?M ?B > 10 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?M ?B > > But I want to select the dataframe according to both factos. > > What can I do? >[[elided Yahoo spam]]> > Best, > > Martin > -- > GMX DSL Doppel-Flat ab 19,99 €/mtl.! Jetzt auch mit > gratis Notebook-Flat! http://portal.gmx.net/de/go/dsl > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 8 Date: Fri, 10 Dec 2010 03:38:03 -0800 (PST) From: "Girish A.R." <garamach at gmail.com> To: r-help at r-project.org Subject: Re: [R] [lattice xyplot] Help needed in help in customizing the panel.abline() function Message-ID: <1291981083178-3081792.post at n4.nabble.com> Content-Type: text/plain; charset=us-ascii Thanks, Felix! That works. best, -Girish -- View this message in context: http://r.789695.n4.nabble.com/lattice-xyplot-Help-needed-in-help-in-customizing-the-panel-abline-function-tp3079656p3081792.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 9 Date: Fri, 10 Dec 2010 03:43:18 -0800 (PST) From: Adam Carr <adamlcarr at yahoo.com> To: r-help at r-project.org Subject: [R] New Installs, Same Trouble Loading doBy and coin Packages Message-ID: <539652.57795.qm at web35307.mail.mud.yahoo.com> Content-Type: text/plain; charset="iso-8859-1" I tried Tal's suggestion of deleting the doBy and coin packages and then reinstalling them from a different mirror. The first install was from the Harvard mirror and the second was from the Case Western Univ. mirror. The new packages generate the same errors when I call them using the library() command. Also, I tried to load these packages using R and its script editor thinking that the problem may have something to do with Tinn-R, but the same errors are generated on the R terminal when I use the library() function. Any help would be appreciated. Again, the errors for these two packages: Error in length(label) : could not find function ".extendsForS3" Error: package/namespace load failed for 'doBy'> library(coin)Loading required package: mvtnorm Loading required package: modeltools Loading required package: stats4? #This is odd. I cannot find any reference for this package. AC Error in length(sig) : could not find function ".extendsForS3" Error: package 'stats4' could not be loaded ----- Forwarded Message ---- From: Adam Carr <adamlcarr at yahoo.com> To: Tal Galili <tal.galili at gmail.com> Cc: r-help at r-project.org Sent: Thu, December 9, 2010 1:12:21 PM Subject: Re: [R] Trouble Loading doBy and coin Packages Hi Tal: No I have not tried this. I will do it this evening and we'll see what happens. Thanks for the suggestion. Adam ________________________________ From: Tal Galili <tal.galili at gmail.com> Cc: r-help at r-project.org Sent: Thu, December 9, 2010 12:29:20 PM Subject: Re: [R] Trouble Loading doBy and coin Packages I Adam, Have you tried deleting the package files and then reinstalling them from a different CRAN mirror? Tal ----------------Contact Details:------------------------------------------------------- Contact me: Tal.Galili at gmail.com |? 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- Good Evening R-Help Community:> >I have attached a file that contains the output from sessionInfo() and asummary>of my Win XP system. I am?running R 2.12.0 and using Tinn-R 2.3.6.2 as my >interface. When I attempt to call either the doBy or coin packages R generates >an error that I do not understand and have so far not been able to resolve by >searching R resources. > >I exchanged a couple of emails with Soren Hojsgaard who does not think the doBy >error is directly related to the package itself, and he suggested that I post >this problem for input from others. > >When the doBy package is loaded, the following error appears in the Tinn-R log: > >Error in length(label) : could not find function ".extendsForS3" >Error: package/namespace load failed for 'doBy' > >When the coin package is called, this error appears in the Tinn-R log: > >Error in length(sig) : could not find function ".extendsForS3" >Error: package 'stats4' could not be loaded > >No functions in either package work, and when I attempt to call them the same >errors are generated in the log. > >Any help or direction would be appreciated. > >Thanks very much, > >Adam > > >? ? ? >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > >? ? ? ??? [[alternative HTML version deleted]] -------------- next part -------------- ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ------------------------------ Message: 10 Date: Fri, 10 Dec 2010 03:45:46 -0800 (PST) From: sadanandan <sudeesh.sadanandan at gmail.com> To: r-help at r-project.org Subject: [R] Help..Neural Network Message-ID: <1291981546611-3081800.post at n4.nabble.com> Content-Type: text/plain; charset=us-ascii Hi all, I am trying to develop a neural network with single target variable and 5 input variables to predict the importance of input variables using R. I used the packages nnet and RSNNS. But unfortunately I could not interpret the out put properly and the documentation of that packages also not giving proper direction. Please help me to find a good package with a proper documentation for neural network. Advance thanks s.sadanand -- View this message in context: http://r.789695.n4.nabble.com/Help-Neural-Network-tp3081800p3081800.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 11 Date: Fri, 10 Dec 2010 12:47:27 +0100 From: "Martin Spindler" <Martin.Spindler at gmx.de> To: r-help at r-project.org Subject: Re: [R] subset with two factors Message-ID: <20101210114727.128910 at gmx.net> Content-Type: text/plain; charset="utf-8" Hey Michael, [[elided Yahoo spam]] Best, Martin -------- Original-Nachricht --------> Datum: Fri, 10 Dec 2010 22:35:56 +1100 > Von: Michael Bedward <michael.bedward at gmail.com> > An: Martin Spindler <Martin.Spindler at gmx.de> > CC: r-help at r-project.org > Betreff: Re: [R] subset with two factors> Hello Martin, > > You were almost there :) > > T1 <- subset(daten1, Geschlecht=="M" & GG=="A") > > Hope this helps. > > Michael > > On 10 December 2010 22:25, Martin Spindler <Martin.Spindler at gmx.de> wrote: > > Dear all, > > > > I have a dataframe of the following strucutre > > > > ?numacc_b coverage_b Geschlecht GG > > 1 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?W ?A > > 2 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?M ?A > > 3 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?M ?B > > 4 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?M ?B > > 5 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?W ?A > > 6 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?M ?B > > > > I would like to form a subset consisting of all entries with > Geschlecht=M and GG=A. > > > > Using > > > >>T1 <- subset(daten1, Geschlecht=="M", GG=="A") > > > > delievers > > > > data frame with 0 columns and 6 rows > > > >> T1 <- subset(daten1, Geschlecht=="M") > > > > delievers > > > > ?numacc_b coverage_b Geschlecht GG > > 2 ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? ?M ?A > > 3 ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? ?M ?B > > 4 ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? ?M ?B > > 6 ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? ?M ?B > > 9 ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? ?M ?B > > 10 ? ? ? ?0 ? ? ? ? ?1 ? ? ? ? ?M ?B > > > > But I want to select the dataframe according to both factos. > > > > What can I do? > >[[elided Yahoo spam]]> > > > Best, > > > > Martin > > -- > > GMX DSL Doppel-Flat ab 19,99 €/mtl.! Jetzt auch mit > > gratis Notebook-Flat! http://portal.gmx.net/de/go/dsl > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > >-- GRATIS! Movie-FLAT mit ?ber 300 Videos. -- GMX DSL Doppel-Flat ab 19,99 €/mtl.! Jetzt auch mit gratis Notebook-Flat! http://portal.gmx.net/de/go/dsl ------------------------------ Message: 12 Date: Fri, 10 Dec 2010 12:56:59 +0100 From: Hans-Peter Suter <gchappi at gmail.com> To: tkdweber <tkd.weber at gmail.com>, eixcxisx at bca.bai.ne.jp Cc: r-help at r-project.org Subject: Re: [R] ReadWrite.xls problem Message-ID: <AANLkTi=qon_=O+4so7Au-_gRYjGOouEbPnnQ954G+qqw at mail.gmail.com> Content-Type: text/plain; charset=UTF-8 Toby, haruo0409, 2010/12/8 tkdweber <tkd.weber at gmail.com>:> This is my Error-Message in its German original: > Fehler in .Call("ReadXls", file, colNames, sheet, type, from, rowNames, ?: > ?Falsche Anzahl von Argumenten (11), erwarte 10 f?r ReadXlsThere was a wrong DLL for a short while in the old 1.5.2 version (I fixed a R2.12.0 related issue and unfortunately introduced this error). If you delete the old xlsReadWrite package and re-install the package (either from CRAN or see www.swissr.org/download) it really should work. 2010/12/10 haruo0409 <eixcxisx at bca.bai.ne.jp>:> I'm also annoyed at same problem. > I installed xlsReadWriter today and entered > x <- read.xls("data.xls",sheet=1) > But I got Error Message: > ?????? .Call("ReadXls", file, colNames, sheet, type, from, rowNames, : > ?????(11)??????10 ?? ReadXls ????????? > (It's Japanese.Its English translation is the same as yours)What's the 'library(xlsReadWrite)' startup message? For the current version it should be: 'xlsReadWrite version 1.5.3 (0b78c1)'. Could you please give more details about 'I installed xlsReadWriter today' (which CRAN mirror, 'R.version' and '.Platform' output, is there only one 'xlsReadWrite.dll' file on your system). It is supposed to work [[elided Yahoo spam]] 2010/12/8 tkdweber <tkd.weber at gmail.com>:>Without being able to read data, the > programme renders pointless for me :-(There are many ways to read/write data in R: * Other than load/save you could use read.table/write.table (see R Data Import/Export). * Using Excel files is not the recommended way. However when you want/need it, there are several options (see http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows) Cheers, Hans-Peter PS. not that I mind to discuss things here, but as these are package specific problems I'd suggest to switch to the xlsReadWrite forum (http://dev.swissr.org/projects/xlsreadwrite/boards). You also can create an issue (http://dev.swissr.org/projects/xlsreadwrite/issues/new) or just send an email to 'support' at 'swissr.org'). ------------------------------ Message: 13 Date: Fri, 10 Dec 2010 04:13:17 -0800 From: Peter Ehlers <ehlers at ucalgary.ca> To: Adam Carr <adamlcarr at yahoo.com> Cc: "r-help at r-project.org" <r-help at r-project.org> Subject: Re: [R] New Installs, Same Trouble Loading doBy and coin Packages Message-ID: <4D02195D.6070500 at ucalgary.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 2010-12-10 03:43, Adam Carr wrote:> I tried Tal's suggestion of deleting the doBy and coin packages and then > reinstalling them from a different mirror. The first install was from the > Harvard mirror and the second was from the Case Western Univ. mirror. The new > packages generate the same errors when I call them using the library() command. > > Also, I tried to load these packages using R and its script editor thinking that > the problem may have something to do with Tinn-R, but the same errors are > generated on the R terminal when I use the library() function. > > Any help would be appreciated. > > Again, the errors for these two packages: > > Error in length(label) : could not find function ".extendsForS3" > Error: package/namespace load failed for 'doBy' > > >> library(coin) > Loading required package: mvtnorm > Loading required package: modeltools > Loading required package: stats4 #This is odd. I cannot find any reference for > this package. AC > Error in length(sig) : could not find function ".extendsForS3" > Error: package 'stats4' could not be loaded >I would remove and re-install R. 'stats4' is a base package and if that can't be loaded, your installation may be broken. Try require(stats4) or help(package=stats4) Peter Ehlers> > > > > ----- Forwarded Message ---- > From: Adam Carr<adamlcarr at yahoo.com> > To: Tal Galili<tal.galili at gmail.com> > Cc: r-help at r-project.org > Sent: Thu, December 9, 2010 1:12:21 PM > Subject: Re: [R] Trouble Loading doBy and coin Packages > > Hi Tal: > > No I have not tried this. I will do it this evening and we'll see what happens. > Thanks for the suggestion. > > Adam > > > > > ________________________________ > From: Tal Galili<tal.galili at gmail.com> > > Cc: r-help at r-project.org > Sent: Thu, December 9, 2010 12:29:20 PM > Subject: Re: [R] Trouble Loading doBy and coin Packages > > > I Adam, > Have you tried deleting the package files and then reinstalling them from a > different CRAN mirror? > > > Tal > > ----------------Contact > Details:------------------------------------------------------- > Contact me: Tal.Galili at gmail.com | 972-52-7275845 > Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | > www.r-statistics.com (English) > ---------------------------------------------------------------------------------------------- > > > > > > > > > > Good Evening R-Help Community: >> >> I have attached a file that contains the output from sessionInfo() and a > summary >> of my Win XP system. I am running R 2.12.0 and using Tinn-R 2.3.6.2 as my >> interface. When I attempt to call either the doBy or coin packages R generates >> an error that I do not understand and have so far not been able to resolve by >> searching R resources. >> >> I exchanged a couple of emails with Soren Hojsgaard who does not think the doBy >> error is directly related to the package itself, and he suggested that I post >> this problem for input from others. >> >> When the doBy package is loaded, the following error appears in the Tinn-R log: >> >> Error in length(label) : could not find function ".extendsForS3" >> Error: package/namespace load failed for 'doBy' >> >> When the coin package is called, this error appears in the Tinn-R log: >> >> Error in length(sig) : could not find function ".extendsForS3" >> Error: package 'stats4' could not be loaded >> >> No functions in either package work, and when I attempt to call them the same >> errors are generated in the log. >> >> Any help or direction would be appreciated. >> >> Thanks very much, >> >> Adam >> >> >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > > > > [[alternative HTML version deleted]] > > >------------------------------ Message: 14 Date: Fri, 10 Dec 2010 04:16:31 -0800 (PST) From: Roger Bivand <Roger.Bivand at nhh.no> To: r-help at r-project.org Subject: Re: [R] Using Lagsarlm Message-ID: <1291983391495-3081833.post at n4.nabble.com> Content-Type: text/plain; charset=us-ascii This has been answered offlist (the poster also wrote directly to me as package maintainer, but did not post on the R-sig-geo list, as would have seemed natural). The resolution was to read ?formula, and to use either errorsarlm() or lagsarlm() in spdep with formula=y ~ 1. Apparently an insurance analyst in a hurry ... Roger Saswati Neogi wrote:> > > > I'm trying to use the spdep package to calculate this: > > y = rho W y + e > > I don't want to use explanatory variables, just the lag from the dependent > variable. > > How would I code this? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >----- Roger Bivand Economic Geography Section Department of Economics Norwegian School of Economics and Business Administration Helleveien 30 N-5045 Bergen, Norway -- View this message in context: http://r.789695.n4.nabble.com/Using-Lagsarlm-tp3080368p3081833.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 15 Date: Sat, 11 Dec 2010 00:01:38 +1100 From: Michael Sumner <mdsumner at gmail.com> To: Barry Rowlingson <b.rowlingson at lancaster.ac.uk> Cc: r-help at r-project.org, mathijsdevaan <mathijsdevaan at gmail.com> Subject: Re: [R] Projecting data on a world map using long/lat Message-ID: <AANLkTi=079ZiWJSZnhLfvoTneO6x=JLhtGKqNTG6my=+ at mail.gmail.com> Content-Type: text/plain; charset=UTF-8 Oh, whoops I was looking for the "vote up" button and accidentally hit "Reply All". On Fri, Dec 10, 2010 at 8:25 PM, Barry Rowlingson <b.rowlingson at lancaster.ac.uk> wrote:> On Fri, Dec 10, 2010 at 2:21 AM, mathijsdevaan <mathijsdevaan at gmail.com> wrote: >> >> Hi, >> >> I have a dataset (CSV) with some counts of firms located around the globe. >> Each count is assigned to the longitude and latitude of the specific >> location. Now I want to plot these counts on a world map using dots (size of >> dots represent the count). I have been unable to find any info on whether[[elided Yahoo spam]]>> > > ?Plotting points is trivial - plot(data$x,data$y,pch=19,cex=data$size) > will do for a start. i'm guessing your real problem is when you say > 'on a world map'. > > ?How detailed a world map do you need? There's an outline one in the > 'maps' package, or you should be able to find a shapefile of the world > on the web somewhere and use that via the rgdal package. > > ?Other options include making a KML file of your points and overlaying > on google earth. Or getting google map tiles and overlaying on > that.... Or exporting your data to a GIS format and doing the pretty > map in something like Quantum GIS. What are you trying to do exactly? > > also, you might want to post to r-sig-geo > > > > Barry > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Michael Sumner Institute for Marine and Antarctic Studies, University of Tasmania Hobart, Australia e-mail: mdsumner at gmail.com ------------------------------ Message: 16 Date: Fri, 10 Dec 2010 02:41:12 -0800 (PST) From: Amelia Vettori <amelia_vettori at yahoo.co.nz> To: r-help at r-project.org Subject: [R] Adding numbers in Outputs Message-ID: <533730.19296.qm at web121402.mail.ne1.yahoo.com> Content-Type: text/plain two OutputsHello! I am Amelia from Auckland and work for a bank. I am new to R and I have started my venture with R just a couple of weeks back and this is my first mail to R-forum. I need following assistance Suppose my R code generates following outputs as> X[[1]] [1] 40 [[2]] [1] 80??? 160 [[3]] [1] 160?? 80? 400> Y[[1]] [1] 10 [[2]] [1] 10??? 30 [[3]] [1] 5? 18? 20 and suppose Z = c(1, 2, 3) I need to perform the calculation where I will be multiplying corresponding terms of X and Y individually and multiplying their sum by Z and store these results in a dataframe. I.e. I need to calculate (40*10) * 1 ??????????????????????????????? # (first element of X + First element of Y) * Z[1] = 400 ((80*10)+(160*30)) * 2???????????????? # 2 row of X and 2nd row of Y = 11200 ((160*5)+(80*18)+(400*20)) * 3???? # 3rd row of X and 3 row of Y and Z[3] =? 30720 So the final output should be 400 11200 30720 One way of doing it is write R code for individual rows and arrive at the result e.g. ([[X]][1]*[[Y]][1])*1 will result in 400. However, I was just trying to know some smart way of doing it as there could be number of rows and writing code for each row will be a cumbersome job. So is there any better way to do it? Please guide me. I thank you in advance. Thanking all Amelia ???? ? [[alternative HTML version deleted]] ------------------------------ Message: 17 Date: Fri, 10 Dec 2010 05:06:39 -0800 From: Peter Ehlers <ehlers at ucalgary.ca> Cc: "r-help at r-project.org" <r-help at r-project.org> Subject: Re: [R] Compare one level of a factor with *all* other non-missing levels Message-ID: <4D0225DF.40705 at ucalgary.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 2010-12-10 03:24, deriK2000 wrote:> > Dear list, > > I try to compare the mean of a variable given a value of a factor with the > mean of the same variable for all K-1 other non-missing values of this > factor. This procedure I want to repeat for each level of the factor. > > Having read the recommendations of this list I want to avoid creating K-1 > dummy variables and searched for options of the pairwise.t.test. But > couldn't find a solution. Anyone with a suggestion how to do the > comparisions?Sounds like you want the Dunnett test procedure which seems to be implemented in a number of packages: multcomp, asd, MCPAN and others. It would probably be a good idea to install package 'sos' and learn how to search with it. Peter Ehlers> > Cheers, > > Derik------------------------------ Message: 18 Date: Fri, 10 Dec 2010 08:31:56 -0500 From: Duncan Murdoch <murdoch.duncan at gmail.com> To: S?ren H?jsgaard <Soren.Hojsgaard at agrsci.dk> Cc: "r-help at stat.math.ethz.ch" <r-help at stat.math.ethz.ch> Subject: Re: [R] Sweave: Setting options with SweaveOpts{} when using driver=RweaveHTML Message-ID: <4D022BCC.9020805 at gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 10/12/2010 3:40 AM, S?ren H?jsgaard wrote:> When using Sweave in connection with the driver RweaveLatex(), global options can be set with \SweaveOpts{}, e.g. > \SweaveOpts{keep.source=T}. > Does anybody know if it is possible to set global options in the same way when using Sweave with the driver RweaveHTML().> > > I haven't used R2HTML, but it looks from the SweaveSyntaxHTML variable that the same syntax should work there.> Regards > S?ren > > [[alternative HTML version deleted]] > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.------------------------------ Message: 19 Date: Fri, 10 Dec 2010 08:43:37 -0500 From: jim holtman <jholtman at gmail.com> To: Amelia Vettori <amelia_vettori at yahoo.co.nz> Cc: r-help at r-project.org Subject: Re: [R] Adding numbers in Outputs Message-ID: <AANLkTikPfKZ56JcpaX5yRE29TYBdzE4n_Rw1DVh28dzS at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 try this:> x <- list(40, c(80,160), c(160,80,400)) > y <- list(10, c(10,30), c(5,18,20)) > z <- c(1,2,3) > mapply(function(a1,a2,a3){+ a3 * sum(a1 * a2) + } + , x + , y + , z + ) [1] 400 11200 30720 On Fri, Dec 10, 2010 at 5:41 AM, Amelia Vettori <amelia_vettori at yahoo.co.nz> wrote:> two OutputsHello! > > I am Amelia from Auckland and work for a bank. I am new to R and I have started my venture with R just a couple of weeks back and this is my first mail to R-forum. I need following assistance > > Suppose my R code generates following outputs as > > >> X > [[1]] > [1] 40 > > [[2]] > [1] 80??? 160 > > [[3]] > [1] 160?? 80? 400 > > >> Y > > [[1]] > > [1] 10 > > > > [[2]] > > [1] 10??? 30 > > > > [[3]] > > [1] 5? 18? 20 > > and suppose > > Z = c(1, 2, 3) > > I need to perform the calculation where I will be multiplying corresponding terms of X and Y individually and multiplying their sum by Z and store these results in a dataframe. > > I.e. I need to calculate > > (40*10) * 1 ??????????????????????????????? # (first element of X + First element of Y) * Z[1] = 400 > > ((80*10)+(160*30)) * 2???????????????? # 2 row of X and 2nd row of Y = 11200 > > ((160*5)+(80*18)+(400*20)) * 3???? # 3rd row of X and 3 row of Y and Z[3] =? 30720 > > > > So the final output should be > > 400 > 11200 > 30720 > > > One way of doing it is write R code for individual rows and > ?arrive at the result e.g. > > ([[X]][1]*[[Y]][1])*1 will result in 400. However, I was just trying to know some smart way of doing it as there could be number of rows and writing code for each row will be a cumbersome job. So is there any better way to do it? > > Please guide me. > > I thank you in advance. > > Thanking > ?all > > Amelia > > > > > > > > > > > > > > > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? ------------------------------ Message: 20 Date: Fri, 10 Dec 2010 05:50:56 -0800 From: Peter Ehlers <ehlers at ucalgary.ca> To: Amelia Vettori <amelia_vettori at yahoo.co.nz> Cc: "r-help at r-project.org" <r-help at r-project.org> Subject: Re: [R] Adding numbers in Outputs Message-ID: <4D023040.2030106 at ucalgary.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 2010-12-10 02:41, Amelia Vettori wrote:> two OutputsHello! > > I am Amelia from Auckland and work for a bank. I am new to R and I have started my venture with R just a couple of weeks back and this is my first mail to R-forum. I need following assistance > > Suppose my R code generates following outputs as > > >> X > [[1]] > [1] 40 > > [[2]] > [1] 80 160 > > [[3]] > [1] 160 80 400 > > >> Y > > [[1]] > > [1] 10 > > > > [[2]] > > [1] 10 30 > > > > [[3]] > > [1] 5 18 20 > > and suppose > > Z = c(1, 2, 3) > > I need to perform the calculation where I will be multiplying corresponding terms of X and Y individually and multiplying their sum by Z and store these results in a dataframe. > > I.e. I need to calculate > > (40*10) * 1 # (first element of X + First element of Y) * Z[1] = 400 > > ((80*10)+(160*30)) * 2 # 2 row of X and 2nd row of Y = 11200 > > ((160*5)+(80*18)+(400*20)) * 3 # 3rd row of X and 3 row of Y and Z[3] = 30720 > > > > So the final output should be > > 400 > 11200 > 30720 > > > One way of doing it is write R code for individual rows and > arrive at the result e.g. > > ([[X]][1]*[[Y]][1])*1 will result in 400. However, I was just trying to know some smart way of doing it as there could be number of rows and writing code for each row will be a cumbersome job. So is there any better way to do it? > > Please guide me. >Why not just write a function to do what you've done by hand: f <- function(x, y, z){ len <- length(x) res <- rep(NA, len) for(i in 1:len){ res[i] <- sum(x[[i]] * y[[i]]) * z[i] } res } f(X, Y, Z) Peter Ehlers> I thank you in advance. > > Thanking > all > > Amelia >------------------------------ Message: 21 Date: Fri, 10 Dec 2010 14:57:17 +0100 From: Jinyan Huang <jinyan.fr at gmail.com> To: Peter Ehlers <ehlers at ucalgary.ca> Cc: "r-help at r-project.org" <r-help at r-project.org> Subject: Re: [R] Adding numbers in Outputs Message-ID: <AANLkTik+2=5EOEcM-bnz4st-Pcrv=q9dqVnQjTE0MjUB at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 X<-list(40,c(80,160),c(160,80,400)) Y<-list(10,c(10,30),c(5,18,20)) Z<-c(1,2,3) as.data.frame(do.call("rbind",X))->x as.data.frame(do.call("rbind",Y))->y x*y*Z->r r[upper.tri(r)] <- 0 rowSums(r) ------------------------------ Message: 22 Date: Fri, 10 Dec 2010 05:58:30 -0800 (PST) To: r-help at r-project.org Subject: Re: [R] Compare one level of a factor with *all* other non-missing levels Message-ID: <1291989510070-3082005.post at n4.nabble.com> Content-Type: text/plain; charset=us-ascii Peter Ehlers wrote:> > > Sounds like you want the Dunnett test procedure which seems > to be implemented in a number of packages: multcomp, asd, MCPAN > and others. > > It would probably be a good idea to install package 'sos' and > learn how to search with it. > > Peter Ehlers > >[[elided Yahoo spam]] Unfortunately, Dunnett compares the mean(x) for a factor level with the means(x) of all single K-1 other levels resulting in K-1 comparisions for each level (printed in a lower triangular matrix for the results). Instead, I just want to compare this one mean(x) with one other mean(x) of all the K-1 other levels (printed in a vector of length K for the results). [[elided Yahoo spam]] Cheers, Derik -- View this message in context: http://r.789695.n4.nabble.com/Compare-one-level-of-a-factor-with-all-other-non-missing-levels-tp3081777p3082005.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 23 Date: Fri, 10 Dec 2010 06:00:12 -0800 (PST) From: Amelia Vettori <amelia_vettori at yahoo.co.nz> To: jim holtman <jholtman at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] Adding numbers in Outputs Message-ID: <762099.6806.qm at web121409.mail.ne1.yahoo.com> Content-Type: text/plain Dear Mr Holtman Sir, Thanks a lot for your great solution. This certainly is helping me achieve what I need to get. However, I shall be hugely thankful to you if you can guide me in one respect. Sir, you have used following commands to assign values to x and y.?> x <- list(40, c(80,160), c(160,80,400)) > y <- list(10, c(10,30), c(5,18,20)) > z <- c(1,2,3)But Sir, the problem is these values are basically outputs of some other process which I am running and chances are these will vary. Sir, it will be a great help if you can guide me to convert the output (which I am getting) X [[1]] [1] 40 [[2]] [1] 80??? 160 [[3]] [1] 160?? 80? 400 to what you have suggested x <- list(40, c(80,160), c(160,80,400)) So, in that case once I get output in my format, I will convert that output as provided by you. I apologize for taking the liberty of writing to you, but I shall be really grateful to you, as I have just started getting the feel of 'R' and I know I need to take lots of efforts to begin with. Thanks and eagerly waiting for your guidance. Amelia Vettori --- On Fri, 10/12/10, jim holtman <jholtman at gmail.com> wrote: From: jim holtman <jholtman at gmail.com> Subject: Re: [R] Adding numbers in Outputs To: "Amelia Vettori" <amelia_vettori at yahoo.co.nz> Cc: r-help at r-project.org Received: Friday, 10 December, 2010, 1:43 PM try this:> x <- list(40, c(80,160), c(160,80,400)) > y <- list(10, c(10,30), c(5,18,20)) > z <- c(1,2,3) > mapply(function(a1,a2,a3){+? ???a3 * sum(a1 * a2) +? ???} +? ???, x +? ???, y +? ???, z + ) [1]???400 11200 30720 On Fri, Dec 10, 2010 at 5:41 AM, Amelia Vettori <amelia_vettori at yahoo.co.nz> wrote:> two OutputsHello! > > I am Amelia from Auckland and work for a bank. I am new to R and I have started my venture with R just a couple of weeks back and this is my first mail to R-forum. I need following assistance > > Suppose my R code generates following outputs as > > >> X > [[1]] > [1] 40 > > [[2]] > [1] 80??? 160 > > [[3]] > [1] 160?? 80? 400 > > >> Y > > [[1]] > > [1] 10 > > > > [[2]] > > [1] 10??? 30 > > > > [[3]] > > [1] 5? 18? 20 > > and suppose > > Z = c(1, 2, 3) > > I need to perform the calculation where I will be multiplying corresponding terms of X and Y individually and multiplying their sum by Z and store these results in a dataframe. > > I.e. I need to calculate > > (40*10) * 1 ??????????????????????????????? # (first element of X + First element of Y) * Z[1] = 400 > > ((80*10)+(160*30)) * 2???????????????? # 2 row of X and 2nd row of Y = 11200 > > ((160*5)+(80*18)+(400*20)) * 3???? # 3rd row of X and 3 row of Y and Z[3] =? 30720 > > > > So the final output should be > > 400 > 11200 > 30720 > > > One way of doing it is write R code for individual rows and > ?arrive at the result e.g. > > ([[X]][1]*[[Y]][1])*1 will result in 400. However, I was just trying to know some smart way of doing it as there could be number of rows and writing code for each row will be a cumbersome job. So is there any better way to do it? > > Please guide me. > > I thank you in advance. > > Thanking > ?all > > Amelia > > > > > > > > > > > > > > > > ? ? ? ?[[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? [[alternative HTML version deleted]] ------------------------------ Message: 24 Date: Fri, 10 Dec 2010 15:18:41 +0100 From: andrija djurovic <djandrija at gmail.com> To: r-help at r-project.org Subject: [R] (no subject) Message-ID: <AANLkTinMTKS-Jr1x5WwXXV3A-WFEaHoxmBpVpRL5T+=K at mail.gmail.com> Content-Type: text/plain Hi R-help, I am trying to find a way to select five highest values in data frame according some variable. I will demonstrate: c X1 X2 1 1 1 2 1 2 3 1 3 4 1 4 5 1 5 6 1 6 7 1 7 8 1 8 9 1 9 10 1 10 11 2 11 12 2 12 13 2 13 14 2 14 15 2 15 16 2 16 17 2 17 18 2 18 19 2 19 20 2 20 21 2 21 22 2 22 23 2 23 24 2 24 25 2 25 So I would like to select a rows with higest values of X2 inside X1. Expected result should be: X1 X2 1 10 1 9 1 8 1 7 1 6 2 25 2 24 2 23 2 22 2 21 I first oreded the data frame using c=c[with(c,order(X1,-X2)),] but I need a help to select highes five. It is easy to select when I have just 2 unique values of X1 but what is if I have 500 unique values in X1? Thanks Andrija [[alternative HTML version deleted]] ------------------------------ Message: 25 Date: Fri, 10 Dec 2010 09:42:44 -0500 From: jim holtman <jholtman at gmail.com> To: andrija djurovic <djandrija at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] (no subject) Message-ID: <AANLkTik+Ssu45gg2PvGF5r0di46pEnEgmHQr_7pAHYWV at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 try this:> do.call(rbind, lapply(split(x, x$X1), function(.grp){+ .ord <- .grp[order(.grp$X2, decreasing = TRUE),] + .ord[seq(min(5, nrow(.grp))),] + })) X1 X2 1.10 1 10 1.9 1 9 1.8 1 8 1.7 1 7 1.6 1 6 2.25 2 25 2.24 2 24 2.23 2 23 2.22 2 22 2.21 2 21 On Fri, Dec 10, 2010 at 9:18 AM, andrija djurovic <djandrija at gmail.com> wrote:> Hi R-help, > > > > I am trying to find a way to select five highest values in data frame > according some variable. I will demonstrate: > > c > > ? X1 X2 > > 1 ? 1 ?1 > > 2 ? 1 ?2 > > 3 ? 1 ?3 > > 4 ? 1 ?4 > > 5 ? 1 ?5 > > 6 ? 1 ?6 > > 7 ? 1 ?7 > > 8 ? 1 ?8 > > 9 ? 1 ?9 > > 10 ?1 10 > > 11 ?2 11 > > 12 ?2 12 > > 13 ?2 13 > > 14 ?2 14 > > 15 ?2 15 > > 16 ?2 16 > > 17 ?2 17 > > 18 ?2 18 > > 19 ?2 19 > > 20 ?2 20 > > 21 ?2 21 > > 22 ?2 22 > > 23 ?2 23 > > 24 ?2 24 > > 25 ?2 25 > > > > So I would like to select a rows with higest values of X2 inside X1. > Expected result should be: > > X1 X2 > > ? 1 ?10 > > ? 1 ?9 > > ? 1 ?8 > > ? 1 ?7 > > ? 1 ?6 > > ? 2 ?25 > > ? 2 ?24 > > ? 2 ?23 > > ? 2 ?22 > > ? 2 ?21 > > > > I first oreded the data frame using > > c=c[with(c,order(X1,-X2)),] > > but I need a help to select highes five. It is easy to select when I have > just 2 unique values of X1 but what is if I have 500 unique values in X1? > > > > Thanks > > Andrija > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? ------------------------------ Message: 26 Date: Fri, 10 Dec 2010 14:47:24 +0000 From: Daniel Brewer <daniel.brewer at icr.ac.uk> To: r-help at stat.math.ethz.ch Subject: [R] melt causes errors when characters and values are used Message-ID: <4D023D7C.1080607 at icr.ac.uk> Content-Type: text/plain; charset=ISO-8859-1 Hello, I am finding that the melt function from the reshape library causes errors when applied to a data.frame that contains numeric and character columns. For example, melt(id.vars="ID",data.frame(ID=1:3,date=c("a","b","c"),value=c(1,4,5))) ID variable value 1 1 date a 2 2 date b 3 3 date c 4 1 value <NA> 5 2 value <NA> 6 3 value <NA> Warning message: In `[<-.factor`(`*tmp*`, ri, value = c(1, 4, 5)) : invalid factor level, NAs generated It would be useful in this situation that the numerical column got converted to a character column in this situation. Any ways round this? In actual fact I have got a situation where it is more like this ID Date_1 Value_1 Date_2 Value_2 ... and I would like to convert it to a data.frame of ID, Date & Value but I thought the above would be an appropriate middle step. Thanks Dan -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis Email: daniel.brewer at icr.ac.uk ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:2}} ------------------------------ Message: 27 Date: Fri, 10 Dec 2010 06:54:56 -0800 (PST) From: haruo0409 <eixcxisx at bca.bai.ne.jp> To: r-help at r-project.org Subject: Re: [R] ReadWrite.xls problem Message-ID: <1291992896890-3082108.post at n4.nabble.com> Content-Type: text/plain; charset=us-ascii Hans-Peter I have checked the 'library(xlsReadWrite)' startup message. I found that I just failed to 'xls.getshlib()'. Entering 'xls.getshlib()', read.xls() works regularly. Thank you. -- View this message in context: http://r.789695.n4.nabble.com/ReadWrite-xls-problem-tp3078348p3082108.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 28 Date: Fri, 10 Dec 2010 09:31:21 -0500 From: jim holtman <jholtman at gmail.com> To: Amelia Vettori <amelia_vettori at yahoo.co.nz> Cc: r-help at r-project.org Subject: Re: [R] Adding numbers in Outputs Message-ID: <AANLkTi=Von-bjo0XgXQqkxJcuGAg-TyK2Ny3XeYsVEm5 at mail.gmail.com> Content-Type: text/plain You should be able to use whatever values you are getting from your script right now. I just did the assignment to match what you were showing on the output. The easiest thing to do is to do 'str(X)' from your data and compare it to the 'x' I created -- str(x). Here is what 'str(x)' gives:> x <- list(40, c(80,160), c(160,80,400)) > str(x)List of 3 $ : num 40 $ : num [1:2] 80 160 $ : num [1:3] 160 80 400 When providing sample data, it is probably best to use 'dput' to it can be reconstructed by the reader:> dput(x)list(40, c(80, 160), c(160, 80, 400)) This gives back what I was using and hopefully it compares with your current output. On Fri, Dec 10, 2010 at 9:00 AM, Amelia Vettori <amelia_vettori at yahoo.co.nz>wrote:> Dear Mr Holtman Sir, > > Thanks a lot for your great solution. This certainly is helping me achieve > what I need to get. However, I shall be hugely thankful to you if you can > guide me in one respect. > > Sir, you have used following commands to assign values to x and y. > > > x <- list(40, c(80,160), c(160,80,400)) > > y <- list(10, c(10,30), c(5,18,20)) > > z <- c(1,2,3) > > But Sir, the problem is these values are basically outputs of some other > process which I am running and chances are these will vary. Sir, it will be > a great help if you can guide me to convert the output (which I am getting) > > > X > [[1]] > [1] 40 > > [[2]] > [1] 80 160 > > [[3]] > [1] 160 80 400 > > to what you have suggested > > x <- list(40, c(80,160), c(160,80,400)) > > So, in that case once I get output in my format, I will convert that output > as provided by you. > > I apologize for taking the liberty of writing to you, but I shall be really > grateful to you, as I have just started getting the feel of 'R' and I know I > need to take lots of efforts to begin with. > > Thanks and eagerly waiting for your guidance. > > Amelia Vettori > > --- On *Fri, 10/12/10, jim holtman <jholtman at gmail.com>* wrote: > > > From: jim holtman <jholtman at gmail.com> > Subject: Re: [R] Adding numbers in Outputs > To: "Amelia Vettori" <amelia_vettori at yahoo.co.nz> > Cc: r-help at r-project.org > Received: Friday, 10 December, 2010, 1:43 PM > > > try this: > > > x <- list(40, c(80,160), c(160,80,400)) > > y <- list(10, c(10,30), c(5,18,20)) > > z <- c(1,2,3) > > mapply(function(a1,a2,a3){ > + a3 * sum(a1 * a2) > + } > + , x > + , y > + , z > + ) > [1] 400 11200 30720 > > > On Fri, Dec 10, 2010 at 5:41 AM, Amelia Vettori > <amelia_vettori at yahoo.co.nz<http://mc/compose?to=amelia_vettori at yahoo.co.nz>> > wrote:[[elided Yahoo spam]]> > > > I am Amelia from Auckland and work for a bank. I am new to R and I have > started my venture with R just a couple of weeks back and this is my first > mail to R-forum. I need following assistance > > > > Suppose my R code generates following outputs as > > > > > >> X > > [[1]] > > [1] 40 > > > > [[2]] > > [1] 80 160 > > > > [[3]] > > [1] 160 80 400 > > > > > >> Y > > > > [[1]] > > > > [1] 10 > > > > > > > > [[2]] > > > > [1] 10 30 > > > > > > > > [[3]] > > > > [1] 5 18 20 > > > > and suppose > > > > Z = c(1, 2, 3) > > > > I need to perform the calculation where I will be multiplying > corresponding terms of X and Y individually and multiplying their sum by Z > and store these results in a dataframe. > > > > I.e. I need to calculate > > > > (40*10) * 1 # (first element of X + First > element of Y) * Z[1] = 400 > > > > ((80*10)+(160*30)) * 2 # 2 row of X and 2nd row of Y > 11200 > > > > ((160*5)+(80*18)+(400*20)) * 3 # 3rd row of X and 3 row of Y and Z[3] > = 30720 > > > > > > > > So the final output should be > > > > 400 > > 11200 > > 30720 > > > > > > One way of doing it is write R code for individual rows and > > arrive at the result e.g. > > > > ([[X]][1]*[[Y]][1])*1 will result in 400. However, I was just trying to > know some smart way of doing it as there could be number of rows and writing > code for each row will be a cumbersome job. So is there any better way to do > it? > > > > Please guide me. > > > > I thank you in advance. > > > > Thanking > > all > > > > Amelia > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > R-help at r-project.org <http://mc/compose?to=R-help at r-project.org> mailing > list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > > >-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? [[alternative HTML version deleted]] ------------------------------ Message: 29 Date: Fri, 10 Dec 2010 08:57:55 -0600 From: David L Lorenz <lorenz at usgs.gov> To: andrija djurovic <djandrija at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] (no subject) Message-ID: <OF0496E27C.59EA4F61-ON862577F5.00517845-862577F5.00523549 at usgs.gov> Content-Type: text/plain Andrija, You should be able to extract the data that you want using a call like this (AD substituted for your c) with(AD, tapply(X2, X1, function(x) sort(x, dec=T)[1:5])) That returns a list like this: $`1` [1] 10 9 8 7 6 $`2` [1] 25 24 23 22 21 Just package it the way that you want. Dave From: andrija djurovic <djandrija at gmail.com> To: r-help at r-project.org Date: 12/10/2010 08:21 AM Subject: [R] (no subject) Sent by: r-help-bounces at r-project.org Hi R-help, I am trying to find a way to select five highest values in data frame according some variable. I will demonstrate: c X1 X2 1 1 1 2 1 2 3 1 3 4 1 4 5 1 5 6 1 6 7 1 7 8 1 8 9 1 9 10 1 10 11 2 11 12 2 12 13 2 13 14 2 14 15 2 15 16 2 16 17 2 17 18 2 18 19 2 19 20 2 20 21 2 21 22 2 22 23 2 23 24 2 24 25 2 25 So I would like to select a rows with higest values of X2 inside X1. Expected result should be: X1 X2 1 10 1 9 1 8 1 7 1 6 2 25 2 24 2 23 2 22 2 21 I first oreded the data frame using c=c[with(c,order(X1,-X2)),] but I need a help to select highes five. It is easy to select when I have just 2 unique values of X1 but what is if I have 500 unique values in X1? Thanks Andrija [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ------------------------------ Message: 30 Date: Fri, 10 Dec 2010 08:27:23 -0600 From: Scott Chamberlain <schamber at rice.edu> To: r-help <r-help at r-project.org> Subject: [R] Textwrangler Languages Folder Message-ID: <AANLkTikUF+4B3krpbePb5Oym_ni3H9FN1scx8FzF4y9g at mail.gmail.com> Content-Type: text/plain Dear R Community, I recently switched to a Mac (10.6.5), and have installed Textwrangler to run code to R. However, I can't install the syntax highlighting file because I can't find the directory: "~Users/username/Library/Application Support/TextWrangler/Language Modules/". Is there a different location I can place the syntax highlighting file? Scott [[alternative HTML version deleted]] ------------------------------ Message: 31 Date: Fri, 10 Dec 2010 04:55:14 -0800 (PST) From: bluesky <yooyoodd at yahoo.com.cn> To: r-help at r-project.org Subject: Re: [R] Minimization of the distance Message-ID: <1291985714124-3081900.post at n4.nabble.com> Content-Type: text/plain; charset=us-ascii its really help,thanks a lot -- View this message in context: http://r.789695.n4.nabble.com/Minimization-of-the-distance-tp3081345p3081900.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 32 Date: Fri, 10 Dec 2010 15:04:41 +0000 From: Francesco Nutini <nutini.francesco at gmail.com> To: "[R] help" <r-help at r-project.org> Subject: [R] [r] overlap different line in a xyplot (lattice) Message-ID: <BAY156-w3AAC96CF0EA9ACD3137FEDF2F0 at phx.gbl> Content-Type: text/plain dear [R] users, is there a way to plot different data (but with the same x-variables) in the same xyplot window? There are already a similar question, but the answer is not enought explanatory... Thanks a lot, Francesco [[alternative HTML version deleted]] ------------------------------ Message: 33 Date: Fri, 10 Dec 2010 09:07:42 -0600 From: Terry Therneau <therneau at mayo.edu> To: r-help at r-project.org, Damjan Krstajic <dkrstajic at hotmail.com> Subject: Re: [R] survival: ridge log-likelihood workaround Message-ID: <1291993662.26439.22.camel at punchbuggy> Content-Type: text/plain ------ begin inclusion --------- Dear all, I need to calculate likelihood ratio test for ridge regression. In February I have reported a bug where coxph returns unpenalized log-likelihood for final beta estimates for ridge coxph regression. In high-dimensional settings ridge regression models usually fail for lower values of lambda. As the result of it, in such settings the ridge regressions have higher values of lambda (e.g. over 100) which means that the difference between unpenalized log-likelihood and penalized log-likelihood is not insignificant. I would be grateful if someone can confirm that the below code is correct workaround. --- end included message ---- First, the "bug" you report is not a bug. The log partial likelihood from a Cox model LPL(beta) is well defined for any vector of coefficients beta, whether they are result of a maximization or taken from your daily horoscope. The loglik component of coxph is the LPL for the reported coefficients. For a ridge regression the coxph function maximizes LPL(beta) - penalty(beta) = penalized partial likelihood = PPL(beta). You have correctly recreated the PPL. Second: how do you do formal tests on such a model? This is hard. The difference LPL1- LPL2 is a chi-square when each is the result of maximizing the Cox LPL over a set of coefficients; when using a PPL we are maximizing over something else. The distribution of the difference of constrained LPL values can be argued to be a weighed sum of squared normals where the weights are in (0,1), which is something more complex than a chisq distribution. In a world with infinite free time I'd have pursued this, worked it all out, and added appropriate code to coxph. What about the difference in PPL values, which is the test you propose? I'm not aware of any theory showing that these have any relation to a chi-square distribution. (Said theory may well exist, and I'd be happy for pointers.) Terry Therneau ------------------------------ Message: 34 Date: Fri, 10 Dec 2010 10:13:22 -0500 From: Ben Tupper <btupper at bigelow.org> To: r-help <r-help at r-project.org> Cc: Scott Chamberlain <schamber at rice.edu> Subject: Re: [R] Textwrangler Languages Folder Message-ID: <B9A94995-291A-4A4F-9092-21318D042EDC at bigelow.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Hi, On Dec 10, 2010, at 9:27 AM, Scott Chamberlain wrote:> Dear R Community, > > I recently switched to a Mac (10.6.5), and have installed > Textwrangler to > run code to R. However, I can't install the syntax highlighting file > because > I can't find the directory: "~Users/username/Library/Application > Support/TextWrangler/Language Modules/". Is there a different > location I can > place the syntax highlighting file? >I don't have OSX 10.6 and I happen to use SubEthaEdit, but I suspect that you are really looking for something like the following: /Users/ben/Library/Application Support/TextWrangler/Language Modules/ Note that you would place your user name in the place where mine ("ben") appears. If the "TextWrangler/Language Modules" directory doesn't exist in the Application Support directory you can create it. Cheers, Ben Ben Tupper Bigelow Laboratory for Ocean Sciences 180 McKown Point Rd. P.O. Box 475 West Boothbay Harbor, Maine 04575-0475 http://www.bigelow.org/ ------------------------------ Message: 35 Date: Fri, 10 Dec 2010 15:27:53 +0000 From: Daniel Brewer <daniel.brewer at icr.ac.uk> To: r-help at stat.math.ethz.ch Subject: [R] Remove 100 years from a date object Message-ID: <4D0246F9.8070907 at icr.ac.uk> Content-Type: text/plain; charset=ISO-8859-1 Hello, I have some data that has dates in the form 27.02.37. I convert them to a date object as follows: as.Date(data$date,format="%d.%m.%y") But this gives me years such as 2037 when I would like them to be 1937. I thought of trying to take off some time i.e. as.Date(camCD$DoB,format="%d.%m.%y") - 100*365 But that doesn't seem to work out correctly. Any ideas how to do this? Thanks Dan -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis Email: daniel.brewer at icr.ac.uk ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:2}} ------------------------------ Message: 36 Date: Fri, 10 Dec 2010 07:20:55 -0800 (PST) From: profaar <profaar at live.com> To: r-help at r-project.org Subject: [R] help requested Message-ID: <1291994455364-3082147.post at n4.nabble.com> Content-Type: text/plain; charset=us-ascii HI friends, I have very lengthy graph data in edge list format. I want to convert it into node list format. example: EDGE LIST FORMAT 1 2 1 3 1 4 1 5 2 3 2 4 3 2 4 1 4 3 4 5 5 2 5 4 ITS NODE LIST FORMAT SHOULD BE LIKE: 1 2 3 4 5 2 3 4 3 2 4 1 3 5 2 4 Kindly suggest me which package in R provides the support to do my task. Thank u friends in advance. -- View this message in context: http://r.789695.n4.nabble.com/help-requested-tp3082147p3082147.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 37 Date: Fri, 10 Dec 2010 10:25:48 -0500 From: Simon Kiss <simonjkiss at yahoo.ca> To: r-help at r-project.org Subject: [R] 45 Degree labels on barplot? Help understanding code previously posted. Message-ID: <6EC1F9BA-28EF-4052-9F54-CD8E38B06E79 at yahoo.ca> Content-Type: text/plain; charset=us-ascii Dear colleagues, i found a line or two of code in the help archives from Uwe Ligges about creating slanted x-labels for a barplot and it works well for my purposes (code below). However, I was hoping someone could explain to me precisely what the code is doing. I'm aware it's invoking the text command, and I know the first ttwo arguments to text are x and y co-ordinates. I'm also aware that par("usr")[3] is grabbing the third element of the vector of plotting co-ordinates. But I tried replacing par("usr")[3] with just "0" and that didn't work; all the labels got bunched up on the left. Is it necessary to create a new object via "barplot" and then quote that in the x,y coordinates of text? Like I said, the code works great, but I'm trying to actually understand the rationale behind the elements so I can apply it in future. Yours, Simon Kiss #Reproducible Code mydat<-data.frame(countries=c("Canada", "Denmark", "Framce", "United Kingdom", "Germany", "Australia", "New Zealand", "Switzerland", "Belgium", "Netherlands"), stories_total=c(429, 25, 239, 99, 100, 96, 18, 21, 0, 6), avg=c(4.165048544, 6.25, 6.459459459, 0.908256881, 1.923076923, 1.103448276, 1.058823529, 1.615384615, 0, 0.107142857), steps=c(2, 2, 2, 0,1, 1, 1, 0,0,0), newspapers=c(103, 4, 37, 109, 52, 87, 17, 13, 10, 56)) mydat.sort1<-mydat[order(-mydat$avg), ] myplot<-barplot(mydat.sort1$avg, col=c("black", "black", "black", "grey", "white", "grey", "grey", "white", "white", "white"), ylim=c(0,7), main="Regulatory Action On Bisphenol A By Newspaper Coverage") col.vec=c("black", "grey", "white") legend("topright", col=col.vec, fill=c("black", "grey", "white"), legend=c("Meaningful Ban", "Recommendations To Withdraw", "No Legislative Action")) labels=mydat.sort1$countries #These lines create the labels text(myplot, par("usr")[3], labels=labels, srt=35, offset=1, adj=1, xpd=TRUE) axis(2) par("usr")[3] ********************************* Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 519 761 7606 ------------------------------ Message: 38 Date: Fri, 10 Dec 2010 16:17:42 +0000 From: Barry Rowlingson <b.rowlingson at lancaster.ac.uk> To: Daniel Brewer <daniel.brewer at icr.ac.uk> Cc: r-help at stat.math.ethz.ch Subject: Re: [R] Remove 100 years from a date object Message-ID: <AANLkTiksEzwdCwM1JCeN3c8JAkohcMAchxy7qRS8kkD+ at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Fri, Dec 10, 2010 at 3:27 PM, Daniel Brewer <daniel.brewer at icr.ac.uk> wrote:> Hello, > > I have some data that has dates in the form 27.02.37. ?I convert them to > a date object as follows: > as.Date(data$date,format="%d.%m.%y") > > But this gives me years such as 2037 when I would like them to be 1937. > ?I thought of trying to take off some time i.e. > as.Date(camCD$DoB,format="%d.%m.%y") - 100*365 > But that doesn't seem to work out correctly. ?Any ideas how to do this?Normally to adjust dates you can use as.difftime() and do arithmetic, but a year is a variable thing (can be 365 or 366 days) so you cant make a difftime of years. Days are variable things if you worry about leap seconds... Also, you could end up with an invalid date if you have 29-Feb-2000 and 29-Feb-1900. One wasn't a leap year... A solution minus those caveats is to convert to POSIXlt and adjust the $year element: > dob="27.02.37" > as.Date(dob,format="%d.%m.%y") [1] "2037-02-27" > dobp = as.POSIXlt(as.Date(dob,format="%d.%m.%y")) > dobp$year = dobp$year - 100 > dobp [1] "1937-02-27 UTC" > as.Date(dobp) [1] "1937-02-27" although it might be easier to paste a '19' into your character variable > paste(substr(dob,1,6),"19",substr(dob,7,9),sep="") [1] "27.02.1937" and then do it the way you started. Assumes you have leading zeroes on all fields though. Barry ------------------------------ Message: 39 Date: Fri, 10 Dec 2010 17:20:00 +0100 From: Jinyan Huang <jinyan.fr at gmail.com> To: profaar <profaar at live.com> Cc: r-help at r-project.org Subject: Re: [R] help requested Message-ID: <AANLkTi=e+XhReMP8p+LaLG3Mibwjq_bzkhEBvGVAjRVN at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 awk '{arr[$1]=arr[$1] " " $2}END{for( i in arr){print i,arr[i]}}' edgelist.txt | sort -k1 On Fri, Dec 10, 2010 at 4:20 PM, profaar <profaar at live.com> wrote:> 1 2 > 1 3 > 1 4 > 1 5 > 2 3 > 2 4 > 3 2 > 4 1 > 4 3 > 4 ?5 > 5 2 > 5 4------------------------------ Message: 40 Date: Fri, 10 Dec 2010 14:23:48 -0200 From: Henrique Dallazuanna <wwwhsd at gmail.com> To: profaar <profaar at live.com> Cc: r-help at r-project.org Subject: Re: [R] help requested Message-ID: <AANLkTineF+bDByVa0vRP2X8ak29_dUmNT+kq7eSWaE6g at mail.gmail.com> Content-Type: text/plain Try this:> DFV1 V2 1 1 2 2 1 3 3 1 4 4 1 5 5 2 3 6 2 4 7 3 2 8 4 1 9 4 3 10 4 5 11 5 2 12 5 4> aggregate(V2 ~ V1, DF, paste, collapse = ' ')V1 V2 1 1 2 3 4 5 2 2 3 4 3 3 2 4 4 1 3 5 5 5 2 4 On Fri, Dec 10, 2010 at 1:20 PM, profaar <profaar at live.com> wrote:> > HI friends, > I have very lengthy graph data in edge list format. I want to convert it > into node list format. > > example: > EDGE LIST FORMAT > 1 2 > 1 3 > 1 4 > 1 5 > 2 3 > 2 4 > 3 2 > 4 1 > 4 3 > 4 5 > 5 2 > 5 4 > > ITS NODE LIST FORMAT SHOULD BE LIKE: > 1 2 3 4 5 > 2 3 4 > 3 2 > 4 1 3 > 5 2 4 > > Kindly suggest me which package in R provides the support to do my task. > Thank u friends in advance. > -- > View this message in context: > http://r.789695.n4.nabble.com/help-requested-tp3082147p3082147.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O [[alternative HTML version deleted]] ------------------------------ Message: 41 Date: Fri, 10 Dec 2010 16:21:29 -0000 From: "Eleni Rapsomaniki" <er339 at medschl.cam.ac.uk> To: <r-help at r-project.org> Subject: [R] survreg vs. aftreg (eha) - the relationship between fitted coefficients? Message-ID: <807B01905E122642ACFED8271250B0C306129AF8 at me-mail1.medlan.cam.ac.uk> Content-Type: text/plain; charset="iso-8859-1" Dear R-users, I need to use the aftreg function in package 'eha' to estimate failure times for left truncated survival data. Apparently, survreg still cannot fit such models. Both functions should be fitting the accelerated failure time (Weibull) model. However, as G?ran Brostr?m points out in the help file for aftreg, the parameterisation is different giving rise to different coefficients. The betas for adjusted covariates are opposite in sign but otherwise identical, whereas the intercept is quite different in a non-obvious way. The log-likelihoods are similar also, but not identical. I would like to find out how I can convert one set of coefficients to the other so as to obtain the same linear predictors using either model. Any ideas??? #the example below uses right-censored data for simplicity (the principle should be the same with left truncation I hope) library(survival) library(eha) # COMPARE coefs between survreg ('survival' pkg) and aftreg ('eha' pkg) #Fitting NULL models (no covariates) results in (approximately) the same coefs (which is good!) m1_NULL=survreg(Surv(futime/365, status==1) ~ 1, data=pbcseq) m2_NULL=aftreg(Surv(futime/365, status==1) ~ 1, data=pbcseq) c(m1_NULL$coef, 1/m1_NULL$scale) #--> intercept= 3.878656 , shape = 1.478177 c(m2_NULL$coef[1], exp(m2_NULL$coef[2])) #--> intercept= 3.878859 , shape=1.478150 # NOW I adjust for covariates m1=survreg(Surv(futime/365, status==1) ~ chol+stage, data=pbcseq) m2= aftreg(Surv(futime/365, status==1) ~ chol+stage, data=pbcseq) ### m2 ####### #Coefficients: # (Intercept) chol stage # 5.944641913 -0.001692574 -0.470861324 #Scale= 0.6416744 #Loglik(model)= -483.9 Loglik(intercept only)= -506.8 # Chisq= 45.91 on 2 degrees of freedom, p= 1.1e-10 #n=1124 (821 observations deleted due to missingness) ### m2 ####### #Covariate W.mean Coef Exp(Coef) se(Coef) Wald p #chol 303.777 0.002 1.002 0.000 0.000 #stage 3.298 0.460 1.584 0.119 0.000 # #log(scale) 5.029 152.807 0.477 0.000 #log(shape) 0.467 1.595 0.095 0.000 # #Events 92 #Total time at risk 9017 #Max. log. likelihood -484.31 #LR test statistic 45.0 #Degrees of freedom 2 #Overall p-value 1.64669e-10 Many thanks for any help you may be able to provide. Eleni Rapsomaniki Research Associate University of Cambridge Institute of Primary and Public Health ------------------------------ Message: 42 Date: Fri, 10 Dec 2010 08:27:33 -0800 (PST) From: Clint Bowman <clint at ecy.wa.gov> To: Barry Rowlingson <b.rowlingson at lancaster.ac.uk> Cc: r-help at stat.math.ethz.ch, Daniel Brewer <daniel.brewer at icr.ac.uk> Subject: Re: [R] Remove 100 years from a date object Message-ID: <alpine.LRH.2.02.1012100824120.8321 at aeolus.ecy.wa.gov> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" There still may be a problem if the dates go back far enough, e.g., 1909. Is '09' 1909 or 2009? No matter what, you have to decide which values need 1900 added and which need 2000. I'd split the date on the delimiter '.', decide whether to add 1900 or 2000, and then paste them together and then as.Date(). Clint -- Clint Bowman INTERNET: clint at ecy.wa.gov Air Quality Modeler INTERNET: clint at math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600 FAX: (360) 407-7534 Olympia, WA 98504-7600 USPS: PO Box 47600, Olympia, WA 98504-7600 Parcels: 300 Desmond Drive, Lacey, WA 98503-1274 On Fri, 10 Dec 2010, Barry Rowlingson wrote:> On Fri, Dec 10, 2010 at 3:27 PM, Daniel Brewer <daniel.brewer at icr.ac.uk> wrote: >> Hello, >> >> I have some data that has dates in the form 27.02.37. ?I convert them to >> a date object as follows: >> as.Date(data$date,format="%d.%m.%y") >> >> But this gives me years such as 2037 when I would like them to be 1937. >> ?I thought of trying to take off some time i.e. >> as.Date(camCD$DoB,format="%d.%m.%y") - 100*365 >> But that doesn't seem to work out correctly. ?Any ideas how to do this? > > Normally to adjust dates you can use as.difftime() and do arithmetic, > but a year is a variable thing (can be 365 or 366 days) so you cant > make a difftime of years. Days are variable things if you worry about > leap seconds... > > Also, you could end up with an invalid date if you have 29-Feb-2000 > and 29-Feb-1900. One wasn't a leap year... > > A solution minus those caveats is to convert to POSIXlt and adjust the > $year element: > > > dob="27.02.37" > > as.Date(dob,format="%d.%m.%y") > [1] "2037-02-27" > > dobp = as.POSIXlt(as.Date(dob,format="%d.%m.%y")) > > dobp$year = dobp$year - 100 > > > dobp > [1] "1937-02-27 UTC" > > as.Date(dobp) > [1] "1937-02-27" > > although it might be easier to paste a '19' into your character variable > > > paste(substr(dob,1,6),"19",substr(dob,7,9),sep="") > [1] "27.02.1937" > > and then do it the way you started. Assumes you have leading zeroes on > all fields though. > > Barry > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 43 Date: Fri, 10 Dec 2010 11:38:11 -0500 From: Gabor Grothendieck <ggrothendieck at gmail.com> To: Daniel Brewer <daniel.brewer at icr.ac.uk> Cc: r-help at stat.math.ethz.ch Subject: Re: [R] Remove 100 years from a date object Message-ID: <AANLkTik9sDts866P12KifkC4WWKVu_2qAoGJx4q+OTKF at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Fri, Dec 10, 2010 at 10:27 AM, Daniel Brewer <daniel.brewer at icr.ac.uk> wrote:> Hello, > > I have some data that has dates in the form 27.02.37. ?I convert them to > a date object as follows: > as.Date(data$date,format="%d.%m.%y") > > But this gives me years such as 2037 when I would like them to be 1937. > ?I thought of trying to take off some time i.e. > as.Date(camCD$DoB,format="%d.%m.%y") - 100*365 > But that doesn't seem to work out correctly. ?Any ideas how to do this? >The easiest is just to use chron dates since it uses a cut.off of 30 by default. That is, if yy is less than that then 2000+yy is used and if greater than that then 1900+yy is used. Thus try this: library(chron) d <- "27.02.37" as.Date(dates(d, format = "d.m.y")) # "1937-02-27" as.Date(d, format = "%d.%m.%y") # "2037-02-27" Also if that is not good enough and you want a different value for the cut.off then note that the default in chron is to use the year.expand function to expand two digit dates but you can change that via something like this: options(chron.year.expand = function(..., cut.off = 25) year.expand(..., cut.off = cut.off)) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ------------------------------ Message: 44 Date: Fri, 10 Dec 2010 11:38:53 -0500 From: Stavros Macrakis <macrakis at alum.mit.edu> To: r-help <r-help at r-project.org> Subject: [R] Stricter read.table? Message-ID: <AANLkTikJm=VtsYcigvcMhVW4ru6nwBN7N32j_qkcQ=0a at mail.gmail.com> Content-Type: text/plain read.table gives idiosyncratic results when the input is formatted strangely, for example: read.table(textConnection("a'b\nc'd\n"),header=FALSE,fill=TRUE,sep="",quote="'") => "c'd" "a'b" "c'd" read.table(textConnection("a'b\nc'd\nf'\n'\n"),header=FALSE,fill=TRUE,sep="",quote="'") => "f'" "\na" "b" "c'd" "f'" "\n" Though read.table doesn't specify the syntax of its input precisely, these results don't seem particularly useful or consistent. Is there a stricter version of read.table (perhaps in a package) that gives errors or warnings if it finds quotation marks in the middle of fields or encounters other such peculiar situations? Thanks, -s [[alternative HTML version deleted]] ------------------------------ Message: 45 Date: Fri, 10 Dec 2010 16:40:39 +0000 From: Daniel Brewer <daniel.brewer at icr.ac.uk> To: Barry Rowlingson <b.rowlingson at lancaster.ac.uk> Cc: r-help at stat.math.ethz.ch Subject: Re: [R] Remove 100 years from a date object Message-ID: <4D025807.9030400 at icr.ac.uk> Content-Type: text/plain; charset=ISO-8859-1 On 10/12/2010 4:17 PM, Barry Rowlingson wrote:> On Fri, Dec 10, 2010 at 3:27 PM, Daniel Brewer <daniel.brewer at icr.ac.uk> wrote: >> Hello, >> >> I have some data that has dates in the form 27.02.37. I convert them to >> a date object as follows: >> as.Date(data$date,format="%d.%m.%y") >> >> But this gives me years such as 2037 when I would like them to be 1937. >> I thought of trying to take off some time i.e. >> as.Date(camCD$DoB,format="%d.%m.%y") - 100*365 >> But that doesn't seem to work out correctly. Any ideas how to do this? > > Normally to adjust dates you can use as.difftime() and do arithmetic, > but a year is a variable thing (can be 365 or 366 days) so you cant > make a difftime of years. Days are variable things if you worry about > leap seconds... > > Also, you could end up with an invalid date if you have 29-Feb-2000 > and 29-Feb-1900. One wasn't a leap year... > > A solution minus those caveats is to convert to POSIXlt and adjust the > $year element: > > > dob="27.02.37" > > as.Date(dob,format="%d.%m.%y") > [1] "2037-02-27" > > dobp = as.POSIXlt(as.Date(dob,format="%d.%m.%y")) > > dobp$year = dobp$year - 100 > > > dobp > [1] "1937-02-27 UTC" > > as.Date(dobp) > [1] "1937-02-27" > > although it might be easier to paste a '19' into your character variable > > > paste(substr(dob,1,6),"19",substr(dob,7,9),sep="") > [1] "27.02.1937" > > and then do it the way you started. Assumes you have leading zeroes on > all fields though. > > BarryMany thanks. Thats great Dan -- ************************************************************** Daniel Brewer, Ph.D. Institute of Cancer Research Molecular Carcinogenesis Email: daniel.brewer at icr.ac.uk ************************************************************** The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the a...{{dropped:2}} ------------------------------ Message: 46 Date: Fri, 10 Dec 2010 17:53:45 +0100 From: Martin Maechler <maechler at stat.math.ethz.ch> To: Steve Lianoglou <mailinglist.honeypot at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] Perl "cut" equivalent in R Message-ID: <19714.23321.622897.948731 at lynne.math.ethz.ch> Content-Type: text/plain; charset=iso-8859-1>>>>> "SL" == Steve Lianoglou <mailinglist.honeypot at gmail.com> >>>>> on Mon, 6 Dec 2010 14:21:59 -0500 writes:>>> if(FALSE) { stuff your don't want executed ? ? ? ? ?} >>> >>> Switching a block of code off/on with editing a single>> character may be done using 0/1 instead of FALSE/TRUE. SL> Or even F/T Bad Idea: F <- 1 ------------------------------ Message: 47 Date: Fri, 10 Dec 2010 08:53:46 -0800 (PST) From: mathijsdevaan <mathijsdevaan at gmail.com> To: r-help at r-project.org Subject: Re: [R] Projecting data on a world map using long/lat Message-ID: <1292000026484-3082305.post at n4.nabble.com> Content-Type: text/plain; charset=us-ascii Thanks for the suggestions, but I am not there yet (I'm a real novice). In the code provided by Patrick (see below), I changed the shape input (from sids to world) which I downloaded here: http://thematicmapping.org/downloads/world_borders.php. As a result I also need to change the "CNTY_ID" and "id" in the code, but I have no idea what [[elided Yahoo spam]] Mathijs library(maptools) library(ggplot2) gpclibPermit() myshp<- readShapeSpatial(system.file("shapes/sids.shp", package="maptools")) ## see licence, not GPL myshp.points<- fortify.SpatialPolygonsDataFrame(myshp, region="CNTY_ID") shpm<- merge(myshp.points, myshp, by.x="id", by.y="CNTY_ID") head(shpm) p <- ggplot(shpm, aes(long, lat, group=group, fill=NWBIR74)) p <- p + geom_polygon() + geom_path(color="white") + coord_equal() ## Add some locations cities <- read.table(textConnection(" long lat city val -78.644722 35.818889 Raleigh 323 -80.843333 35.226944 Charlotte 510 -82.555833 35.58 Asheville 400"), header = TRUE) p <- p + geom_point(aes( fill=NULL, group = NULL, size=val), data = cities, color= 'black') p -- View this message in context: http://r.789695.n4.nabble.com/Projecting-data-on-a-world-map-using-long-lat-tp3081298p3082305.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 48 Date: Fri, 10 Dec 2010 09:05:24 -0800 From: "William Dunlap" <wdunlap at tibco.com> To: "Martin Maechler" <maechler at stat.math.ethz.ch>, "Steve Lianoglou" <mailinglist.honeypot at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] Perl "cut" equivalent in R Message-ID: <77EB52C6DD32BA4D87471DCD70C8D70003B9AB1D at NA-PA-VBE03.na.tibco.com> Content-Type: text/plain; charset="iso-8859-1"> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Martin Maechler > Sent: Friday, December 10, 2010 8:54 AM > To: Steve Lianoglou > Cc: r-help at r-project.org > Subject: Re: [R] Perl "cut" equivalent in R > > >>>>> "SL" == Steve Lianoglou <mailinglist.honeypot at gmail.com> > >>>>> on Mon, 6 Dec 2010 14:21:59 -0500 writes: > > >>> if(FALSE) { stuff your don't want executed ? ? ? ? ?} > >>> > >> > > Switching a block of code off/on with editing a single > >> character may be done using 0/1 instead of FALSE/TRUE. > > SL> Or even F/T > > Bad Idea: > > F <- 1Another approach is to write the following function dontRun <- function(expr) {} and replace that if (FALSE) { ... questionable code ... } with dontRun( {... questionable code ...} ) If you do want the questionable code to run, redefine dontRun to be dontRun <- function(expr) { expr } You can use this approach to put assertion tests into your code that only get run when the assertion function is defined to do something. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 49 Date: Fri, 10 Dec 2010 12:18:03 -0500 From: "John Fox" <jfox at mcmaster.ca> To: <r-help at r-project.org> Subject: [R] new edition of R Companion to Applied Regression Message-ID: <005f01cb988e$345736e0$9d05a4a0$@ca> Content-Type: text/plain; charset="us-ascii" Dear all, Sandy Weisberg and I would like to announce the publication of the second edition of An R Companion to Applied Regression (Sage, 2011). As is immediately clear, the book now has two authors and S-PLUS is gone from the title (and the book). The R Companion has also been thoroughly rewritten, covering developments in the nearly 10 years since the first edition was written and expanding coverage of topics such as R graphics and R programming. As before, however, the R Companion provides a general introduction to R in the context of applied regression analysis, broadly construed. It is available from the publisher at <http://www.sagepub.com/books/Book233899?> (US) or <http://www.uk.sagepub.com/books/Book233899?> (UK), and from Amazon at <http://www.amazon.ca/R-Companion-Applied-Regression/dp/141297514X/ref=sr_1_ 3?s=books&ie=UTF8&qid=1291995545&sr=1-3>. The book is augmented by a web site <http://socserv.mcmaster.ca/jfox/Books/Companion/> with data sets, appendices on a variety of topics, and more, and it associated with the car package on CRAN, which has recently undergone an overhaul. Regards, John and Sandy -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox ------------------------------ Message: 50 Date: Fri, 10 Dec 2010 12:55:32 -0500 From: Mike Marchywka <marchywka at hotmail.com> To: <jinyan.fr at gmail.com>, <profaar at live.com> Cc: r-help at r-project.org Subject: Re: [R] help requested Message-ID: <BLU113-W7141585491F2E4027F5C6BE2F0 at phx.gbl> Content-Type: text/plain; charset="iso-8859-1" ----------------------------------------> From: jinyan.fr at gmail.com > Date: Fri, 10 Dec 2010 17:20:00 +0100 > To: profaar at live.com > CC: r-help at r-project.org > Subject: Re: [R] help requested > > awk '{arr[$1]=arr[$1] " " $2}END{for( i in arr){print i,arr[i]}}' > edgelist.txt | sort -k1My first thought PERL hash but I guess my answer would still be to consider any R hash-like structures. I guess any array that accepts arbitrary subscripts amounts to a hash.> > > > On Fri, Dec 10, 2010 at 4:20 PM, profaar wrote: > > 1 2 > > 1 3 > > 1 4 > > 1 5 > > 2 3 > > 2 4 > > 3 2 > > 4 1 > > 4 3 > > 4 5 > > 5 2 > > 5 4 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.------------------------------ Message: 51 Date: Fri, 10 Dec 2010 13:09:10 -0500 From: Duncan Murdoch <murdoch.duncan at gmail.com> To: William Dunlap <wdunlap at tibco.com> Cc: r-help at r-project.org, Martin Maechler <maechler at stat.math.ethz.ch> Subject: Re: [R] Perl "cut" equivalent in R Message-ID: <4D026CC6.8010804 at gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 10/12/2010 12:05 PM, William Dunlap wrote:> > -----Original Message----- > > From: r-help-bounces at r-project.org > > [mailto:r-help-bounces at r-project.org] On Behalf Of Martin Maechler > > Sent: Friday, December 10, 2010 8:54 AM > > To: Steve Lianoglou > > Cc: r-help at r-project.org > > Subject: Re: [R] Perl "cut" equivalent in R > > > > >>>>> "SL" == Steve Lianoglou<mailinglist.honeypot at gmail.com> > > >>>>> on Mon, 6 Dec 2010 14:21:59 -0500 writes: > > > > >>> if(FALSE) { stuff your don't want executed } > > >>> > > >> > > > Switching a block of code off/on with editing a single > > >> character may be done using 0/1 instead of FALSE/TRUE. > > > > SL> Or even F/T > > > > Bad Idea: > > > > F<- 1 > > Another approach is to write the following function > dontRun<- function(expr) {} > and replace that > if (FALSE) { ... questionable code ... } > with > dontRun( {... questionable code ...} ) > If you do want the questionable code to run, > redefine dontRun to be > dontRun<- function(expr) { expr } > > You can use this approach to put assertion tests > into your code that only get run when the assertion > function is defined to do something.That's a nice idea! Duncan Murdoch ------------------------------ Message: 52 Date: Fri, 10 Dec 2010 10:13:00 -0800 From: Peter Ehlers <ehlers at ucalgary.ca> To: Francesco Nutini <nutini.francesco at gmail.com> Cc: "\[R\] help" <r-help at r-project.org> Subject: Re: [R] [r] overlap different line in a xyplot (lattice) Message-ID: <4D026DAC.6070701 at ucalgary.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 2010-12-10 07:04, Francesco Nutini wrote:> > dear [R] users, > is there a way to plot different data (but with the same x-variables) in the same xyplot window? > There are already a similar question, but the answer is not enought explanatory...Something like this? x <- rep(1:10, 2) y1 <- rnorm(10); y2 <- rnorm(10) + 2 y <- c(y1, y2) g <- gl(2, 10) xyplot( y ~ x, groups = g, type = 'b') Peter Ehlers> > > Thanks a lot, > Francesco >------------------------------ Message: 53 Date: Fri, 10 Dec 2010 13:13:32 -0500 From: Anthony Damico <ajdamico at gmail.com> To: r-help at r-project.org Subject: [R] Could concurrent R sessions mix up variables? Message-ID: <AANLkTi=mqRZqf1zSkg1L=xmEDub_VeiN+9pYQ=nDAak_ at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi, I'm working in R 2.11.1 x64 on Windows x86_64-pc-mingw32. I'm experiencing a strange problem in R that I'm not even sure how to begin to fix. I've got a huge (forty-pages printed) simulation written in R that I'd like to run multiple times. When I open up R and run it on its own, it works fine. At the beginning of the program, there's a variable X that I set to 1, 5, 10, 20, depending on how sensitive I want the simulation to be to a certain parameter. When I just run one instance of R, the X variable stays the same throughout the program. I have a quad-core machine, so I'd like to take advantage of all four processors. If I open up four sessions and set X to 1, 5, 10, and 20 in those different sessions, then run all four simulations all the way through (about eighteen hours of processing time) at the same time, the variable X ends up being 20 at the end of all four sessions. It's as if R mixed up the variable setting between the four concurrent sessions. I can't figure out why else my variable X would ever get changed to 20 in the three simulations that I set it to 1, 5, and 10, respeectively (it doesn't get updated anywhere during the simulation). When I have all four of these simulations running concurrently, I am absolutely maxing out my computer. All four processors are at 100%, and my Windows Task Manager says I'm using almost 100% of my 16 GB of RAM. Is it possible that intense resource use would cause a variable conflict like this? I have no idea where to start troubleshooting this error, so any advice would be appreciated. Thanks! Anthony Damico Kaiser Family Foundation ------------------------------ Message: 54 Date: Fri, 10 Dec 2010 13:18:21 -0500 From: Michael D <mike409 at gmail.com> To: r-help at r-project.org Subject: [R] help with RSQLite adding a new column Message-ID: <AANLkTikL9tqK=RZphOVs8FCPYKCXF2CMAiXfWQtuEsGt at mail.gmail.com> Content-Type: text/plain I'm new to using sql so I'm having difficulties (and worries) in adding a new column of data to a table I have. Its a very large file (around 5 Gb) which is why I'm having to use SQL I have a table with variables ID, IDrec and IDdes and the variables IDrec and IDdes give a mapping of some other values but the other values are associated with the ID variable (think of IDrec and IDdes being character strings and ID being numeric) (Imagine the transposed) Table1: ID: 1,2,3,4,... IDrec: A,B,C,D... IDdes: B,C,A,E... So I've created a table with the final form I need it to be in dbGetQuery(db, "CREATE TABLE Map (ID int, IDrec int, IDrec1 int, IDdes int, IDdes1 int)") And the finished table would look something like: Map: ID: 1, 2, 3, 4,... IDrec: 1, 2, 3, 4,... IDrec1: A, B, C, D,... IDdes: 2, 3, 1, 5,.... IDdes1: B, C, A, E,... So I copy in the first set of values easily: dbGetQuery(db, "INSERT INTO Map(ID, IDrec, IDrec1, IDdes1) SELECT ID, ID, IDrec, IDdes FROM Ntemp") Giving me a table that looks like: Map: ID: 1, 2, 3, 4,... IDrec: 1, 2, 3, 4,... IDrec1: A, B, C, D,... IDdes: NA,NA,NA,NA,... IDdes1: B, C, A, E,... Then I create a new table with just the IDdes values I need: dbGetQuery(db, "Create table temp2 as SELECT temp.ID FROM Ntemp, temp WHERE Ntemp.IDdes1 = temp.IDrec1") Giving me temp2 (not sure what the variable name is) V1: 2, 3, 1, 5,... But when I try to copy in the new data: dbGetQuery(db, "INSERT INTO Map(IDdes) SELECT * FROM temp2") My map table isn't updated: Map: ID: 1, 2, 3, 4,... IDrec: 1, 2, 3, 4,... IDrec1: A, B, C, D,... IDdes: NA,NA,NA,NA,... IDdes1: B, C, A, E,... Is there something I'm missing? Or am I just going about inserting the IDdes variables the wrong way? Thanks for the help. Michael [[alternative HTML version deleted]] ------------------------------ Message: 55 Date: Fri, 10 Dec 2010 19:31:04 +0100 From: Petr Savicky <savicky at cs.cas.cz> To: r-help at r-project.org Subject: Re: [R] help requested Message-ID: <20101210183104.GA24654 at cs.cas.cz> Content-Type: text/plain; charset=utf-8 On Fri, Dec 10, 2010 at 07:20:55AM -0800, profaar wrote:> > HI friends, > I have very lengthy graph data in edge list format. I want to convert it > into node list format. > > example: > EDGE LIST FORMAT > 1 2 > 1 3 > 1 4 > 1 5 > 2 3 > 2 4 > 3 2 > 4 1 > 4 3 > 4 5 > 5 2 > 5 4 > > ITS NODE LIST FORMAT SHOULD BE LIKE: > 1 2 3 4 5 > 2 3 4 > 3 2 > 4 1 3 > 5 2 4 > > Kindly suggest me which package in R provides the support to do my task.How long the list of egdes is? For not too large lists, consider also library(graph) G <- new("graphNEL", edgemode="directed") G <- addNode(as.character(1:5), G) edges <- read.table(file=stdin(), colClasses="character") 1 2 1 3 1 4 1 5 2 3 2 4 3 2 4 1 4 3 4 5 5 2 5 4 G <- addEdge(from=edges[, 1], to=edges[, 2], G) edgeL(G) $`1` $`1`$edges [1] 2 3 4 5 $`2` $`2`$edges [1] 3 4 $`3` $`3`$edges [1] 2 $`4` $`4`$edges [1] 1 3 5 $`5` $`5`$edges [1] 2 4 Very large lists can be handled by unix/linux sort command (if not sorted already) and by extracting the end-nodes of the edges starting in each node. In a sorted file, they form blocks of consecutive lines, so a simple text processing with perl is sufficient. Petr Savicky. ------------------------------ Message: 56 Date: Fri, 10 Dec 2010 13:41:38 -0500 From: David Winsemius <dwinsemius at comcast.net> To: Simon Kiss <simonjkiss at yahoo.ca> Cc: r-help at r-project.org Subject: Re: [R] 45 Degree labels on barplot? Help understanding code previously posted. Message-ID: <F3071826-3CCB-4396-B85D-7BCADF1D3FEE at comcast.net> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes On Dec 10, 2010, at 10:25 AM, Simon Kiss wrote:> Dear colleagues, > i found a line or two of code in the help archives from Uwe Ligges > about creating slanted x-labels for a barplot and it works well for > my purposes (code below). However, I was hoping someone could > explain to me precisely what the code is doing. > I'm aware it's invoking the text command, and I know the first ttwo > arguments to text are x and y co-ordinates. I'm also aware that > par("usr")[3] is grabbing the third element of the vector of > plotting co-ordinates.More accurately the limits of the plot area in plot dimensions.> But I tried replacing par("usr")[3] with just "0" and that didn't > work; all the labels got bunched up on the left.That was the "y" argument, not the rotation argument. (Which means I am surprised that it bunched things to the side ... and for me it did nothing at all... same graphic.) It is the srt argument that controls the angle.> Is it necessary to create a new object via "barplot"That gives you appropriate positions for the labels in plot coordinate terms and the xpd argument allows these locations to be used outside the plot area.> and then quote that in the x,y coordinates of text?What do you mean by "then quote it in the x,y,coordinates"? I don't see any quotes. You could of course just look at the plot area and supply your own locations. You would need to figure out what the unlabeled x-axis scale really was, but that too is documented. -- David.> Like I said, the code works great, but I'm trying to actually > understand the rationale behind the elements so I can apply it in > future. > Yours, Simon Kiss > > #Reproducible Code > mydat<-data.frame(countries=c("Canada", "Denmark", "Framce", "United > Kingdom", "Germany", "Australia", "New Zealand", "Switzerland", > "Belgium", "Netherlands"), stories_total=c(429, 25, > 239, 99, 100, 96, 18, 21, 0, 6), avg=c(4.165048544, 6.25, > 6.459459459, 0.908256881, 1.923076923, 1.103448276, 1.058823529, > 1.615384615, 0, 0.107142857), steps=c(2, 2, 2, 0,1, 1, 1, 0,0,0), > newspapers=c(103, 4, 37, 109, 52, 87, 17, 13, 10, 56)) > mydat.sort1<-mydat[order(-mydat$avg), ] > myplot<-barplot(mydat.sort1$avg, col=c("black", "black", "black", > "grey", "white", "grey", "grey", "white", "white", "white"), > ylim=c(0,7), main="Regulatory Action On Bisphenol A By Newspaper > Coverage") > col.vec=c("black", "grey", "white") > legend("topright", col=col.vec, fill=c("black", "grey", "white"), > legend=c("Meaningful Ban", "Recommendations To Withdraw", "No > Legislative Action")) > labels=mydat.sort1$countries > #These lines create the labels > text(myplot, par("usr")[3], labels=labels, srt=35, offset=1, adj=1, > xpd=TRUE) > axis(2) > par("usr")[3] > > ********************************* > Simon J. Kiss, PhD > Assistant Professor, Wilfrid Laurier University > 73 George Street > Brantford, Ontario, Canada > N3T 2C9 > Cell: +1 519 761 7606 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT ------------------------------ Message: 57 Date: Fri, 10 Dec 2010 13:50:45 -0500 From: Duncan Murdoch <murdoch.duncan at gmail.com> To: Anthony Damico <ajdamico at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] Could concurrent R sessions mix up variables? Message-ID: <4D027685.5050700 at gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 10/12/2010 1:13 PM, Anthony Damico wrote:> Hi, I'm working in R 2.11.1 x64 on Windows x86_64-pc-mingw32. > > I'm experiencing a strange problem in R that I'm not even sure how to > begin to fix. > > I've got a huge (forty-pages printed) simulation written in R that I'd > like to run multiple times. When I open up R and run it on its own, > it works fine. At the beginning of the program, there's a variable X > that I set to 1, 5, 10, 20, depending on how sensitive I want the > simulation to be to a certain parameter. When I just run one instance > of R, the X variable stays the same throughout the program. > > I have a quad-core machine, so I'd like to take advantage of all four > processors. > > If I open up four sessions and set X to 1, 5, 10, and 20 in those > different sessions, then run all four simulations all the way through > (about eighteen hours of processing time) at the same time, the > variable X ends up being 20 at the end of all four sessions. It's as > if R mixed up the variable setting between the four concurrent > sessions. I can't figure out why else my variable X would ever get > changed to 20 in the three simulations that I set it to 1, 5, and 10, > respeectively (it doesn't get updated anywhere during the simulation). > > When I have all four of these simulations running concurrently, I am > absolutely maxing out my computer. All four processors are at 100%, > and my Windows Task Manager says I'm using almost 100% of my 16 GB of > RAM. Is it possible that intense resource use would cause a variable > conflict like this? I have no idea where to start troubleshooting > this error, so any advice would be appreciated.If you are running something that takes 18 hours to complete, a common practice is to save intermediate results to disk occasionally. Have you (or whoever wrote the simulation) done this and forgotten about it? If all 4 processes are saving to the same place, then reading results back, you'd see something like you describe. If all calculations are held in memory, you shouldn't. A simple approach that might debug this is to create a new variables initX, set equal to X. Then sprinkle statements stopifnot(X == initX) through your simulation code. That should quit when the change happens, and you can try to figure out why it happened. Duncan Murdoch ------------------------------ Message: 58 Date: Fri, 10 Dec 2010 10:58:01 -0800 (PST) From: Phil Spector <spector at stat.berkeley.edu> To: Anthony Damico <ajdamico at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] Could concurrent R sessions mix up variables? Message-ID: <alpine.DEB.2.00.1012101047210.20652 at springer.Berkeley.EDU> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Anthony - I would advise you to use the multicore or snowfall packages to utilize multiple CPUs. As an example using multicore:> library(multicore) > sim = function(mu)max(replicate(100000,max(rnorm(100,mu)))) > library(multicore) > unlist(mclapply(c(1,5,10,20),sim))[1] 6.569332 10.268091 15.335847 25.291502 Using snowfall:> library(snowfall) > sim = function(mu)max(replicate(100000,max(rnorm(100,mu)))) > sfInit(cpus=4,type='SOCK',parallel=TRUE) > sfSapply(c(1,5,10,20),sim)[1] 6.200161 10.307807 15.271581 25.055950 Hope this helps. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Fri, 10 Dec 2010, Anthony Damico wrote:> Hi, I'm working in R 2.11.1 x64 on Windows x86_64-pc-mingw32. > > I'm experiencing a strange problem in R that I'm not even sure how to > begin to fix. > > I've got a huge (forty-pages printed) simulation written in R that I'd > like to run multiple times. When I open up R and run it on its own, > it works fine. At the beginning of the program, there's a variable X > that I set to 1, 5, 10, 20, depending on how sensitive I want the > simulation to be to a certain parameter. When I just run one instance > of R, the X variable stays the same throughout the program. > > I have a quad-core machine, so I'd like to take advantage of all four > processors. > > If I open up four sessions and set X to 1, 5, 10, and 20 in those > different sessions, then run all four simulations all the way through > (about eighteen hours of processing time) at the same time, the > variable X ends up being 20 at the end of all four sessions. It's as > if R mixed up the variable setting between the four concurrent > sessions. I can't figure out why else my variable X would ever get > changed to 20 in the three simulations that I set it to 1, 5, and 10, > respeectively (it doesn't get updated anywhere during the simulation). > > When I have all four of these simulations running concurrently, I am > absolutely maxing out my computer. All four processors are at 100%, > and my Windows Task Manager says I'm using almost 100% of my 16 GB of > RAM. Is it possible that intense resource use would cause a variable > conflict like this? I have no idea where to start troubleshooting > this error, so any advice would be appreciated. > > Thanks! > > Anthony Damico > Kaiser Family Foundation > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 59 Date: Fri, 10 Dec 2010 20:07:40 +0100 From: Andreas Wittmann <andreas_wittmann at gmx.de> To: r-help <r-help at r-project.org> Subject: [R] survival package - calculating probability to survive a given time Message-ID: <4D027A7C.2020804 at gmx.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Dear R users, i try to calculate the probabilty to survive a given time by using the estimated survival curve by kaplan meier. What is the right way to do that? as far as is see i cannot use the predict-methods from the survival package? library(survival) set.seed(1) time <- cumsum(rexp(1000)/10) status <- rbinom(1000, 1, 0.5) ## kaplan meier estimates fit <- survfit(Surv(time, status) ~ 1) s <- summary(fit) ## 1. possibility to get the probability for surviving 20 units of time ind <- findInterval(20, s$time) cbind(s$surv[ind], s$time[ind]) ## 2. possibility to get the probability for surviving 20 units of time ind <- s$time >= 20 sum(ind) / length(ind) Thanks and best regards Andreas ------------------------------ Message: 60 Date: Fri, 10 Dec 2010 11:16:26 -0800 From: Peter Ehlers <ehlers at ucalgary.ca> Cc: "r-help at r-project.org" <r-help at r-project.org> Subject: Re: [R] Compare one level of a factor with *all* other non-missing levels Message-ID: <4D027C8A.8080902 at ucalgary.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 2010-12-10 05:58, deriK2000 wrote:> > > Peter Ehlers wrote: >> >> >> Sounds like you want the Dunnett test procedure which seems >> to be implemented in a number of packages: multcomp, asd, MCPAN >> and others. >> >> It would probably be a good idea to install package 'sos' and >> learn how to search with it. >> >> Peter Ehlers >> >> > >[[elided Yahoo spam]]> > Unfortunately, Dunnett compares the mean(x) for a factor level with the > means(x) of all single K-1 other levels resulting in K-1 comparisions for > each level (printed in a lower triangular matrix for the results). Instead, > I just want to compare this one mean(x) with one other mean(x) of all the > K-1 other levels (printed in a vector of length K for the results). >Okay, I misunderstood; should have read more carefully. I would just use a loop (I'm not as loop-averse as some R users). x <- rnorm(20) f <- gl(4, 5, lab = letters[1:4]) lev <- levels(f) len <- length(lev) pv <- numeric(len) for(i in 1:len){ pv[i] <- t.test(x[f == lev[i]], x[f != lev[i]])$p.value } pv For pvalue adjustment (if you think that's needed), see ?p.adjust. [[elided Yahoo spam]] Yes, it's an excellent tool. Peter Ehlers> > Cheers, > > Derik------------------------------ Message: 61 Date: Fri, 10 Dec 2010 15:11:47 -0500 From: David Winsemius <dwinsemius at comcast.net> To: Andreas Wittmann <andreas_wittmann at gmx.de> Cc: r-help <r-help at r-project.org> Subject: Re: [R] survival package - calculating probability to survive a given time Message-ID: <FBE2BF70-878C-46FE-BFCE-19BAFAC3107C at comcast.net> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes On Dec 10, 2010, at 2:07 PM, Andreas Wittmann wrote:> Dear R users, > > i try to calculate the probabilty to survive a given time by using > the estimated survival curve by kaplan meier. > > What is the right way to do that? as far as is see i cannot use the > predict-methods from the survival package? > > library(survival) > set.seed(1) > time <- cumsum(rexp(1000)/10) > status <- rbinom(1000, 1, 0.5) > > ## kaplan meier estimates > fit <- survfit(Surv(time, status) ~ 1) > s <- summary(fit) > > ## 1. possibility to get the probability for surviving 20 units of > time > ind <- findInterval(20, s$time) > cbind(s$surv[ind], s$time[ind])See if this helps: > head(which(s$surv < 0.5)) [1] 368 369 370 371 372 373 > plot(fit) > abline(h=0.5) > abline(v=s$time[368])> > ## 2. possibility to get the probability for surviving 20 units of > time > ind <- s$time >= 20 > sum(ind) / length(ind)-- David Winsemius, MD West Hartford, CT ------------------------------ Message: 62 Date: Fri, 10 Dec 2010 12:14:04 -0800 From: Rob James <rob at aetiologic.ca> To: r-help at r-project.org Subject: [R] Reorder factor and address embedded escapes Message-ID: <4D028A0C.3010801 at aetiologic.ca> Content-Type: text/plain; charset=ISO-8859-1; format=flowed I am trying to reorder a factor variable that has embedded escape characters. The data begins as a csv file with a factor that includes embedded new line characters. By the time read.table has rendered it into a data frame, the variable now has an extra backslash. e.g. "This\nLabel" in the csv becomes "This\\nLabel" in the data frame. So, I am trying to reorder the factor and deal with the introduction of a secondary \ . Here's a small example: A = c("A\\nB", "C\\nD") test <-data.frame(A) str(test) test$reorderA <-factor(test$A, c("C\\nD", "A\\nB")) str(test) test$reorderB <-sub("\\\\n", "\n", test$reorderA) str(test) When sub is applied to the now-correctly ordered factor, it returns to the default ordering. Suggestions? ------------------------------ Message: 63 Date: Fri, 10 Dec 2010 12:16:23 -0800 (PST) From: dreadgazebo <ernie.tedeschi at gmail.com> To: r-help at r-project.org Subject: [R] Time Series Row Label Message-ID: <1292012183267-3082639.post at n4.nabble.com> Content-Type: text/plain; charset=us-ascii Simple question... I know that when referencing data in a multivariate time series matrix, I can use the variable name instead of the column number (such as budget.ts[4,"incometax"]). Is there a way I can use the time unit (say, the year in an annual time series) instead of the row number? -- View this message in context: http://r.789695.n4.nabble.com/Time-Series-Row-Label-tp3082639p3082639.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 64 Date: Fri, 10 Dec 2010 14:20:33 -0600 From: Erik Iverson <eriki at ccbr.umn.edu> To: Rob James <rob at aetiologic.ca> Cc: r-help at r-project.org Subject: Re: [R] Reorder factor and address embedded escapes Message-ID: <4D028B91.5050408 at ccbr.umn.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Does the following help? A = c("A\\nB", "C\\nD") test <-data.frame(A) #access levels directly to change names levels(test$A) <- sub("\\\\n", "\n", levels(test$A)) #re-order levels of the factor test$A <- relevel(test$A, "C\nD") Rob James wrote:> I am trying to reorder a factor variable that has embedded escape > characters. The data begins as a csv file with a factor that includes > embedded new line characters. By the time read.table has rendered it > into a data frame, the variable now has an extra backslash. > > e.g. > "This\nLabel" in the csv becomes "This\\nLabel" in the data frame. > > So, I am trying to reorder the factor and deal with the introduction of > a secondary \ . Here's a small example: > > > A = c("A\\nB", "C\\nD") > test <-data.frame(A) > str(test) > test$reorderA <-factor(test$A, c("C\\nD", "A\\nB")) > str(test) > test$reorderB <-sub("\\\\n", "\n", test$reorderA) > str(test) > > When sub is applied to the now-correctly ordered factor, it returns to > the default ordering. > > Suggestions? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.------------------------------ Message: 65 Date: Fri, 10 Dec 2010 15:31:21 -0500 From: David Winsemius <dwinsemius at comcast.net> To: Rob James <rob at aetiologic.ca> Cc: r-help at r-project.org Subject: Re: [R] Reorder factor and address embedded escapes Message-ID: <A25DDB11-1027-47F8-89C2-41AB1D7DA30B at comcast.net> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes On Dec 10, 2010, at 3:14 PM, Rob James wrote:> I am trying to reorder a factor variable that has embedded escape > characters. The data begins as a csv file with a factor that > includes embedded new line characters. By the time read.table has > rendered it into a data frame, the variable now has an extra > backslash. > > e.g. > "This\nLabel" in the csv becomes "This\\nLabel" in the data frame. > > So, I am trying to reorder the factor and deal with the introduction > of a secondary \ . Here's a small example: > > > A = c("A\\nB", "C\\nD") > test <-data.frame(A) > str(test) > test$reorderA <-factor(test$A, c("C\\nD", "A\\nB")) > str(test) > test$reorderB <-sub("\\\\n", "\n", test$reorderA) > str(test) > > When sub is applied to the now-correctly ordered factor, it returns > to the default ordering.Actually it returns a character vector rather than a factor.> > Suggestions?Perhaps you should ask a question. -- David Winsemius, MD West Hartford, CT ------------------------------ Message: 66 Date: Fri, 10 Dec 2010 22:34:23 +0200 From: Tal Galili <tal.galili at gmail.com> To: Matt Shotwell <shotwelm at musc.edu> Cc: "r-help at r-project.org" <r-help at r-project.org> Subject: Re: [R] Encoding problem - I fails to read Hebrew text from online Message-ID: <AANLkTikb1HeF+5LwWqBkvav59eZLkpHx6=T6eR_sygBL at mail.gmail.com> Content-Type: text/plain Hi Matt and everyone else, Thanks for the help so far. I ended up using the tips provided to create a "dirty hack" based on a translation table between the code and the Hebrew letters. For the future (and for any suggestions), I am attaching this code bellow: Best, Tal # the translation table: translation.table.Hebrew <- structure(list(V1 = structure(1:27, .Label c("05D0", "05D1", "05D2", "05D3", "05D4", "05D5", "05D6", "05D7", "05D8", "05D9", "05DA", "05DB", "05DC", "05DD", "05DE", "05DF", "05E0", "05E1", "05E2", "05E3", "05E4", "05E5", "05E6", "05E7", "05E8", "05E9", "05EA"), class = "factor"), V2 = structure(1:27, .Label = c("??", "?'", "?'", "?"", "?"", "?.", "?-", "?-", "?~", "?T", "?s", "?>", "?o", "??", "?z", "?Y", "??", "??", "??", "??", "??", "??", "??", "??", "??", "??", "??" ), class = "factor")), .Names = c("CODE", "HEBREW"), class = "data.frame", row.names = c(NA, -27L)) # translation.table # STRING = inp turn_nohash <- function(STRING) { require(stringr) nohash <- str_replace(STRING, "#", "0") # cvrt # to 0 nohash <- str_replace(nohash, ";", "") # cvrt # to 0 nohash <- str_replace(nohash, "&", "") # cvrt # to 0 nohash <- str_replace(nohash, "x", "") # cvrt # to 0 return(nohash) } translate.all.chars <- function(STRING, TABLE = translation.table.Hebrew) { # TABLE is of the form: # CODE HEBREW # 1 05D0 ?? # 2 05D1 ?' # 3 05D2 ?' require(stringr) i.chars.to.check <- seq_len(dim(TABLE)[1]) for(i in i.chars.to.check) { STRING <- str_replace(STRING, as.character(TABLE[i,1]), as.character(TABLE[i,2])) } return(STRING) } HTML_heb_decode <- function(STRING, TABLE = translation.table.Hebrew) { STRING <- turn_nohash(STRING) STRING <- translate.all.chars(STRING, TABLE) return(STRING) } # example of use: inp <- "שלום" HTML_heb_decode(inp) inp <- "שלום חנוך\ " HTML_heb_decode(inp) ourput:> HTML_heb_decode(inp)Loading required package: stringr Loading required package: plyr [1] "???o?.??"> inp <- "שלום חנוך\ " > HTML_heb_decode(inp)[1] "???o?.?? ?-???.?s " ----------------Contact Details:------------------------------------------------------- Contact me: Tal.Galili at gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- On Fri, Dec 10, 2010 at 12:00 AM, Matt Shotwell <shotwelm at musc.edu> wrote:> Tal, > > OK, let me clarify my understanding. The original and decoded file are > text, encoded by UTF-8. In the original file, there are HTML `entities' > that represent UTF-8 Hebrew characters. In the decoded file, the > entities are converted to UTF-8 characters. The question is how to > convert these entities within R. It's not the same as converting between > character encodings, otherwise iconv() might offer a solution. > > I'll have a look around to find a solution, and I hope others will too. > My first idea is to check RCurl, XML, and the related utils::URLdecode. > If there really is no existing solution, I think it might be worthwhile > to look at how PHP and Python do it (and maybe borrow some code :) ). > > -Matt > > > On Thu, 2010-12-09 at 14:27 -0500, Tal Galili wrote: > > Hi Matt, > > Thanks for having a look at this. > > I just spent some time looking around and couldn't find any R function > > to decode decimal HTML code. > > > > > > Do you (or someone else on the list) knows how to program this sort of > > thing? (is there a formula for the translation? > > > > > > > > > > p.s: > > For it to work on my end I added the encoding parameter: > > readLines("http://biostatmatt.com/temp/Hebrew-decoded", warn=FALSE, > > encoding= "UTF-8") > > > > > > p.p.s: The Hebrew word I used means "peace" > > > > > > Cheers, > > Tal > > > > > > ----------------Contact > > Details:------------------------------------------------------- > > Contact me: Tal.Galili at gmail.com | 972-52-7275845 > > Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) > > | www.r-statistics.com (English) > > > ---------------------------------------------------------------------------------------------- > > > > > > > > > > On Thu, Dec 9, 2010 at 8:38 PM, Matt Shotwell <shotwelm at musc.edu> > > wrote: > > Tal, > > > > It looks like the data you received has HTML special hex > > characters. > > That is, 'ש' is just an ASCII HTML representation of a > > hex > > character. It's not encoded in a special manner. > > > > The trick is to substitute the HTML encoded hex character for > > its binary > > representation, or "decode" the character. I don't know of any > > R > > function that does this, but there are web services, for > > example: > > http://www.hashemian.com/tools/html-url-encode-decode.php > > > > I decoded your file using this service and posted it on my > > website. You > > can see the difference by running: > > > > readLines("http://biostatmatt.com/temp/Hebrew-original", > > warn=FALSE) > > > > readLines("http://biostatmatt.com/temp/Hebrew-decoded", > > warn=FALSE) > > > > The second should display the Hebrew characters correctly (it > > does in my > > terminal). The next thing to think about is how to automate > > this in R > > without using the web service... We may need to write an > > HTMLDecode > > function if there isn't one already. > > > > By the way, what's the Hebrew text in English? > > > > Best, > > Matt > > > > > > > > On Thu, 2010-12-09 at 12:21 -0500, Tal Galili wrote: > > > I am bumping this question in the hopes that someone might > > be able to > > > advise. > > > This Hebrew and R business is not as smooth as I had > > hoped... > > > > > > Thanks, > > > Tal > > > > > > Older massage: > > > > > > On Tue, Dec 7, 2010 at 2:30 PM, Tal Galili > > <tal.galili at gmail.com> wrote: > > > > > > > Hello all, > > > > > > > > # I am trying to read the text in this URL: > > > > u <- > > > > http://google.com/complete/search?output=toolbar&q=%d7%a9% > > d7%9c%d7%95%d7%9d > > > > # By using this command: > > > > readLines(u) > > > > > > > > And no matter what variation I tried, I keep getting this > > output: > > > > [1] "<?xml version=\"1.0 > > \"?><toplevel><CompleteSuggestion><suggestion > > > > data=\"שלום\"/>< (etc...) > > > > > > > > > > > > > > Instead of this output: > > > > <?xml > > version="1.0"?><toplevel><CompleteSuggestion><suggestion > > data="???o?.?? > > > > "/><num_queries > > > int="16800000"/></CompleteSuggestion><CompleteSuggestion><suggestion > > > > data="???o?.?? ?-???.?s"/><num_queries > > int="232000"/></CompleteSuggestion> > > > > <CompleteSuggestion><suggestion data="???o?.?? ???o?T?>??"/ > > > > (etc....) > > > > > > > > > > > > > > > I tried: > > > > readLines(u, encoding= "latin1") > > > > readLines(u, encoding= "UTF-8") > > > > And also changing Sys.setlocale: > > > > Sys.setlocale("LC_ALL", "Hebrew") # must be done for > > Hebrew to work. > > > > Sys.setlocale("LC_ALL", "English") # must be done for > > Hebrew to work. > > > > > > > > Are there any more options I could try to get this text > > properly encoded? > > > >[[elided Yahoo spam]]> > > > Tal > > > > > > > > > > > > > > > > ----------------Contact > > > > > > Details:------------------------------------------------------- > > > > Contact me: Tal.Galili at gmail.com | 972-52-7275845 > > > > Read me: www.talgalili.com (Hebrew) | > > www.biostatistics.co.il (Hebrew) | > > > > www.r-statistics.com (English) > > > > > > > > > > > ---------------------------------------------------------------------------------------------- > > > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > > > > -- > > Matthew S. Shotwell > > Graduate Student > > Division of Biostatistics and Epidemiology > > Medical University of South Carolina > > > > > > > > -- > Matthew S. Shotwell > Graduate Student > Division of Biostatistics and Epidemiology > Medical University of South Carolina > >[[alternative HTML version deleted]] ------------------------------ Message: 67 Date: Fri, 10 Dec 2010 23:26:28 +0200 From: "dorina.lazar" <dorina.lazar at econ.ubbcluj.ro> To: <r-help at r-project.org> Subject: [R] spatial clusters Message-ID: <7aad2cfb7a3d00c107ecb37cecea9f7d at econ.ubbcluj.ro> Content-Type: text/plain; charset=UTF-8 Dear all, I am looking for a clustering method usefull to classify the countries in some clusters taking account of: a) the geographical distance (in km) between countries and b) of some macroeconomic indicators (gdp, life expectancy...). Are there some packages in R usefull for this? Thanks a lot for your help, Dorina ------------------------------ Message: 68 Date: Fri, 10 Dec 2010 16:31:32 -0500 From: Gabor Grothendieck <ggrothendieck at gmail.com> To: dreadgazebo <ernie.tedeschi at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] Time Series Row Label Message-ID: <AANLkTikYh1iFRCBAeDyUoAhO+ETP3KOrFBuzpSH8q1fX at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Fri, Dec 10, 2010 at 3:16 PM, dreadgazebo <ernie.tedeschi at gmail.com> wrote:> > Simple question... > > I know that when referencing data in a multivariate time series matrix, I > can use the variable name instead of the column number (such as > budget.ts[4,"incometax"]). Is there a way I can use the time unit (say, the > year in an annual time series) instead of the row number?You can use window(), e.g. tt <- ts(1:4, start = 2000) window(tt, start = 2001, end = 2002) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ------------------------------ Message: 69 Date: Fri, 10 Dec 2010 13:39:33 -0800 From: Patrick McKann <pcmckann at gmail.com> To: r-help at r-project.org Subject: [R] WriteXLS error:Error in get(x, envir = envir) : variable names are limited to 256 bytes Message-ID: <AANLkTim5tFDvcu9hRj3jGMOqK8cNzqg8WhveSbpB67WG at mail.gmail.com> Content-Type: text/plain Hello all, I don't understand why this won't work. I have entered: WriteXLS(alldata,'test.xls') and I get this error message: Error in get(x, envir = envir) : variable names are limited to 256 bytes. My variable names are not very long, and are accepted by write.csv. alldata is a list containing 4 dataframes, with each dataframe having the the same variable names, which are:> names(avg8302)[1] "ID" "cluster" "rec.unit" "int.hib" "yr.hib" "yr0309.hib" "int.hib.se" "yr.hib.se" " yr0309.hib.se" "int.cl" [11] "yr.cl" "yr0309.cl" "int.cl.se" "yr.cl.se" " yr0309.cl.se" "int.ru" "yr.ru" "yr0309.ru" "int.ru.se" "yr.ru.se" [21] "yr0309.ru.se" "int.sp" "yr.sp" "yr0309.sp" " int.sp.se" "yr.sp.se" "yr0309.sp.se">Does anybody know how I can fix this? Or another way to write a multi-sheet xls? Thank you. [[alternative HTML version deleted]] ------------------------------ Message: 70 Date: Fri, 10 Dec 2010 13:53:08 -0800 (PST) From: casperyc <casperyc at hotmail.co.uk> To: r-help at r-project.org Subject: [R] How to print colorful R output?? Message-ID: <1292017988649-3082750.post at n4.nabble.com> Content-Type: text/plain; charset=us-ascii Hi All, I wonder if there is a way to print the R output with COLOR? Not the color plots, but the outputs in the console. Thank. casper -- View this message in context: http://r.789695.n4.nabble.com/How-to-print-colorful-R-output-tp3082750p3082750.html Sent from the R help mailing list archive at Nabble.com. ------------------------------ Message: 71 Date: Fri, 10 Dec 2010 17:02:43 -0500 From: David Winsemius <dwinsemius at comcast.net> To: Patrick McKann <pcmckann at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] WriteXLS error:Error in get(x, envir = envir) : variable names are limited to 256 bytes Message-ID: <2C68E697-86FF-4D7C-ABC0-5667437E548F at comcast.net> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes On Dec 10, 2010, at 4:39 PM, Patrick McKann wrote:> Hello all, > I don't understand why this won't work. I have entered: > > WriteXLS(alldata,'test.xls')I have gotten tripped up by the argument syntax in WriteXLS myself, many times. Please check the help page for argument names and use them, especially paying attention to the fact that the first argument needs to be a character _vector_ (and I suspect that passing it a list may not qualify) and I always use the name for the Excel file argument. I suspect that this may work: WriteXLS('alldata','test.xls') -- David.> > and I get this error message: > > Error in get(x, envir = envir) : variable names are limited to 256 > bytes. > > My variable names are not very long, and are accepted by write.csv. > > alldata is a list containing 4 dataframes, with each dataframe > having the > the same variable names, which are: > >> names(avg8302) > [1] "ID" "cluster" "rec.unit" "int.hib" > "yr.hib" "yr0309.hib" "int.hib.se" "yr.hib.se" " > yr0309.hib.se" "int.cl" > [11] "yr.cl" "yr0309.cl" "int.cl.se" "yr.cl.se" " > yr0309.cl.se" "int.ru" "yr.ru" "yr0309.ru" > "int.ru.se" > "yr.ru.se" > [21] "yr0309.ru.se" "int.sp" "yr.sp" "yr0309.sp" " > int.sp.se" "yr.sp.se" "yr0309.sp.se" >> > > Does anybody know how I can fix this? Or another way to write a > multi-sheet > xls? > > Thank you. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT ------------------------------ Message: 72 Date: Fri, 10 Dec 2010 17:14:38 -0500 From: David Winsemius <dwinsemius at comcast.net> To: David Winsemius <dwinsemius at comcast.net> Cc: r-help at r-project.org, Patrick McKann <pcmckann at gmail.com> Subject: Re: [R] WriteXLS error:Error in get(x, envir = envir) : variable names are limited to 256 bytes Message-ID: <29D67E79-2FA8-4827-BBA2-E1C668C22E5F at comcast.net> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes On Dec 10, 2010, at 5:02 PM, David Winsemius wrote:> > On Dec 10, 2010, at 4:39 PM, Patrick McKann wrote: > >> Hello all, >> I don't understand why this won't work. I have entered: >> >> WriteXLS(alldata,'test.xls') > > I have gotten tripped up by the argument syntax in WriteXLS myself, > many times. Please check the help page for argument names and use > them, especially paying attention to the fact that the first > argument needs to be a character _vector_ (and I suspect that > passing it a list may not qualify) and I always use the name for the > Excel file argument. I suspect that this may work: > > WriteXLS('alldata','test.xls')OOOPs. I wrote that before I noted that you said you were using a list, and I forgot to go back and fix it, so that would NOT work.> > -- > David. > >> >> and I get this error message: >> >> Error in get(x, envir = envir) : variable names are limited to 256 >> bytes. >> >> My variable names are not very long, and are accepted by write.csv. >> >> alldata is a list containing 4 dataframes, with each dataframe >> having the >> the same variable names, which are: >> >>> names(avg8302) >> [1] "ID" "cluster" "rec.unit" "int.hib" >> "yr.hib" "yr0309.hib" "int.hib.se" "yr.hib.se" " >> yr0309.hib.se" "int.cl" >> [11] "yr.cl" "yr0309.cl" "int.cl.se" >> "yr.cl.se" " >> yr0309.cl.se" "int.ru" "yr.ru" "yr0309.ru" >> "int.ru.se" >> "yr.ru.se" >> [21] "yr0309.ru.se" "int.sp" "yr.sp" >> "yr0309.sp" " >> int.sp.se" "yr.sp.se" "yr0309.sp.se" >>> >> >> Does anybody know how I can fix this? Or another way to write a >> multi-sheet >> xls? >> >> Thank you. >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT ------------------------------ Message: 73 Date: Fri, 10 Dec 2010 23:06:27 +0000 (UTC) From: Ben Bolker <bbolker at gmail.com> To: r-help at stat.math.ethz.ch Subject: Re: [R] Stricter read.table? Message-ID: <loom.20101211T000025-611 at post.gmane.org> Content-Type: text/plain; charset=us-ascii Stavros Macrakis <macrakis <at> alum.mit.edu> writes:> > read.table gives idiosyncratic results when the input is formatted > strangely, for example: > > read.table(textConnection("a'b\nc'd\n"),header=FALSE, fill=TRUE,sep="",quote="'")> => "c'd" "a'b" "c'd" > > > read.table(textConnection("a'b\nc'd\nf'\n'\n"), header=FALSE,fill=TRUE sep="",quote="'")> => "f'" "\na" "b" "c'd" "f'" "\n" > > Though read.table doesn't specify the syntax of its input precisely, these > results don't seem particularly useful or consistent. > > Is there a stricter version of read.table (perhaps in a package) that gives > errors or warnings if it finds quotation marks in the middle of fields or > encounters other such peculiar situations?I dissected this behavior a bit more here <https://stat.ethz.ch/pipermail/r-devel/2010-November/059016.html> (it is due to an inconsistency between the way that scan() and readLines() handle lines with unterminated quotes, IIRC) and Martin Maechler said <https://stat.ethz.ch/pipermail/r-devel/2010-November/059107.html> "I think it can be defended to file as a bug, but it is tricky to pinpoint exactly what the issue is." I don't know of a stricter version of read.table(), but if you had the time and inclination to pick through the code and (i) provide a careful definition of desired behavior and (ii) supply patches, you could do your little bit to make R better. (If I posted a bug report would you annotate it with a discussion of desired behavior?) ------------------------------ Message: 74 Date: Fri, 10 Dec 2010 18:10:42 -0500 From: Stavros Macrakis <macrakis at alum.mit.edu> To: r-help <r-help at r-project.org> Subject: [R] Quantile with discrete types Message-ID: <AANLkTi=YXVGnd4NcrurJ3TXBsH74WMEgwWeMM3DaP6T_ at mail.gmail.com> Content-Type: text/plain I don't understand why 'quantile' works in this case:> tt <- rep(c('a','b'),c(10,3)) > sapply(0:6/6,function(q) quantile(tt,probs=q,type=1))0% 16.66667% 33.33333% 50% 66.66667% 83.33333% 100% "a" "a" "a" "a" "a" "b" "b" and also> quantile(tt,0:5/5,type=1)0% 20% 40% 60% 80% 100% "a" "a" "a" "a" "b" "b" but gives an error in this, which I would have thought equivalent to the first case above:> quantile(tt,0:6/6,type=1)Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : argument is not a numeric vector I could of course write something like sort(tt)[seq(1,length(tt),length.out=7)] -- but I'm wondering why quantile fails in this case. Thanks, -s [[alternative HTML version deleted]] ------------------------------ Message: 75 Date: Fri, 10 Dec 2010 23:10:18 +0000 (UTC) From: Ben Bolker <bbolker at gmail.com> To: r-help at stat.math.ethz.ch Subject: Re: [R] [r] overlap different line in a xyplot (lattice) Message-ID: <loom.20101211T000730-842 at post.gmane.org> Content-Type: text/plain; charset=us-ascii Peter Ehlers <ehlers <at> ucalgary.ca> writes:> > On 2010-12-10 07:04, Francesco Nutini wrote: > > > > dear [R] users, > > is there a way to plot different data (but with the same x-variables) > in the same xyplot window? > > There are already a similar question, but the answer is > not enought explanatory... > > Something like this? >[snip] Also possibly the layer() command in the latticeExtra package. If there is an answer that doesn't make sense to you it might be most efficient to post an edited version of that question/answer, attempting to clarify which parts of the answer you do and don't understand ... A reproducible example would be nice too. Ben Bolker ------------------------------ Message: 76 Date: Fri, 10 Dec 2010 15:51:49 -0500 From: Layla Parast <lparast at fas.harvard.edu> To: r-help at r-project.org Subject: [R] locfit weights not working as expected Message-ID: <AANLkTikzs66VrZbO13iq=WhSWxc-GSLmyOtCRNgxsNV2 at mail.gmail.com> Content-Type: text/plain; charset="iso-8859-1" Hello! I am having a problem understanding what the weights option in the locfit command of the locfit package is doing. I have written a sample program which illustrates the issue (below). The example involves using bootstrap however, that is not my main goal but it illustrates where my problem lies. As you know, to compute a bootstrap estimate of a particular quantity using a sample size of size N, I would sample a multinomial vector which should be of size N with possible values 1:N. I should get the exact same answer whether I a) estimate the quantity using this multinomial vector as the indices i.e. literally make a new sample by choosing each element according to the number of time in the corresponding index or b) estimate the quantity using the original data and specify this multinomial vector as the "weights". In all commands that I have tried, the weights functions works such that this is always true. This is not true for the locfit command. The models are different, the coefficients are different, the predictions are different. I have tried several combinations of things including specifying different options for the evaluation points, different kernels, different datasets to predict and I cannot find any way to make these equal. For my particular simulation, I need to understand exactly what the weights option is doing since it is clearly not doing what I expected it to. I would gratefully appreciate any advice or help that anyone can give on this issue and I appreciate your time very much. library(MASS) library(splines) library(sm) library(quantreg) library(locfit) set.seed(20) Zi = rnorm(5000,0, 2) ei = mvrnorm(5000, c(.6,2), matrix(c(.7,0,0,.9),2,2)) X1i = exp(-.5*Zi + ei[,1]) X2i = exp(-.5*(Zi*log(T1i) -log(T1i) + Zi) + ei[,2]) X1i = log(X1i) boot.weight = rmultinom(1,5000, prob=rep(1/5000, 5000)) boot.subset = rep(1:5000, boot.weight) loc.model.1 = locfit(1*(X2i[boot.subset]< 10) ~ lp(X1i[boot.subset], Zi[boot.subset], style = c ? ? ?("n","cpar"), h=0.4, deg = 1, nn=0), family = "binomial", link = "logit") loc.model.2 = locfit(1*(X2i< 10) ~ lp(X1i, Zi, style = c("n","cpar"), h=0.4, deg = 1,nn=0), family = "binomial", link = "logit", weights=boot.weight) print(loc.model.1) print(loc.model.2) t.vector = log(seq(0.5,10, length=300)) p.0.1 = predict(loc.model.1, newdata=cbind(t.vector, rep(0,length(t.vector)))) p.0.2 = predict(loc.model.2, newdata=cbind(t.vector, rep(0,length(t.vector)))) #quantity below should be vector of zeros but it is not. p.0.1-p.0.2 ------------------------------ Message: 77 Date: Fri, 10 Dec 2010 17:05:10 -0600 From: simon lu <simonlmg at gmail.com> To: r-help at r-project.org Subject: [R] R question: memory usage Message-ID: <AANLkTimMP1NibCS4xpK_zSwhhuSG=nis5EegLrZGHmbv at mail.gmail.com> Content-Type: text/plain Hi I have a large R progress that that is constently running into memory issues, I am trying to rewrite some of the code to make more efficeint However, one thing I have found is that the memory usage shown by gc() is very different from what i see from unix termilnal Garbage collection 2047 = 1398+243+406 (level 2) ... 12.4 Mbytes of cons cells used (3%) 998.6 Mbytes of vectors used (23%) used (Mb) gc trigger (Mb) max used (Mb) Ncells 231884 12.4 8140556 434.8 12719620 679.4 Vcells 130881306 998.6 572307468 4366.4 713657158 5444.8 vs PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 26291 s_edge 25 0 2883m 8.2g 4276 R 99.7 48.0 10:35.31 R Am i missing something here? Thanks a lot for the help Simon [[alternative HTML version deleted]] ------------------------------ Message: 78 Date: Fri, 10 Dec 2010 23:59:10 -0500 From: Dennis Duro <dennis.duro at gmail.com> To: r-help at r-project.org Subject: [R] randomForest: help with combine() function Message-ID: <AANLkTind5kV=Ku5GXX=YGvk5fx5DY-cRE_gbHCGB=QYz at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 I've built two RF objects (RF1 and RF2) and have tried to combine them, but I get the following error: Error in rf$votes + ifelse(is.na(rflist[[i]]$votes), 0, rflist[[i]]$votes) : non-conformable arrays In addition: Warning message: In rf$oob.times + rflist[[i]]$oob.times : longer object length is not a multiple of shorter object length Both RF models use the same variables, although the NAs in both models likely differ (using na.roughfix in both models). I assume this is part of the reason that my arrays are "non-conformable". If so, does anyone have any suggestions on how to combine in such a situation? How similar do RFs have to be in order to combine? Cheers ------------------------------ Message: 79 Date: Fri, 10 Dec 2010 23:15:22 -0600 From: Yihui Xie <xie at yihui.name> To: casperyc <casperyc at hotmail.co.uk> Cc: r-help at r-project.org Subject: Re: [R] How to print colorful R output?? Message-ID: <AANLkTimPn=WhvAyrMOH6U+kMrWDVM7Pyjgf=fvGWzT2e at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Yes (depending on what you mean by "output"), but maybe there is a long way to go. The vignette of the Rd2roxygen package is an example: http://cran.r-project.org/web/packages/Rd2roxygen/vignettes/Rd2roxygen.pdf It makes use of the highlight package and Sweave to output colored code. If the style of the above vignette is what you want, more details are here: http://yihui.name/en/2010/10/how-to-start-using-pgfsweave-in-lyx-in-one-minute/ Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Fri, Dec 10, 2010 at 3:53 PM, casperyc <casperyc at hotmail.co.uk> wrote:> > Hi All, > > I wonder if there is a way to print the R output with COLOR? > Not the color plots, but the outputs in the console. > > Thank. > > casper > -- > View this message in context: http://r.789695.n4.nabble.com/How-to-print-colorful-R-output-tp3082750p3082750.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 80 Date: Fri, 10 Dec 2010 23:57:14 -0600 From: Marc Schwartz <marc_schwartz at me.com> To: David Winsemius <dwinsemius at comcast.net> Cc: r-help at r-project.org, Patrick McKann <pcmckann at gmail.com> Subject: Re: [R] WriteXLS error:Error in get(x, envir = envir) : variable names are limited to 256 bytes Message-ID: <0DF6AE39-8D86-424C-9E1B-765D372DC376 at me.com> Content-Type: text/plain; charset=us-ascii On Dec 10, 2010, at 4:14 PM, David Winsemius wrote:> > On Dec 10, 2010, at 5:02 PM, David Winsemius wrote: > >> >> On Dec 10, 2010, at 4:39 PM, Patrick McKann wrote: >> >>> Hello all, >>> I don't understand why this won't work. I have entered: >>> >>> WriteXLS(alldata,'test.xls') >> >> I have gotten tripped up by the argument syntax in WriteXLS myself, many times. Please check the help page for argument names and use them, especially paying attention to the fact that the first argument needs to be a character _vector_ (and I suspect that passing it a list may not qualify) and I always use the name for the Excel file argument. I suspect that this may work: >> >> WriteXLS('alldata','test.xls') > > OOOPs. I wrote that before I noted that you said you were using a list, and I forgot to go back and fix it, so that would NOT work. >> >> -- >> David. >> >>> >>> and I get this error message: >>> >>> Error in get(x, envir = envir) : variable names are limited to 256 bytes. >>> >>> My variable names are not very long, and are accepted by write.csv. >>> >>> alldata is a list containing 4 dataframes, with each dataframe having the >>> the same variable names, which are: >>> >>>> names(avg8302) >>> [1] "ID" "cluster" "rec.unit" "int.hib" >>> "yr.hib" "yr0309.hib" "int.hib.se" "yr.hib.se" " >>> yr0309.hib.se" "int.cl" >>> [11] "yr.cl" "yr0309.cl" "int.cl.se" "yr.cl.se" " >>> yr0309.cl.se" "int.ru" "yr.ru" "yr0309.ru" "int.ru.se" >>> "yr.ru.se" >>> [21] "yr0309.ru.se" "int.sp" "yr.sp" "yr0309.sp" " >>> int.sp.se" "yr.sp.se" "yr0309.sp.se" >>>> >>> >>> Does anybody know how I can fix this? Or another way to write a multi-sheet >>> xls? >>> >>> Thank you.Hi David and Patrick, Apologies for the delay in my reply as I am away on vacation at the moment. As David surmised initially, the name of the object(s) to be exported, needs to be passed as a character vector. The vector can either contain the names of one or more data frames, or can be the single name of a list of data frames. The latter option was added in a September update. See the help page for an example of use. Thus: WriteXLS("alldata", "test.xls") should work. HTH, Marc Schwartz ------------------------------ Message: 81 Date: Sat, 11 Dec 2010 10:12:32 +0100 From: Liviu Andronic <landronimirc at gmail.com> To: casperyc <casperyc at hotmail.co.uk> Cc: r-help at r-project.org Subject: Re: [R] How to print colorful R output?? Message-ID: <AANLkTikLBC2-VA0k7r8Wj5xfCGa1PmOd-9o37MdXZy+- at mail.gmail.com> Content-Type: text/plain; charset=UTF-8 Hello On Fri, Dec 10, 2010 at 10:53 PM, casperyc <casperyc at hotmail.co.uk> wrote:> I wonder if there is a way to print the R output with COLOR? > Not the color plots, but the outputs in the console. >I once asked for this on the list [1], and the are two points: - although technically feasible, say using ncurses, there are currently no facilities for automated formatting of input/output - go for editors that do this automatically, say Emacs or JGR Regards Liviu [1] http://www.mail-archive.com/r-help at r-project.org/msg86512.html ------------------------------ Message: 82 Date: Sat, 11 Dec 2010 20:17:00 +1100 From: Michael Bedward <michael.bedward at gmail.com> To: Michael D <mike409 at gmail.com> Cc: r-help at r-project.org Subject: Re: [R] help with RSQLite adding a new column Message-ID: <AANLkTikMWCtY7wkLDcknnXunHRsQhHYfnUgxUVV0dKtk at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi Michael, Sorry if I'm being slow, but I've read your post three times and still can't quite work out what you're trying to do (the changing variables names are a bit confusing). I use RSQLite a lot and might be able to help if you could explain your inputs and desired output in simple terms. (another) Michael On 11 December 2010 05:18, Michael D <mike409 at gmail.com> wrote:> I'm new to using sql so I'm having difficulties (and worries) in adding a > new column of data to a table I have. Its a very large file (around 5 Gb) > which is why I'm having to use SQL > > I have a table with variables ID, IDrec and IDdes and the variables IDrec > and IDdes give a mapping of some other values but the other values are > associated with the ID variable (think of IDrec and IDdes being character > strings and ID being numeric) > > (Imagine the transposed) > Table1: > ID: 1,2,3,4,... > IDrec: A,B,C,D... > IDdes: B,C,A,E... > > So I've created a table with the final form I need it to be in > > dbGetQuery(db, "CREATE TABLE Map > ? ? ? ? ? ? ? ?(ID int, IDrec int, IDrec1 int, > ? ? ? ? ? ? ? ?IDdes int, IDdes1 int)") > > And the finished table would look something like: > Map: > ID: 1, 2, 3, 4,... > IDrec: 1, 2, 3, 4,... > IDrec1: A, B, C, D,... > IDdes: 2, 3, 1, 5,.... > IDdes1: B, C, A, E,... > > So I copy in the first set of values easily: > dbGetQuery(db, "INSERT INTO Map(ID, IDrec, IDrec1, IDdes1) > ? ? ? ? ? ? ? ?SELECT ID, ID, IDrec, IDdes FROM Ntemp") > > Giving me a table that looks like: > Map: > ID: 1, 2, 3, 4,... > IDrec: 1, 2, 3, 4,... > IDrec1: A, B, C, D,... > IDdes: NA,NA,NA,NA,... > IDdes1: B, C, A, E,... > > Then I create a new table with just the IDdes values I need: > dbGetQuery(db, "Create table temp2 as > ? ? ? ? ? ? ? ?SELECT temp.ID > ? ? ? ? ? ? ? ?FROM Ntemp, temp > ? ? ? ? ? ? ? ?WHERE Ntemp.IDdes1 = temp.IDrec1") > > Giving me temp2 (not sure what the variable name is) > V1: 2, 3, 1, 5,... > > But when I try to copy in the new data: > dbGetQuery(db, "INSERT INTO Map(IDdes) > ? ? ? ? ? ? ? ?SELECT * FROM temp2") > > My map table isn't updated: > Map: > ID: 1, 2, 3, 4,... > IDrec: 1, 2, 3, 4,... > IDrec1: A, B, C, D,... > IDdes: NA,NA,NA,NA,... > IDdes1: B, C, A, E,... > > Is there something I'm missing? Or am I just going about inserting the IDdes > variables the wrong way? > > Thanks for the help. > Michael > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >------------------------------ Message: 83 Date: Sat, 11 Dec 2010 20:51:02 +1100 From: Jim Lemon <jim at bitwrit.com.au> To: Simon Kiss <simonjkiss at yahoo.ca> Cc: r-help at r-project.org Subject: Re: [R] 45 Degree labels on barplot? Help understanding code previously posted. Message-ID: <4D034986.9070001 at bitwrit.com.au> Content-Type: text/plain; charset=ISO-8859-1; format=flowed On 12/11/2010 02:25 AM, Simon Kiss wrote:> Dear colleagues, > i found a line or two of code in the help archives from Uwe Ligges about creating slanted x-labels for a barplot and it works well for my purposes (code below). However, I was hoping someone could explain to me precisely what the code is doing. > I'm aware it's invoking the text command, and I know the first ttwo arguments to text are x and y co-ordinates. I'm also aware that par("usr")[3] is grabbing the third element of the vector of plotting co-ordinates. But I tried replacing par("usr")[3] with just "0" and that didn't work; all the labels got bunched up on the left. Is it necessary to create a new object via "barplot" and then quote that in the x,y coordinates of text? > Like I said, the code works great, but I'm trying to actually understand the rationale behind the elements so I can apply it in future.Hi Simon, The staxlab function will add an axis to an existing plot with either staggered or rotated labels. It is somewhat similar to Uwe's function. Looking at both the code and the examples on the help page might give you some idea why the function was written. It is principally to allow the user to add more labels than would be displayed on the default axes, and to decide whether staggering or rotating those labels will produce a better looking plot. The problem most often mentioned in adding custom labels to a plot produced by barplot is that the bars are centered on non-integer values, and so the user must get the return value of barplot and use that for the axis label positions. Jim ------------------------------ Message: 84 Date: Sat, 11 Dec 2010 11:36:30 +0100 From: Petr PIKAL <petr.pikal at precheza.cz> To: Amelia Vettori <amelia_vettori at yahoo.co.nz> Cc: r-help at r-project.org Subject: Re: [R] Adding numbers in Outputs Message-ID: <OFB14F81D0.633C61D4-ONC12577F6.0039EC79-C12577F6.003A5B1A at precheza.cz> Content-Type: text/plain; charset="US-ASCII" Hi r-help-bounces at r-project.org napsal dne 10.12.2010 15:00:12:> Dear Mr Holtman Sir, > > Thanks a lot for your great solution. This certainly is helping meachieve> what I need to get. However, I shall be hugely thankful to you if youcan> guide me in one respect. > > Sir, you have used following commands to assign values to x and y. > > > x <- list(40, c(80,160), c(160,80,400)) > > y <- list(10, c(10,30), c(5,18,20)) > > z <- c(1,2,3) > > But Sir, the problem is these values are basically outputs of some other> process which I am running and chances are these will vary. Sir, it willbe a> great help if you can guide me to convert the output (which I amgetting)> > X > [[1]] > [1] 40 > > [[2]] > [1] 80 160 > > [[3]] > [1] 160 80 400 >I believe this object is probably list too so Jim's answer shall work on it. See what str(X) gives you as an output. Regards Petr> to what you have suggested > > x <- list(40, c(80,160), c(160,80,400)) > > So, in that case once I get output in my format, I will convert thatoutput as> provided by you. > > I apologize for taking the liberty of writing to you, but I shall bereally> grateful to you, as I have just started getting the feel of 'R' and Iknow I> need to take lots of efforts to begin with. > > Thanks and eagerly waiting for your guidance. > > Amelia Vettori > > --- On Fri, 10/12/10, jim holtman <jholtman at gmail.com> wrote: > > From: jim holtman <jholtman at gmail.com> > Subject: Re: [R] Adding numbers in Outputs > To: "Amelia Vettori" <amelia_vettori at yahoo.co.nz> > Cc: r-help at r-project.org > Received: Friday, 10 December, 2010, 1:43 PM > > try this: > > > x <- list(40, c(80,160), c(160,80,400)) > > y <- list(10, c(10,30), c(5,18,20)) > > z <- c(1,2,3) > > mapply(function(a1,a2,a3){ > + a3 * sum(a1 * a2) > + } > + , x > + , y > + , z > + ) > [1] 400 11200 30720 > > > On Fri, Dec 10, 2010 at 5:41 AM, Amelia Vettori > <amelia_vettori at yahoo.co.nz> wrote:[[elided Yahoo spam]]> > > > I am Amelia from Auckland and work for a bank. I am new to R and Ihave> started my venture with R just a couple of weeks back and this is myfirst> mail to R-forum. I need following assistance > > > > Suppose my R code generates following outputs as > > > > > >> X > > [[1]] > > [1] 40 > > > > [[2]] > > [1] 80 160 > > > > [[3]] > > [1] 160 80 400 > > > > > >> Y > > > > [[1]] > > > > [1] 10 > > > > > > > > [[2]] > > > > [1] 10 30 > > > > > > > > [[3]] > > > > [1] 5 18 20 > > > > and suppose > > > > Z = c(1, 2, 3) > > > > I need to perform the calculation where I will be multiplyingcorresponding> terms of X and Y individually and multiplying their sum by Z and storethese> results in a dataframe. > > > > I.e. I need to calculate > > > > (40*10) * 1 # (first element of X +First> element of Y) * Z[1] = 400 > > > > ((80*10)+(160*30)) * 2 # 2 row of X and 2nd row of Y =11200> > > > ((160*5)+(80*18)+(400*20)) * 3 # 3rd row of X and 3 row of Y andZ[3] = 30720> > > > > > > > So the final output should be > > > > 400 > > 11200 > > 30720 > > > > > > One way of doing it is write R code for individual rows and > > arrive at the result e.g. > > > > ([[X]][1]*[[Y]][1])*1 will result in 400. However, I was just tryingto know> some smart way of doing it as there could be number of rows and writingcode> for each row will be a cumbersome job. So is there any better way to doit?> > > > Please guide me. > > > > I thank you in advance. > > > > Thanking > > all > > > > Amelia > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.------------------------------ _______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. End of R-help Digest, Vol 94, Issue 11 ************************************** Email transmitted across the Internet is normally not protected and may be intercepted and viewed by others. Therefore, you should refrain from sending any confidential or private information via unsecured email to PenFed. We will not ask you to send confidential information to us via email, such as your logon ID, password, account numbers, or Social Security number. We prohibit our employees from sending confidential information to you via email that is not encrypted. The recommended document submission method is FAX; a partial list of generic fax numbers can be found here.