similar to: how to create a variable to rank within subgroups

Displaying 20 results from an estimated 10000 matches similar to: "how to create a variable to rank within subgroups"

2005 Jan 06
6
"labels" attached to variable names
Hi, Can we attach a more descriptive "label" (I may use the wrong terminology, which would explain why I found nothing on the FAQ) to variable names, and later have an easy way to switch to these labels in plots? I fear this is not possible and one must enter this by hand as ylab and xlab when making plots. Thanks in advance, Denis Chabot
2005 Sep 26
4
p-level in packages mgcv and gam
Hi, I am fairly new to GAM and started using package mgcv. I like the fact that optimal smoothing is automatically used (i.e. df are not determined a priori but calculated by the gam procedure). But the mgcv manual warns that p-level for the smooth can be underestimated when df are estimated by the model. Most of the time my p-levels are so small that even doubling them would not result
2009 Mar 10
4
puzzled by math on date-time objects
Hi, I don't understand the following. When I create a small artificial set of date information in class POSIXct, I can calculate the mean and the median: a = as.POSIXct(Sys.time()) a = a + 60*0:10; a [1] "2009-03-10 11:30:16 EDT" "2009-03-10 11:31:16 EDT" "2009-03-10 11:32:16 EDT" [4] "2009-03-10 11:33:16 EDT" "2009-03-10 11:34:16
2007 Mar 18
2
italics letter in roman string
Hi, As part of the legend to a plot, I need to have the "n" in italics because it is a requirement of the journal I aim to publish in: "This study, n = 3293" Presently I have: legend(20, 105, "This study, n = 3293", pch=1, col=rgb(0,0,0,0.5), pt.cex=0.3, cex=0.8, bty="n") I suppose I could leave a blank in place of the "n",
2006 Sep 20
1
functionality of "update" in SAS
Dear list, I've tried to search the archives but found nothing, although I may use the wrong wording in my searches. I've also double-checked the upData function in Hmisc, but it does something else. I'm wondering if one can update a dataframe by "forcing into" it a shorter dataframe containing the corrections, like the "update" provided in SAS data steps.
2005 Feb 04
5
2 small problems: integer division and the nature of NA
Hi, I'm wondering why 48 %/% 2 gives 24 but 4.8 %/% 0.2 gives 23... I'm not trying to round up here, but to find out how many times something fits into something else, and the answer should have been the same for both examples, no? On a different topic, I like the behavior of NAs better in R than in SAS (at least they are not considered the smallest value for a variable), but at the
2006 Feb 08
1
plotting lines that break if data break
Hi, Sometimes data series (not necessarily time series) suffer breaks where data were expected, but not collected. Often the regular "lines" command to add such data to a plot is what I want, but other times I'd like the line to break where the data series is interrupted, instead of the line jumping to the next point in the series uninterrupted. Usually my data file
2006 Oct 26
2
distance between legend title and legend box
Hi, I've looked at the parameters available for the legend function and cannot find a way to change the distance between the top of the box surrounding a legend and the legend's title. I have a math expression that raises the height of my title. If you don't mind the non-sensical title I give to the legend for this plot (Figure 3.20 in R Graphics): with(iris,
2005 Jan 19
2
recoding large number of categories (select in SAS)
Hi, I have data on stomach contents. Possible prey species are in the hundreds, so a list of prey codes has been in used in many labs doing this kind of work. When comes time to do analyses on these data one often wants to regroup prey in broader categories, especially for rare prey. In SAS you can nest a large number of "if-else", or do this more cleanly with "select"
2006 Nov 08
2
combining dataframes with different numbers of columns
Dear list members, I have to combine dataframes together. However they contain different numbers of variables. It is possible that all the variables in the dataframe with fewer variables are contained in the dataframe with more variables, though it is not always the case. There are key variables identifying observations. These could be used in a merge statement, although this won't
2006 Feb 05
1
how to extract predicted values from a quantreg fit?
Hi, I have used package quantreg to estimate a non-linear fit to the lowest part of my data points. It works great, by the way. But I'd like to extract the predicted values. The help for predict.qss1 indicates this: predict.qss1(object, newdata, ...) and states that newdata is a data frame describing the observations at which prediction is to be made. I used the same technique I used
2007 May 22
5
Reducing the size of pdf graphics files produced with R
Hi, Without trying to print 1000000 points (see <http:// finzi.psych.upenn.edu/R/Rhelp02a/archive/42105.html>), I often print maps for which I do not want to loose too much of coastline detail, and/or plots with 1000-5000 points (yes, some are on top of each other, but using transparency (i.e. rgb colors with alpha information) this actually comes through as useful information.
2006 Sep 13
1
reshaping a dataset
Hi, I'm trying to move to R the last few data handling routines I was performing in SAS. I'm working on stomach content data. In the simplified example I provide below, there are variables describing the origin of each prey item (nbpc is a ship number, each ship may have been used on different trips, each trip has stations, and individual fish (tagno) can be caught at each
2010 Jan 08
2
A better way to Rank Data that considers "ties"
This will start off sounding very easy, but I think it will be very complicated. Let's say that I have a matrix, which shows the number of apples that each person in a group has. OriginalMatrix<-matrix(c(2,3,5,4,6),nrow=5,ncol=1,byrow=T,dimnames=list(c("Bob","Frank","Joe","Jim","David"),c("Apples"))) Apples Bob 2
2005 Oct 05
3
testing non-linear component in mgcv:gam
Hi, I need further help with my GAMs. Most models I test are very obviously non-linear. Yet, to be on the safe side, I report the significance of the smooth (default output of mgcv's summary.gam) and confirm it deviates significantly from linearity. I do the latter by fitting a second model where the same predictor is entered without the s(), and then use anova.gam to compare the
2003 Jul 22
1
rank with ties
Hi, Is there a function like rank but that solves the ties by randomly assigning a value (doesn't average ranks of ties). This is what I actually need: I want to make NA all elements of each column in an array that are ranked in a position larger that rankmax for each column. # Say I've got an array b: b<-cbind(c(1:5,5:1),c(1,12,14,2,5,4:8)) #> b # [,1] [,2] #[1,] 1 1
2006 Oct 10
3
Rank Function
Does anyone know why the two rank functions gives different results? I need to use the rank function in a "for" loop, so the sequence to be ranked is given values in the form of part (1). How can I use assignment like in part (1) to get correct ranks as in part (2)? Thank You Part (1) i<-1.94 b<-0.95-i c<-1.73-i d<-2.62-i y<-c(0.68,0.95,b,c,d) y 0.68 0.95 -0.99
2009 Jul 23
2
alternative to rbind within a loop
Hi, I often have to do this: select a folder (directory) containing a few hundred data files in csv format (up to 1000 files, in fact) open each file, transform some character variables in date-tiime format make into a dataframe (involves getting rid of a few variables I don't need concatenate to the master dataframe that will eventually contain the data from all the files in the
2010 Oct 17
1
lattice xyplot - formatting of multiple Y variables when using subgroups
Hi all, Using xyplot I want to print to Y variables (y1, y2) versus X, conditional on the group. How can I obtain a line (type="l") for one relationship (ie. y1 ~ x) and points (type="p") for the other (y2 ~ x) ? library(lattice) # create some sample data df<-data.frame(group=as.factor(c(rep("a",4), rep("b",4))), # grouping variable for conditional
2005 Nov 01
1
percent rank by an index key?
What is the easiest way to calculate a percent rank “by” an index key? Foe example, I have a dataset with 3 fields: Year, State, Income , I wish to calculate the rank, by year, by state. I also wish to calculate the “percent rank”, where I define percent rank as rank/n. (n is the number of numeric data points within each date-state grouping.) This is what I am