similar to: "hist" combines two lowest categories -- is there a workaround?

Displaying 20 results from an estimated 1000 matches similar to: ""hist" combines two lowest categories -- is there a workaround?"

2006 Sep 27
3
Space required by object?
Does R provide a function analogous to LS() or str() that reports the storage space, on disk or in memory, required by objects? Ben Fairbank
2006 Dec 14
5
Better way to change the name of a column in a dataframe?
Hello R users -- If I have a dataframe such as the following, named "frame" with the columns intended to be named col1 through col6, > frame col1 col2 cmlo3 col4 col5 col6 [1,] 3 10 2 6 5 7 [2,] 6 8 4 10 7 1 [3,] 7 5 1 3 1 8 [4,] 10 6 5 4 9 2 and I want to correct or otherwise change the
2007 Apr 24
4
Size of an object in workspace
Hi folks, Is there a function to show the size of an R object? eg. in Kbytes? Couple months ago Bendix Carstensen posted this marvelous little function lls(), which shows all objects in the current workspace by mode, class and 'size'. This is a wonderful enhancement to the build-in ls() already and I now have it sourced in my Rprofile.site at startup. The only drawback is,
2006 Jan 05
2
Splitting the list
I've changed the heading because this really is another thread. I think it inevitable that there will, in the course of time, be other lists that are devoted, in some shape or form, to the concerns of practitioners (at all levels) who are using R. One development I'd not like to see is fracture along application area lines, allowing those who are comfortable in coteries whose
2008 Jun 20
2
The Green Book and its relevance to R
I bogged down about half way through reading the Green Book, in part because it became increasingly difficult to understand how some of the ideas related to R, as opposed to S (which I have not used). Does any reader know whether there is a document that points out differences between S and R that would be helpful in reading the Green Book? Ideally, perhaps, I need a "crib sheet" to
2005 Jun 28
1
Using data frames for EDA: Insert, Change name, delete columns? (Newcomer's question)
I am finding complex analyses easier than some elementary operations in R. In particular I want to do some low level exploratory data analyses with data in a data frame but cannot find commands to easily insert, remove (delete), rename, and re-order (arbitrarily, not sort) columns. I see that the micEcon package has an insertCol command, but that is for matrices, not data frames. I have looked
2011 Feb 14
1
problem running scripts
Dear all, I have encounter an odd situation. I have various R scripts interconnected via the source () function. After examining the results I noticed that not all the functions or procedures within a script were adequately conducted. Especially with the longest script ( about 180 lines) Then, I ran every scripts individually (not using source () ) selecting all (Ctrl + a) and running the
2007 Jan 19
4
Newbie question: Statistical functions (e.g., mean, sd) in a "transform" statement?
Greetings listeRs - Given a data frame such as times time1 time2 time3 time4 1 70.408543 48.92378 7.399605 95.93050 2 17.231940 27.48530 82.962916 10.20619 3 20.279220 10.33575 66.209290 30.71846 4 NA 53.31993 12.398237 35.65782 5 9.295965 NA 48.929201 NA 6 63.966518 42.16304 1.777342 NA one can use "transform" to
2009 Feb 02
1
survfit using quantiles to group age
I am using the package Design for survival analysis. I want to plot a simple Kaplan-Meier fit of survival vs. age, with age grouped as quantiles. I can do this: survplot(survfit(Surv(time,status) ~ cut(age,3), data=veteran) but I would like to do something like this: survplot(survfit(Surv(time,status) ~ quantile(age,3), data=veteran) #will not work ideally I would like to superimpose
2012 Mar 06
1
How to eliminate for next loops in this script
I needed to compute a complicated cross tabulation to show weighted means and standard deviations and the only method I could get that worked uses a series of nested for next loops. I know that there must be a better way to do so, but could use some assistance pointing the way. Here is my working, but inefficient script: library(Hmisc) rm(list=ls()) load('NHTS.Rdata') day.wt <-
2006 May 13
2
What does it mean to be "masked from data" when attaching? (Newbie question)
I have several data frames, each with six variables and several hundred cases broken out from a larger dataframe by eleven values of a factor called "Division". I have to perform the same analysis on each one. I would like to do it by creating a data frame called data2 eleven times, once with data corresponding to each value of the factor, and performing the same analysis on each of
2002 Oct 03
0
[Fwd: curiousity with hist]
just realized that the bin value is actually the relative frequency divided by the bin width. sorry for consuming band width. Alas, is there anyway to make hist() calculate relative frequencies irrespective of bin width? thanks Murad Nayal wrote: > > Hello, > > I am rather new to R. in trying to use the hist() command I get behavior > that is somewhat puzzling me, in short,
2007 Feb 12
1
'Save Workspace' gives "recursive default argument reference" -- workaround?
When signing off R or trying to save a workspace in Windows XP pro SP2, I receive the following error message - save.image("C:\\Program Files\\R\\R-2.4.1\\Responses3.RData") Error in save.image("C:\\Program Files\\R\\R-2.4.1\\Responses3.RData") : recursive default argument reference Everything else seems to work fine, and the only function I have written
2023 Oct 16
1
Create new data frame with conditional sums
Dear Jason, The code could look something like: dummyData = data.frame(Tract=seq(1, 10, by=1), ?? ?Pct = c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03), ?? ?Totpop = c(4000,3500,4500,4100,3900,4250,5100,4700,4950,4800)) # Define the cutoffs # - allow for duplicate entries; by = 0.03; # by = 0.01; cutoffs <- seq(0, 0.20, by = by) # Create a new column with cutoffs dummyData$Cutoff
2023 Oct 16
1
Create new data frame with conditional sums
If one makes the reasonable assumption that Pct is much larger than Cutoff, sorting Cutoff is the expensive part e.g O(nlog2(n) for Quicksort (n = length Cutoff). I believe looping is O(n^2). Jeff's approach using findInterval may be faster. Of course implementation details matter. -- Bert On Mon, Oct 16, 2023 at 4:41?AM Leonard Mada <leo.mada at syonic.eu> wrote: > > Dear
2010 Aug 25
1
accessing the attr(*,label.table) after importing from spss
Dear all, I just received a file from a colleague in spss. The read.spss could not finish the file due to an error (Unrecognized record type 7, subtype 18 encountered in system file) so instead I converted the file using stat-transfer. Looking at my data I see that most labels are in the attributes and I?d love to access them and assign the pertinent variables to factors without doing the whole
2008 Mar 19
1
[PS] Two Way ANOVA
Ben, I would like to test the sulfur on the clover field, nitrogen on the clover field and then test for the presence of interaction. Sorry about the last email, seems it really screwed itself over, here it is again, hopefully nicer: Nitrogen(0) Nitrogen(20) Sulfur(0) 4.54 5.73 Sulfur(3) 4.64
2023 Oct 15
1
Create new data frame with conditional sums
Dear Jason, I do not think that the solution based on aggregate offered by GPT was correct. That quasi-solution only aggregates for every individual level. As I understand, you want the cumulative sum. The idea was proposed by Bert; you need only to sort first based on the cutoff (e.g. using an ordered factor). And then only extract the last value for each level. If Pct is unique, than you
2005 Jun 24
5
Memory limits using read.table on Windows XP Pro
Hello, When I try: geno <-read.table("2500.geno.tab",header=TRUE,sep="\t",na.strings=".",quote=" ",comment.char="",colClasses=c("factor"),nrows=2501) I get, after hour(s) of work: Error: cannot allocate vector of size 9 Kb I have: Rgui.exe --max-mem-size=3Gb and multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Microsoft
2011 Jan 27
2
creating categorical frequency tables from continuous data
Hello, I am working with a dataset which essentially has only one column - a list of distances in metres, accurate to several decimal places. eg distance 1000 6403.124 1000 1414.214 1414.214 1000 I want to organise this into a frequency table, grouping into categories of 0 - 999, 1000 - 1999, 2000-2999 etc. I'd also like the rows where there are no data points in that category to