Displaying 20 results from an estimated 1000 matches similar to: ""hist" combines two lowest categories -- is there a workaround?"
2006 Sep 27
3
Space required by object?
Does R provide a function analogous to LS() or str() that reports the
storage space, on disk or in memory, required by objects?
Ben Fairbank
2006 Dec 14
5
Better way to change the name of a column in a dataframe?
Hello R users --
If I have a dataframe such as the following, named "frame" with the
columns intended to be named col1 through col6,
> frame
col1 col2 cmlo3 col4 col5 col6
[1,] 3 10 2 6 5 7
[2,] 6 8 4 10 7 1
[3,] 7 5 1 3 1 8
[4,] 10 6 5 4 9 2
and I want to correct or otherwise change the
2007 Apr 24
4
Size of an object in workspace
Hi folks,
Is there a function to show the size of an R object? eg. in Kbytes?
Couple months ago Bendix Carstensen posted this marvelous little function lls(), which shows all objects in the current workspace by mode, class and 'size'. This is a wonderful enhancement to the build-in ls() already and I now have it sourced in my Rprofile.site at startup.
The only drawback is,
2006 Jan 05
2
Splitting the list
I've changed the heading because this really is another thread. I
think it inevitable that there will, in the course of time, be other
lists that are devoted, in some shape or form, to the concerns of
practitioners (at all levels) who are using R. One development I'd
not like to see is fracture along application area lines, allowing
those who are comfortable in coteries whose
2008 Jun 20
2
The Green Book and its relevance to R
I bogged down about half way through reading the Green Book, in part
because it became increasingly difficult to understand how some of the
ideas related to R, as opposed to S (which I have not used). Does any
reader know whether there is a document that points out differences
between S and R that would be helpful in reading the Green Book?
Ideally, perhaps, I need a "crib sheet" to
2005 Jun 28
1
Using data frames for EDA: Insert, Change name, delete columns? (Newcomer's question)
I am finding complex analyses easier than some elementary operations in
R. In particular I want to do some low level exploratory data analyses
with data in a data frame but cannot find commands to easily insert,
remove (delete), rename, and re-order (arbitrarily, not sort) columns.
I see that the micEcon package has an insertCol command, but that is for
matrices, not data frames. I have looked
2011 Feb 14
1
problem running scripts
Dear all,
I have encounter an odd situation.
I have various R scripts interconnected via the source () function.
After examining the results I noticed that not all the functions or procedures within a script were adequately conducted.
Especially with the longest script ( about 180 lines)
Then, I ran every scripts individually (not using source () ) selecting all (Ctrl + a) and running the
2007 Jan 19
4
Newbie question: Statistical functions (e.g., mean, sd) in a "transform" statement?
Greetings listeRs -
Given a data frame such as
times
time1 time2 time3 time4
1 70.408543 48.92378 7.399605 95.93050
2 17.231940 27.48530 82.962916 10.20619
3 20.279220 10.33575 66.209290 30.71846
4 NA 53.31993 12.398237 35.65782
5 9.295965 NA 48.929201 NA
6 63.966518 42.16304 1.777342 NA
one can use "transform" to
2009 Feb 02
1
survfit using quantiles to group age
I am using the package Design for survival analysis. I want to plot a
simple Kaplan-Meier fit of survival vs. age, with age grouped as
quantiles. I can do this:
survplot(survfit(Surv(time,status) ~ cut(age,3), data=veteran)
but I would like to do something like this:
survplot(survfit(Surv(time,status) ~ quantile(age,3), data=veteran)
#will not work
ideally I would like to superimpose
2012 Mar 06
1
How to eliminate for next loops in this script
I needed to compute a complicated cross tabulation to show weighted means
and standard deviations and the only method I could get that worked uses a
series of nested for next loops. I know that there must be a better way to
do so, but could use some assistance pointing the way.
Here is my working, but inefficient script:
library(Hmisc)
rm(list=ls())
load('NHTS.Rdata')
day.wt <-
2006 May 13
2
What does it mean to be "masked from data" when attaching? (Newbie question)
I have several data frames, each with six variables and several hundred
cases broken out from a larger dataframe by eleven values of a factor
called "Division". I have to perform the same analysis on each one. I
would like to do it by creating a data frame called data2 eleven times,
once with data corresponding to each value of the factor, and performing
the same analysis on each of
2002 Oct 03
0
[Fwd: curiousity with hist]
just realized that the bin value is actually the relative frequency
divided by the bin width. sorry for consuming band width.
Alas, is there anyway to make hist() calculate relative frequencies
irrespective of bin width?
thanks
Murad Nayal wrote:
>
> Hello,
>
> I am rather new to R. in trying to use the hist() command I get behavior
> that is somewhat puzzling me, in short,
2007 Feb 12
1
'Save Workspace' gives "recursive default argument reference" -- workaround?
When signing off R or trying to save a workspace in Windows XP pro SP2,
I receive the following error message -
save.image("C:\\Program Files\\R\\R-2.4.1\\Responses3.RData")
Error in save.image("C:\\Program Files\\R\\R-2.4.1\\Responses3.RData") :
recursive default argument reference
Everything else seems to work fine, and the only function I have written
2023 Oct 16
1
Create new data frame with conditional sums
Dear Jason,
The code could look something like:
dummyData = data.frame(Tract=seq(1, 10, by=1),
?? ?Pct = c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),
?? ?Totpop = c(4000,3500,4500,4100,3900,4250,5100,4700,4950,4800))
# Define the cutoffs
# - allow for duplicate entries;
by = 0.03; # by = 0.01;
cutoffs <- seq(0, 0.20, by = by)
# Create a new column with cutoffs
dummyData$Cutoff
2023 Oct 16
1
Create new data frame with conditional sums
If one makes the reasonable assumption that Pct is much larger than
Cutoff, sorting Cutoff is the expensive part e.g O(nlog2(n) for
Quicksort (n = length Cutoff). I believe looping is O(n^2). Jeff's
approach using findInterval may be faster. Of course implementation
details matter.
-- Bert
On Mon, Oct 16, 2023 at 4:41?AM Leonard Mada <leo.mada at syonic.eu> wrote:
>
> Dear
2010 Aug 25
1
accessing the attr(*,label.table) after importing from spss
Dear all,
I just received a file from a colleague in spss. The read.spss could not finish the file due to an error (Unrecognized record type 7, subtype 18 encountered in system file) so instead I converted the file using stat-transfer. Looking at my data I see that most labels are in the attributes and I?d love to access them and assign the pertinent variables to factors without doing the whole
2008 Mar 19
1
[PS] Two Way ANOVA
Ben,
I would like to test the sulfur on the clover field, nitrogen on the clover field and then test for the presence of interaction.
Sorry about the last email, seems it really screwed itself over, here it is again, hopefully nicer:
Nitrogen(0) Nitrogen(20)
Sulfur(0) 4.54 5.73
Sulfur(3) 4.64
2023 Oct 15
1
Create new data frame with conditional sums
Dear Jason,
I do not think that the solution based on aggregate offered by GPT was
correct. That quasi-solution only aggregates for every individual level.
As I understand, you want the cumulative sum. The idea was proposed by
Bert; you need only to sort first based on the cutoff (e.g. using an
ordered factor). And then only extract the last value for each level. If
Pct is unique, than you
2005 Jun 24
5
Memory limits using read.table on Windows XP Pro
Hello,
When I try:
geno
<-read.table("2500.geno.tab",header=TRUE,sep="\t",na.strings=".",quote="
",comment.char="",colClasses=c("factor"),nrows=2501)
I get, after hour(s) of work:
Error: cannot allocate vector of size 9 Kb
I have:
Rgui.exe --max-mem-size=3Gb
and
multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Microsoft
2011 Jan 27
2
creating categorical frequency tables from continuous data
Hello,
I am working with a dataset which essentially has only one column - a
list of distances in metres, accurate to several decimal places. eg
distance
1000
6403.124
1000
1414.214
1414.214
1000
I want to organise this into a frequency table, grouping into categories
of 0 - 999, 1000 - 1999, 2000-2999 etc. I'd also like the rows where
there are no data points in that category to