Displaying 20 results from an estimated 10000 matches similar to: "Organizing Large Datasets"
2010 Feb 20
1
What is your system for WorkFlow and Source Code Organizing in R ?
Hello dear R users,
Recently there has been several fascinating threads on the website
stackoverflow <http://stackoverflow.com> regarding the subject of R WorkFlow
best practices:
- What best practices do you use for programming in
R?<http://stackoverflow.com/questions/2258092/what-best-practices-do-you-use-for-programming-in-r>
- Workflow for statistical analysis and report
2012 Feb 10
1
Need to aggregate large dataset by week...
Hi all,
I have a large dataset with ~8600 observations that I want to compress to
weekly means. There are 9 variables (columns), and I have already added a
"week" column with 51 weeks. I have been looking at the functions:
aggregate, tapply, apply, etc. and I am just not savvy enough with R to
figure this out on my own, though I'm sure it's fairly easy. I also have the
Dates
2001 Mar 20
2
Help with large datasets
I am a new user of R (1.2.2 on Alpha), but have been using S and then
Splus heavily for about 10 years now. My problem is this. The data
I analyze comprise large sets. Typically I am analyzing 5000 observations
on 90 variables over several hundred subjects. Sometimes 500,000
subjects with 200 variables. Unfortunately, although my Alpha has
1.5 gig of ram, R as it is configured seems to be set
2008 Sep 15
1
Help... Organizing multiple spreadsheets data into a huge R data structure!
Hello R users,
I am relatively new to the R program, and I hope some of you can offer
me some suggestions on how to organize my data in R using some of the
more advanced data structuring technique. Here's my scenario:
I have date set of 50 participants (each with conditions and
demographic data), each participant performed 2x16 trials, for each
trial, there was specific information about the
2015 Jun 24
0
Organizing a Pre Astricon road trip
Hi All,
I am cross posting this to the Asterisk Users, Biz, and Dev lists at the
suggestion of David Duffet, so sorry if you see it multiple times.
As this year Astricon is in Orlando, and most of us are tech geeks in on
form or another, we are trying to organize a road trip to see NASA's
Kennedy Space Center.
I have been in touch with their group sales office and was told that there
is a
2006 Sep 19
5
Recommendations for organizing hosts into groups?
I have a few different groupings of hosts, and I am wondering what is
the best way to organize the node configuration for them.
I have a few machines that are VMware hosts, a bunch of VMware
guests, and the main admin server which runs a bunch of stuff like
puppetmasterd.
I''ve got a bunch of classes/*.pp which define configuration for sudo,
yum, java, etc. Right now I just have
2015 Jun 25
0
Organizing a Pre Astricon road trip (Eric Klein)
Sorry, apparently I forgot that we are looking at the Monday before Devcon
for this trip.
> ------------------------------
>
> Message: 4
> Date: Wed, 24 Jun 2015 11:59:29 +0300
> From: Eric Klein <eric.klein at greenfieldtech.net>
> To: asterisk-users at lists.digium.com
> Subject: [asterisk-users] Organizing a Pre Astricon road trip
> Message-ID:
>
2000 Sep 28
2
organizing work; dump function
> Dear R-users!
>
> I am using R 1.0.0 and Windows NT 4.0
>
In the past I have used several different working directories for different
projects, and during many of these projects I have written some functions
for particular purposes. Now I thought I would be nice to have all these
"personal" functions collected in one place, and to make them available in R
no matter which
2002 Apr 29
3
Organizing the help files in a package
Dear all!!
I am using R1.4.1 on windows 98.
I had been trying to organize the package and has already been able to
document some of the functions in to .Rd (R documentation) files. From these
.Rd files I generated plain text files as well as html files.
I have also given the 00Index file in each of the directories:
html/
help/
data/
man/
Problem: I don't get the help using comand
2000 Sep 29
1
SUMMARY and follow up question: organizing work; dump function
Thanks very much to Friedrich Leisch, Brian Ripley and Thomas Lumley for
their useful answers.
Basically, they all suggested creation of a personal package as the easiest
and best way to hold ones own functions. This indeed is quite easy and very
useful.
In addition, Friedrich Leisch pointed out that he's planning to add an
"append" argument to the dump() function.
One follow up
2011 Jun 07
1
Packages for R-CRAN (organizing aspects)
Hello,
I have some ideas for packages that I want to provide on R-CRAN.
One package alreads is working, but I have some warnings in
when compiling. Also I don't know if the libraries in use are only
working on Unix/Linux.
So I have some general questions:
- If I'm not sure if the code would also work on windows
(needing some ceratain libraries or tools),
would it be better to
2005 Apr 24
1
large dataset import, aggregation and reshape
Dear useRs
We have a data-set (comma delimited) with 12Millions of rows, and 5
columns (in fact many more, but we need only 4 of them): id, factor 'a'
(5 levels), factor 'b' (15 levels), date-stamp, numeric measurement. We
run R on suse-linux 9.1 with 2GB RAM, (and a 3.5GB swap file).
on average we have 30 obs. per id. We want to aggregate (eg. sum of the
measuresments under
2008 Jan 24
3
How should I organize data to compare differences in matched pairs?
I'm just learning how to use R right now, so I'm not sure what the most
efficient way to organize these data is.
I had subjects perform the same task twice with slight changes between the
rounds. I want to analyze differences between the rounds. All of the
subjects also answered a questionnaire.
Putting all of one subject's information on one row seems sloppy.
I was thinking about
2006 May 18
1
Patch 5091
Hi,
I submitted a patch (for the first time) a couple of days ago
(http://dev.rubyonrails.org/ticket/5091 - the decsription is included
below).
I''ve noticed some posts to rails-core notifying the world of patches,
since there''s been no activity on said patch I thought I''d do the
same. Please let me know if this breaks protocol.
Cheers,
Ian White
--- patch summary
2006 Jul 29
0
SOAP for large datasets
I''ve been playing around with a soap interface to an application that
can return large datasets (up to 50mb or so). There are also some
nested structures for which I''ve used ActionWebService::Struct with
2-3 nested members of oher ActionWebService::Struct members. In
addition to chewing up a ton of memory, cpu ulilization isn''t that
great either. My development
2013 Nov 30
1
bnlearn and very large datasets (> 1 million observations)
Hi
Anyone have experience with very large datasets and the Bayesian Network
package, bnlearn? In my experience R doesn't react well to very large
datasets.
Is there a way to divide up the dataset into pieces and incrementally learn
the network with the pieces? This would also be helpful incase R crashes,
because I could save the network after learning each piece.
Thank you.
2010 Jul 16
1
Question about KLdiv and large datasets
Hi all,
when running KL on a small data set, everything is fine:
require("flexmix")
n <- 20
a <- rnorm(n)
b <- rnorm(n)
mydata <- cbind(a,b)
KLdiv(mydata)
however, when this dataset increases
require("flexmix")
n <- 10000000
a <- rnorm(n)
b <- rnorm(n)
mydata <- cbind(a,b)
KLdiv(mydata)
KL seems to be not defined. Can somebody explain what is going
2011 Jul 20
0
Competing risk regression with CRR slow on large datasets?
Hi,
I posted this question on stats.stackexchange.com 3 days ago but the
answer didn't really address my question concerning the speed in
competing risk regression. I hope you don't mind me asking it in this
forum:
I?m doing a registry based study with almost 200 000 observations and
I want to perform a competing risk analysis. My problem is that the
crr() in the cmprsk package is
2013 Jan 19
2
importing large datasets in R
Hi Everyone,
I am a little new to R and the first problem I am facing is the dilemma
whether R is suitable for files of size 2 GB's and slightly more then 2
Million rows. When I try importing the data using read.table, it seems to
take forever and I have to cancel the command. Are there any special
techniques or methods which i can use or some tricks of the game that I
should keep in mind in
2009 Feb 26
0
glm with large datasets
Hi all,
I have to run a logit regresion over a large dataset and I am not sure
about the best option to do it. The dataset is about 200000x2000 and R
runs out of memory when creating it.
After going over help archives and the mailing lists, I think there are
two main options, though I am not sure about which one will be better.
Of course, any alternative will be welcome as well.
Actually, I