Displaying 20 results from an estimated 10000 matches similar to: "Help with large datasets"
2005 May 07
1
Incorrect libxml2.2.dylib version on Tiger install
Hi all,
I have just installed OSX Server 10.4 and R comes up with the
incompatible libxml library message reported by Dan Kelley a few
messages ago. Xcode 2 does not ship with Tiger Server. I installed
the X-Windows code. I can report that the version of libxml2.2 that is
installed in this case is the version 8.0.0 dylib.
[6]sboker at munimula:/usr/lib % ls -l libxml2.2*
-rwxr-xr-x 1
1997 Oct 09
0
R-alpha: [sboker@calliope.psych.nd.edu: Re: S-PLUS on UNIX plans]
--Multipart_Thu_Oct__9_10:41:03_1997-1
Content-Type: text/plain; charset=US-ASCII
In case you did not realize how much this is related to R :
--Multipart_Thu_Oct__9_10:41:03_1997-1
Content-Type: message/rfc822
Return-Path: s-sender@utstat.toronto.edu
From: "Steven M. Boker" <sboker@calliope.psych.nd.edu>
Date: Wed, 8 Oct 97 16:37:05 -0500
To: s-news@utstat.toronto.edu
Subject:
2001 Apr 20
1
Mac OS-X port of R?
I'm wondering if a Mac OS-X port of R-1.2.2 has occurred.
Jan de Leeuw posted that he had made a preliminary port
using calls to X windows for the GUI portions, but that
file doesn't exist on his server.
Is someone doing this? If not, I'll get busy and see if I
can make a preliminary port. Of course, lots of work would
need to be done to get everything nicely integrated with
2012 Feb 02
0
Organizing Large Datasets
Recently I've run into memory problems while using data.frames for a
reasonably large dataset. I've solved those problems using arrays, and
that has provoked me to do a few benchmarks. I would like to share the
results.
Let us start with the data. There are N subjects classified into G
groups. These subjects are observed for T periods, and each
observation consists of M variables. So,
2018 Oct 05
5
[Bug 13645] New: Improve efficiency when resuming transfer of large files
https://bugzilla.samba.org/show_bug.cgi?id=13645
Bug ID: 13645
Summary: Improve efficiency when resuming transfer of large
files
Product: rsync
Version: 3.0.9
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P5
Component: core
Assignee:
2012 Oct 06
2
Large subjects increase memory-usage and enlarge index-files
Several times we already had the problems, that accounts with more the
1.3 or 1.7 billion e-mails in one folder run out-of-memory, even if
vsize_limit of 750 MB is set.
In this case, the lmtpd-process haven't been able to allocate more
memory to read/write/update the index-files and crashed (and the
index-files become corrupted at the end.)
[Please -- don't discuss about the need of
2024 Jan 29
1
print data.frame with a list column
On Mon, 29 Jan 2024 14:19:21 +0200
Micha Silver <tsvibar at gmail.com> wrote:
> Is there some option to force printing the full list?
> df <- data.frame("name" = "A", "bands" = I(list(1:20)))
format.AsIs is responsible for printing columns produced using I(). It
accepts a "width" argument:
format(x, width = 9999)
# name
2003 Aug 11
3
Plea for Help with Slow Roaming Profiles
I've posted a couple of times about this problem. This is a plea for help
with Roaming Profile configuration.
The short problem is that logging on and logging off takes about ten
minutes, with a fresh roaming profile (~1MB), on a 100Mb LAN.
If anyone has any suggestions or pointers or questions, *please* pipe up.
I'm all out of ideas.
2023 Nov 20
1
Calculating volume under polygons
Dear all;
I am trying to calculate volume under each polygon of a shapefile according
to a DEM.
when I run the code, it gives me an error as follows.
"
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function
'addAttrToGeom': sp supports Z dimension only for POINT and MULTIPOINT.
use `st_zm(...)` to coerce to XY dimensions
2006 Apr 29
1
splitting and saving a large dataframe
Hi,
I searched for this in the mailing list, but found no results.
I have a large dataframe ( dim(mydata)= 1297059 16, object.size(mydata=
145280576) ) , and I want to perform some calculations which can be done by
a factor's levels, say, mydata$myfactor. So what I want is to split this
dataframe into nlevels(mydata$myfactor) = 80 levels. But I must do this
efficiently, that is, I
2010 Feb 12
1
ffsave.image() error with large objects
Hi, I have been using ffsave.image() to save mixture of ff and normal
objects in my workspace. e.g.
ffsave.image(file = "C:\output\saveobjects", rootpath =
"D:\fftempdir", safe = TRUE)
It works fine but once my workspace has large (~4GB) objects, I get the error:
Error in ffsave.image(file = "C:\output\savedobjects", rootpath =
"D:\fftempdir", safe =
2005 Jul 01
2
loop over large dataset
Hi All,
I'd like to ask for a few clarifications. I am doing some calculations
over some biggish datasets. One has ~ 23000 rows, and 6 columns, the
other has ~620000 rows and 6 columns.
I am using these datasets to perform a simulation of of haplotype
coalescence over a pedigree (the datestes themselves are pedigree
information). I created a new dataset (same number of rows as the
pedigree
2004 Feb 11
7
large fonts on plots
Hi all,
I need to enlarge te fonts used oo R-plots (plots, histograms, ...) in
labels and titles etc.
I seem to be unable to figure out how to do it. The problem is that the
titles of the plots are simply unreadable when I insert them into my LaTeX
text, since they are relatively small compared to the entire plot.
I am sure it is pretty simple, can anybody give me a hint ?
Please reply
2006 Jul 29
0
SOAP for large datasets
I''ve been playing around with a soap interface to an application that
can return large datasets (up to 50mb or so). There are also some
nested structures for which I''ve used ActionWebService::Struct with
2-3 nested members of oher ActionWebService::Struct members. In
addition to chewing up a ton of memory, cpu ulilization isn''t that
great either. My development
2013 Nov 30
1
bnlearn and very large datasets (> 1 million observations)
Hi
Anyone have experience with very large datasets and the Bayesian Network
package, bnlearn? In my experience R doesn't react well to very large
datasets.
Is there a way to divide up the dataset into pieces and incrementally learn
the network with the pieces? This would also be helpful incase R crashes,
because I could save the network after learning each piece.
Thank you.
2010 Jul 16
1
Question about KLdiv and large datasets
Hi all,
when running KL on a small data set, everything is fine:
require("flexmix")
n <- 20
a <- rnorm(n)
b <- rnorm(n)
mydata <- cbind(a,b)
KLdiv(mydata)
however, when this dataset increases
require("flexmix")
n <- 10000000
a <- rnorm(n)
b <- rnorm(n)
mydata <- cbind(a,b)
KLdiv(mydata)
KL seems to be not defined. Can somebody explain what is going
2011 Jul 20
0
Competing risk regression with CRR slow on large datasets?
Hi,
I posted this question on stats.stackexchange.com 3 days ago but the
answer didn't really address my question concerning the speed in
competing risk regression. I hope you don't mind me asking it in this
forum:
I?m doing a registry based study with almost 200 000 observations and
I want to perform a competing risk analysis. My problem is that the
crr() in the cmprsk package is
2013 Jan 19
2
importing large datasets in R
Hi Everyone,
I am a little new to R and the first problem I am facing is the dilemma
whether R is suitable for files of size 2 GB's and slightly more then 2
Million rows. When I try importing the data using read.table, it seems to
take forever and I have to cancel the command. Are there any special
techniques or methods which i can use or some tricks of the game that I
should keep in mind in
2009 Feb 26
0
glm with large datasets
Hi all,
I have to run a logit regresion over a large dataset and I am not sure
about the best option to do it. The dataset is about 200000x2000 and R
runs out of memory when creating it.
After going over help archives and the mailing lists, I think there are
two main options, though I am not sure about which one will be better.
Of course, any alternative will be welcome as well.
Actually, I
2017 Jul 18
1
Help-Multi class classification for large datasets
Hai all,
We are working on Multi-class Classification. Currently up to 1.1 million
records Ranger package in R is able to handle. Training time on 128 GB RAM
is 12 days, which is not a practically feasible method to proceed further.
In future we will have dataset of dimension 10 million records, we are in
search for a package or framework which can handle 10 million records with
at least 12000