similar to: Help with large datasets

Displaying 20 results from an estimated 10000 matches similar to: "Help with large datasets"

2005 May 07
1
Incorrect libxml2.2.dylib version on Tiger install
Hi all, I have just installed OSX Server 10.4 and R comes up with the incompatible libxml library message reported by Dan Kelley a few messages ago. Xcode 2 does not ship with Tiger Server. I installed the X-Windows code. I can report that the version of libxml2.2 that is installed in this case is the version 8.0.0 dylib. [6]sboker at munimula:/usr/lib % ls -l libxml2.2* -rwxr-xr-x 1
1997 Oct 09
0
R-alpha: [sboker@calliope.psych.nd.edu: Re: S-PLUS on UNIX plans]
--Multipart_Thu_Oct__9_10:41:03_1997-1 Content-Type: text/plain; charset=US-ASCII In case you did not realize how much this is related to R : --Multipart_Thu_Oct__9_10:41:03_1997-1 Content-Type: message/rfc822 Return-Path: s-sender@utstat.toronto.edu From: "Steven M. Boker" <sboker@calliope.psych.nd.edu> Date: Wed, 8 Oct 97 16:37:05 -0500 To: s-news@utstat.toronto.edu Subject:
2001 Apr 20
1
Mac OS-X port of R?
I'm wondering if a Mac OS-X port of R-1.2.2 has occurred. Jan de Leeuw posted that he had made a preliminary port using calls to X windows for the GUI portions, but that file doesn't exist on his server. Is someone doing this? If not, I'll get busy and see if I can make a preliminary port. Of course, lots of work would need to be done to get everything nicely integrated with
2012 Feb 02
0
Organizing Large Datasets
Recently I've run into memory problems while using data.frames for a reasonably large dataset. I've solved those problems using arrays, and that has provoked me to do a few benchmarks. I would like to share the results. Let us start with the data. There are N subjects classified into G groups. These subjects are observed for T periods, and each observation consists of M variables. So,
2018 Oct 05
5
[Bug 13645] New: Improve efficiency when resuming transfer of large files
https://bugzilla.samba.org/show_bug.cgi?id=13645 Bug ID: 13645 Summary: Improve efficiency when resuming transfer of large files Product: rsync Version: 3.0.9 Hardware: All OS: All Status: NEW Severity: enhancement Priority: P5 Component: core Assignee:
2012 Oct 06
2
Large subjects increase memory-usage and enlarge index-files
Several times we already had the problems, that accounts with more the 1.3 or 1.7 billion e-mails in one folder run out-of-memory, even if vsize_limit of 750 MB is set. In this case, the lmtpd-process haven't been able to allocate more memory to read/write/update the index-files and crashed (and the index-files become corrupted at the end.) [Please -- don't discuss about the need of
2024 Jan 29
1
print data.frame with a list column
On Mon, 29 Jan 2024 14:19:21 +0200 Micha Silver <tsvibar at gmail.com> wrote: > Is there some option to force printing the full list? > df <- data.frame("name" = "A", "bands" = I(list(1:20))) format.AsIs is responsible for printing columns produced using I(). It accepts a "width" argument: format(x, width = 9999) # name
2003 Aug 11
3
Plea for Help with Slow Roaming Profiles
I've posted a couple of times about this problem. This is a plea for help with Roaming Profile configuration. The short problem is that logging on and logging off takes about ten minutes, with a fresh roaming profile (~1MB), on a 100Mb LAN. If anyone has any suggestions or pointers or questions, *please* pipe up. I'm all out of ideas.
2023 Nov 20
1
Calculating volume under polygons
Dear all; I am trying to calculate volume under each polygon of a shapefile according to a DEM. when I run the code, it gives me an error as follows. " Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'addAttrToGeom': sp supports Z dimension only for POINT and MULTIPOINT. use `st_zm(...)` to coerce to XY dimensions
2006 Apr 29
1
splitting and saving a large dataframe
Hi, I searched for this in the mailing list, but found no results. I have a large dataframe ( dim(mydata)= 1297059 16, object.size(mydata= 145280576) ) , and I want to perform some calculations which can be done by a factor's levels, say, mydata$myfactor. So what I want is to split this dataframe into nlevels(mydata$myfactor) = 80 levels. But I must do this efficiently, that is, I
2010 Feb 12
1
ffsave.image() error with large objects
Hi, I have been using ffsave.image() to save mixture of ff and normal objects in my workspace. e.g. ffsave.image(file = "C:\output\saveobjects", rootpath = "D:\fftempdir", safe = TRUE) It works fine but once my workspace has large (~4GB) objects, I get the error: Error in ffsave.image(file = "C:\output\savedobjects", rootpath = "D:\fftempdir", safe =
2005 Jul 01
2
loop over large dataset
Hi All, I'd like to ask for a few clarifications. I am doing some calculations over some biggish datasets. One has ~ 23000 rows, and 6 columns, the other has ~620000 rows and 6 columns. I am using these datasets to perform a simulation of of haplotype coalescence over a pedigree (the datestes themselves are pedigree information). I created a new dataset (same number of rows as the pedigree
2004 Feb 11
7
large fonts on plots
Hi all, I need to enlarge te fonts used oo R-plots (plots, histograms, ...) in labels and titles etc. I seem to be unable to figure out how to do it. The problem is that the titles of the plots are simply unreadable when I insert them into my LaTeX text, since they are relatively small compared to the entire plot. I am sure it is pretty simple, can anybody give me a hint ? Please reply
2006 Jul 29
0
SOAP for large datasets
I''ve been playing around with a soap interface to an application that can return large datasets (up to 50mb or so). There are also some nested structures for which I''ve used ActionWebService::Struct with 2-3 nested members of oher ActionWebService::Struct members. In addition to chewing up a ton of memory, cpu ulilization isn''t that great either. My development
2013 Nov 30
1
bnlearn and very large datasets (> 1 million observations)
Hi Anyone have experience with very large datasets and the Bayesian Network package, bnlearn? In my experience R doesn't react well to very large datasets. Is there a way to divide up the dataset into pieces and incrementally learn the network with the pieces? This would also be helpful incase R crashes, because I could save the network after learning each piece. Thank you.
2010 Jul 16
1
Question about KLdiv and large datasets
Hi all, when running KL on a small data set, everything is fine: require("flexmix") n <- 20 a <- rnorm(n) b <- rnorm(n) mydata <- cbind(a,b) KLdiv(mydata) however, when this dataset increases require("flexmix") n <- 10000000 a <- rnorm(n) b <- rnorm(n) mydata <- cbind(a,b) KLdiv(mydata) KL seems to be not defined. Can somebody explain what is going
2011 Jul 20
0
Competing risk regression with CRR slow on large datasets?
Hi, I posted this question on stats.stackexchange.com 3 days ago but the answer didn't really address my question concerning the speed in competing risk regression. I hope you don't mind me asking it in this forum: I?m doing a registry based study with almost 200 000 observations and I want to perform a competing risk analysis. My problem is that the crr() in the cmprsk package is
2013 Jan 19
2
importing large datasets in R
Hi Everyone, I am a little new to R and the first problem I am facing is the dilemma whether R is suitable for files of size 2 GB's and slightly more then 2 Million rows. When I try importing the data using read.table, it seems to take forever and I have to cancel the command. Are there any special techniques or methods which i can use or some tricks of the game that I should keep in mind in
2009 Feb 26
0
glm with large datasets
Hi all, I have to run a logit regresion over a large dataset and I am not sure about the best option to do it. The dataset is about 200000x2000 and R runs out of memory when creating it. After going over help archives and the mailing lists, I think there are two main options, though I am not sure about which one will be better. Of course, any alternative will be welcome as well. Actually, I
2017 Jul 18
1
Help-Multi class classification for large datasets
Hai all, We are working on Multi-class Classification. Currently up to 1.1 million records Ranger package in R is able to handle. Training time on 128 GB RAM is 12 days, which is not a practically feasible method to proceed further. In future we will have dataset of dimension 10 million records, we are in search for a package or framework which can handle 10 million records with at least 12000