thr3ads.net - similar to: "Help with large datasets"

Displaying 20 results from an estimated 10000 matches similar to: "Help with large datasets"

Incorrect libxml2.2.dylib version on Tiger install

2005 May 07

Incorrect libxml2.2.dylib version on Tiger install

Hi all, I have just installed OSX Server 10.4 and R comes up with the incompatible libxml library message reported by Dan Kelley a few messages ago. Xcode 2 does not ship with Tiger Server. I installed the X-Windows code. I can report that the version of libxml2.2 that is installed in this case is the version 8.0.0 dylib. [6]sboker at munimula:/usr/lib % ls -l libxml2.2* -rwxr-xr-x 1

R-alpha: [sboker@calliope.psych.nd.edu: Re: S-PLUS on UNIX plans]

1997 Oct 09

R-alpha: [sboker@calliope.psych.nd.edu: Re: S-PLUS on UNIX plans]

--Multipart_Thu_Oct__9_10:41:03_1997-1 Content-Type: text/plain; charset=US-ASCII In case you did not realize how much this is related to R : --Multipart_Thu_Oct__9_10:41:03_1997-1 Content-Type: message/rfc822 Return-Path: s-sender@utstat.toronto.edu From: "Steven M. Boker" <sboker@calliope.psych.nd.edu> Date: Wed, 8 Oct 97 16:37:05 -0500 To: s-news@utstat.toronto.edu Subject:

Mac OS-X port of R?

2001 Apr 20

Mac OS-X port of R?

I'm wondering if a Mac OS-X port of R-1.2.2 has occurred. Jan de Leeuw posted that he had made a preliminary port using calls to X windows for the GUI portions, but that file doesn't exist on his server. Is someone doing this? If not, I'll get busy and see if I can make a preliminary port. Of course, lots of work would need to be done to get everything nicely integrated with

Organizing Large Datasets

2012 Feb 02

Organizing Large Datasets

Recently I've run into memory problems while using data.frames for a reasonably large dataset. I've solved those problems using arrays, and that has provoked me to do a few benchmarks. I would like to share the results. Let us start with the data. There are N subjects classified into G groups. These subjects are observed for T periods, and each observation consists of M variables. So,

[Bug 13645] New: Improve efficiency when resuming transfer of large files

2018 Oct 05

[Bug 13645] New: Improve efficiency when resuming transfer of large files

https://bugzilla.samba.org/show_bug.cgi?id=13645 Bug ID: 13645 Summary: Improve efficiency when resuming transfer of large files Product: rsync Version: 3.0.9 Hardware: All OS: All Status: NEW Severity: enhancement Priority: P5 Component: core Assignee:

Large subjects increase memory-usage and enlarge index-files

2012 Oct 06

Large subjects increase memory-usage and enlarge index-files

Several times we already had the problems, that accounts with more the 1.3 or 1.7 billion e-mails in one folder run out-of-memory, even if vsize_limit of 750 MB is set. In this case, the lmtpd-process haven't been able to allocate more memory to read/write/update the index-files and crashed (and the index-files become corrupted at the end.) [Please -- don't discuss about the need of

print data.frame with a list column

2024 Jan 29

print data.frame with a list column

On Mon, 29 Jan 2024 14:19:21 +0200 Micha Silver <tsvibar at gmail.com> wrote: > Is there some option to force printing the full list? > df <- data.frame("name" = "A", "bands" = I(list(1:20))) format.AsIs is responsible for printing columns produced using I(). It accepts a "width" argument: format(x, width = 9999) # name

Plea for Help with Slow Roaming Profiles

2003 Aug 11

Plea for Help with Slow Roaming Profiles

I've posted a couple of times about this problem. This is a plea for help with Roaming Profile configuration. The short problem is that logging on and logging off takes about ten minutes, with a fresh roaming profile (~1MB), on a 100Mb LAN. If anyone has any suggestions or pointers or questions, *please* pipe up. I'm all out of ideas.

Calculating volume under polygons

2023 Nov 20

Calculating volume under polygons

Dear all; I am trying to calculate volume under each polygon of a shapefile according to a DEM. when I run the code, it gives me an error as follows. " Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'addAttrToGeom': sp supports Z dimension only for POINT and MULTIPOINT. use `st_zm(...)` to coerce to XY dimensions

splitting and saving a large dataframe

2006 Apr 29

splitting and saving a large dataframe

Hi, I searched for this in the mailing list, but found no results. I have a large dataframe ( dim(mydata)= 1297059 16, object.size(mydata= 145280576) ) , and I want to perform some calculations which can be done by a factor's levels, say, mydata$myfactor. So what I want is to split this dataframe into nlevels(mydata$myfactor) = 80 levels. But I must do this efficiently, that is, I

ffsave.image() error with large objects

2010 Feb 12

ffsave.image() error with large objects

Hi, I have been using ffsave.image() to save mixture of ff and normal objects in my workspace. e.g. ffsave.image(file = "C:\output\saveobjects", rootpath = "D:\fftempdir", safe = TRUE) It works fine but once my workspace has large (~4GB) objects, I get the error: Error in ffsave.image(file = "C:\output\savedobjects", rootpath = "D:\fftempdir", safe =

loop over large dataset

2005 Jul 01

loop over large dataset

Hi All, I'd like to ask for a few clarifications. I am doing some calculations over some biggish datasets. One has ~ 23000 rows, and 6 columns, the other has ~620000 rows and 6 columns. I am using these datasets to perform a simulation of of haplotype coalescence over a pedigree (the datestes themselves are pedigree information). I created a new dataset (same number of rows as the pedigree

large fonts on plots

2004 Feb 11

large fonts on plots

Hi all, I need to enlarge te fonts used oo R-plots (plots, histograms, ...) in labels and titles etc. I seem to be unable to figure out how to do it. The problem is that the titles of the plots are simply unreadable when I insert them into my LaTeX text, since they are relatively small compared to the entire plot. I am sure it is pretty simple, can anybody give me a hint ? Please reply

SOAP for large datasets

2006 Jul 29

SOAP for large datasets

I''ve been playing around with a soap interface to an application that can return large datasets (up to 50mb or so). There are also some nested structures for which I''ve used ActionWebService::Struct with 2-3 nested members of oher ActionWebService::Struct members. In addition to chewing up a ton of memory, cpu ulilization isn''t that great either. My development

bnlearn and very large datasets (> 1 million observations)

2013 Nov 30

bnlearn and very large datasets (> 1 million observations)

Hi Anyone have experience with very large datasets and the Bayesian Network package, bnlearn? In my experience R doesn't react well to very large datasets. Is there a way to divide up the dataset into pieces and incrementally learn the network with the pieces? This would also be helpful incase R crashes, because I could save the network after learning each piece. Thank you.

Question about KLdiv and large datasets

2010 Jul 16

Question about KLdiv and large datasets

Hi all, when running KL on a small data set, everything is fine: require("flexmix") n <- 20 a <- rnorm(n) b <- rnorm(n) mydata <- cbind(a,b) KLdiv(mydata) however, when this dataset increases require("flexmix") n <- 10000000 a <- rnorm(n) b <- rnorm(n) mydata <- cbind(a,b) KLdiv(mydata) KL seems to be not defined. Can somebody explain what is going

Competing risk regression with CRR slow on large datasets?

2011 Jul 20

Competing risk regression with CRR slow on large datasets?

Hi, I posted this question on stats.stackexchange.com 3 days ago but the answer didn't really address my question concerning the speed in competing risk regression. I hope you don't mind me asking it in this forum: I?m doing a registry based study with almost 200 000 observations and I want to perform a competing risk analysis. My problem is that the crr() in the cmprsk package is

importing large datasets in R

2013 Jan 19

importing large datasets in R

Hi Everyone, I am a little new to R and the first problem I am facing is the dilemma whether R is suitable for files of size 2 GB's and slightly more then 2 Million rows. When I try importing the data using read.table, it seems to take forever and I have to cancel the command. Are there any special techniques or methods which i can use or some tricks of the game that I should keep in mind in

glm with large datasets

2009 Feb 26

glm with large datasets

Hi all, I have to run a logit regresion over a large dataset and I am not sure about the best option to do it. The dataset is about 200000x2000 and R runs out of memory when creating it. After going over help archives and the mailing lists, I think there are two main options, though I am not sure about which one will be better. Of course, any alternative will be welcome as well. Actually, I

Help-Multi class classification for large datasets

2017 Jul 18

Help-Multi class classification for large datasets

Hai all, We are working on Multi-class Classification. Currently up to 1.1 million records Ranger package in R is able to handle. Training time on 128 GB RAM is 12 days, which is not a practically feasible method to proceed further. In future we will have dataset of dimension 10 million records, we are in search for a package or framework which can handle 10 million records with at least 12000

similar to: Help with large datasets