search for: 500,000

Displaying 20 results from an estimated 119 matches for "500,000".

Did you mean: 100,000
2007 Apr 18
1
Memory increase in R
Dear All: Pleas help me to increase the memory in R. I am trying to make euclidean distance matrix. The number of low in data is 500,000. Therefore, the dimension of euclidean distance matrix is 500,000*500,000. When I run the data in R. R could not make distance matrix because of memory allocation problem. In order increase memory, I read the FAQ and follow the instruction as below: You may also set the amount of avail...
2006 Aug 21
5
lean and mean lm/glm?
...efficiently with large datasets. I'm running R on Windows XP with 1Gb ram (so about 600mb-700mb after the usual windows overhead). I have a dataset that has 4 million observations and about 20 variables. I want to run probit regressions on this data, but can't do this with more than about 500,000 observations before I start running out of ram (you could argue that I'm getting sufficient precision with <500,000 obs but lets pretend otherwise). Loading 500,000 observations into a data frame only takes about 100Mb of ram, so that isn't the problem. Instead it seems R uses huge a...
2008 Nov 30
2
Snow and multi-processing
...~50,000 t.test for a series of micro-array experiments, one gene at a time. Thus, I can easily spread the load across multiple processors and nodes. So, I have a master list object that tells me what rows to pick up for each genes to do the t.test from series of microarray experiments containing ~500,000 rows and x columns per experiments. While trying to optimize my function using parLapply(), I quickly realized that I was not gaining any speed because every time a test was done on one of the item in the list, the 500,000 line by x column matrix had to be shipped along with the item in the li...
2008 Sep 22
4
Manage huge database
Hello, Recently I have been trying to open a huge database with no success. It’s a 4GB csv plain text file with around 2000 rows and over 500,000 columns/variables. I have try with The SAS System, but it reads only around 5000 columns, no more. R hangs up when opening. Is there any way to work with “parts” (a set of columns) of this database, since its impossible to manage it all at once? Is there any way to establish a link...
2007 Feb 27
1
read.csv size limits
...the read.csv function for a while now without any problems. My files are usually 20-50 MBs and they take up to a minute to import. They have all been under 50,000 rows and under 100 columns. Recently, I tried importing a file of a similar size (which means about the same amount of data), but with ~500,000 columns and ~20 rows. The process is taking forever (~1 hour so far). In Task Manager, I see the CPU is at max, but memory slows down to a halt at around 50 MBs (far below memory limit). Is this normal? Is there a way to optimize this operation or at least check the progress? Will this take 2...
2001 May 14
5
unique and precision of long integers
Hello. I have a dataset with about 500,000 observations, most of which are not unique. The first 10 observations look like 901000000000100000010100101011002 901101101110100000010100101011002 901000000000100000010100000001002 901000000000100000010101001011002 901000000000100000010101010011002 901000000000100000010100110101002 901000000...
2013 Apr 08
3
SVD on very large data matrix
Dear All, I need to perform a SVD on a very large data matrix, of dimension ~ 500,000 x 1,000 , and I am looking for an efficient algorithm that can perform an approximate (partial) SVD to extract on the order of the top 50 right and left singular vectors. Would be very grateful for any advice on what R-packages are available to perform such a task, what the RAM requirement is,...
2008 Oct 14
6
Doing a Task Without Using a For Loop
...13 has 15 entries (NinYear) for 1953. The following bit of code calculates NinYear: for (i in 1:length(data1$ID)) { data1$NinYear[i] <- length(data1[data1$Year==data1$Year[i] & data1$ID==data1$ID[i],1]) } This seems to work but is horribly slow (some files I am working with have over 500,000 lines). Can anyone suggest a faster way of doing this, perhaps a way that does not use a for loop? Thanks. Tom ID Year NinYear 209 1971 0 209 1971 0 213 1951 0 213 1951 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 213 1953 0 2...
2019 Feb 28
3
What files to edit when changing the sdX of hard drives?
...ystem, and it's also got partition one labled boot... so? I'm not trying to boot off of it, I'm going to mount it on /mnt. No, I dislike UUIDs. I dislike, strongly, lots of extra typing that doesn't really get me anything. MAYBE, if you're in a Google or Amazon datacenter, with 500,000 physical servers (I phone interviewed with them 10 years ago)... but short of that? Nope. mark
2008 May 02
3
Loading large files in R
Hello, I'm attempting to load a ~110 MB text file with ~500,000 rows and 200 columns using read.table . R hangs and seems to give up. Can anyone tell me an efficient way to load a file of this size? Thank you! Alex -- View this message in context: http://www.nabble.com/Loading-large-files-in-R-tp17025045p17025045.html Sent from the R help mailing list arch...
2006 May 23
7
Load Balancing
Hi, We are starting a new project, and are trying to decide the best way to proceed. We want to setup a LAMP configuration using Centos, something we have been doing in the past with great success. The question is load balancing. We antisipate the potential for the system to receive 500,000 requests/ day with in the next year. We want to plan for that extra load now as we start the project. What would you suggest for setups for multiple servers for redundancy and load balancing? I have setup MySQL replication and that works fine but what about the rest of the system. I know...
2006 Mar 06
8
RoR on a VPS
I''m looking for a virtual private server to run a RoR site accessing a database on a different machine. This is a small application - basically a form to add records, and a few summary screens. What are the minimal requirements for a vps? Thanks. -- Posted via http://www.ruby-forum.com/.
2002 Aug 28
4
Huge data frames?
A friend of mine recently mentioned that he had painlessly imported a data file with 8 columns and 500,000 rows into matlab. When I tried the same thing in R (both Unix and Windows variants) I had little success. The Windows version hung for a very long time, until I eventually more or less ran out of virtual memory; I tried to set the proper memory allocations for the Unix version, but it never see...
2008 Sep 20
2
selecting from a series of integers with pre-determined probabilities
R 2.6 Windows XP I need to select from the integers 1,2,3,4,5 with some pre-determined probability, e.g. probability of selecting 5 80%, probability of selecting 1 or 2 or 3 or 4 20%. Any suggestions for how I might accomplish this? I need to do it very efficiently as I will be doing it 500,000 times. Thanks John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please c...
2013 Oct 01
3
Fixing Timestamps
I have a user with a lot of email (A LOT of email, probably over 500,000 emails). Recently, several thousand messages of his were lost, and I pulled them out of the backup archives (zip files containing each days emails in an mbox) that are created on his account and fed them into his procmail scripts and they were all processed just fine and ended up in the right d...
2005 Dec 27
1
No performance increase from dual-core processors?
...nity to compare the performance of my 1.6 GHz Pentium M laptop, and a 2.8 GHz dual-core Pentium processor (both running WinXP professional 32-bit). I run a lot of long simulations, so I was hoping to get something that would speed them up. I ran a few quick computation tests (e.g. generating 500,000 normals), and found the performance increase of the 2.8 dual-core over my 1.6M laptop to be negligable (and in fact sometimes slower). One thing I did notice that if I look at the CPU usage of my laptop when it's performing the simulations the laptop is at about 100%, while the dual-c...
2009 Jun 08
2
problem with bulk insert into a *.csv file
Hi all, I am trying to create a "index.csv" with caliculating different types of caliculations . In that i have to caliculate on 10,000 studies and have to insert many no of rows more than 500,000 for that right now I am inserting every row after caliculating and doing data.frame but its taking much time to create that index.csv is there any thing like bulk insert in to file with keeping those rows in memory and inserting into table if any thing is there means please give me some idea t...
2012 May 20
2
Histograms with bin proportions on the y-axis
I have what is probably a simple problem. I have a data file from an MCMC Bayes estimation problem that is a vector of 500,000 numeric values (just one variable) ranging from 100,000 to 700,000. I need to display the histogram of this data in a high quality graphic for a figure in a journal publication. I want 100 bins so as to display a reasonable complete and smooth histogram, and I need the Y-axis to display the bin...
2009 Jun 24
2
Memory issues on a 64-bit debian system (quantreg)
...true 64-bit installation (should R have access to > 4GB of RAM?) I suspect so, because I was running an rqss() (package quantreg, installed via install.packages() -- I noticed it required a compilation of the source) and watched the memory usage spike to 4.9GB (my input data contains > 500,000 samples). With this said, after 30 mins or so of processing, I got the following error: tahoe_rq <- rqss(ltbmu_4_stemsha_30m_exp.img~qss(ltbmu_eto_annual_mm.img),tau=.99,data=boundary_data) Error: cannot allocate vector of size 1.5 Gb The dataset is a bit big (300mb or so), so I...
2003 Aug 12
1
Certification (was RE: realpath(3) et al)
Just saw this from eWeek. "IBM, which paid roughly $500,000 for the testing, and SuSE (pronounced "SOOS-ah") were announcing the certification jointly. " The article is here: http://www.eweek.com/article2/0,3959,1212529,00.asp --- Darren Reed <avalon@caligula.anu.edu.au> wrote: > In some mail from twig les, sie said: > &gt...