search for: 20,000

Displaying 20 results from an estimated 267 matches for "20,000".

Did you mean: 40,000
2010 Nov 24
5
Performance tuning tips when working with wide datasets
Does anyone have any performance tuning tips when working with datasets that are extremely wide (e.g. 20,000 columns)? In particular, I am trying to perform a merge like below: merged_data <- merge(data1, data2, by.x="date",by.y="date",all=TRUE,sort=TRUE); This statement takes about 8 hours to execute on a pretty fast machine. The dataset data1 contains daily data going back...
2008 Mar 12
4
Distances between two datasets of x and y co-ordinates
Hi all I am trying to determine the distances between two datasets of x and y points. The number of points in dataset One is very small i.e. perhaps 5-10. The number of points in dataset Two is likely to be very large i.e. 20,000-30,000. My initial approach was to append the first dataset to the second and then carry out the calculation: dists <- as.matrix(dist(gis data from 2 * datasets)) However, the memory of the computer is not sufficient. A lot of calculations carried out in this situation are unnecessary as...
2002 Oct 22
1
getent passwd fails at 20,000 users
...0+ users in our NT 4 Domain and we are testing winbind and it's authentication process. The Linux box is Redhat 8 and samba 2.2.5x, the default samba with RedHat 8.0. All of the users we tested were able to get a home directory, use shares and permissions assigned. All of the users over the 20k limit, could log in through the PDC, but they weren't able to get a home directory or use the permissions set on the directories. When I run: wbinfo -t secret is good wbinfo -u all users appear in the list wbinfo -g all groups are in the list getent passwd stops to a prompt with no error at e...
2017 Sep 01
2
Asterisk bugs make a right mess of RTP
On Fri, Sep 1, 2017 at 9:13 AM, Joshua Colp <jcolp at digium.com> wrote: > On Fri, Sep 1, 2017, at 09:01 AM, Dave Topping wrote: > > http:/www.theregister.co.uk/2017/09/01/asterisk_admin_patch/ > > This specific issue exists in a lot of different implementations and > devices. Unfortunately...
2009 Jan 14
6
Removing duplicates from a list
For a list say; list1<-{1,2,3,4,5,2,1} How do I remove the duplicates please? My real list is 20,000 obs long of dates with many duplicates Regards Glenn [[alternative HTML version deleted]]
2005 Jul 06
4
Tempfile error
...entify in terms of why the program stops at certain points. It just seems to be random as far as I can tell. I've searched the archives and of course Sweave FAQs but haven't found much that sheds light on what this error indicates and what I should do to resolve it. There are approximately 20,000 rows, meaning that about 20,000 tex files are created. If I sample 5,000 or even 10,000 and run the program, I do not encounter an error. It only occurs when I run the program on the full dataframe and even then the error is not occuring at the same point. That is, the row at which the program...
2006 Mar 28
3
fixed effects
dear R wizards: X is factor with 20,000*20=800,000 observations of 20,000 factors. I.e., each factor has 20 observations. y is 800,000 normally distributed data points. I want to see how much R^2 the X factors can provide. Easy, right? > lm ( y ~ X) and > aov( y ~ X) Error: cannot allocate vector of size 3125000 Kb is th...
2009 Apr 21
4
Asterisk Database
...ge that he chose the first time.What would be better - storing his number in the Asterisk DB and using Dbput and DBget ? or storing it in MySQL from the dial plan and quering it everytime to see the callers record ? how many records can AstDB handle safely ? In my case the total records wont exceed 20,000 since there are many repeat callers ? rgds Sriram -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20090421/a750a86e/attachment.htm
2006 Jun 10
3
sparse matrix, rnorm, malloc
...e with 32G memory: sparse_matrix <- function(dims,rnd,p) { ptm <- proc.time() x <- round(rnorm(dims*dims),rnd) x[((abs(x) - p) < 0)] <- 0 y <- matrix(x,nrow=dims,ncol=dims) proc.time() - ptm } When trying to generate the matrix around 20,000 rows/cols on a machine with 32G of memory, the error message I receive is: R(335) malloc: *** vm_allocate(size=3200004096) failed (error code=3) R(335) malloc: *** error: can't allocate region R(335) malloc: *** set a breakpoint in szone_error to debug R(335) malloc: *** vm_allocate(size...
2004 Oct 04
3
Working with large datafiles
Hi, I have been enjoying r for some time now, but was wondering about working with larger data files. When I try to load in big files with more than 20,000 records, the programs seems unbable to store all the records. Is there some way that I can increase the size of records that I work with? Ideally I would like to work with census data which can hold a million records. Greg
2010 Jul 26
12
how to generate a random data from a empirical distribition
...an a R question. but I do want to know how to implement this in R. I have 10,000 data points. Is there any way to generate a empirical probablity distribution from it (the problem is that I do not know what exactly this distribution follows, normal, beta?). My ultimate goal is to generate addition 20,000 data point from this empirical distribution created from the existing 10,000 data points. thank you all in advance. -- View this message in context: http://r.789695.n4.nabble.com/how-to-generate-a-random-data-from-a-empirical-distribition-tp2302716p2302716.html Sent from the R help mailing...
2009 Dec 24
3
aggregate binary response data
Dear list I have a response variable coded 0/1 i.e. a binary response. There are 20,000 individual responses that I would like to aggregate into numbers of each category (i.e. 0/1) by group called dn (350 different groups) and by month mth (there are several hundred responses per month. What is the simplest way to perform this operation in R?
2004 Aug 15
3
calibration/validation sets
Hi; Does anyone know how to create a calibration and validation set from a particular dataset? I have a dataframe with nearly 20,000 rows! and I would like to select (randomly) a subset from the original dataset (...I found how to do that) to use as calibration set. However, I don't know how to remove this "calibration" set from the original dataframe in order to get my "validation" set.....Any hint w...
2012 Jun 23
3
Hardware infrastructure for email system
...tup and have been doing some research. I have done some searches and read several threads in the areas of my questions here. While there are some that come close I haven't yet been able to get all my questions answered. I currently run a postfix, dovecot & roundcube setup and have about 2000 active accounts. I have a separate SMTP server for outbound mail and auth is done against a separate LDAP server. In front of the POP/IMAP server I have another SMTP (4 in parallel actually) server that receives and filters inbound mail through a company specific, proprietary filter before t...
2007 Dec 18
3
creating a database
useR's, I am writing a program in which the input can be multidimensional. As of now, to hold the input, I have created an n by m matrix where n is the number of observations and m is the number of variables. The data that I could potentially use can contain well over 20,000 observations. Can a simple matrix be used for this or would it be better and more efficient to create an external database to hold the data. If so, should the database be created using C and how would I do this (seeing as that I have never programmed in C)? Any help would be greatly appr...
2007 Oct 25
1
Indexes on dataframe columns?
Hi -- I'm working with some data frames with fairly high nrows (call it 8 columns, by 20,000 rows). Are there any indexes on these columns? When I do a df[df$foo == 42,] [which I think is idiomatic], am I doing a linear search or something better? If the column contents is ordered, I'd like to at least be doing a naive binary search. Thanks! Ranjan
2004 Jul 07
9
Windows 2K outperform Linux/Samba very much?
...utperform Linux/Samba very much after I campared the bench results. I am very confused about it and who can explain it? The computers' configurations are as follows: 1. PC Client It runs the follow VB program to compute the time when check files' property Operation System: Windows 2000 professional // ... Set objFSO = CreateObject("Scripting.FileSystemObject") thistime = thisnow If objFSO.FileExists(fn) Then totle = totle & "Check file time " & CStr(thisnow - thistime) + " ms" + vbCrLf thistime = thisnow Set objFile = objFSO.GetF...
2011 Nov 11
3
Combining Overlapping Data
I've scoured the archives but have found no concrete answer to my question. Problem: Two data sets 1st data set(x) = 20,000 rows 2nd data set(y) = 5,000 rows Both have the same column names, the column of interest to me is a variable called strain. For example, a strain named "Chab1405" appears in x 150 times and in y 25 times... strain "Chab1999" only appears 200 times in x and none in y (so...
2005 Feb 15
1
Off topic -- large data sets. Was RE: 64 Bit R Background Question
In message <200502151112.j1FB5fZ5002722 at hypatia.math.ethz.ch>, r-help- request at stat.math.ethz.ch writes >Can comeone give me an example (perhaps in a private response, since I'm off >topic here) where one actually needs all cases in a large data set ("large" >being > 1e6, say)...
2010 Feb 19
2
Best inode_ratio for maildir++ on ext4
...defautl setting I see this difference of space (wasted space, but more inodes): 4328633696 free 1K-blocks with mkfs's "-T news" switch = 1219493877 free inodes 4557288800 free 1K-blocks with default mkfs settings = 304873461 free inodes I'll be storing e-mail messages for around 20,000 accounts on that partition (average 512 Mb per account). Would you consider worth the waste of about 200 Gb of the filesystem space in exchange of more inodes? Thanks. Rodolfo.