Displaying 20 results from an estimated 269 matches for "20,000".
Did you mean:
40,000
2010 Nov 24
5
Performance tuning tips when working with wide datasets
Does anyone have any performance tuning tips when working with datasets that are extremely wide (e.g. 20,000 columns)?
In particular, I am trying to perform a merge like below:
merged_data <- merge(data1, data2, by.x="date",by.y="date",all=TRUE,sort=TRUE);
This statement takes about 8 hours to execute on a pretty fast machine. The dataset data1 contains daily data going back...
2008 Mar 12
4
Distances between two datasets of x and y co-ordinates
Hi all
I am trying to determine the distances between two datasets of x and y
points. The number of points in dataset One is very small i.e. perhaps
5-10. The number of points in dataset Two is likely to be very large
i.e. 20,000-30,000. My initial approach was to append the first dataset
to the second and then carry out the calculation:
dists <- as.matrix(dist(gis data from 2 * datasets))
However, the memory of the computer is not sufficient. A lot of
calculations carried out in this situation are unnecessary as...
2002 Oct 22
1
getent passwd fails at 20,000 users
...0+ users in our NT 4 Domain and we are testing
winbind and it's authentication process. The Linux box is Redhat 8 and
samba 2.2.5x, the default samba with RedHat 8.0. All of the users we tested
were able to get a home directory, use shares and permissions assigned. All
of the users over the 20k limit, could log in through the PDC, but they
weren't able to get a home directory or use the permissions set on the
directories.
When I run:
wbinfo -t secret is good
wbinfo -u all users appear in the list
wbinfo -g all groups are in the list
getent passwd stops to a prompt with no error at e...
2017 Sep 01
2
Asterisk bugs make a right mess of RTP
On Fri, Sep 1, 2017 at 9:13 AM, Joshua Colp <jcolp at digium.com> wrote:
> On Fri, Sep 1, 2017, at 09:01 AM, Dave Topping wrote:
> > http:/www.theregister.co.uk/2017/09/01/asterisk_admin_patch/
>
> This specific issue exists in a lot of different implementations and
> devices. Unfortunately...
2009 Jan 14
6
Removing duplicates from a list
For a list say;
list1<-{1,2,3,4,5,2,1}
How do I remove the duplicates please?
My real list is 20,000 obs long of dates with many duplicates
Regards
Glenn
[[alternative HTML version deleted]]
2005 Jul 06
4
Tempfile error
...entify in terms of why the program stops at certain points. It just
seems to be random as far as I can tell. I've searched the archives and
of course Sweave FAQs but haven't found much that sheds light on what
this error indicates and what I should do to resolve it.
There are approximately 20,000 rows, meaning that about 20,000 tex files
are created. If I sample 5,000 or even 10,000 and run the program, I do
not encounter an error. It only occurs when I run the program on the
full dataframe and even then the error is not occuring at the same
point. That is, the row at which the program...
2006 Mar 28
3
fixed effects
dear R wizards:
X is factor with 20,000*20=800,000 observations of 20,000 factors.
I.e., each factor has 20 observations. y is 800,000 normally
distributed data points. I want to see how much R^2 the X factors can
provide. Easy, right?
> lm ( y ~ X)
and
> aov( y ~ X)
Error: cannot allocate vector of size 3125000 Kb
is th...
2009 Apr 21
4
Asterisk Database
...ge that he chose the first time.What would be better - storing his number in the Asterisk DB and using Dbput and DBget ? or storing it in MySQL from the dial plan and quering it everytime to see the callers record ? how many records can AstDB handle safely ? In my case the total records wont exceed 20,000 since there are many repeat callers ?
rgds
Sriram
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20090421/a750a86e/attachment.htm
2006 Jun 10
3
sparse matrix, rnorm, malloc
...e with 32G memory:
sparse_matrix <- function(dims,rnd,p) {
ptm <- proc.time()
x <- round(rnorm(dims*dims),rnd)
x[((abs(x) - p) < 0)] <- 0
y <- matrix(x,nrow=dims,ncol=dims)
proc.time() - ptm
}
When trying to generate the matrix around 20,000 rows/cols on a
machine with 32G of memory, the error message I receive is:
R(335) malloc: *** vm_allocate(size=3200004096) failed (error code=3)
R(335) malloc: *** error: can't allocate region
R(335) malloc: *** set a breakpoint in szone_error to debug
R(335) malloc: *** vm_allocate(size...
2004 Oct 04
3
Working with large datafiles
Hi,
I have been enjoying r for some time now, but was wondering about working
with larger data files. When I try to load in big files with more than
20,000 records, the programs seems unbable to store all the records. Is
there some way that I can increase the size of records that I work with?
Ideally I would like to work with census data which can hold a million
records.
Greg
2010 Jul 26
12
how to generate a random data from a empirical distribition
...an a R question. but I do want to
know how to implement this in R.
I have 10,000 data points. Is there any way to generate a empirical
probablity distribution from it (the problem is that I do not know what
exactly this distribution follows, normal, beta?). My ultimate goal is to
generate addition 20,000 data point from this empirical distribution created
from the existing 10,000 data points.
thank you all in advance.
--
View this message in context: http://r.789695.n4.nabble.com/how-to-generate-a-random-data-from-a-empirical-distribition-tp2302716p2302716.html
Sent from the R help mailing...
2009 Dec 24
3
aggregate binary response data
Dear list
I have a response variable coded 0/1 i.e. a binary response. There are
20,000 individual responses that I would like to aggregate into
numbers of each category (i.e. 0/1) by
group called dn (350 different groups) and by month mth (there are
several hundred responses per month.
What is the simplest way to perform this operation in R?
2004 Aug 15
3
calibration/validation sets
Hi;
Does anyone know how to create a calibration and validation set from a particular dataset? I have a dataframe with nearly 20,000 rows! and I would like to select (randomly) a subset from the original dataset (...I found how to do that) to use as calibration set. However, I don't know how to remove this "calibration" set from the original dataframe in order to get my "validation" set.....Any hint w...
2012 Jun 23
3
Hardware infrastructure for email system
...tup and have been doing some
research. I have done some searches and read several threads in the
areas of my questions here. While there are some that come close I
haven't yet been able to get all my questions answered.
I currently run a postfix, dovecot & roundcube setup and have about 2000
active accounts. I have a separate SMTP server for outbound mail and
auth is done against a separate LDAP server. In front of the POP/IMAP
server I have another SMTP (4 in parallel actually) server that receives
and filters inbound mail through a company specific, proprietary filter
before t...
2007 Dec 18
3
creating a database
useR's,
I am writing a program in which the input can be multidimensional. As of
now, to hold the input, I have created an n by m matrix where n is the
number of observations and m is the number of variables. The data that I
could potentially use can contain well over 20,000 observations.
Can a simple matrix be used for this or would it be better and more
efficient to create an external database to hold the data. If so, should
the database be created using C and how would I do this (seeing as that I
have never programmed in C)?
Any help would be greatly appr...
2007 Oct 25
1
Indexes on dataframe columns?
Hi --
I'm working with some data frames with fairly high nrows (call it 8
columns, by 20,000 rows). Are there any indexes on these columns?
When I do a df[df$foo == 42,] [which I think is idiomatic], am I doing a linear
search or something better? If the column contents is ordered, I'd like
to at least be doing a naive binary search.
Thanks!
Ranjan
2004 Jul 07
9
Windows 2K outperform Linux/Samba very much?
...utperform Linux/Samba very much after I
campared the bench results. I am very confused about it and who can
explain it?
The computers' configurations are as follows:
1. PC Client
It runs the follow VB program to compute the time when check files' property
Operation System:
Windows 2000 professional
// ...
Set objFSO = CreateObject("Scripting.FileSystemObject")
thistime = thisnow
If objFSO.FileExists(fn) Then
totle = totle & "Check file time " & CStr(thisnow - thistime) + " ms" + vbCrLf
thistime = thisnow
Set objFile = objFSO.GetF...
2011 Nov 11
3
Combining Overlapping Data
I've scoured the archives but have found no concrete answer to my question.
Problem: Two data sets
1st data set(x) = 20,000 rows
2nd data set(y) = 5,000 rows
Both have the same column names, the column of interest to me is a variable
called strain.
For example, a strain named "Chab1405" appears in x 150 times and in y 25
times...
strain "Chab1999" only appears 200 times in x and none in y (so...
2005 Feb 15
1
Off topic -- large data sets. Was RE: 64 Bit R Background Question
In message <200502151112.j1FB5fZ5002722 at hypatia.math.ethz.ch>, r-help-
request at stat.math.ethz.ch writes
>Can comeone give me an example (perhaps in a private response, since I'm off
>topic here) where one actually needs all cases in a large data set ("large"
>being > 1e6, say)...
2010 Feb 19
2
Best inode_ratio for maildir++ on ext4
...defautl setting I see this
difference of space (wasted space, but more inodes):
4328633696 free 1K-blocks with mkfs's "-T news" switch = 1219493877 free
inodes
4557288800 free 1K-blocks with default mkfs settings = 304873461 free inodes
I'll be storing e-mail messages for around 20,000 accounts on that
partition (average 512 Mb per account). Would you consider worth the
waste of about 200 Gb of the filesystem space in exchange of more inodes?
Thanks.
Rodolfo.