Displaying 20 results from an estimated 1000 matches similar to: "fast way to search for a pattern in a few million entries data frame"
2016 Apr 10
2
what is the faster way to search for a pattern in a few million entries data frame ?
Hi there,
I have a data frame DF with 40 millions strings and their frequency. I
am searching for strings with a given pattern and I am trying to speed
up this part of my code. I try many options but so far I am not
satisfied. I tried:
- grepl and subset are equivalent in term of processing time
grepl(paste0("^",pattern),df$Strings)
subset(df,
2016 Apr 10
0
what is the faster way to search for a pattern in a few million entries data frame ?
Hi Fabien,
I was going to send this last night, but I thought it was too simple.
Runs in about one millisecond.
df<-data.frame(freq=runif(1000),
strings=apply(matrix(sample(LETTERS,10000,TRUE),ncol=10),
1,paste,collapse=""))
match.ind<-grep("DF",df$strings)
match.ind
[1] 2 11 91 133 169 444 547 605 734 943
Jim
On Mon, Apr 11, 2016 at 5:27 AM, Fabien Tarrade
2016 Apr 10
5
what is the faster way to search for a pattern in a few million entries data frame ?
Hi Duncan,
> Didn't you post the same question yesterday? Perhaps nobody answered
> because your question is unanswerable.
sorry, I got a email that my message was waiting for approval and when I
look at the forum I didn't see my message and this is why I sent it
again and this time I did check that the format of my message was text
only. Sorry for the noise.
> You need to
2016 Apr 10
0
what is the faster way to search for a pattern in a few million entries data frame ?
On 10/04/2016 2:03 PM, Fabien Tarrade wrote:
> Hi there,
>
> I have a data frame DF with 40 millions strings and their frequency. I
> am searching for strings with a given pattern and I am trying to speed
> up this part of my code. I try many options but so far I am not
> satisfied. I tried:
> - grepl and subset are equivalent in term of processing time
>
2016 Apr 10
0
what is the faster way to search for a pattern in a few million entries data frame ?
On 04/10/2016 03:27 PM, Fabien Tarrade wrote:
> Hi Duncan,
>> Didn't you post the same question yesterday? Perhaps nobody answered
>> because your question is unanswerable.
> sorry, I got a email that my message was waiting for approval and when I
> look at the forum I didn't see my message and this is why I sent it
> again and this time I did check that the
2017 Sep 14
1
Print All Warnings that Occurr in All Parallel Nodes
Dear R Users,
I have developed the following code for importing a series of zipped CSV by parallel computing.
My problems are that:
A) Some ZIP Files (Which contain CSVs inside) are corrupted, and cannot be opened.
B) After executing parRapply I can only see the last.warning variable error, for knowing which CSV have failed in each node, but I cannot see all warnings, only 1 at a time.
So:
*
2012 Oct 23
0
Typos/omissions/inconsistencies in man page for clusterApply
Hi,
Here are the issues I found:
Typos
-----
(a) Found: It a parallel version of ?evalq?,
"is" missing.
(b) Found: 'parLapplyLB', 'parSapplyLB' are load-balancing versions,
intended for use when applying ?FUN? to
'parLapplyLB' has no 'FUN' arg (more on this below).
(c) Found: 'clusterApply' calls 'fun' on the first
2020 May 18
1
parRapply and parCapply return a list in corner cases
According to ?parCapply:
parRapply and parCapply always return a vector.
This appears not to be the case in the following minimal reproducible example:
> library(parallel)
> nslaves <- 2
> cl <- makeCluster(nslaves)
> X <- matrix(2,nrow=3,ncol=4)
> X <- rbind(c(1,1,0,1),X)
> tv <- parCapply(cl,X,FUN=function(x){
+
2012 Mar 21
1
Invitation à se connecter sur LinkedIn
LinkedIn
------------
J'aimerais vous inviter ? rejoindre mon r?seau professionnel en ligne, sur le site LinkedIn.
Fabien
Fabien Dupont
Architecte Syst?me Unix/Linux chez CEA
France
Veuillez confirmer que vous connaissez Fabien Dupont?:
https://www.linkedin.com/e/h24oga-h02h13s2-1w/isd/6381163954/lTfQQJ_9/?hs=false&tok=2oumAa6oXER581
--
Vous recevez des invitations ? vous connecter
2018 Feb 27
0
Re: [PATCH] v2v: remove MAC address related information
In this case, shouldn't that be done by the upload-disk API ? When you
upload a VM, oVirt knows what to do with it and handle the locking and any
operations required to make the VM an oVirt VM.
On 27 February 2018 at 13:53, Tomáš Golembiovský <tgolembi@redhat.com>
wrote:
> On Tue, 27 Feb 2018 13:43:59 +0100
> Fabien Dupont <fdupont@redhat.com> wrote:
>
> > We can
2018 Sep 25
0
Re: OpenStack output - server_id
On my current instance, the meta_data.json is the following:
{
"availability_zone": "nova",
"devices": [],
"hostname": "ims-host-1",
"keys": [
{
"data": "ssh-rsa
2003 Dec 17
0
help: samba server don't work in embeded linux
hi all,
I want to use samba as a file server in some embeded enviroment, becase I have only 8M flash to hold file system of linux, so I have to
put samba files into harddisk.
we mount the harddisk as /mnt/c,mkdir samba in /mnt/c, and mkdir
bin,lib,log,pid,codepage in /mnt/c/samba.
we put smbd,nmbd in /mnt/c/samba/bin
we put all the lib files needed in /mnt/c/samba/lib and make all the
2003 Aug 26
2
French Translation of the Shorewall Setup Guide
Thanks to Fabien Luciole, there is now a French version of the Shorewall
Setup Guide (http://shorewall.net/shorewall_setup_guide_fr.htm).
Thanks Fabien!!!
-Tom
--
Tom Eastep \ Shorewall - iptables made easy
Shoreline, \ http://shorewall.net
Washington USA \ teastep@shorewall.net
2018 Oct 04
1
Re: OpenStack output workflow
New code tries SIGTERM first, with a grace period of 30 seconds:
https://github.com/ManageIQ/manageiq-content/pull/433.
On Wed, Sep 26, 2018 at 6:10 PM Richard W.M. Jones <rjones@redhat.com>
wrote:
> On Wed, Sep 26, 2018 at 04:57:19PM +0200, Fabien Dupont wrote:
> > It's not virt-v2v-wrapper that kills virt-v2v, it's ManageIQ. We have the
> > PID from
2020 Jan 26
2
Vacation use different SMTP server
Thanks for idea but it won't work for me as 'internal domains' can be
anything, including gmail.com (and i don't know which of them are
really internal/local, this is decided by sending SMTP everytime
something is sent, based on MX records). Problem is that Dovecot/Sieve
is using wrong SMTP server (one used for receiving e-mails which
should NEVER be used for sending [and
2020 Jan 28
0
Vacation use different SMTP server
Really no more info?
Cit?t azurit at pobox.sk:
> Thanks for idea but it won't work for me as 'internal domains' can
> be anything, including gmail.com (and i don't know which of them are
> really internal/local, this is decided by sending SMTP everytime
> something is sent, based on MX records). Problem is that
> Dovecot/Sieve is using wrong SMTP server
2002 Nov 25
0
GraspeR - functions and GUI for spatial predictions written for R
Hi,
I would like to announce the first version of GraspeR. It is a port of GRASP (Generalized Regression Analysis and Spatial Predictions) written for S-Plus, to R. It serves as an "automated" method for doing spatial predictions. You can find the first testing version at http://www.fivaz.ch/grasper/index.html. For now, I provide a dump file you can read with source(). A UNIX package is
2002 Oct 24
4
To compare Linux journalised filesystem, part II.
Back,
After to get all informations i received, i put them in table as follow:
see attachment file.
Specialists can they tell me if they agree with my conclusions ?
Thank's for your good job.
Fabien.
--
Fabien COMBERNOUS - IT Engineer
eProcess - Parc Club du Millénaire Batiment n° 6
1025 rue Henri Becquerel - 34000 Montpellier FRANCE
http://www.eprocess.fr - +33 (0)4 67 13 84 50
2020 Jan 31
0
Vacation use different SMTP server
Is Pigeonhole really using LDA for sending? My Dovecot is using LMTP, not LDA.
Cit?t Stephan Bosch <stephan at rename-it.nl>:
> Op 28-1-2020 om 19:20 schreef azurit at pobox.sk:
>> Really no more info?
>
>
> You could do something with the sendmail_path or submission_host settings.
>
> Regards,
>
> Stephan.
>
>>
>>
>>
>>
>>
2020 Jan 31
2
Vacation use different SMTP server
Op 28-1-2020 om 19:20 schreef azurit at pobox.sk:
> Really no more info?
You could do something with the sendmail_path or submission_host settings.
Regards,
Stephan.
>
>
>
>
> Cit?t azurit at pobox.sk:
>
>> Thanks for idea but it won't work for me as 'internal domains' can be
>> anything, including gmail.com (and i don't know which of them are