similar to: fast way to search for a pattern in a few million entries data frame

Displaying 20 results from an estimated 1000 matches similar to: "fast way to search for a pattern in a few million entries data frame"

2016 Apr 10
2
what is the faster way to search for a pattern in a few million entries data frame ?
Hi there, I have a data frame DF with 40 millions strings and their frequency. I am searching for strings with a given pattern and I am trying to speed up this part of my code. I try many options but so far I am not satisfied. I tried: - grepl and subset are equivalent in term of processing time grepl(paste0("^",pattern),df$Strings) subset(df,
2016 Apr 10
0
what is the faster way to search for a pattern in a few million entries data frame ?
Hi Fabien, I was going to send this last night, but I thought it was too simple. Runs in about one millisecond. df<-data.frame(freq=runif(1000), strings=apply(matrix(sample(LETTERS,10000,TRUE),ncol=10), 1,paste,collapse="")) match.ind<-grep("DF",df$strings) match.ind [1] 2 11 91 133 169 444 547 605 734 943 Jim On Mon, Apr 11, 2016 at 5:27 AM, Fabien Tarrade
2016 Apr 10
5
what is the faster way to search for a pattern in a few million entries data frame ?
Hi Duncan, > Didn't you post the same question yesterday? Perhaps nobody answered > because your question is unanswerable. sorry, I got a email that my message was waiting for approval and when I look at the forum I didn't see my message and this is why I sent it again and this time I did check that the format of my message was text only. Sorry for the noise. > You need to
2016 Apr 10
0
what is the faster way to search for a pattern in a few million entries data frame ?
On 10/04/2016 2:03 PM, Fabien Tarrade wrote: > Hi there, > > I have a data frame DF with 40 millions strings and their frequency. I > am searching for strings with a given pattern and I am trying to speed > up this part of my code. I try many options but so far I am not > satisfied. I tried: > - grepl and subset are equivalent in term of processing time >
2016 Apr 10
0
what is the faster way to search for a pattern in a few million entries data frame ?
On 04/10/2016 03:27 PM, Fabien Tarrade wrote: > Hi Duncan, >> Didn't you post the same question yesterday? Perhaps nobody answered >> because your question is unanswerable. > sorry, I got a email that my message was waiting for approval and when I > look at the forum I didn't see my message and this is why I sent it > again and this time I did check that the
2017 Sep 14
1
Print All Warnings that Occurr in All Parallel Nodes
Dear R Users, I have developed the following code for importing a series of zipped CSV by parallel computing. My problems are that: A) Some ZIP Files (Which contain CSVs inside) are corrupted, and cannot be opened. B) After executing parRapply I can only see the last.warning variable error, for knowing which CSV have failed in each node, but I cannot see all warnings, only 1 at a time. So: *
2012 Oct 23
0
Typos/omissions/inconsistencies in man page for clusterApply
Hi, Here are the issues I found: Typos ----- (a) Found: It a parallel version of ?evalq?, "is" missing. (b) Found: 'parLapplyLB', 'parSapplyLB' are load-balancing versions, intended for use when applying ?FUN? to 'parLapplyLB' has no 'FUN' arg (more on this below). (c) Found: 'clusterApply' calls 'fun' on the first
2020 May 18
1
parRapply and parCapply return a list in corner cases
According to ?parCapply: parRapply and parCapply always return a vector. This appears not to be the case in the following minimal reproducible example: > library(parallel) > nslaves <- 2 > cl <- makeCluster(nslaves) > X <- matrix(2,nrow=3,ncol=4) > X <- rbind(c(1,1,0,1),X) > tv <- parCapply(cl,X,FUN=function(x){ +
2012 Mar 21
1
Invitation à se connecter sur LinkedIn
LinkedIn ------------ J'aimerais vous inviter ? rejoindre mon r?seau professionnel en ligne, sur le site LinkedIn. Fabien Fabien Dupont Architecte Syst?me Unix/Linux chez CEA France Veuillez confirmer que vous connaissez Fabien Dupont?: https://www.linkedin.com/e/h24oga-h02h13s2-1w/isd/6381163954/lTfQQJ_9/?hs=false&tok=2oumAa6oXER581 -- Vous recevez des invitations ? vous connecter
2018 Feb 27
0
Re: [PATCH] v2v: remove MAC address related information
In this case, shouldn't that be done by the upload-disk API ? When you upload a VM, oVirt knows what to do with it and handle the locking and any operations required to make the VM an oVirt VM. On 27 February 2018 at 13:53, Tomáš Golembiovský <tgolembi@redhat.com> wrote: > On Tue, 27 Feb 2018 13:43:59 +0100 > Fabien Dupont <fdupont@redhat.com> wrote: > > > We can
2018 Sep 25
0
Re: OpenStack output - server_id
On my current instance, the meta_data.json is the following: { "availability_zone": "nova", "devices": [], "hostname": "ims-host-1", "keys": [ { "data": "ssh-rsa
2003 Dec 17
0
help: samba server don't work in embeded linux
hi all, I want to use samba as a file server in some embeded enviroment, becase I have only 8M flash to hold file system of linux, so I have to put samba files into harddisk. we mount the harddisk as /mnt/c,mkdir samba in /mnt/c, and mkdir bin,lib,log,pid,codepage in /mnt/c/samba. we put smbd,nmbd in /mnt/c/samba/bin we put all the lib files needed in /mnt/c/samba/lib and make all the
2003 Aug 26
2
French Translation of the Shorewall Setup Guide
Thanks to Fabien Luciole, there is now a French version of the Shorewall Setup Guide (http://shorewall.net/shorewall_setup_guide_fr.htm). Thanks Fabien!!! -Tom -- Tom Eastep \ Shorewall - iptables made easy Shoreline, \ http://shorewall.net Washington USA \ teastep@shorewall.net
2018 Oct 04
1
Re: OpenStack output workflow
New code tries SIGTERM first, with a grace period of 30 seconds: https://github.com/ManageIQ/manageiq-content/pull/433. On Wed, Sep 26, 2018 at 6:10 PM Richard W.M. Jones <rjones@redhat.com> wrote: > On Wed, Sep 26, 2018 at 04:57:19PM +0200, Fabien Dupont wrote: > > It's not virt-v2v-wrapper that kills virt-v2v, it's ManageIQ. We have the > > PID from
2020 Jan 26
2
Vacation use different SMTP server
Thanks for idea but it won't work for me as 'internal domains' can be anything, including gmail.com (and i don't know which of them are really internal/local, this is decided by sending SMTP everytime something is sent, based on MX records). Problem is that Dovecot/Sieve is using wrong SMTP server (one used for receiving e-mails which should NEVER be used for sending [and
2020 Jan 28
0
Vacation use different SMTP server
Really no more info? Cit?t azurit at pobox.sk: > Thanks for idea but it won't work for me as 'internal domains' can > be anything, including gmail.com (and i don't know which of them are > really internal/local, this is decided by sending SMTP everytime > something is sent, based on MX records). Problem is that > Dovecot/Sieve is using wrong SMTP server
2002 Nov 25
0
GraspeR - functions and GUI for spatial predictions written for R
Hi, I would like to announce the first version of GraspeR. It is a port of GRASP (Generalized Regression Analysis and Spatial Predictions) written for S-Plus, to R. It serves as an "automated" method for doing spatial predictions. You can find the first testing version at http://www.fivaz.ch/grasper/index.html. For now, I provide a dump file you can read with source(). A UNIX package is
2002 Oct 24
4
To compare Linux journalised filesystem, part II.
Back, After to get all informations i received, i put them in table as follow: see attachment file. Specialists can they tell me if they agree with my conclusions ? Thank's for your good job. Fabien. -- Fabien COMBERNOUS - IT Engineer eProcess - Parc Club du Millénaire Batiment n° 6 1025 rue Henri Becquerel - 34000 Montpellier FRANCE http://www.eprocess.fr - +33 (0)4 67 13 84 50
2020 Jan 31
0
Vacation use different SMTP server
Is Pigeonhole really using LDA for sending? My Dovecot is using LMTP, not LDA. Cit?t Stephan Bosch <stephan at rename-it.nl>: > Op 28-1-2020 om 19:20 schreef azurit at pobox.sk: >> Really no more info? > > > You could do something with the sendmail_path or submission_host settings. > > Regards, > > Stephan. > >> >> >> >> >>
2020 Jan 31
2
Vacation use different SMTP server
Op 28-1-2020 om 19:20 schreef azurit at pobox.sk: > Really no more info? You could do something with the sendmail_path or submission_host settings. Regards, Stephan. > > > > > Cit?t azurit at pobox.sk: > >> Thanks for idea but it won't work for me as 'internal domains' can be >> anything, including gmail.com (and i don't know which of them are