Hi,
I have been using R for a few months and I have this working
code. Don't seen any problem but this takes a long time. So if I have
about 30000 rows it takes a few minutes. If I have 100000 it does not seem
to complete.
Original Data:
Proto Recv-Q Send-Q Local-Address Foreign-Address State
tcp 0 0 172.20.100.2:60255
172.20.100.3:8209 ESTABLISHED
tcp 0 0 172.20.100.2:60247
172.20.100.3:8209 ESTABLISHED
tcp 0 0 ::ffff:172.20.100.2:80
::ffff:10.1.5.7:3185 TIME_WAIT
tcp 0 0 ::ffff:172.20.100.2:80
::ffff:10.5.1.3:3189 TIME_WAIT
tcp 0 0 ::ffff:172.20.100.2:80
::ffff:10.5.5.7:3445 TIME_WAIT
tcp 0 0 ::ffff:172.20.100.2:80
::ffff:10.3.29.3:2671 TIME_WAIT
Parsed Data:
tcp 0 0 172.20.100.2:60255
172.20.100.3:8209 ESTABLISHED
tcp 0 0 172.20.100.2:60247
172.20.100.3:8209 ESTABLISHED
Here I am just splitting at colons and getting IP's and ports. That is
all. Can this code improved ?
data <- read.table("D:\\Log
Analysis\\26-9-2013\\concurrentusage-node1",sep="",header=T,stringsAsFactors=FALSE,
fill=TRUE)
var <- c("Foreign.Address")
data[,var] <- sapply(data[,var],function(x)
ifelse(length(unlist(str_split(x,":")))==5,unlist(str_split(x,":"))[4],unlist(str_split(x,":"))[1]))
var <- c("Local.Address")
data[,var] <- sapply(data[,var],function(x)
ifelse(length(unlist(str_split(x,":")))==5,paste(unlist(str_split(x,":"))[4],":",unlist(str_split(x,":"))[5]),
paste(unlist(str_split(x,":"))[1],":",unlist(str_split(x,":"))[2])))
Thanks,
Mohan
This e-Mail may contain proprietary and confidential information and is sent for
the intended recipient(s) only. If by an addressing or transmission error this
mail has been misdirected to you, you are requested to delete this mail
immediately. You are also hereby notified that any use, any form of
reproduction, dissemination, copying, disclosure, modification, distribution
and/or publication of this e-mail message, contents or its attachment other than
by its intended recipient/s is strictly prohibited.
Visit us at http://www.polarisFT.com
[[alternative HTML version deleted]]
Is this what you want? Please use dput when providing data. Should be faster using regular expressions:> x <- read.table(text = "Proto Recv-Q Send-Q Local-Address Foreign-Address State+ tcp 0 0 172.20.100.2:60255 172.20.100.3:8209 ESTABLISHED + tcp 0 0 172.20.100.2:60247 172.20.100.3:8209 ESTABLISHED + tcp 0 0 ::ffff:172.20.100.2:80 ::ffff:10.1.5.7:3185 TIME_WAIT + tcp 0 0 ::ffff:172.20.100.2:80 ::ffff:10.5.1.3:3189 TIME_WAIT + tcp 0 0 ::ffff:172.20.100.2:80 ::ffff:10.5.5.7:3445 TIME_WAIT + tcp 0 0 ::ffff:172.20.100.2:80 ::ffff:10.3.29.3:2671 TIME_WAIT" + , as.is = TRUE + , header = TRUE + , check.names = FALSE + )> x # beforeProto Recv-Q Send-Q Local-Address Foreign-Address State 1 tcp 0 0 172.20.100.2:60255 172.20.100.3:8209 ESTABLISHED 2 tcp 0 0 172.20.100.2:60247 172.20.100.3:8209 ESTABLISHED 3 tcp 0 0 ::ffff:172.20.100.2:80 ::ffff:10.1.5.7:3185 TIME_WAIT 4 tcp 0 0 ::ffff:172.20.100.2:80 ::ffff:10.5.1.3:3189 TIME_WAIT 5 tcp 0 0 ::ffff:172.20.100.2:80 ::ffff:10.5.5.7:3445 TIME_WAIT 6 tcp 0 0 ::ffff:172.20.100.2:80 ::ffff:10.3.29.3:2671 TIME_WAIT> for (i in c("Local-Address", "Foreign-Address")){+ x[[i]] <- sub("[^0-9]*(.*)", "\\1", x[[i]]) # ignore upto first digit + }> x # afterProto Recv-Q Send-Q Local-Address Foreign-Address State 1 tcp 0 0 172.20.100.2:60255 172.20.100.3:8209 ESTABLISHED 2 tcp 0 0 172.20.100.2:60247 172.20.100.3:8209 ESTABLISHED 3 tcp 0 0 172.20.100.2:80 10.1.5.7:3185 TIME_WAIT 4 tcp 0 0 172.20.100.2:80 10.5.1.3:3189 TIME_WAIT 5 tcp 0 0 172.20.100.2:80 10.5.5.7:3445 TIME_WAIT 6 tcp 0 0 172.20.100.2:80 10.3.29.3:2671 TIME_WAIT>Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Fri, Sep 27, 2013 at 3:12 AM, <mohan.radhakrishnan at polarisft.com> wrote:> Hi, > > I have been using R for a few months and I have this working > code. Don't seen any problem but this takes a long time. So if I have > about 30000 rows it takes a few minutes. If I have 100000 it does not seem > to complete. > > Original Data: > > Proto Recv-Q Send-Q Local-Address Foreign-Address State > tcp 0 0 172.20.100.2:60255 > 172.20.100.3:8209 ESTABLISHED > tcp 0 0 172.20.100.2:60247 > 172.20.100.3:8209 ESTABLISHED > tcp 0 0 ::ffff:172.20.100.2:80 > ::ffff:10.1.5.7:3185 TIME_WAIT > tcp 0 0 ::ffff:172.20.100.2:80 > ::ffff:10.5.1.3:3189 TIME_WAIT > tcp 0 0 ::ffff:172.20.100.2:80 > ::ffff:10.5.5.7:3445 TIME_WAIT > tcp 0 0 ::ffff:172.20.100.2:80 > ::ffff:10.3.29.3:2671 TIME_WAIT > > Parsed Data: > > tcp 0 0 172.20.100.2:60255 > 172.20.100.3:8209 ESTABLISHED > tcp 0 0 172.20.100.2:60247 > 172.20.100.3:8209 ESTABLISHED > > Here I am just splitting at colons and getting IP's and ports. That is > all. Can this code improved ? > > data <- read.table("D:\\Log > Analysis\\26-9-2013\\concurrentusage-node1",sep="",header=T,stringsAsFactors=FALSE, > fill=TRUE) > var <- c("Foreign.Address") > data[,var] <- sapply(data[,var],function(x) > ifelse(length(unlist(str_split(x,":")))==5,unlist(str_split(x,":"))[4],unlist(str_split(x,":"))[1])) > var <- c("Local.Address") > data[,var] <- sapply(data[,var],function(x) > ifelse(length(unlist(str_split(x,":")))==5,paste(unlist(str_split(x,":"))[4],":",unlist(str_split(x,":"))[5]), > paste(unlist(str_split(x,":"))[1],":",unlist(str_split(x,":"))[2]))) > > Thanks, > Mohan > > > This e-Mail may contain proprietary and confidential information and is sent for the intended recipient(s) only. If by an addressing or transmission error this mail has been misdirected to you, you are requested to delete this mail immediately. You are also hereby notified that any use, any form of reproduction, dissemination, copying, disclosure, modification, distribution and/or publication of this e-mail message, contents or its attachment other than by its intended recipient/s is strictly prohibited. > > Visit us at http://www.polarisFT.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi,
Please use ?dput()
dat1<- structure(list(Proto = c("tcp", "tcp",
"tcp", "tcp", "tcp", "tcp"
), `Recv-Q` = c(0L, 0L, 0L, 0L, 0L, 0L), `Send-Q` = c(0L, 0L,
0L, 0L, 0L, 0L), `Local-Address` = c("172.20.100.2:60255",
"172.20.100.2:60247",
"::ffff:172.20.100.2:80", "::ffff:172.20.100.2:80",
"::ffff:172.20.100.2:80",
"::ffff:172.20.100.2:80"), `Foreign-Address` =
c("172.20.100.3:8209",
"172.20.100.3:8209", "::ffff:10.1.5.7:3185",
"::ffff:10.5.1.3:3189",
"::ffff:10.5.5.7:3445", "::ffff:10.3.29.3:2671"), State =
c("ESTABLISHED",
"ESTABLISHED", "TIME_WAIT", "TIME_WAIT",
"TIME_WAIT", "TIME_WAIT"
)), .Names = c("Proto", "Recv-Q", "Send-Q",
"Local-Address",
"Foreign-Address", "State"), class = "data.frame",
row.names = c(NA,
-6L))
library(stringr)
dat1[,4:5]<-lapply(dat1[,4:5],function(x)
str_replace(x,"^\\D+",""))
dat1
#? Proto Recv-Q Send-Q????? Local-Address?? Foreign-Address?????? State
#1?? tcp????? 0????? 0 172.20.100.2:60255 172.20.100.3:8209 ESTABLISHED
#2?? tcp????? 0????? 0 172.20.100.2:60247 172.20.100.3:8209 ESTABLISHED
##3?? tcp????? 0????? 0??? 172.20.100.2:80???? 10.1.5.7:3185?? TIME_WAIT
#4?? tcp????? 0????? 0??? 172.20.100.2:80???? 10.5.1.3:3189?? TIME_WAIT
#5?? tcp????? 0????? 0??? 172.20.100.2:80???? 10.5.5.7:3445?? TIME_WAIT
#6?? tcp????? 0????? 0??? 172.20.100.2:80??? 10.3.29.3:2671?? TIME_WAIT
A.K.
----- Original Message -----
From: "mohan.radhakrishnan at polarisft.com" <mohan.radhakrishnan
at polarisft.com>
To: r-help at r-project.org
Cc:
Sent: Friday, September 27, 2013 3:42 AM
Subject: [R] Locating inefficient code
Hi,
? ? ? ? ? I have been using R for a few months and I have this working
code. Don't seen any problem but this takes a long time. So if I have
about 30000 rows it takes a few minutes. If I have 100000 it does not seem
to complete.
Original Data:
Proto Recv-Q Send-Q Local-Address? ? ? ? ? ? ? Foreign-Address? State?
tcp? ? ? ? 0? ? ? ? ? ? 0? ? ? ? ? ? ? ? ? ? 172.20.100.2:60255
172.20.100.3:8209? ? ? ? ? ESTABLISHED
tcp? ? ? ? 0? ? ? ? 0? ? ? ? ? ? ? ? ? ? 172.20.100.2:60247
172.20.100.3:8209? ? ? ? ? ESTABLISHED
tcp? ? ? ? 0? ? ? ? 0? ? ? ? ? ? ? ? ? ? ::ffff:172.20.100.2:80
::ffff:10.1.5.7:3185? ? ? ? TIME_WAIT
tcp? ? ? ? 0? ? ? ? 0? ? ? ? ? ? ? ? ? ? ::ffff:172.20.100.2:80
::ffff:10.5.1.3:3189? ? ? ? TIME_WAIT
tcp? ? ? ? 0? ? ? ? 0? ? ? ? ? ? ? ? ? ? ::ffff:172.20.100.2:80
::ffff:10.5.5.7:3445? ? ? ? TIME_WAIT
tcp? ? ? ? 0? ? ? ? 0? ? ? ? ? ? ? ? ? ? ::ffff:172.20.100.2:80
::ffff:10.3.29.3:2671? ? ? TIME_WAIT
Parsed Data:
tcp? ? ? ? 0? ? ? ? ? ? 0? ? ? ? ? ? ? ? ? ? 172.20.100.2:60255
172.20.100.3:8209? ? ? ? ? ESTABLISHED
tcp? ? ? ? 0? ? ? ? 0? ? ? ? ? ? ? ? ? ? 172.20.100.2:60247
172.20.100.3:8209? ? ? ? ? ESTABLISHED
Here I am just splitting at colons and getting IP's and ports. That is
all. Can this code improved ?
data <- read.table("D:\\Log
Analysis\\26-9-2013\\concurrentusage-node1",sep="",header=T,stringsAsFactors=FALSE,
fill=TRUE)
var <- c("Foreign.Address")
data[,var] <- sapply(data[,var],function(x)
ifelse(length(unlist(str_split(x,":")))==5,unlist(str_split(x,":"))[4],unlist(str_split(x,":"))[1]))
var <- c("Local.Address")
data[,var] <- sapply(data[,var],function(x)
ifelse(length(unlist(str_split(x,":")))==5,paste(unlist(str_split(x,":"))[4],":",unlist(str_split(x,":"))[5]),
paste(unlist(str_split(x,":"))[1],":",unlist(str_split(x,":"))[2])))
Thanks,
Mohan
This e-Mail may contain proprietary and confidential information and is sent for
the intended recipient(s) only.? If by an addressing or transmission error this
mail has been misdirected to you, you are requested to delete this mail
immediately. You are also hereby notified that any use, any form of
reproduction, dissemination, copying, disclosure, modification, distribution
and/or publication of this e-mail message, contents or its attachment other than
by its intended recipient/s is strictly prohibited.
Visit us at http://www.polarisFT.com
??? [[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.