Satish Vadlamani
2016-Jun-17  03:42 UTC
[R] what is the best way to process the following data?
Hello, I have multiple text files with the format shown below (see the two files that I pasted below). Each file is a log of multiple steps that the system has processed and for each step, it has shown the start time of the process step. For example, in the data below, the filter started at |06/16/2016|03:44:16 How to read this data so that Step 001 is one data frame, Step 002 is another, and so on. After I do this, I will then compare the Step 001 times with and without parallel process. For example, the files pasted below "no_parallel_process_SLS_4.txt" and "parallel_process_SLS_4.txt" will make it clear what I am trying to do. I want to compare the parallel process times taken for each step with the non parallel process times. If there are better ways of performing this task that what I am thinking, could you let me know? Thanks in advance. Satish Vadlamani>> parallel_process_file.txt|06/16/2016|03:44:16|Step 001 |06/16/2016|03:44:16|Initialization |06/16/2016|03:44:16|Filters |06/16/2016|03:45:03|Split Items |06/16/2016|03:46:20|Sort |06/16/2016|03:46:43|Check |06/16/2016|04:01:13|Save |06/16/2016|04:04:35|Update preparation |06/16/2016|04:04:36|Update comparison |06/16/2016|04:04:38|Update |06/16/2016|04:04:38|Update |06/16/2016|04:06:01|Close |06/16/2016|04:06:33|BOP processing for 7,960 items has finished |06/16/2016|04:06:34|Step 002 |06/16/2016|04:06:35|Initialization |06/16/2016|04:06:35|Filters |06/16/2016|04:07:14|Split Items |06/16/2016|04:08:57|Sort |06/16/2016|04:09:06|Check |06/16/2016|04:26:36|Save |06/16/2016|04:39:29|Update preparation |06/16/2016|04:39:31|Update comparison |06/16/2016|04:39:43|Update |06/16/2016|04:39:45|Update |06/16/2016|04:44:28|Close |06/16/2016|04:45:26|BOP processing for 8,420 items has finished |06/16/2016|04:45:27|Step 003 |06/16/2016|04:45:27|Initialization |06/16/2016|04:45:27|Filters |06/16/2016|04:48:50|Split Items |06/16/2016|04:55:15|Sort |06/16/2016|04:55:40|Check |06/16/2016|05:13:35|Save |06/16/2016|05:17:34|Update preparation |06/16/2016|05:17:34|Update comparison |06/16/2016|05:17:36|Update |06/16/2016|05:17:36|Update |06/16/2016|05:19:29|Close |06/16/2016|05:19:49|BOP processing for 8,876 items has finished |06/16/2016|05:19:50|Step 004 |06/16/2016|05:19:50|Initialization |06/16/2016|05:19:50|Filters |06/16/2016|05:20:43|Split Items |06/16/2016|05:22:14|Sort |06/16/2016|05:22:29|Check |06/16/2016|05:37:27|Save |06/16/2016|05:38:43|Update preparation |06/16/2016|05:38:44|Update comparison |06/16/2016|05:38:45|Update |06/16/2016|05:38:45|Update |06/16/2016|05:39:09|Close |06/16/2016|05:39:19|BOP processing for 5,391 items has finished |06/16/2016|05:39:20|Step 005 |06/16/2016|05:39:20|Initialization |06/16/2016|05:39:20|Filters |06/16/2016|05:39:57|Split Items |06/16/2016|05:40:21|Sort |06/16/2016|05:40:24|Check |06/16/2016|05:46:01|Save |06/16/2016|05:46:54|Update preparation |06/16/2016|05:46:54|Update comparison |06/16/2016|05:46:54|Update |06/16/2016|05:46:55|Update |06/16/2016|05:47:24|Close |06/16/2016|05:47:31|BOP processing for 3,016 items has finished |06/16/2016|05:47:32|Step 006 |06/16/2016|05:47:32|Initialization |06/16/2016|05:47:32|Filters |06/16/2016|05:47:32|Update preparation |06/16/2016|05:47:32|Update comparison |06/16/2016|05:47:32|Update |06/16/2016|05:47:32|Close |06/16/2016|05:47:33|BOP processing for 0 items has finished |06/16/2016|05:47:33|Step 007 |06/16/2016|05:47:33|Initialization |06/16/2016|05:47:33|Filters |06/16/2016|05:47:34|Split Items |06/16/2016|05:47:34|Sort |06/16/2016|05:47:34|Check |06/16/2016|05:47:37|Save |06/16/2016|05:47:37|Update preparation |06/16/2016|05:47:37|Update comparison |06/16/2016|05:47:37|Update |06/16/2016|05:47:37|Update |06/16/2016|05:47:37|Close |06/16/2016|05:47:37|BOP processing for 9 items has finished |06/16/2016|05:47:37|Step 008 |06/16/2016|05:47:37|Initialization |06/16/2016|05:47:37|Filters |06/16/2016|05:47:38|Update preparation |06/16/2016|05:47:38|Update comparison |06/16/2016|05:47:38|Update |06/16/2016|05:47:38|Close |06/16/2016|05:47:38|BOP processing for 0 items has finished>> no_parallel_process_file.txt|06/15/2016|22:52:46|Step 001 |06/15/2016|22:52:46|Initialization |06/15/2016|22:52:46|Filters |06/15/2016|22:54:21|Split Items |06/15/2016|22:55:10|Sort |06/15/2016|22:55:15|Check |06/15/2016|23:04:43|Save |06/15/2016|23:06:38|Update preparation |06/15/2016|23:06:38|Update comparison |06/15/2016|23:06:39|Update |06/15/2016|23:06:39|Update |06/15/2016|23:12:04|Close |06/15/2016|23:13:16|BOP processing for 7,942 items has finished |06/15/2016|23:13:17|Step 002 |06/15/2016|23:13:17|Initialization |06/15/2016|23:13:17|Filters |06/15/2016|23:16:27|Split Items |06/15/2016|23:20:18|Sort |06/15/2016|23:20:34|Check |06/16/2016|00:08:08|Save |06/16/2016|00:26:19|Update preparation |06/16/2016|00:26:20|Update comparison |06/16/2016|00:26:30|Update |06/16/2016|00:26:31|Update |06/16/2016|00:42:31|Close |06/16/2016|00:45:09|BOP processing for 8,400 items has finished |06/16/2016|00:45:11|Step 003 |06/16/2016|00:45:12|Initialization |06/16/2016|00:45:12|Filters |06/16/2016|00:53:01|Split Items |06/16/2016|01:01:44|Sort |06/16/2016|01:02:55|Check |06/16/2016|01:41:40|Save |06/16/2016|01:44:37|Update preparation |06/16/2016|01:44:37|Update comparison |06/16/2016|01:44:39|Update |06/16/2016|01:44:39|Update |06/16/2016|01:47:37|Close |06/16/2016|01:48:07|BOP processing for 8,867 items has finished |06/16/2016|01:48:08|Step 004 |06/16/2016|01:48:08|Initialization |06/16/2016|01:48:08|Filters |06/16/2016|01:49:51|Split Items |06/16/2016|01:50:35|Sort |06/16/2016|01:50:39|Check |06/16/2016|01:59:12|Save |06/16/2016|02:00:47|Update preparation |06/16/2016|02:00:47|Update comparison |06/16/2016|02:00:48|Update |06/16/2016|02:00:48|Update |06/16/2016|02:02:40|Close |06/16/2016|02:02:55|BOP processing for 5,383 items has finished |06/16/2016|02:02:56|Step 005 |06/16/2016|02:02:56|Initialization |06/16/2016|02:02:56|Filters |06/16/2016|02:03:47|Split Items |06/16/2016|02:04:19|Sort |06/16/2016|02:04:21|Check |06/16/2016|02:08:08|Save |06/16/2016|02:09:22|Update preparation |06/16/2016|02:09:22|Update comparison |06/16/2016|02:09:22|Update |06/16/2016|02:09:22|Update |06/16/2016|02:11:03|Close |06/16/2016|02:11:14|BOP processing for 3,016 items has finished |06/16/2016|02:11:14|Step 006 |06/16/2016|02:11:14|Initialization |06/16/2016|02:11:14|Filters |06/16/2016|02:11:15|Update preparation |06/16/2016|02:11:15|Update comparison |06/16/2016|02:11:15|Update |06/16/2016|02:11:15|Close |06/16/2016|02:11:15|BOP processing for 0 items has finished |06/16/2016|02:11:15|Step 007 |06/16/2016|02:11:15|Initialization |06/16/2016|02:11:15|Filters |06/16/2016|02:11:17|Split Items |06/16/2016|02:11:17|Sort |06/16/2016|02:11:17|Check |06/16/2016|02:11:20|Save |06/16/2016|02:11:20|Update preparation |06/16/2016|02:11:20|Update comparison |06/16/2016|02:11:20|Update |06/16/2016|02:11:20|Update |06/16/2016|02:11:20|Close |06/16/2016|02:11:20|BOP processing for 9 items has finished |06/16/2016|02:11:20|Step 008 |06/16/2016|02:11:20|Initialization |06/16/2016|02:11:21|Filters |06/16/2016|02:11:21|Update preparation |06/16/2016|02:11:21|Update comparison |06/16/2016|02:11:21|Update |06/16/2016|02:11:21|Close |06/16/2016|02:11:21|BOP processing for 0 items has finished -- Satish Vadlamani [[alternative HTML version deleted]]
William Dunlap
2016-Jun-17  15:57 UTC
[R] what is the best way to process the following data?
You can make a step-number variable with cumsum(grepl("^Step ", ...))
and
use it as the splitting variable in split.  E.g.,
> dat <- read.table(yourFile, stringsAsFactors=FALSE, sep="|",
colClasses=c("NULL", "character", "character",
"character"),
col.names=c("Junk","Date","Time","Type"))> dat <- with(dat, data.frame(DateTime=as.POSIXct(paste(Date, Time),
format="%m/%d/%Y %H:%M:%S"), Type=Type,
stringsAsFactors=FALSE))> head(dat)
             DateTime           Type
1 2016-06-16 03:44:16       Step 001
2 2016-06-16 03:44:16 Initialization
3 2016-06-16 03:44:16        Filters
4 2016-06-16 03:45:03    Split Items
5 2016-06-16 03:46:20           Sort
6 2016-06-16 03:46:43          Check> split(dat, cumsum(grepl("^Step ", dat$Type)))
$`1`
              DateTime                                        Type
1  2016-06-16 03:44:16                                    Step 001
2  2016-06-16 03:44:16                              Initialization
...
13 2016-06-16 04:06:33 BOP processing for 7,960 items has finished
$`2`
              DateTime                                        Type
14 2016-06-16 04:06:34                                    Step 002
15 2016-06-16 04:06:35                              Initialization
...
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Jun 16, 2016 at 8:42 PM, Satish Vadlamani <
satish.vadlamani at gmail.com> wrote:
> Hello,
> I have multiple text files with the format shown below (see the two files
> that I pasted below). Each file is a log of multiple steps that the system
> has processed and for each step, it has shown the start time of the process
> step. For example, in the data below, the filter started at
> |06/16/2016|03:44:16
>
> How to read this data so that Step 001 is one data frame, Step 002 is
> another, and so on. After I do this, I will then compare the Step 001 times
> with and without parallel process.
>
> For example, the files pasted below
"no_parallel_process_SLS_4.txt" and
> "parallel_process_SLS_4.txt" will make it clear what I am trying
to do. I
> want to compare the parallel process times taken for each step with the non
> parallel process times.
>
> If there are better ways of performing this task that what I am thinking,
> could you let me know? Thanks in advance.
>
> Satish Vadlamani
>
> >> parallel_process_file.txt
>
> |06/16/2016|03:44:16|Step 001
> |06/16/2016|03:44:16|Initialization
> |06/16/2016|03:44:16|Filters
> |06/16/2016|03:45:03|Split Items
> |06/16/2016|03:46:20|Sort
> |06/16/2016|03:46:43|Check
> |06/16/2016|04:01:13|Save
> |06/16/2016|04:04:35|Update preparation
> |06/16/2016|04:04:36|Update comparison
> |06/16/2016|04:04:38|Update
> |06/16/2016|04:04:38|Update
> |06/16/2016|04:06:01|Close
> |06/16/2016|04:06:33|BOP processing for 7,960 items has finished
> |06/16/2016|04:06:34|Step 002
> |06/16/2016|04:06:35|Initialization
> |06/16/2016|04:06:35|Filters
> |06/16/2016|04:07:14|Split Items
> |06/16/2016|04:08:57|Sort
> |06/16/2016|04:09:06|Check
> |06/16/2016|04:26:36|Save
> |06/16/2016|04:39:29|Update preparation
> |06/16/2016|04:39:31|Update comparison
> |06/16/2016|04:39:43|Update
> |06/16/2016|04:39:45|Update
> |06/16/2016|04:44:28|Close
> |06/16/2016|04:45:26|BOP processing for 8,420 items has finished
> |06/16/2016|04:45:27|Step 003
> |06/16/2016|04:45:27|Initialization
> |06/16/2016|04:45:27|Filters
> |06/16/2016|04:48:50|Split Items
> |06/16/2016|04:55:15|Sort
> |06/16/2016|04:55:40|Check
> |06/16/2016|05:13:35|Save
> |06/16/2016|05:17:34|Update preparation
> |06/16/2016|05:17:34|Update comparison
> |06/16/2016|05:17:36|Update
> |06/16/2016|05:17:36|Update
> |06/16/2016|05:19:29|Close
> |06/16/2016|05:19:49|BOP processing for 8,876 items has finished
> |06/16/2016|05:19:50|Step 004
> |06/16/2016|05:19:50|Initialization
> |06/16/2016|05:19:50|Filters
> |06/16/2016|05:20:43|Split Items
> |06/16/2016|05:22:14|Sort
> |06/16/2016|05:22:29|Check
> |06/16/2016|05:37:27|Save
> |06/16/2016|05:38:43|Update preparation
> |06/16/2016|05:38:44|Update comparison
> |06/16/2016|05:38:45|Update
> |06/16/2016|05:38:45|Update
> |06/16/2016|05:39:09|Close
> |06/16/2016|05:39:19|BOP processing for 5,391 items has finished
> |06/16/2016|05:39:20|Step 005
> |06/16/2016|05:39:20|Initialization
> |06/16/2016|05:39:20|Filters
> |06/16/2016|05:39:57|Split Items
> |06/16/2016|05:40:21|Sort
> |06/16/2016|05:40:24|Check
> |06/16/2016|05:46:01|Save
> |06/16/2016|05:46:54|Update preparation
> |06/16/2016|05:46:54|Update comparison
> |06/16/2016|05:46:54|Update
> |06/16/2016|05:46:55|Update
> |06/16/2016|05:47:24|Close
> |06/16/2016|05:47:31|BOP processing for 3,016 items has finished
> |06/16/2016|05:47:32|Step 006
> |06/16/2016|05:47:32|Initialization
> |06/16/2016|05:47:32|Filters
> |06/16/2016|05:47:32|Update preparation
> |06/16/2016|05:47:32|Update comparison
> |06/16/2016|05:47:32|Update
> |06/16/2016|05:47:32|Close
> |06/16/2016|05:47:33|BOP processing for 0 items has finished
> |06/16/2016|05:47:33|Step 007
> |06/16/2016|05:47:33|Initialization
> |06/16/2016|05:47:33|Filters
> |06/16/2016|05:47:34|Split Items
> |06/16/2016|05:47:34|Sort
> |06/16/2016|05:47:34|Check
> |06/16/2016|05:47:37|Save
> |06/16/2016|05:47:37|Update preparation
> |06/16/2016|05:47:37|Update comparison
> |06/16/2016|05:47:37|Update
> |06/16/2016|05:47:37|Update
> |06/16/2016|05:47:37|Close
> |06/16/2016|05:47:37|BOP processing for 9 items has finished
> |06/16/2016|05:47:37|Step 008
> |06/16/2016|05:47:37|Initialization
> |06/16/2016|05:47:37|Filters
> |06/16/2016|05:47:38|Update preparation
> |06/16/2016|05:47:38|Update comparison
> |06/16/2016|05:47:38|Update
> |06/16/2016|05:47:38|Close
> |06/16/2016|05:47:38|BOP processing for 0 items has finished
>
>
>
>
> >> no_parallel_process_file.txt
>
> |06/15/2016|22:52:46|Step 001
> |06/15/2016|22:52:46|Initialization
>
> |06/15/2016|22:52:46|Filters
>
> |06/15/2016|22:54:21|Split Items
>
> |06/15/2016|22:55:10|Sort
>
> |06/15/2016|22:55:15|Check
>
> |06/15/2016|23:04:43|Save
>
> |06/15/2016|23:06:38|Update preparation
>
> |06/15/2016|23:06:38|Update comparison
>
> |06/15/2016|23:06:39|Update
>
> |06/15/2016|23:06:39|Update
>
> |06/15/2016|23:12:04|Close
>
> |06/15/2016|23:13:16|BOP processing for 7,942 items has finished
>
> |06/15/2016|23:13:17|Step 002
> |06/15/2016|23:13:17|Initialization
>
> |06/15/2016|23:13:17|Filters
>
> |06/15/2016|23:16:27|Split Items
>
> |06/15/2016|23:20:18|Sort
>
> |06/15/2016|23:20:34|Check
>
> |06/16/2016|00:08:08|Save
>
> |06/16/2016|00:26:19|Update preparation
>
> |06/16/2016|00:26:20|Update comparison
>
> |06/16/2016|00:26:30|Update
>
> |06/16/2016|00:26:31|Update
>
> |06/16/2016|00:42:31|Close
>
> |06/16/2016|00:45:09|BOP processing for 8,400 items has finished
>
> |06/16/2016|00:45:11|Step 003
> |06/16/2016|00:45:12|Initialization
>
> |06/16/2016|00:45:12|Filters
>
> |06/16/2016|00:53:01|Split Items
>
> |06/16/2016|01:01:44|Sort
>
> |06/16/2016|01:02:55|Check
>
> |06/16/2016|01:41:40|Save
>
> |06/16/2016|01:44:37|Update preparation
>
> |06/16/2016|01:44:37|Update comparison
>
> |06/16/2016|01:44:39|Update
>
> |06/16/2016|01:44:39|Update
>
> |06/16/2016|01:47:37|Close
>
> |06/16/2016|01:48:07|BOP processing for 8,867 items has finished
>
> |06/16/2016|01:48:08|Step 004
> |06/16/2016|01:48:08|Initialization
>
> |06/16/2016|01:48:08|Filters
>
> |06/16/2016|01:49:51|Split Items
>
> |06/16/2016|01:50:35|Sort
>
> |06/16/2016|01:50:39|Check
>
> |06/16/2016|01:59:12|Save
>
> |06/16/2016|02:00:47|Update preparation
>
> |06/16/2016|02:00:47|Update comparison
>
> |06/16/2016|02:00:48|Update
>
> |06/16/2016|02:00:48|Update
>
> |06/16/2016|02:02:40|Close
>
> |06/16/2016|02:02:55|BOP processing for 5,383 items has finished
>
> |06/16/2016|02:02:56|Step 005
> |06/16/2016|02:02:56|Initialization
>
> |06/16/2016|02:02:56|Filters
>
> |06/16/2016|02:03:47|Split Items
>
> |06/16/2016|02:04:19|Sort
>
> |06/16/2016|02:04:21|Check
>
> |06/16/2016|02:08:08|Save
>
> |06/16/2016|02:09:22|Update preparation
>
> |06/16/2016|02:09:22|Update comparison
>
> |06/16/2016|02:09:22|Update
>
> |06/16/2016|02:09:22|Update
>
> |06/16/2016|02:11:03|Close
>
> |06/16/2016|02:11:14|BOP processing for 3,016 items has finished
>
> |06/16/2016|02:11:14|Step 006
> |06/16/2016|02:11:14|Initialization
>
> |06/16/2016|02:11:14|Filters
>
> |06/16/2016|02:11:15|Update preparation
>
> |06/16/2016|02:11:15|Update comparison
>
> |06/16/2016|02:11:15|Update
>
> |06/16/2016|02:11:15|Close
>
> |06/16/2016|02:11:15|BOP processing for 0 items has finished
>
> |06/16/2016|02:11:15|Step 007
> |06/16/2016|02:11:15|Initialization
>
> |06/16/2016|02:11:15|Filters
>
> |06/16/2016|02:11:17|Split Items
>
> |06/16/2016|02:11:17|Sort
>
> |06/16/2016|02:11:17|Check
>
> |06/16/2016|02:11:20|Save
>
> |06/16/2016|02:11:20|Update preparation
>
> |06/16/2016|02:11:20|Update comparison
>
> |06/16/2016|02:11:20|Update
>
> |06/16/2016|02:11:20|Update
>
> |06/16/2016|02:11:20|Close
>
> |06/16/2016|02:11:20|BOP processing for 9 items has finished
>
> |06/16/2016|02:11:20|Step 008
> |06/16/2016|02:11:20|Initialization
>
> |06/16/2016|02:11:21|Filters
>
> |06/16/2016|02:11:21|Update preparation
>
> |06/16/2016|02:11:21|Update comparison
>
> |06/16/2016|02:11:21|Update
>
> |06/16/2016|02:11:21|Close
>
> |06/16/2016|02:11:21|BOP processing for 0 items has finished
>
>
>
> --
>
> Satish Vadlamani
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]