thr3ads.net - R help - [R] Working with "necessary" columns in R (CSV) [Nov 2010]

If this information is useful, please help other people find it:
Share via:

arturs.onzuls at gmail.com

2010-Nov-17 18:19 UTC

[R] Working with "necessary" columns in R (CSV)

Hi all. It will be great if some one will help me to solve my home task. So,
the deal : i have .pcap file, i convert it to csv using tcpdump (tcpdump -tt
-n -r x.pcap > x.csv)

CSV file looks like that :

12890084,761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: UDP, length 12
12890084,761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: TCP, length 12
12890084,761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: HTML, length 12
...

100000 rows.

Now, i need to open csv in R, and solve 5 problems, but i need to work only
with "UDP" packets (not TCP,HTMP...). For example i need to count how
many
"UDP" packets are there, max and min time in UDP and so on. I see only
two
answers.. i need to scan (but how?) for "UDP" or i need to separate
this
csv, cut only needed rows, and work with them. Please help.

	[[alternative HTML version deleted]]

Petr Savicky

2010-Nov-17 19:15 UTC

head link

[R] Working with "necessary" columns in R (CSV)

On Wed, Nov 17, 2010 at 08:19:53PM +0200, arturs.onzuls at gmail.com
wrote:> Hi all. It will be great if some one will help me to solve my home task.
So,
> the deal : i have .pcap file, i convert it to csv using tcpdump (tcpdump
-tt
> -n -r x.pcap > x.csv)
> 
> CSV file looks like that :
> 
> 12890084,761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: UDP, length
12
> 12890084,761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: TCP, length
12
> 12890084,761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: HTML, length
12
> ...
> 
> 100000 rows.
> 
> Now, i need to open csv in R, and solve 5 problems, but i need to work only
> with "UDP" packets (not TCP,HTMP...). For example i need to count
how many
> "UDP" packets are there, max and min time in UDP and so on. I see
only two
> answers.. i need to scan (but how?) for "UDP" or i need to
separate this
> csv, cut only needed rows, and work with them. Please help.
You can read the file into R and extract only UDP rows for example

  all <- read.csv("x.csv", stringsAsFactors=FALSE, header=FALSE) #
assuming there is no header
  udp <- all[grep(" UDP$", all[, 2]), ]

Using concatenation of three copies of your 3 rows, we get

  all
          V1                                                     V2         V3
  1 12890084  761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: UDP  length
12
  2 12890084  761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: TCP  length
12
  3 12890084 761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: HTML  length
12
  4 12890084  761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: UDP  length
12
  5 12890084  761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: TCP  length
12
  6 12890084 761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: HTML  length
12
  7 12890084  761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: UDP  length
12
  8 12890084  761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: TCP  length
12
  9 12890084 761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: HTML  length
12

  udp
          V1                                                    V2         V3
  1 12890084 761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: UDP  length 12
  4 12890084 761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: UDP  length 12
  7 12890084 761659 IP 10.10.20.20.47808 > 10.10.20.255.47808: UDP  length 12

Note that there are three columns only, since your input had only three fields
per line. If you change the export to .csv so that, for example, column 2
contains
only the protocol name, you could use

  table(all[, 2])

to get the number of occurrences of each protocol or

  sum(all[, 2] == "UDP")

to get the number of UDP rows or

  udp <- all[all[, 2] == "UDP", ]

to extract only UDP rows.

If you cannot change the export to .csv, you can use the function strsplit().

Petr Savicky.

Seemingly Similar Threads

Search for more reasonably related threads

R help - Nov 2010 - Working with "necessary" columns in R (CSV)

[R] Working with "necessary" columns in R (CSV)

[R] Working with "necessary" columns in R (CSV)

Seemingly Similar Threads