thr3ads.net - R help - [R] Alignment of data sets [Sep 2013]

If this information is useful, please help other people find it:
Share via:

Mostafavipak, Nasrin

2013-Sep-06 19:42 UTC

[R] Alignment of data sets

Hi all;

I have a data set with the format below:


Year, Day, Hour, Value

2010,  001,    0,    15.9
2010,  001,    1,    7.3
2010,  001,    2,    5.2
2010,  001,    3,    8.0
2010,  001,    4,    0.0
2010,  001,    5,    12.1
2010,  001,    6,    11.6
2010,  001,    7,    13.9
2010,  001,    8,    11.9
2010,  001,    9,    13.6
2010,  001,    10,    16.1
2010,  001,    11,    18.5

That should be converted to this format:

2010,  001,    0,    15.9
2010,  001,    1,      7.3
2010,  001,    2,      5.2
2010,  001,    3,      8.0
2010,  001,    4,      0.0
2010,  001,    5,    12.1
2010,  001,    6,    11.6
2010,  001,    7,    13.9
2010,  001,    8,    11.9
2010,  001,    9,    13.6
2010,  001,  10,    16.1
2010,  001,  11,    18.5
The number of spaces is important. I have tried justify, but it produces spaces
at the end or at the beginning of the rows depending on the choice of right,
left alignment. Also I need 3 significant digits for the second column, when I
use read.csv it gives me 1 instead of 001. So I use read.table, and one of the
problems with read.table is that it produces row names that I don't want.
Also I need commas in my output file.


So far this is the best I could do:

mydata = read.table("C:/ozone3.txt", sep = "")


capture.output( print(mydata, sep = ",", print.gap=3),
file="capture2.txt" )

and the output has all the unwanted row names and also there are no commas.


Any suggestions?

Thank you
Nasrin

	[[alternative HTML version deleted]]

arun

2013-Sep-07 02:08 UTC

head link

[R] Alignment of data sets

HI,

The question is not clear.

Lines1<- readLines(textConnection("Year, Day, Hour, Value
2010,? 001,??? 0,??? 15.9
2010,? 001,??? 1,??? 7.3
2010,? 001,??? 2,??? 5.2
2010,? 001,??? 3,??? 8.0
2010,? 001,??? 4,??? 0.0
2010,? 001,??? 5,??? 12.1
2010,? 001,??? 6,??? 11.6
2010,? 001,??? 7,??? 13.9
2010,? 001,??? 8,??? 11.9
2010,? 001,??? 9,??? 13.6
2010,? 001,??? 10,??? 16.1
2010,? 001,??? 11,??? 18.5"))

library(stringr)
#Looking at the spaces between each comma.

str_count(gsub("(\\d+,\\s+\\d+).*","\\1",Lines1[-1]),"
")
# [1] 2 2 2 2 2 2 2 2 2 2 2 2
str_count(gsub("^\\d+,\\s+(\\d+,\\s+\\d+).*","\\1",Lines1[-1]),"
")
# [1] 4 4 4 4 4 4 4 4 4 4 4 4
str_count(gsub("\\d+,\\s+\\d+,\\s+(\\d+,\\s+\\d+)","\\1",Lines1[-1]),"
")
# [1] 4 4 4 4 4 4 4 4 4 4 4 4


Lines2<- gsub(",",",?? ",gsub("
","",Lines1))[-1]
?str_count(Lines2," ")
# [1] 9 9 9 9 9 9 9 9 9 9 9 9
?str_count(gsub("(\\d+,\\s+\\d+).*","\\1",Lines2),"
")
# [1] 3 3 3 3 3 3 3 3 3 3 3 3
str_count(gsub("^\\d+,\\s+(\\d+,\\s+\\d+).*","\\1",Lines2),"
")
# [1] 3 3 3 3 3 3 3 3 3 3 3 3
str_count(gsub("\\d+,\\s+\\d+,\\s+(\\d+,\\s+\\d+)","\\1",Lines2),"
")
# [1] 3 3 3 3 3 3 3 3 3 3 3 3



write(Lines2,"capture2.txt")

A.K.





----- Original Message -----
From: "Mostafavipak, Nasrin" <Nasrin.Mostafavipak at
stantec.com>
To: "r-help at R-project.org" <r-help at r-project.org>
Cc: 
Sent: Friday, September 6, 2013 3:42 PM
Subject: [R] Alignment of data sets

Hi all;

I have a data set with the format below:


Year, Day, Hour, Value

2010,? 001,? ? 0,? ? 15.9
2010,? 001,? ? 1,? ? 7.3
2010,? 001,? ? 2,? ? 5.2
2010,? 001,? ? 3,? ? 8.0
2010,? 001,? ? 4,? ? 0.0
2010,? 001,? ? 5,? ? 12.1
2010,? 001,? ? 6,? ? 11.6
2010,? 001,? ? 7,? ? 13.9
2010,? 001,? ? 8,? ? 11.9
2010,? 001,? ? 9,? ? 13.6
2010,? 001,? ? 10,? ? 16.1
2010,? 001,? ? 11,? ? 18.5

That should be converted to this format:

2010,? 001,? ? 0,? ? 15.9
2010,? 001,? ? 1,? ? ? 7.3
2010,? 001,? ? 2,? ? ? 5.2
2010,? 001,? ? 3,? ? ? 8.0
2010,? 001,? ? 4,? ? ? 0.0
2010,? 001,? ? 5,? ? 12.1
2010,? 001,? ? 6,? ? 11.6
2010,? 001,? ? 7,? ? 13.9
2010,? 001,? ? 8,? ? 11.9
2010,? 001,? ? 9,? ? 13.6
2010,? 001,? 10,? ? 16.1
2010,? 001,? 11,? ? 18.5
The number of spaces is important. I have tried justify, but it produces spaces
at the end or at the beginning of the rows depending on the choice of right,
left alignment. Also I need 3 significant digits for the second column, when I
use read.csv it gives me 1 instead of 001. So I use read.table, and one of the
problems with read.table is that it produces row names that I don't want.
Also I need commas in my output file.


So far this is the best I could do:

mydata = read.table("C:/ozone3.txt", sep = "")


capture.output( print(mydata, sep = ",", print.gap=3),
file="capture2.txt" )

and the output has all the unwanted row names and also there are no commas.


Any suggestions?

Thank you
Nasrin

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

jim holtman

2013-Sep-07 17:50 UTC

head link

[R] Alignment of data sets

If spacing is critical, use 'sprintf' for creating the output.

> Lines1<- read.csv(textConnection("Year, Day, Hour, Value+ 2010,  001,    0,    15.9
+  2010,  001,    1,    7.3
+  2010,  001,    2,    5.2
+  2010,  001,    3,    8.0
+  2010,  001,    4,    0.0
+  2010,  001,    5,    12.1
+  2010,  001,    6,    11.6
+  2010,  001,    7,    13.9
+  2010,  001,    8,    11.9
+  2010,  001,    9,    13.6
+  2010,  001,    10,    16.1
+
+ 2010,  001,    11,    18.5"))> # use sprintf for the spacing
> cat(with(Lines1, sprintf("%4d, %03d,%3d,%5.1f\n"+ , Year, Day, Hour, Value
+ )), sep = ''
+ )
2010, 001,  0, 15.9
2010, 001,  1,  7.3
2010, 001,  2,  5.2
2010, 001,  3,  8.0
2010, 001,  4,  0.0
2010, 001,  5, 12.1
2010, 001,  6, 11.6
2010, 001,  7, 13.9
2010, 001,  8, 11.9
2010, 001,  9, 13.6
2010, 001, 10, 16.1
2010, 001, 11, 18.5>Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Fri, Sep 6, 2013 at 3:42 PM, Mostafavipak, Nasrin
<Nasrin.Mostafavipak at stantec.com> wrote:> Hi all;
>
> I have a data set with the format below:
>
>
> Year, Day, Hour, Value
>
> 2010,  001,    0,    15.9
> 2010,  001,    1,    7.3
> 2010,  001,    2,    5.2
> 2010,  001,    3,    8.0
> 2010,  001,    4,    0.0
> 2010,  001,    5,    12.1
> 2010,  001,    6,    11.6
> 2010,  001,    7,    13.9
> 2010,  001,    8,    11.9
> 2010,  001,    9,    13.6
> 2010,  001,    10,    16.1
> 2010,  001,    11,    18.5
>
> That should be converted to this format:
>
> 2010,  001,    0,    15.9
> 2010,  001,    1,      7.3
> 2010,  001,    2,      5.2
> 2010,  001,    3,      8.0
> 2010,  001,    4,      0.0
> 2010,  001,    5,    12.1
> 2010,  001,    6,    11.6
> 2010,  001,    7,    13.9
> 2010,  001,    8,    11.9
> 2010,  001,    9,    13.6
> 2010,  001,  10,    16.1
> 2010,  001,  11,    18.5
> The number of spaces is important. I have tried justify, but it produces
spaces at the end or at the beginning of the rows depending on the choice of
right, left alignment. Also I need 3 significant digits for the second column,
when I use read.csv it gives me 1 instead of 001. So I use read.table, and one
of the problems with read.table is that it produces row names that I don't
want. Also I need commas in my output file.
>
>
> So far this is the best I could do:
>
> mydata = read.table("C:/ozone3.txt", sep = "")
>
>
> capture.output( print(mydata, sep = ",", print.gap=3),
file="capture2.txt" )
>
> and the output has all the unwanted row names and also there are no commas.
>
>
> Any suggestions?
>
> Thank you
> Nasrin
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Sep 2013 - Alignment of data sets

[R] Alignment of data sets

[R] Alignment of data sets

[R] Alignment of data sets

Seemingly Similar Threads