Hi all; I have a data set with the format below: Year, Day, Hour, Value 2010, 001, 0, 15.9 2010, 001, 1, 7.3 2010, 001, 2, 5.2 2010, 001, 3, 8.0 2010, 001, 4, 0.0 2010, 001, 5, 12.1 2010, 001, 6, 11.6 2010, 001, 7, 13.9 2010, 001, 8, 11.9 2010, 001, 9, 13.6 2010, 001, 10, 16.1 2010, 001, 11, 18.5 That should be converted to this format: 2010, 001, 0, 15.9 2010, 001, 1, 7.3 2010, 001, 2, 5.2 2010, 001, 3, 8.0 2010, 001, 4, 0.0 2010, 001, 5, 12.1 2010, 001, 6, 11.6 2010, 001, 7, 13.9 2010, 001, 8, 11.9 2010, 001, 9, 13.6 2010, 001, 10, 16.1 2010, 001, 11, 18.5 The number of spaces is important. I have tried justify, but it produces spaces at the end or at the beginning of the rows depending on the choice of right, left alignment. Also I need 3 significant digits for the second column, when I use read.csv it gives me 1 instead of 001. So I use read.table, and one of the problems with read.table is that it produces row names that I don't want. Also I need commas in my output file. So far this is the best I could do: mydata = read.table("C:/ozone3.txt", sep = "") capture.output( print(mydata, sep = ",", print.gap=3), file="capture2.txt" ) and the output has all the unwanted row names and also there are no commas. Any suggestions? Thank you Nasrin [[alternative HTML version deleted]]
HI, The question is not clear. Lines1<- readLines(textConnection("Year, Day, Hour, Value 2010,? 001,??? 0,??? 15.9 2010,? 001,??? 1,??? 7.3 2010,? 001,??? 2,??? 5.2 2010,? 001,??? 3,??? 8.0 2010,? 001,??? 4,??? 0.0 2010,? 001,??? 5,??? 12.1 2010,? 001,??? 6,??? 11.6 2010,? 001,??? 7,??? 13.9 2010,? 001,??? 8,??? 11.9 2010,? 001,??? 9,??? 13.6 2010,? 001,??? 10,??? 16.1 2010,? 001,??? 11,??? 18.5")) library(stringr) #Looking at the spaces between each comma. str_count(gsub("(\\d+,\\s+\\d+).*","\\1",Lines1[-1])," ") # [1] 2 2 2 2 2 2 2 2 2 2 2 2 str_count(gsub("^\\d+,\\s+(\\d+,\\s+\\d+).*","\\1",Lines1[-1])," ") # [1] 4 4 4 4 4 4 4 4 4 4 4 4 str_count(gsub("\\d+,\\s+\\d+,\\s+(\\d+,\\s+\\d+)","\\1",Lines1[-1])," ") # [1] 4 4 4 4 4 4 4 4 4 4 4 4 Lines2<- gsub(",",",?? ",gsub(" ","",Lines1))[-1] ?str_count(Lines2," ") # [1] 9 9 9 9 9 9 9 9 9 9 9 9 ?str_count(gsub("(\\d+,\\s+\\d+).*","\\1",Lines2)," ") # [1] 3 3 3 3 3 3 3 3 3 3 3 3 str_count(gsub("^\\d+,\\s+(\\d+,\\s+\\d+).*","\\1",Lines2)," ") # [1] 3 3 3 3 3 3 3 3 3 3 3 3 str_count(gsub("\\d+,\\s+\\d+,\\s+(\\d+,\\s+\\d+)","\\1",Lines2)," ") # [1] 3 3 3 3 3 3 3 3 3 3 3 3 write(Lines2,"capture2.txt") A.K. ----- Original Message ----- From: "Mostafavipak, Nasrin" <Nasrin.Mostafavipak at stantec.com> To: "r-help at R-project.org" <r-help at r-project.org> Cc: Sent: Friday, September 6, 2013 3:42 PM Subject: [R] Alignment of data sets Hi all; I have a data set with the format below: Year, Day, Hour, Value 2010,? 001,? ? 0,? ? 15.9 2010,? 001,? ? 1,? ? 7.3 2010,? 001,? ? 2,? ? 5.2 2010,? 001,? ? 3,? ? 8.0 2010,? 001,? ? 4,? ? 0.0 2010,? 001,? ? 5,? ? 12.1 2010,? 001,? ? 6,? ? 11.6 2010,? 001,? ? 7,? ? 13.9 2010,? 001,? ? 8,? ? 11.9 2010,? 001,? ? 9,? ? 13.6 2010,? 001,? ? 10,? ? 16.1 2010,? 001,? ? 11,? ? 18.5 That should be converted to this format: 2010,? 001,? ? 0,? ? 15.9 2010,? 001,? ? 1,? ? ? 7.3 2010,? 001,? ? 2,? ? ? 5.2 2010,? 001,? ? 3,? ? ? 8.0 2010,? 001,? ? 4,? ? ? 0.0 2010,? 001,? ? 5,? ? 12.1 2010,? 001,? ? 6,? ? 11.6 2010,? 001,? ? 7,? ? 13.9 2010,? 001,? ? 8,? ? 11.9 2010,? 001,? ? 9,? ? 13.6 2010,? 001,? 10,? ? 16.1 2010,? 001,? 11,? ? 18.5 The number of spaces is important. I have tried justify, but it produces spaces at the end or at the beginning of the rows depending on the choice of right, left alignment. Also I need 3 significant digits for the second column, when I use read.csv it gives me 1 instead of 001. So I use read.table, and one of the problems with read.table is that it produces row names that I don't want. Also I need commas in my output file. So far this is the best I could do: mydata = read.table("C:/ozone3.txt", sep = "") capture.output( print(mydata, sep = ",", print.gap=3), file="capture2.txt" ) and the output has all the unwanted row names and also there are no commas. Any suggestions? Thank you Nasrin ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
If spacing is critical, use 'sprintf' for creating the output.> Lines1<- read.csv(textConnection("Year, Day, Hour, Value+ 2010, 001, 0, 15.9 + 2010, 001, 1, 7.3 + 2010, 001, 2, 5.2 + 2010, 001, 3, 8.0 + 2010, 001, 4, 0.0 + 2010, 001, 5, 12.1 + 2010, 001, 6, 11.6 + 2010, 001, 7, 13.9 + 2010, 001, 8, 11.9 + 2010, 001, 9, 13.6 + 2010, 001, 10, 16.1 + + 2010, 001, 11, 18.5"))> # use sprintf for the spacing > cat(with(Lines1, sprintf("%4d, %03d,%3d,%5.1f\n"+ , Year, Day, Hour, Value + )), sep = '' + ) 2010, 001, 0, 15.9 2010, 001, 1, 7.3 2010, 001, 2, 5.2 2010, 001, 3, 8.0 2010, 001, 4, 0.0 2010, 001, 5, 12.1 2010, 001, 6, 11.6 2010, 001, 7, 13.9 2010, 001, 8, 11.9 2010, 001, 9, 13.6 2010, 001, 10, 16.1 2010, 001, 11, 18.5>Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Fri, Sep 6, 2013 at 3:42 PM, Mostafavipak, Nasrin <Nasrin.Mostafavipak at stantec.com> wrote:> Hi all; > > I have a data set with the format below: > > > Year, Day, Hour, Value > > 2010, 001, 0, 15.9 > 2010, 001, 1, 7.3 > 2010, 001, 2, 5.2 > 2010, 001, 3, 8.0 > 2010, 001, 4, 0.0 > 2010, 001, 5, 12.1 > 2010, 001, 6, 11.6 > 2010, 001, 7, 13.9 > 2010, 001, 8, 11.9 > 2010, 001, 9, 13.6 > 2010, 001, 10, 16.1 > 2010, 001, 11, 18.5 > > That should be converted to this format: > > 2010, 001, 0, 15.9 > 2010, 001, 1, 7.3 > 2010, 001, 2, 5.2 > 2010, 001, 3, 8.0 > 2010, 001, 4, 0.0 > 2010, 001, 5, 12.1 > 2010, 001, 6, 11.6 > 2010, 001, 7, 13.9 > 2010, 001, 8, 11.9 > 2010, 001, 9, 13.6 > 2010, 001, 10, 16.1 > 2010, 001, 11, 18.5 > The number of spaces is important. I have tried justify, but it produces spaces at the end or at the beginning of the rows depending on the choice of right, left alignment. Also I need 3 significant digits for the second column, when I use read.csv it gives me 1 instead of 001. So I use read.table, and one of the problems with read.table is that it produces row names that I don't want. Also I need commas in my output file. > > > So far this is the best I could do: > > mydata = read.table("C:/ozone3.txt", sep = "") > > > capture.output( print(mydata, sep = ",", print.gap=3), file="capture2.txt" ) > > and the output has all the unwanted row names and also there are no commas. > > > Any suggestions? > > Thank you > Nasrin > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.