Displaying 6 results from an estimated 6 matches for "000214.5".
2017 Oct 05
4
dealing with a messy dataset
dear R-users,
I am facing a quite regular and basic problem when it comes to dealing
with datasets, but I cannot find any satisfying answer so far.
I have a messy dataset of galaxies like that :
And XVIII 000214.5+450520 0.69 17 9 0.00 -8.7 26.8 6.44
6.78 < 6.65 -44 0.5 MESSIER031 0.6 1.54
PAndAS-03 000356.4+405319 0.10 17 0.00 -3.6 27.8
4.38
2017 Oct 05
0
dealing with a messy dataset
It looks like fixed width. I just used the last position of each
field to get the size and used the 'readr' package;
> input <- "And XVIII 000214.5+450520 0.69 17 9 0.00
-8.7 26.8 6.44 6.78 < 6.65 -44 0.5 MESSIER031 0.6
1.54
+ PAndAS-03 000356.4+405319 0.10 17 0.00 -3.6 27.8
4.38 2.8 MESSIER031
2017 Oct 05
0
dealing with a messy dataset
Is this a fixed width format?
If so, read.fwf() in base, or read_fwf() in the readr package will solve the problem. You may need to trim trailing spaces though.
B.
> On Oct 5, 2017, at 10:12 AM, jean-philippe <jeanphilippe.fontaine at gssi.infn.it> wrote:
>
> dear R-users,
>
>
> I am facing a quite regular and basic problem when it comes to dealing with datasets,
2017 Oct 05
3
dealing with a messy dataset
dear Jim,
Thanks for your reply and your proposition.
I forgot to provide the header of the dataframe, here it is:
================================================================================
Byte-by-byte Description of file: lvg_table2.dat
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
2017 Oct 05
0
dealing with a messy dataset
You should be able to use that header information to create the
correct parameters to the read_fwf function to read in the data.
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
On Thu, Oct 5, 2017 at 11:02 AM, jean-philippe
<jeanphilippe.fontaine at gssi.infn.it> wrote:
> dear Jim,
>
> Thanks
2017 Oct 05
1
dealing with a messy dataset
dear Jim,
Yes I fixed the problem. Thanks again all of you for your contribution!
This worked :
start <- c(1, 20, 35, 41, 44, 48, 53, 59, 64, 70, 76, 78, 83, 88,
+ 93, 114, 122, 127)
data1<-read_fwf("lvg_table2.txt",skip=70, fwf_widths(diff(start)))
Well now I know how to deal with fixed-width files :)
Cheers
Jean-Philippe
On 05/10/2017 18:42, jim