thr3ads.net - R help - [R] reading fixed width format data with 2 types of lines [Aug 2010]

If this information is useful, please help other people find it:
Share via:

Denis Chabot

2010-Aug-12 17:57 UTC

[R] reading fixed width format data with 2 types of lines

Hi,

I know how to read fixed width format data with read.fwf, but suddenly I need to
read in a large number of old fwf files with 2 types of lines. Lines that begin
with "3" in first column carry one set of variables, and lines that
begin with "4" carry another set, like this:

?
3A00206546L070049016090045    99  1015002      001001008010004002004007003   001
3A00206546L070049006090030    99  1029001002001001006014002                     
3A00206546L070049002290004    99  1015            001001                        
3A00206546L070049001692559049033  1015                                 018036024
3A00206546L070049002290004    99  1001                                       002
4A00176546L068047090010111000606516400150010000001501063   065914               
4A00176546L06804709001011100040761600000000         1092   095614               
4A00196546L098000100010111001706214400005010000000051062   065914               
4A00176546L06804709001011100050591300000000         1062   065914               
4A00196546L098000100010111002604721400020010000000201042   046114               
4A00196546L098000100010111002504221400005012000000051042   046114               
4A00196546L098000100010111002903721400050012200000501032   036214               
?

I have searched for tricks to do this but I must not have used the right
keywords, I found nothing.

I suppose I could read the entire file as a single character variable for each
line, then subset for lines that begin with 3 and save this in an ascii file
that will then be reopened with a read.fwf call, and do the same with lines that
begin with 4. But this does not appear to me to be very elegant nor efficient?
Is there a better method?

Thanks in advance,


Denis Chabot

Tim Gruene

2010-Aug-12 20:01 UTC

head link

[R] reading fixed width format data with 2 types of lines

I don't know if it's elegant enough for you, but you could split the
file into
two files with 'grep "^3" file > file_3' and 'grep
"^4" file > file_4'
and then read them in separately.

Tim

On Thu, Aug 12, 2010 at 01:57:19PM -0400, Denis Chabot
wrote:> Hi,
> 
> I know how to read fixed width format data with read.fwf, but suddenly I
need to read in a large number of old fwf files with 2 types of lines. Lines
that begin with "3" in first column carry one set of variables, and
lines that begin with "4" carry another set, like this:
> 
> ?
> 3A00206546L070049016090045    99  1015002      001001008010004002004007003 
001
> 3A00206546L070049006090030    99  1029001002001001006014002
> 3A00206546L070049002290004    99  1015            001001
> 3A00206546L070049001692559049033  1015                                
018036024
> 3A00206546L070049002290004    99  1001                                     
002
> 4A00176546L068047090010111000606516400150010000001501063   065914
> 4A00176546L06804709001011100040761600000000         1092   095614
> 4A00196546L098000100010111001706214400005010000000051062   065914
> 4A00176546L06804709001011100050591300000000         1062   065914
> 4A00196546L098000100010111002604721400020010000000201042   046114
> 4A00196546L098000100010111002504221400005012000000051042   046114
> 4A00196546L098000100010111002903721400050012200000501032   036214
> ?
> 
> I have searched for tricks to do this but I must not have used the right
keywords, I found nothing.
> 
> I suppose I could read the entire file as a single character variable for
each line, then subset for lines that begin with 3 and save this in an ascii
file that will then be reopened with a read.fwf call, and do the same with lines
that begin with 4. But this does not appear to me to be very elegant nor
efficient? Is there a better method?
> 
> Thanks in advance,
> 
> 
> Denis Chabot
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
--
Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL:
<https://stat.ethz.ch/pipermail/r-help/attachments/20100812/e032d2e0/attachment.bin>

Reasonably Related Threads

Search for more maybe matching threads

R help - Aug 2010 - reading fixed width format data with 2 types of lines

[R] reading fixed width format data with 2 types of lines

[R] reading fixed width format data with 2 types of lines

Reasonably Related Threads