100700.3013 at compuserve.com
2007-Jul-05 11:00 UTC
[Rd] data messed up by read.table ? (PR#9779)
Full_Name: Joerg Rauh Version: 2.5.0 OS: Windows 2000 Submission from: (NULL) (84.168.226.163) Following Michael J. Crawley "Statistical Computing" on page 9 the worms.txt is required. After downloading it from the book's supporting website, which is http://www.bio.ic.ac.uk/research/mjcraw/statcomp/data/ I visually check the data against the book and they look identical. Then I do a read.table as suggested: worms<-read.table("C:/Programme/R/R-2.5.0/Data/Worms.txt", header = T). Typing "worms" to see the data, it's no longer the same: Four lines have been added to the beginning of the file. One is the header line and three lines are from further down in the file, i.e. lines 10,11 and 12 in reverse order. Please look at a copy at the end of this mail. If the first four lines weren't there, the data would be o.k. I tried different parameter settings in read.table but couldn't obtain any improvement. Please let me know, how I can correct this. Best regards Joerg> worms<-read.table("C:/Programme/R/R-2.5.0/Data/Worms.txt", header = T) > wormsField.Name Area Slope Vegetation Soil.pH Damp Worm.density 1 Oak.Mead 3.1 2 Grassland 3.9 F 2 2 Church.Field 3.5 3 Grassland 4.2 F 3 3 Ashurst 2.1 0 Arable 4.8 F 4 4 Field.Name Area Slope Vegetation Soil.pH Damp Worm.density 5 Nash's.Field 3.6 11 Grassland 4.1 F 4 6 Silwood.Bottom 5.1 2 Arable 5.2 F 7 7 Nursery.Field 2.8 3 Grassland 4.3 F 2 8 Rush.Meadow 2.4 5 Meadow 4.9 T 5 9 Gunness'.Thicket 3.8 0 Scrub 4.2 F 6 10 Oak.Mead 3.1 2 Grassland 3.9 F 2 11 Church.Field 3.5 3 Grassland 4.2 F 3 12 Ashurst 2.1 0 Arable 4.8 F 4 13 The.Orchard 1.9 0 Orchard 5.7 F 9 14 Rookery.Slope 1.5 4 Grassland 5 T 7 15 Garden.Wood 2.9 10 Scrub 5.2 F 8 16 North.Gravel 3.3 1 Grassland 4.1 F 1 17 South.Gravel 3.7 2 Grassland 4 F 2 18 Observatory.Ridge 1.8 6 Grassland 3.8 F 0 19 Pond.Field 4.1 0 Meadow 5 T 6 20 Water.Meadow 3.9 0 Meadow 4.9 T 8 21 Cheapside 2.2 8 Scrub 4.7 T 4 22 Pound.Hill 4.4 2 Arable 4.5 F 5 23 Gravel.Pit 2.9 1 Grassland 3.5 F 1 24 Farm.Wood 0.8 10 Scrub 5.1 T 3>
On Thu, 5 Jul 2007 100700.3013 at compuserve.com wrote:> Following Michael J. Crawley "Statistical Computing" on page 9 the worms.txt is > required. After downloading it from the book's supporting website, which is > http://www.bio.ic.ac.uk/research/mjcraw/statcomp/data/ I visually check the data > against the book and they look identical. Then I do a read.table as suggested: > worms<-read.table("C:/Programme/R/R-2.5.0/Data/Worms.txt", header = T).Add the argument quote="" or quote="\"" to the call to read.table() so the apostrophes in the file are not taken to be quotes.> Typing "worms" to see the data, it's no longer the same: Four lines have been > added to the beginning of the file. One is the header line and three lines are > from further down in the file, i.e. lines 10,11 and 12 in reverse order. > Please look at a copy at the end of this mail. If the first four lines weren't > there, the data would be o.k. I tried different parameter settings in read.table > but couldn't obtain any improvement. > > Please let me know, how I can correct this. > > Best regards > > Joerg > > > worms<-read.table("C:/Programme/R/R-2.5.0/Data/Worms.txt", header = T) > > worms > Field.Name Area Slope Vegetation Soil.pH Damp Worm.density > 1 Oak.Mead 3.1 2 Grassland 3.9 F 2 > 2 Church.Field 3.5 3 Grassland 4.2 F 3 > 3 Ashurst 2.1 0 Arable 4.8 F 4 > 4 Field.Name Area Slope Vegetation Soil.pH Damp Worm.density > 5 Nash's.Field 3.6 11 Grassland 4.1 F 4 > 6 Silwood.Bottom 5.1 2 Arable 5.2 F 7 > 7 Nursery.Field 2.8 3 Grassland 4.3 F 2 > 8 Rush.Meadow 2.4 5 Meadow 4.9 T 5 > 9 Gunness'.Thicket 3.8 0 Scrub 4.2 F 6 > 10 Oak.Mead 3.1 2 Grassland 3.9 F 2 > 11 Church.Field 3.5 3 Grassland 4.2 F 3 > 12 Ashurst 2.1 0 Arable 4.8 F 4 > 13 The.Orchard 1.9 0 Orchard 5.7 F 9 > 14 Rookery.Slope 1.5 4 Grassland 5 T 7 > 15 Garden.Wood 2.9 10 Scrub 5.2 F 8 > 16 North.Gravel 3.3 1 Grassland 4.1 F 1 > 17 South.Gravel 3.7 2 Grassland 4 F 2 > 18 Observatory.Ridge 1.8 6 Grassland 3.8 F 0 > 19 Pond.Field 4.1 0 Meadow 5 T 6 > 20 Water.Meadow 3.9 0 Meadow 4.9 T 8 > 21 Cheapside 2.2 8 Scrub 4.7 T 4 > 22 Pound.Hill 4.4 2 Arable 4.5 F 5 > 23 Gravel.Pit 2.9 1 Grassland 3.5 F 1 > 24 Farm.Wood 0.8 10 Scrub 5.1 T 3---------------------------------------------------------------------------- Bill Dunlap Insightful Corporation bill at insightful dot com 360-428-8146 "All statements in this message represent the opinions of the author and do not necessarily reflect Insightful Corporation policy or position."
On Thursday 05 July 2007 7:00:46 am 100700.3013 at compuserve.com wrote:> Full_Name: Joerg Rauh > Version: 2.5.0 > OS: Windows 2000 > Submission from: (NULL) (84.168.226.163) > > > Following Michael J. Crawley "Statistical Computing" on page 9 the > worms.txt is required. After downloading it from the book's supporting > website, which is http://www.bio.ic.ac.uk/research/mjcraw/statcomp/data/ I > visually check the data against the book and they look identical. Then I do > a read.table as suggested: > worms<-read.table("C:/Programme/R/R-2.5.0/Data/Worms.txt", header = T). >I see the same effect on 2.5.0 and 2.5.1 running on Linux. However, the following line reads the data correctly: read.table('worms.txt', header=TRUE, quote="\"") Thus the problem is likely because of single quotes in the Field.Name column, perhaps a single quote character was added to the list of defaults since the book was released. best Vladimir Dergachev
100700.3013 at compuserve.com wrote:> Full_Name: Joerg Rauh > Version: 2.5.0 > OS: Windows 2000 > Submission from: (NULL) (84.168.226.163) > > > Following Michael J. Crawley "Statistical Computing" on page 9 the worms.txt is > required. After downloading it from the book's supporting website, which is > http://www.bio.ic.ac.uk/research/mjcraw/statcomp/data/ I visually check the data > against the book and they look identical. Then I do a read.table as suggested: > worms<-read.table("C:/Programme/R/R-2.5.0/Data/Worms.txt", header = T). > > Typing "worms" to see the data, it's no longer the same: Four lines have been > added to the beginning of the file. One is the header line and three lines are > from further down in the file, i.e. lines 10,11 and 12 in reverse order. > Please look at a copy at the end of this mail. If the first four lines weren't > there, the data would be o.k. I tried different parameter settings in read.table > but couldn't obtain any improvement. > > Please let me know, how I can correct this. > > Best regards > > Joerg > > >> worms<-read.table("C:/Programme/R/R-2.5.0/Data/Worms.txt", header = T) >> worms >> > Field.Name Area Slope Vegetation Soil.pH Damp Worm.density > 1 Oak.Mead 3.1 2 Grassland 3.9 F 2 > 2 Church.Field 3.5 3 Grassland 4.2 F 3 > 3 Ashurst 2.1 0 Arable 4.8 F 4 > 4 Field.Name Area Slope Vegetation Soil.pH Damp Worm.density > 5 Nash's.Field 3.6 11 Grassland 4.1 F 4 > 6 Silwood.Bottom 5.1 2 Arable 5.2 F 7 > 7 Nursery.Field 2.8 3 Grassland 4.3 F 2 > 8 Rush.Meadow 2.4 5 Meadow 4.9 T 5 > 9 Gunness'.Thicket 3.8 0 Scrub 4.2 F 6 > 10 Oak.Mead 3.1 2 Grassland 3.9 F 2 > 11 Church.Field 3.5 3 Grassland 4.2 F 3 > 12 Ashurst 2.1 0 Arable 4.8 F 4 > 13 The.Orchard 1.9 0 Orchard 5.7 F 9 > 14 Rookery.Slope 1.5 4 Grassland 5 T 7 > 15 Garden.Wood 2.9 10 Scrub 5.2 F 8 > 16 North.Gravel 3.3 1 Grassland 4.1 F 1 > 17 South.Gravel 3.7 2 Grassland 4 F 2 > 18 Observatory.Ridge 1.8 6 Grassland 3.8 F 0 > 19 Pond.Field 4.1 0 Meadow 5 T 6 > 20 Water.Meadow 3.9 0 Meadow 4.9 T 8 > 21 Cheapside 2.2 8 Scrub 4.7 T 4 > 22 Pound.Hill 4.4 2 Arable 4.5 F 5 > 23 Gravel.Pit 2.9 1 Grassland 3.5 F 1 > 24 Farm.Wood 0.8 10 Scrub 5.1 T 3 >Same thing happens on Linux. It appears to be the single quotes that mess things up. Using read.delim(), which is designed to read tab-delimitedfiles like this one, works as does setting quote="".