Hi all,
I'm trying to put together examples in an R package, and am having trouble
reading data from the package's data directory. The data are in
comma-separated variable files, so to read a file like gw.csv, I include
in the data directory both bailey.csv and a file bailey.R which contains:
bailey <- read.csv("bailey.csv",na.strings=".");
so that typing
data(bailey)
should load the data from bailey.csv.
The problem is that the data is being read in "wrong", as a single
column,
rather than a data frame with four columns. I get:
data.p1.p2.model
1 0.57,2,1,1
2 0.54,4,1,1
3 0.54,6,1,1
4 0.52,8,1,1
5 0.54,10,1,1
6 0.53,50,1,1
7 0.93,2,1,2
8 0.61,4,1,2
9 0.53,6,1,2
10 0.49,8,1,2
11 0.43,10,1,2
12 0.11,50,1,2
13 0.89,2,1,3
14 0.72,4,1,3
15 0.62,6,1,3
16 0.58,8,1,3
17 0.49,10,1,3
18 0.12,50,1,3
19 0.51,2,1,4
20 0.43,4,1,4
21 0.41,6,1,4
22 0.37,8,1,4
23 0.37,10,1,4
24 0.31,50,1,4
Which is wrong. But if I use the very same command as above, but from the
command line (without the data() wrapper), it comes in right:
>
read.csv("c:/progra~1/R/library/seemc/data/bailey.csv",na.strings=".")
data p1 p2 model
1 0.57 2 1 1
2 0.54 4 1 1
3 0.54 6 1 1
4 0.52 8 1 1
5 0.54 10 1 1
6 0.53 50 1 1
7 0.93 2 1 2
8 0.61 4 1 2
9 0.53 6 1 2
10 0.49 8 1 2
11 0.43 10 1 2
12 0.11 50 1 2
13 0.89 2 1 3
14 0.72 4 1 3
15 0.62 6 1 3
16 0.58 8 1 3
17 0.49 10 1 3
18 0.12 50 1 3
19 0.51 2 1 4
20 0.43 4 1 4
21 0.41 6 1 4
22 0.37 8 1 4
23 0.37 10 1 4
24 0.31 50 1 4>
What could be causing this? I just want to get the example working; in
practice one wouldn't need to use the data() command for this package, but
it seems to be the only reliable way to get stuff out of the data
directory for running examples. Any ideas?
Thanks
Chris
+-----------------------------------------------------------------+
Chris Adolph Department of Government
work: 617-496-4099 Littauer Center, North Yard
cell: 617-642-0683 Harvard University
email: cadolph at fas.harvard.edu Cambridge, MA 02138, USA
URLs: www.fas.harvard.edu/~cadolph chris.adolph.name
+-----------------------------------------------------------------+
?data says
4. files ending `.csv' are read using `read.table(..., header
= TRUE, sep = ";")', and also result in a data frame.
That may well be found first, so you need to rename bailey.csv.
You don't tell us your OS, but it looks like Windows where sorts often
have c before R in the default locale:> sort(c(letters, LETTERS))
[1] "a" "A" "b" "B" "c"
"C" "d" "D" "e" "E"
"f" "F" "g" "G" "h"
"H" "i" "I" "j"
[20] "J" "k" "K" "l" "L"
"m" "M" "n" "N" "o"
"O" "p" "P" "q" "Q"
"r" "R" "s" "S"
[39] "t" "T" "u" "U" "v"
"V" "w" "W" "x" "X"
"y" "Y" "z" "Z"
On Sun, 2 Mar 2003, Christopher Adolph wrote:
>
> Hi all,
>
> I'm trying to put together examples in an R package, and am having
trouble
> reading data from the package's data directory. The data are in
> comma-separated variable files, so to read a file like gw.csv, I include
> in the data directory both bailey.csv and a file bailey.R which contains:
>
> bailey <- read.csv("bailey.csv",na.strings=".");
>
> so that typing
>
> data(bailey)
>
> should load the data from bailey.csv.
>
> The problem is that the data is being read in "wrong", as a
single column,
> rather than a data frame with four columns. I get:
>
> data.p1.p2.model
> 1 0.57,2,1,1
> 2 0.54,4,1,1
> 3 0.54,6,1,1
> 4 0.52,8,1,1
> 5 0.54,10,1,1
> 6 0.53,50,1,1
> 7 0.93,2,1,2
> 8 0.61,4,1,2
> 9 0.53,6,1,2
> 10 0.49,8,1,2
> 11 0.43,10,1,2
> 12 0.11,50,1,2
> 13 0.89,2,1,3
> 14 0.72,4,1,3
> 15 0.62,6,1,3
> 16 0.58,8,1,3
> 17 0.49,10,1,3
> 18 0.12,50,1,3
> 19 0.51,2,1,4
> 20 0.43,4,1,4
> 21 0.41,6,1,4
> 22 0.37,8,1,4
> 23 0.37,10,1,4
> 24 0.31,50,1,4
>
> Which is wrong. But if I use the very same command as above, but from the
> command line (without the data() wrapper), it comes in right:
>
> >
read.csv("c:/progra~1/R/library/seemc/data/bailey.csv",na.strings=".")
> data p1 p2 model
> 1 0.57 2 1 1
> 2 0.54 4 1 1
> 3 0.54 6 1 1
> 4 0.52 8 1 1
> 5 0.54 10 1 1
> 6 0.53 50 1 1
> 7 0.93 2 1 2
> 8 0.61 4 1 2
> 9 0.53 6 1 2
> 10 0.49 8 1 2
> 11 0.43 10 1 2
> 12 0.11 50 1 2
> 13 0.89 2 1 3
> 14 0.72 4 1 3
> 15 0.62 6 1 3
> 16 0.58 8 1 3
> 17 0.49 10 1 3
> 18 0.12 50 1 3
> 19 0.51 2 1 4
> 20 0.43 4 1 4
> 21 0.41 6 1 4
> 22 0.37 8 1 4
> 23 0.37 10 1 4
> 24 0.31 50 1 4
> >
>
> What could be causing this? I just want to get the example working; in
> practice one wouldn't need to use the data() command for this package,
but
> it seems to be the only reliable way to get stuff out of the data
> directory for running examples. Any ideas?
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595