I've seen a few threads about this, but none that seem to answer my problem
I have a list of .txt files in a directory that I am reading into R and row
binding together.  I am using the following code to do so:
# Directory where files are found
my.txt.file.directory <-
"C:/Jared/Data/Kenya/Wildebeest/Tracking_Data"
names.of.txt.files <-
list.files(my.txt.file.directory,pattern="all_data",ignore.case=TRUE,
full.names=TRUE)
# Print names that meet criteria in directory
names.of.txt.files
# The names.of.txt.files will be a vector with the names of all the txt
files
# Dataset will contain all the data in the directory
wildebeest <- NULL
# Run loop
for (i in 1:length(names.of.txt.files))
{
dat <- read.table(names.of.txt.files[i],header=FALSE)
# Row bind all data together into a file called 'wildebeest'
wildebeest <- rbind.data.frame(wildebeest,dat)
rm(dat)
}
When I run this script, I get an error such as:
18: In `[<-.factor`(`*tmp*`, ri, value = c(1714.36, 1711.27,  ... :
  invalid factor level, NAs generated
I think I have identified the problem such that when I identify the
structure of some of the files that I am reading in, columns are labeled as
"Factors".  In other files, the same columns are labeled as numeric
values.  Is there a way to assign the data structure to these columns in the
dataframe as they are being read in?  Any other suggestions to why I am
getting this error is appreciated.
Jared
	[[alternative HTML version deleted]]
<snip>> > I think I have identified the problem such that when I identify the > structure of some of the files that I am reading in, columns are labeled as > "Factors". In other files, the same columns are labeled as numeric > values. Is there a way to assign the data structure to these columns in the > dataframe as they are being read in? Any other suggestions to why I am > getting this error is appreciated. >Yes, as ?read.table describes, see the colClasses argument. However, you should investigate why read.table wants to treat these particular columns as factors. If they are indeed simply consisting of all numeric values, there shouldn't be a problem. The fact that they are coming out as factors raises a flag to me ...
On 2010-07-16 13:38, Jared Stabach wrote:> I've seen a few threads about this, but none that seem to answer my problem > > I have a list of .txt files in a directory that I am reading into R and row > binding together. I am using the following code to do so: > > # Directory where files are found > my.txt.file.directory<- "C:/Jared/Data/Kenya/Wildebeest/Tracking_Data" > > names.of.txt.files<- > list.files(my.txt.file.directory,pattern="all_data",ignore.case=TRUE, > full.names=TRUE) > # Print names that meet criteria in directory > names.of.txt.files > > # The names.of.txt.files will be a vector with the names of all the txt > files > # Dataset will contain all the data in the directory > > wildebeest<- NULL > # Run loop > for (i in 1:length(names.of.txt.files)) > { > dat<- read.table(names.of.txt.files[i],header=FALSE) > # Row bind all data together into a file called 'wildebeest' > wildebeest<- rbind.data.frame(wildebeest,dat) > rm(dat) > } > > When I run this script, I get an error such as: > > 18: In `[<-.factor`(`*tmp*`, ri, value = c(1714.36, 1711.27, ... : > invalid factor level, NAs generated > > > I think I have identified the problem such that when I identify the > structure of some of the files that I am reading in, columns are labeled as > "Factors". In other files, the same columns are labeled as numeric > values. Is there a way to assign the data structure to these columns in the > dataframe as they are being read in? Any other suggestions to why I am > getting this error is appreciated. > > JaredJared, I'd guess that you have invalid data in some of your files. Perhaps missing values coded as '*' or as 'N/A'. If so, set the na.strings= argument. -Peter Ehlers