Displaying 1 result from an estimated 1 matches for "big_data_set".
2005 Aug 09
2
How to pre-filter large amounts of data effectively
...UT 600 Mb MEMORY.
#
# I WOULD BE HAPPY ABOUT ANY HINT HOW TO IMPROVE THIS.
# Pre-filter the big data set (more than 115,000 rows and 524
columns) for later class predictions.
# The big data set contains the same column names as the training
set, but in a different order.
input.file <- 'big_data_set.txt'
filtered.file <- 'big_data_set_filtered.txt'
# Read header with first row
prediction.set <- read.csv(input.file, header=TRUE, skip=0, nrow=1)
# Prepare column names by stripping the underline and the number at
the end
colnames(prediction.set) <- sub('_\\d+$', &...