Got it...the problem was with Slovenian characters. Once i replaced them with
normal characters it works fine.
Tnx anyway, m
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Matev? Pavli?
Sent: Saturday, May 21, 2011 1:27 PM
To: r-help at r-project.org
Cc: feinerer at logic.at
Subject: [R] DocumentTermMatrix error
Hi all,
I have tried to create a DocumentTermMatrix with a tm package, but i get this
error :
Error in tolower(txt) :
invalid input 'PROD Z LAHKO GNETNO MELJNO GLINO, ... in
'utf8towcs'
I tried doing this as it is showed in :
http://www.r-project.org/doc/Rnews/Rnews_2008-2.pdf (An Introduction to Text
Mining),
with this R code :
setwd("C:/Users/mpavlic/Desktop/temp")
tekst <- Corpus(DirSource("."))
>Warning message:
>In readLines(y, encoding = x$Encoding) :
>incomplete final line found on './test.txt'
meta(tekst, "Heading", "local") <- c("test")
meta(tekst[[1]])
>Available meta data pairs are:
Author :
DateTimeStamp: 2011-05-21 11:25:21
Description :
Heading : test
ID : test.txt
Language : en
Origin :
test <- TermDocumentMatrix(tekst)
> Error in tolower(txt) :
> invalid input 'PROD Z LAHKO GNETNO MELJNO GLINO, ... in
'utf8towcs'
Attached is a small sample (test.txt) on which i worked.
Any help would be appreaciated,
m