Hi, I am trying to the data from https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt without any success. Below is the error I am getting:> read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt')Error in make.names(col.names, unique = TRUE) : invalid multibyte string at '<ff><fe>t' In addition: Warning messages: 1: In read.table(file = file, header = header, sep = sep, quote = quote, : line 1 appears to contain embedded nulls 2: In read.table(file = file, header = header, sep = sep, quote = quote, : line 2 appears to contain embedded nulls 3: In read.table(file = file, header = header, sep = sep, quote = quote, : line 3 appears to contain embedded nulls 4: In read.table(file = file, header = header, sep = sep, quote = quote, : line 4 appears to contain embedded nulls 5: In read.table(file = file, header = header, sep = sep, quote = quote, : line 5 appears to contain embedded nulls Is there any way to read this data directly onto R? Thanks for your time
Well, this is frankly an unsatisfactory answer, as it does not try to deal properly with the issues that you experienced, which I also did. However, it's simple and works. As this is a small text file, simply copy it in your browser to the clipboard, and then use: thefile <- read.table(text "<paste text here>", header = TRUE) either in an editor or directly at the prompt in the console. In fact here's the whole thing that you can just copy and paste from this email: thefile <-read.table(text "time vendor metal 1 322 44.2 2 317 44.3 3 319 44.4 4 323 43.4 5 327 42.8 6 328 44.3 7 325 44.4 8 326 44.8 9 330 44.4 10 334 43.1 11 337 42.6 12 341 42.4 13 322 42.2 14 318 41.8 15 320 40.1 16 326 42 17 332 42.4 18 334 43.1 19 335 42.4 20 336 43.1 21 335 43.2 22 338 42.8 23 342 43 24 348 42.8 25 330 42.5 26 326 42.6 27 329 42.3 28 337 42.9 29 345 43.6 30 350 44.7 31 351 44.5 32 354 45 33 355 44.8 34 357 44.9 35 362 45.2 36 368 45.2 37 348 45 38 345 45.5 39 349 46.2 40 355 46.8 41 362 47.5 42 367 48.3 43 366 48.3 44 370 49.1 45 371 48.9 46 375 49.4 47 380 50 48 385 50 49 361 49.6 50 354 49.9 51 357 49.6 52 367 50.7 53 376 50.7 54 381 50.9 55 381 50.5 56 383 51.2 57 384 50.7 58 387 50.3 59 392 49.2 60 396 48.1", header = TRUE) Cheers, Bert On Sat, Sep 7, 2024 at 12:57?PM Christofer Bogaso < bogaso.christofer at gmail.com> wrote:> Hi, > > I am trying to the data from > > https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt > without any success. Below is the error I am getting: > > > read.delim(' > https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt > ') > > Error in make.names(col.names, unique = TRUE) : > > invalid multibyte string at '<ff><fe>t' > > In addition: Warning messages: > > 1: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 1 appears to contain embedded nulls > > 2: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 2 appears to contain embedded nulls > > 3: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 3 appears to contain embedded nulls > > 4: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 4 appears to contain embedded nulls > > 5: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 5 appears to contain embedded nulls > > Is there any way to read this data directly onto R? > > Thanks for your time > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
That looks like a UTF-16LE byte order mark. Simply open the connection with the proper encoding: read.delim( 'https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt', fileEncoding = "UTF-16LE" ) On Sat, Sep 7, 2024 at 3:57?PM Christofer Bogaso <bogaso.christofer at gmail.com> wrote:> > Hi, > > I am trying to the data from > https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt > without any success. Below is the error I am getting: > > > read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt') > > Error in make.names(col.names, unique = TRUE) : > > invalid multibyte string at '<ff><fe>t' > > In addition: Warning messages: > > 1: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 1 appears to contain embedded nulls > > 2: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 2 appears to contain embedded nulls > > 3: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 3 appears to contain embedded nulls > > 4: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 4 appears to contain embedded nulls > > 5: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 5 appears to contain embedded nulls > > Is there any way to read this data directly onto R? > > Thanks for your time > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Sun, 08 Sep 2024, Christofer Bogaso writes:> Hi, > > I am trying to the data from > https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt > without any success. Below is the error I am getting: > >> read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt') > > Error in make.names(col.names, unique = TRUE) : > > invalid multibyte string at '<ff><fe>t' > > In addition: Warning messages: > > 1: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 1 appears to contain embedded nulls > > 2: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 2 appears to contain embedded nulls > > 3: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 3 appears to contain embedded nulls > > 4: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 4 appears to contain embedded nulls > > 5: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 5 appears to contain embedded nulls > > Is there any way to read this data directly onto R? > > Thanks for your time >The <ff><fe> looks like a byte-order mark (https://en.wikipedia.org/wiki/Byte_order_mark). Try this: fn <- file('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt', encoding = "UTF-16LE") read.delim(fn) -- Enrico Schumann Lucerne, Switzerland https://enricoschumann.net
Add the fileEncoding = "UTF-16" argument to the read call. For a human explanation of why this is going on I recommend [1]. For a more R-related take, try [2]. For reference, I downloaded your file and used the "file" command line program typically available on Linux (and possibly MacOSX) which will tell you about what encoding is used in a particular file. [1] https://www.youtube.com/watch?v=4mRxIgu9R70 [2] https://kevinushey.github.io/blog/2018/02/21/string-encoding-and-r/ On September 7, 2024 12:56:36 PM PDT, Christofer Bogaso <bogaso.christofer at gmail.com> wrote:>Hi, > >I am trying to the data from >https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt >without any success. Below is the error I am getting: > >> read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt') > >Error in make.names(col.names, unique = TRUE) : > > invalid multibyte string at '<ff><fe>t' > >In addition: Warning messages: > >1: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 1 appears to contain embedded nulls > >2: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 2 appears to contain embedded nulls > >3: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 3 appears to contain embedded nulls > >4: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 4 appears to contain embedded nulls > >5: In read.table(file = file, header = header, sep = sep, quote = quote, : > > line 5 appears to contain embedded nulls > >Is there any way to read this data directly onto R? > >Thanks for your time > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide https://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.