thr3ads.net - search: "char

Displaying 3 results from an estimated 3 matches for "char_col".

Did you mean: char_cols

2024 Apr 10

Exceptional slowness with read.csv

...n a two column data.frame with columns Col - the column processed; Unbalanced - the rows with unbalanced double quotes. I am assuming the quotes are double quotes. It shouldn't be difficult to adapt it to other cas, single quotes, both cases. unbalanced_dquotes <- function(x) { char_cols <- sapply(x, is.character) |> which() lapply(char_cols, \(i) { y <- x[[i]] Unbalanced <- gregexpr('"', y) |> sapply(\(x) attr(x, "match.length") |> length()) |> {\(x) (x %% 2L) == 1L}() |> which() data.frame(Col...

Exceptional slowness with read.csv

2024 Apr 10

Exceptional slowness with read.csv

...ol - the column processed; > ?Unbalanced - the rows with unbalanced double quotes. > > I am assuming the quotes are double quotes. It shouldn't be difficult > to adapt it to other cas, single quotes, both cases. > > > > > unbalanced_dquotes <- function(x) { > ? char_cols <- sapply(x, is.character) |> which() > ? lapply(char_cols, \(i) { > ??? y <- x[[i]] > ??? Unbalanced <- gregexpr('"', y) |> > ????? sapply(\(x) attr(x, "match.length") |> length()) |> > ????? {\(x) (x %% 2L) == 1L}() |> > ????? whic...

Exceptional slowness with read.csv

2024 Apr 08

Exceptional slowness with read.csv

Greetings, I have a csv file of 76 fields and about 4 million records. I know that some of the records have errors - unmatched quotes, specifically.? Reading the file with readLines and parsing the lines with read.csv(text = ...) is really slow. I know that the first 2459465 records are good. So I try this: > startTime <- Sys.time() > first_records <- read.csv(file_name, nrows

search for: char_col