Displaying 8 results from an estimated 8 matches for "crewjanitormakeclean".
2024 Dec 11
2
Cores hang when calling mcapply
...e/presence columns
Some include over 200M rows, with two columns that need presence/absence columns based on the strings contained within them, as an example, one set has ~29k unique values and the other with ~15k unique values (no overlap across the two).
Using a combination of custom functions:
crewjanitormakeclean <- function(df,columns) {
df <- df |> mutate(across(columns, ~make_clean_names(., allow_dupes = TRUE)))
return(df)
}
mass_pivot_wider <- function(df,column,prefix) {
df <- df |> distinct() |> mutate(n = 1) |> pivot_wider(names_from = glue("{column}"), value...
2024 Dec 12
1
Cores hang when calling mcapply
...???"ID_Key" = string(),
????????????"column1" = string(),
????????????"column2" = string()
????????????)
??????) |> as_tibble()
??????
??keeptabs <- split(temp, temp$ID_Key)
if(isTRUE(multicore)){
keeptabs <- mclapply(1:length(keeptabs), function(i) crewjanitormakeclean(keeptabs[[i]],c("column1","column2")), mc.cores = numcores)
}else{
keeptabs <- lapply(1:length(keeptabs), function(i) crewjanitormakeclean(keeptabs[[i]],c("column1","column2")))
}
keeptabs <- bind_rows(keeptabs)
out1 <- dcast(...
2024 Dec 12
1
Cores hang when calling mcapply
...??????????"column1" = string(),
> ????????????"column2" = string()
> ????????????)
> ??????) |> as_tibble()
>
> ??keeptabs <- split(temp, temp$ID_Key)
>
> ? ? if(isTRUE(multicore)){
> ? ? ? keeptabs <- mclapply(1:length(keeptabs), function(i) crewjanitormakeclean(keeptabs[[i]],c("column1","column2")), mc.cores = numcores)
> ? ? }else{
> ? ? ? keeptabs <- lapply(1:length(keeptabs), function(i) crewjanitormakeclean(keeptabs[[i]],c("column1","column2")))
> ? ? }
>
> ? ? keeptabs <- bind_rows(keepta...
2024 Dec 11
1
Cores hang when calling mcapply
...; = string(),
> > >????? "column1" = string(),
> > >????? "column2" = string()
> > >??? )
> > >? ) |>
> >
> > >??? collect()
> > > )
> > >
> > > # Step B: Clean names once
> > > # Assume `crewjanitormakeclean` essentially standardizes column names
> > > dt[, column1 := janitor::make_clean_names(column1, allow_dupes =?
> >
> > > TRUE)]
> > > dt[, column2 := janitor::make_clean_names(column2, allow_dupes =
> >
> > >? TRUE)]
> > >
> > >...
2024 Dec 11
1
Cores hang when calling mcapply
...'csv',
> unify_schema = TRUE,
> col_types = schema(
> "ID_Key" = string(),
> "column1" = string(),
> "column2" = string()
> )
> ) |>
> collect()
> )
>
> # Step B: Clean names once
> # Assume `crewjanitormakeclean` essentially standardizes column names
> dt[, column1 := janitor::make_clean_names(column1, allow_dupes =
> TRUE)]
> dt[, column2 := janitor::make_clean_names(column2, allow_dupes =
> TRUE)]
>
> # Step C: Create presence/absence indicators using data.table
> # Use dcast t...
2024 Dec 11
1
Cores hang when calling mcapply
...= 'csv',
> unify_schema = TRUE,
> col_types = schema(
> "ID_Key" = string(),
> "column1" = string(),
> "column2" = string()
> )
> ) |>
> collect()
> )
>
> # Step B: Clean names once
> # Assume `crewjanitormakeclean` essentially standardizes column names
> dt[, column1 := janitor::make_clean_names(column1, allow_dupes =
> TRUE)]
> dt[, column2 := janitor::make_clean_names(column2, allow_dupes =
> TRUE)]
>
> # Step C: Create presence/absence indicators using data.table
> # Use dcast to p...
2024 Dec 11
1
Cores hang when calling mcapply
...l_types = schema(
> >????? "ID_Key" = string(),
> >????? "column1" = string(),
> >????? "column2" = string()
> >??? )
> >? ) |>
>
> >??? collect()
> > )
> >
> > # Step B: Clean names once
> > # Assume `crewjanitormakeclean` essentially standardizes column names
> > dt[, column1 := janitor::make_clean_names(column1, allow_dupes =?
>
> > TRUE)]
> > dt[, column2 := janitor::make_clean_names(column2, allow_dupes =
>
> >? TRUE)]
> >
> > # Step C: Create presence/absence indica...
2024 Dec 11
1
Cores hang when calling mcapply
...= 'csv',
> unify_schema = TRUE,
> col_types = schema(
> "ID_Key" = string(),
> "column1" = string(),
> "column2" = string()
> )
> ) |>
> collect()
> )
>
> # Step B: Clean names once
> # Assume `crewjanitormakeclean` essentially standardizes column names
> dt[, column1 := janitor::make_clean_names(column1, allow_dupes =
> TRUE)]
> dt[, column2 := janitor::make_clean_names(column2, allow_dupes =
> TRUE)]
>
> # Step C: Create presence/absence indicators using data.table
> # Use dcast to p...