Displaying 8 results from an estimated 8 matches for "final_dt".
2024 Dec 11
1
Cores hang when calling mcapply
...mns with 0
> for (col in out1_missing) out1[, (col) := 0]
> for (col in out2_missing) out2[, (col) := 0]
>
> # Ensure column order alignment if needed
> setcolorder(out1, all_cols)
> setcolorder(out2, all_cols)
>
> # Combine by ID_Key (since they share same columns now)
> final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE)
>
> # Step E: If needed, summarize across ID_Key to sum presence
> indicators
> final_result <- final_dt[, lapply(.SD, sum, na.rm = TRUE), by =
> ID_Key, .SDcols = setdiff(names(final_dt), "ID_Key")]
&g...
2024 Dec 12
1
Cores hang when calling mcapply
...ames(out1), names(out2)))
out1_missing <- setdiff(all_cols, names(out1))
out2_missing <- setdiff(all_cols, names(out2))
for (col in out1_missing) out1[, (col) := 0]
for (col in out2_missing) out2[, (col) := 0]
setcolorder(out1, all_cols)
setcolorder(out2, all_cols)
final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE)
final_result <- as_tibble(final_dt[, lapply(.SD, sum, na.rm = TRUE), by = ID_Key, .SDcols = setdiff(names(final_dt), "ID_Key")])
Worth noting however:
*
I unfortunately had to keep the multicore parameters for th...
2024 Dec 11
1
Cores hang when calling mcapply
...mns with 0
> for (col in out1_missing) out1[, (col) := 0]
> for (col in out2_missing) out2[, (col) := 0]
>
> # Ensure column order alignment if needed
> setcolorder(out1, all_cols)
> setcolorder(out2, all_cols)
>
> # Combine by ID_Key (since they share same columns now)
> final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE)
>
> # Step E: If needed, summarize across ID_Key to sum presence
> indicators
> final_result <- final_dt[, lapply(.SD, sum, na.rm = TRUE), by =
> ID_Key, .SDcols = setdiff(names(final_dt), "ID_Key")]
>...
2024 Dec 11
1
Cores hang when calling mcapply
...(col in out2_missing) out2[, (col) := 0]
> > >
> > > # Ensure column order alignment if needed
> > > setcolorder(out1, all_cols)
> > > setcolorder(out2, all_cols)
> > >
> > > # Combine by ID_Key (since they share same columns now)
> > > final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE)
> > >
> > > # Step E: If needed, summarize across ID_Key to sum presence
> >
> > > indicators
> > > final_result <- final_dt[, lapply(.SD, sum, na.rm = TRUE), by =
> >
> > &...
2024 Dec 12
1
Cores hang when calling mcapply
...out1_missing <- setdiff(all_cols, names(out1))
> ? ? out2_missing <- setdiff(all_cols, names(out2))
> ? ? for (col in out1_missing) out1[, (col) := 0]
> ? ? for (col in out2_missing) out2[, (col) := 0]
> ? ? setcolorder(out1, all_cols)
> ? ? setcolorder(out2, all_cols)
> ? ? final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE)
> ? ? final_result <- as_tibble(final_dt[, lapply(.SD, sum, na.rm = TRUE), by = ID_Key, .SDcols = setdiff(names(final_dt), "ID_Key")])
>
>
> Worth noting however:
>
>
> - I unfortunately had...
2024 Dec 11
1
Cores hang when calling mcapply
...) out1[, (col) := 0]
> > for (col in out2_missing) out2[, (col) := 0]
> >
> > # Ensure column order alignment if needed
> > setcolorder(out1, all_cols)
> > setcolorder(out2, all_cols)
> >
> > # Combine by ID_Key (since they share same columns now)
> > final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE)
> >
> > # Step E: If needed, summarize across ID_Key to sum presence
>
> > indicators
> > final_result <- final_dt[, lapply(.SD, sum, na.rm = TRUE), by =
>
> > ID_Key, .SDcols = setdiff(names...
2024 Dec 11
1
Cores hang when calling mcapply
...mns with 0
> for (col in out1_missing) out1[, (col) := 0]
> for (col in out2_missing) out2[, (col) := 0]
>
> # Ensure column order alignment if needed
> setcolorder(out1, all_cols)
> setcolorder(out2, all_cols)
>
> # Combine by ID_Key (since they share same columns now)
> final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE)
>
> # Step E: If needed, summarize across ID_Key to sum presence
> indicators
> final_result <- final_dt[, lapply(.SD, sum, na.rm = TRUE), by =
> ID_Key, .SDcols = setdiff(names(final_dt), "ID_Key")]
>...
2024 Dec 11
2
Cores hang when calling mcapply
Hi R users.
Apologies for the lack of concrete examples because the dataset is large, and it being so I believe is the issue.
I multiple, very large datasets for which I need to generate 0/1 absence/presence columns
Some include over 200M rows, with two columns that need presence/absence columns based on the strings contained within them, as an example, one set has ~29k unique values and the