search for: final_dt

Displaying 8 results from an estimated 8 matches for "final_dt".

2024 Dec 11
1
Cores hang when calling mcapply
...mns with 0 > for (col in out1_missing) out1[, (col) := 0] > for (col in out2_missing) out2[, (col) := 0] > > # Ensure column order alignment if needed > setcolorder(out1, all_cols) > setcolorder(out2, all_cols) > > # Combine by ID_Key (since they share same columns now) > final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE) > > # Step E: If needed, summarize across ID_Key to sum presence > indicators > final_result <- final_dt[, lapply(.SD, sum, na.rm = TRUE), by = > ID_Key, .SDcols = setdiff(names(final_dt), "ID_Key")] &g...
2024 Dec 12
1
Cores hang when calling mcapply
...ames(out1), names(out2))) out1_missing <- setdiff(all_cols, names(out1)) out2_missing <- setdiff(all_cols, names(out2)) for (col in out1_missing) out1[, (col) := 0] for (col in out2_missing) out2[, (col) := 0] setcolorder(out1, all_cols) setcolorder(out2, all_cols) final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE) final_result <- as_tibble(final_dt[, lapply(.SD, sum, na.rm = TRUE), by = ID_Key, .SDcols = setdiff(names(final_dt), "ID_Key")]) Worth noting however: * I unfortunately had to keep the multicore parameters for th...
2024 Dec 11
1
Cores hang when calling mcapply
...mns with 0 > for (col in out1_missing) out1[, (col) := 0] > for (col in out2_missing) out2[, (col) := 0] > > # Ensure column order alignment if needed > setcolorder(out1, all_cols) > setcolorder(out2, all_cols) > > # Combine by ID_Key (since they share same columns now) > final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE) > > # Step E: If needed, summarize across ID_Key to sum presence > indicators > final_result <- final_dt[, lapply(.SD, sum, na.rm = TRUE), by = > ID_Key, .SDcols = setdiff(names(final_dt), "ID_Key")] >...
2024 Dec 11
1
Cores hang when calling mcapply
...(col in out2_missing) out2[, (col) := 0] > > > > > > # Ensure column order alignment if needed > > > setcolorder(out1, all_cols) > > > setcolorder(out2, all_cols) > > > > > > # Combine by ID_Key (since they share same columns now) > > > final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE) > > > > > > # Step E: If needed, summarize across ID_Key to sum presence > > > > > indicators > > > final_result <- final_dt[, lapply(.SD, sum, na.rm = TRUE), by = > > > > &...
2024 Dec 12
1
Cores hang when calling mcapply
...out1_missing <- setdiff(all_cols, names(out1)) > ? ? out2_missing <- setdiff(all_cols, names(out2)) > ? ? for (col in out1_missing) out1[, (col) := 0] > ? ? for (col in out2_missing) out2[, (col) := 0] > ? ? setcolorder(out1, all_cols) > ? ? setcolorder(out2, all_cols) > ? ? final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE) > ? ? final_result <- as_tibble(final_dt[, lapply(.SD, sum, na.rm = TRUE), by = ID_Key, .SDcols = setdiff(names(final_dt), "ID_Key")]) > > > Worth noting however: > > > - I unfortunately had...
2024 Dec 11
1
Cores hang when calling mcapply
...) out1[, (col) := 0] > > for (col in out2_missing) out2[, (col) := 0] > > > > # Ensure column order alignment if needed > > setcolorder(out1, all_cols) > > setcolorder(out2, all_cols) > > > > # Combine by ID_Key (since they share same columns now) > > final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE) > > > > # Step E: If needed, summarize across ID_Key to sum presence > > > indicators > > final_result <- final_dt[, lapply(.SD, sum, na.rm = TRUE), by = > > > ID_Key, .SDcols = setdiff(names...
2024 Dec 11
1
Cores hang when calling mcapply
...mns with 0 > for (col in out1_missing) out1[, (col) := 0] > for (col in out2_missing) out2[, (col) := 0] > > # Ensure column order alignment if needed > setcolorder(out1, all_cols) > setcolorder(out2, all_cols) > > # Combine by ID_Key (since they share same columns now) > final_dt <- rbindlist(list(out1, out2), use.names = TRUE, fill = TRUE) > > # Step E: If needed, summarize across ID_Key to sum presence > indicators > final_result <- final_dt[, lapply(.SD, sum, na.rm = TRUE), by = > ID_Key, .SDcols = setdiff(names(final_dt), "ID_Key")] >...
2024 Dec 11
2
Cores hang when calling mcapply
Hi R users. Apologies for the lack of concrete examples because the dataset is large, and it being so I believe is the issue. I multiple, very large datasets for which I need to generate 0/1 absence/presence columns Some include over 200M rows, with two columns that need presence/absence columns based on the strings contained within them, as an example, one set has ~29k unique values and the