Hi, I'm quite new to the R-project. I was suggested to look into it because I am trying to solve the "Subset sum" problem", which basically is: Given a set of integers and an integer s, does any non-empty subset sum to s? (See http://en.wikipedia.org/wiki/Subset_sum_problem) I have been searching the web for quite some time now (which is how I eventually discovered that my problem is called subset sum), but I can't seem to find an easily applicable implementation. I did search the list archive, the R website and used the help.search and apropos function. I'm afraid nothing obvious showed up for me. Has anybody tackled this issue before in R ? If so, I would be very grateful if you could share your solution with me. Thank you very much. Geert
The problem is NP-Complete and the real problem is how you can solve it.
According to the wiki page, you can use the bottom algorithm(Polynomial time 
approximate algorithm) to solve your problems.
If you had trouble with writing R code, you can read ``R-introduction''.
regards
                                                                             
                                                                             
                Guo-Hao Huang
--------------------------------------------------
From: "Geert Janssens" <janssens-geert at telenet.be>
Sent: Monday, December 07, 2009 10:56 PM
To: <r-help at r-project.org>
Subject: [R] Subset sum problem.
> Hi,
>
> I'm quite new to the R-project. I was suggested to look into it because
I
> am
> trying to solve the "Subset sum" problem", which basically
is:
> Given a set of integers and an integer s, does any non-empty subset sum to 
> s?
> (See http://en.wikipedia.org/wiki/Subset_sum_problem)
>
> I have been searching the web for quite some time now (which is how I
> eventually discovered that my problem is called subset sum), but I
can't
> seem
> to find an easily applicable implementation. I did search the list 
> archive,
> the R website and used the help.search and apropos function. I'm afraid
> nothing obvious showed up for me.
>
> Has anybody tackled this issue before in R ? If so, I would be very 
> grateful
> if you could share your solution with me.
>
> Thank you very much.
>
> Geert
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Geert Janssens <janssens-geert <at> telenet.be> writes:> > Hi, > > I'm quite new to the R-project. I was suggested to look into it because I am > trying to solve the "Subset sum" problem", which basically is: > Given a set of integers and an integer s, does any non-empty subset sum to s? > (See http://en.wikipedia.org/wiki/Subset_sum_problem) > > I have been searching the web for quite some time now (which is how I > eventually discovered that my problem is called subset sum), but I can't seem > to find an easily applicable implementation. I did search the list archive, > the R website and used the help.search and apropos function. I'm afraid > nothing obvious showed up for me. > > Has anybody tackled this issue before in R ? If so, I would be very grateful > if you could share your solution with me.Is it really true that you only want to see a "Yes or No" answer to this question whether a subset sums up to s --- without learning which numbers this subset is composed of (the pure SUBSET SUM problem)? Then the following procedure does that in a reasonable amount of time (returning 'TRUE' or 'FALSE' instead of "Y-or-N"): # Exact algorithm for the SUBSET SUM problem exactSubsetSum <- function(S, t) { S <- S[S <= t] if (sum(S) < t) return(FALSE) S <- sort(S, decreasing=TRUE) n <- length(S) L <- c(0) for (i in 1:n) { L <- unique(sort(c(L, L + S[i]))) L <- L[L <= t] if (max(L) == t) return(TRUE) } return(FALSE) } # Example with a set of cardinality 64 amount <- 4748652 products <- c(30500,30500,30500,30500,42000,42000,42000,42000, 42000,42000,42000,42000,42000,42000,71040,90900, 76950,35100,71190,53730,456000,70740,70740,533600, 83800,59500,27465,28000,28000,28000,28000,28000, 26140,49600,77000,123289,27000,27000,27000,27000, 27000,27000,80000,33000,33000,55000,77382,48048, 51186,40000,35000,21716,63051,15025,15025,15025, 15025,800000,1110000,59700,25908,829350,1198000,1031655) # Timing is not that bad system.time( sol <- exactSubsetSum(products, amount) ) # user system elapsed # 0.516 0.096 0.673 sol # [1] TRUE> Thank you very much. > > Geert >
On Wednesday 9 December 2009, Hans W Borchers wrote:> Geert Janssens <janssens-geert <at> telenet.be> writes: > > Hi, > > > > I'm quite new to the R-project. I was suggested to look into it because I > > am trying to solve the "Subset sum" problem", which basically is: > > Given a set of integers and an integer s, does any non-empty subset sum > > to s? (See http://en.wikipedia.org/wiki/Subset_sum_problem) > > > > I have been searching the web for quite some time now (which is how I > > eventually discovered that my problem is called subset sum), but I can't > > seem to find an easily applicable implementation. I did search the list > > archive, the R website and used the help.search and apropos function. I'm > > afraid nothing obvious showed up for me. > > > > Has anybody tackled this issue before in R ? If so, I would be very > > grateful if you could share your solution with me. > > Is it really true that you only want to see a "Yes or No" answer to this > question whether a subset sums up to s --- without learning which numbers > this subset is composed of (the pure SUBSET SUM problem)? > Then the following procedure does that in a reasonable amount of time > (returning 'TRUE' or 'FALSE' instead of "Y-or-N"): >Unfortunatly no. I do need the numbers in the subset. But thank you for presenting this code. Geert> # Exact algorithm for the SUBSET SUM problem > exactSubsetSum <- function(S, t) { > S <- S[S <= t] > if (sum(S) < t) return(FALSE) > S <- sort(S, decreasing=TRUE) > n <- length(S) > L <- c(0) > for (i in 1:n) { > L <- unique(sort(c(L, L + S[i]))) > L <- L[L <= t] > if (max(L) == t) return(TRUE) > } > return(FALSE) > } > > # Example with a set of cardinality 64 > amount <- 4748652 > products <- > c(30500,30500,30500,30500,42000,42000,42000,42000, > 42000,42000,42000,42000,42000,42000,71040,90900, > 76950,35100,71190,53730,456000,70740,70740,533600, > 83800,59500,27465,28000,28000,28000,28000,28000, > 26140,49600,77000,123289,27000,27000,27000,27000, > 27000,27000,80000,33000,33000,55000,77382,48048, > 51186,40000,35000,21716,63051,15025,15025,15025, > 15025,800000,1110000,59700,25908,829350,1198000,1031655) > > # Timing is not that bad > system.time( sol <- exactSubsetSum(products, amount) ) > # user system elapsed > # 0.516 0.096 0.673 > sol > # [1] TRUE >
Hans, you're my personal hero today ! The function seems to work fine for the tests I did already. Thank you very much ! Geert On Thursday 10 December 2009, Hans W Borchers wrote:> Geert Janssens <janssens-geert <at> telenet.be> writes: > > On Wednesday 9 December 2009, Hans W Borchers wrote: > > > Geert Janssens <janssens-geert <at> telenet.be> writes: > > > > [ ... ] > > > > Has anybody tackled this issue before in R ? If so, I would be very > > > > grateful if you could share your solution with me. > > > > > > Is it really true that you only want to see a "Yes or No" answer to > > > this question whether a subset sums up to s --- without learning which > > > numbers this subset is composed of (the pure SUBSET SUM problem)? > > > Then the following procedure does that in a reasonable amount of time > > > (returning 'TRUE' or 'FALSE' instead of "Y-or-N"): > > > > Unfortunately no. I do need the numbers in the subset. But thank you for > > presenting this code. > > > > Geert > > Okay then, here we go. But don't tell later that your requirement was to > generate _all_ subsets that add up to a certain amount. I will generate > only one (with largest elements). > > For simplicity I assume that the set is prepared s.t. it is decreasingly > ordered, has no elements larger than the amount given, and has a total sum > larger than this amount. > > # Assume S decreasing, no elements > t, total sum >= t > solveSubsetSum <- function(S, t) { > L <- c(0) > inds <- NULL > for (i in 1:length(S)) { > # L <- unique(sort(c(L, L + S[i]))) > L <- c(L, L+S[i]) > L <- L[L <= t] > if (max(L) == t) { > inds <- c(i) > t <- t - S[i] > while (t > 0) { > K <- c(0) > for (j in 1:(i-1)) { > K <- c(K, K+S[j]) > if (t %in% K) break > } > inds <- c(inds, j) > t <- t - S[j] > } > break > } > } > return(inds) > } > > # former example > amount <- 4748652 > products <- > c(30500,30500,30500,30500,42000,42000,42000,42000, > 42000,42000,42000,42000,42000,42000,71040,90900, > 76950,35100,71190,53730,456000,70740,70740,533600, > 83800,59500,27465,28000,28000,28000,28000,28000, > 26140,49600,77000,123289,27000,27000,27000,27000, > 27000,27000,80000,33000,33000,55000,77382,48048, > 51186,40000,35000,21716,63051,15025,15025,15025, > 15025,800000,1110000,59700,25908,829350,1198000,1031655) > > # prepare set > prods <- products[products <= amount] # no elements > amount > prods <- sort(prods, decreasing=TRUE) # decreasing order > > # now find one solution > system.time(is <- solveSubsetSum(prods, amount)) > # user system elapsed > # 0.320 0.032 0.359 > > prods[is] > # [1] 70740 70740 71190 76950 77382 80000 83800 > # [8] 90900 456000 533600 829350 1110000 1198000 > > sum(prods[is]) == amount > # [1] TRUE > > Note that running times and memory needs will be much higher when more > summands are needed. To mention that too: I have not tested the code > extensively. > > Regards > Hans Werner > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, minimal, > self-contained, reproducible code.