Unique combinations of vector elements that fulfill criteria
up vote
1
down vote
favorite
I have a vector of integers, e.g., totalVector <- c(4,2,1)
, and two variables totalResult
and totalNumber
. What I want to do is the following:
I want to to find all UNIQUE combinations of "totalNumber" elements from totalVector that add up to "totalResult". To clarify, if totalResult = 100 and totalNumber = 50, I want all combinations of 50 elements from totalVector that have a sum of 100 (repetitions are obviously allowed, but duplicate results such as 25 fours and 25 re-arranged fours should only be counted once).
I originally did this by expanding the total vector (repeating each element 50 times), getting all combinations of 50 elements with combn()
and then filtering their sums. For large values however, this proved very inefficient, and failed due to the sheer amount of data. Is there a quicker and less data-heavy way to do this?
r
add a comment |
up vote
1
down vote
favorite
I have a vector of integers, e.g., totalVector <- c(4,2,1)
, and two variables totalResult
and totalNumber
. What I want to do is the following:
I want to to find all UNIQUE combinations of "totalNumber" elements from totalVector that add up to "totalResult". To clarify, if totalResult = 100 and totalNumber = 50, I want all combinations of 50 elements from totalVector that have a sum of 100 (repetitions are obviously allowed, but duplicate results such as 25 fours and 25 re-arranged fours should only be counted once).
I originally did this by expanding the total vector (repeating each element 50 times), getting all combinations of 50 elements with combn()
and then filtering their sums. For large values however, this proved very inefficient, and failed due to the sheer amount of data. Is there a quicker and less data-heavy way to do this?
r
2
You should ask a mathematician for an algorithm other than brute force.
– Roland
Nov 19 at 11:41
you could check out thepartitions
library that has a lazy iterator for large combinations, although I don't think it has a function to directly solve your problem
– jenesaisquoi
Nov 19 at 13:40
You are essentially trying to find the positive solutions to a system of two linear Diophantine equations (one equation involvingtotalResult
and one involvingtotalNumber
). Ehrhart polynomials are relevant for simply counting the number of solutions, though there might be another approach.
– John Coleman
Nov 19 at 15:02
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a vector of integers, e.g., totalVector <- c(4,2,1)
, and two variables totalResult
and totalNumber
. What I want to do is the following:
I want to to find all UNIQUE combinations of "totalNumber" elements from totalVector that add up to "totalResult". To clarify, if totalResult = 100 and totalNumber = 50, I want all combinations of 50 elements from totalVector that have a sum of 100 (repetitions are obviously allowed, but duplicate results such as 25 fours and 25 re-arranged fours should only be counted once).
I originally did this by expanding the total vector (repeating each element 50 times), getting all combinations of 50 elements with combn()
and then filtering their sums. For large values however, this proved very inefficient, and failed due to the sheer amount of data. Is there a quicker and less data-heavy way to do this?
r
I have a vector of integers, e.g., totalVector <- c(4,2,1)
, and two variables totalResult
and totalNumber
. What I want to do is the following:
I want to to find all UNIQUE combinations of "totalNumber" elements from totalVector that add up to "totalResult". To clarify, if totalResult = 100 and totalNumber = 50, I want all combinations of 50 elements from totalVector that have a sum of 100 (repetitions are obviously allowed, but duplicate results such as 25 fours and 25 re-arranged fours should only be counted once).
I originally did this by expanding the total vector (repeating each element 50 times), getting all combinations of 50 elements with combn()
and then filtering their sums. For large values however, this proved very inefficient, and failed due to the sheer amount of data. Is there a quicker and less data-heavy way to do this?
r
r
edited Nov 19 at 11:05
jogo
9,59592135
9,59592135
asked Nov 19 at 11:04
Iason
142
142
2
You should ask a mathematician for an algorithm other than brute force.
– Roland
Nov 19 at 11:41
you could check out thepartitions
library that has a lazy iterator for large combinations, although I don't think it has a function to directly solve your problem
– jenesaisquoi
Nov 19 at 13:40
You are essentially trying to find the positive solutions to a system of two linear Diophantine equations (one equation involvingtotalResult
and one involvingtotalNumber
). Ehrhart polynomials are relevant for simply counting the number of solutions, though there might be another approach.
– John Coleman
Nov 19 at 15:02
add a comment |
2
You should ask a mathematician for an algorithm other than brute force.
– Roland
Nov 19 at 11:41
you could check out thepartitions
library that has a lazy iterator for large combinations, although I don't think it has a function to directly solve your problem
– jenesaisquoi
Nov 19 at 13:40
You are essentially trying to find the positive solutions to a system of two linear Diophantine equations (one equation involvingtotalResult
and one involvingtotalNumber
). Ehrhart polynomials are relevant for simply counting the number of solutions, though there might be another approach.
– John Coleman
Nov 19 at 15:02
2
2
You should ask a mathematician for an algorithm other than brute force.
– Roland
Nov 19 at 11:41
You should ask a mathematician for an algorithm other than brute force.
– Roland
Nov 19 at 11:41
you could check out the
partitions
library that has a lazy iterator for large combinations, although I don't think it has a function to directly solve your problem– jenesaisquoi
Nov 19 at 13:40
you could check out the
partitions
library that has a lazy iterator for large combinations, although I don't think it has a function to directly solve your problem– jenesaisquoi
Nov 19 at 13:40
You are essentially trying to find the positive solutions to a system of two linear Diophantine equations (one equation involving
totalResult
and one involving totalNumber
). Ehrhart polynomials are relevant for simply counting the number of solutions, though there might be another approach.– John Coleman
Nov 19 at 15:02
You are essentially trying to find the positive solutions to a system of two linear Diophantine equations (one equation involving
totalResult
and one involving totalNumber
). Ehrhart polynomials are relevant for simply counting the number of solutions, though there might be another approach.– John Coleman
Nov 19 at 15:02
add a comment |
2 Answers
2
active
oldest
votes
up vote
1
down vote
I think the OP is looking for the combinations with repetition of a vector that sum to a particular number. This will do it:
totalVector <- c(4,2,1)
totalNumber <- 50
totalResult <- 100
library(RcppAlgos)
myAns <- comboGeneral(totalVector, totalNumber, repetition = TRUE,
constraintFun = "sum", comparisonFun = "==",
limitConstraints = totalResult)
dim(myAns)
[1] 17 50
all(apply(myAns, 1, sum) == totalResult)
[1] TRUE
Disclaimer: I am the author of RcppAlgos
1
Nice answer (+1), but how long before it becomes computationally infeasible? It still seems like a brute-force approach (albeit one aided by a well-written package).
– John Coleman
2 days ago
@JohnColeman, first of all I am a huge fan. I have learned so much from your contributions. Secondly, you are right (as usual)... this will become computationally infeasible quickly. This very problem has aided me to sleep many many many nights (I know the general saying is "kept me up", but I find comfort in these types of problems). So far, my conclusion is that you can't do better than an optimized brute force for the general case. I have tried (and failed) to apply an integer partition approach. My current approach is to apply a sort of binary approach.. continued
– Joseph Wood
2 days ago
@JohnColeman, as we are able to generate thenth
combination, we can test thesum
(or whatever constraint function we are applying) and apply loose bounds on where we should be searching. This will (in theory) at least limit the space of results we are testing (albeit in a brute forceish manner). Addendum to the last comment: I should have said that " my current research" instead of "my current approach".. meaning that this is under active development.
– Joseph Wood
2 days ago
add a comment |
up vote
0
down vote
This would give you what you need for a small sample, but you will encounter issues with combinatorial explosion very quickly as you increase the size of the problem
tv <- sample(1:10, 10, replace = TRUE)
tn <- 5
tr <- 20
combinations <- combn(tv, tn)
equals.tr <- apply(combinations, MARGIN = 2, FUN = function(x) sum(x) == tr)
combinations[, equals.tr]
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
I think the OP is looking for the combinations with repetition of a vector that sum to a particular number. This will do it:
totalVector <- c(4,2,1)
totalNumber <- 50
totalResult <- 100
library(RcppAlgos)
myAns <- comboGeneral(totalVector, totalNumber, repetition = TRUE,
constraintFun = "sum", comparisonFun = "==",
limitConstraints = totalResult)
dim(myAns)
[1] 17 50
all(apply(myAns, 1, sum) == totalResult)
[1] TRUE
Disclaimer: I am the author of RcppAlgos
1
Nice answer (+1), but how long before it becomes computationally infeasible? It still seems like a brute-force approach (albeit one aided by a well-written package).
– John Coleman
2 days ago
@JohnColeman, first of all I am a huge fan. I have learned so much from your contributions. Secondly, you are right (as usual)... this will become computationally infeasible quickly. This very problem has aided me to sleep many many many nights (I know the general saying is "kept me up", but I find comfort in these types of problems). So far, my conclusion is that you can't do better than an optimized brute force for the general case. I have tried (and failed) to apply an integer partition approach. My current approach is to apply a sort of binary approach.. continued
– Joseph Wood
2 days ago
@JohnColeman, as we are able to generate thenth
combination, we can test thesum
(or whatever constraint function we are applying) and apply loose bounds on where we should be searching. This will (in theory) at least limit the space of results we are testing (albeit in a brute forceish manner). Addendum to the last comment: I should have said that " my current research" instead of "my current approach".. meaning that this is under active development.
– Joseph Wood
2 days ago
add a comment |
up vote
1
down vote
I think the OP is looking for the combinations with repetition of a vector that sum to a particular number. This will do it:
totalVector <- c(4,2,1)
totalNumber <- 50
totalResult <- 100
library(RcppAlgos)
myAns <- comboGeneral(totalVector, totalNumber, repetition = TRUE,
constraintFun = "sum", comparisonFun = "==",
limitConstraints = totalResult)
dim(myAns)
[1] 17 50
all(apply(myAns, 1, sum) == totalResult)
[1] TRUE
Disclaimer: I am the author of RcppAlgos
1
Nice answer (+1), but how long before it becomes computationally infeasible? It still seems like a brute-force approach (albeit one aided by a well-written package).
– John Coleman
2 days ago
@JohnColeman, first of all I am a huge fan. I have learned so much from your contributions. Secondly, you are right (as usual)... this will become computationally infeasible quickly. This very problem has aided me to sleep many many many nights (I know the general saying is "kept me up", but I find comfort in these types of problems). So far, my conclusion is that you can't do better than an optimized brute force for the general case. I have tried (and failed) to apply an integer partition approach. My current approach is to apply a sort of binary approach.. continued
– Joseph Wood
2 days ago
@JohnColeman, as we are able to generate thenth
combination, we can test thesum
(or whatever constraint function we are applying) and apply loose bounds on where we should be searching. This will (in theory) at least limit the space of results we are testing (albeit in a brute forceish manner). Addendum to the last comment: I should have said that " my current research" instead of "my current approach".. meaning that this is under active development.
– Joseph Wood
2 days ago
add a comment |
up vote
1
down vote
up vote
1
down vote
I think the OP is looking for the combinations with repetition of a vector that sum to a particular number. This will do it:
totalVector <- c(4,2,1)
totalNumber <- 50
totalResult <- 100
library(RcppAlgos)
myAns <- comboGeneral(totalVector, totalNumber, repetition = TRUE,
constraintFun = "sum", comparisonFun = "==",
limitConstraints = totalResult)
dim(myAns)
[1] 17 50
all(apply(myAns, 1, sum) == totalResult)
[1] TRUE
Disclaimer: I am the author of RcppAlgos
I think the OP is looking for the combinations with repetition of a vector that sum to a particular number. This will do it:
totalVector <- c(4,2,1)
totalNumber <- 50
totalResult <- 100
library(RcppAlgos)
myAns <- comboGeneral(totalVector, totalNumber, repetition = TRUE,
constraintFun = "sum", comparisonFun = "==",
limitConstraints = totalResult)
dim(myAns)
[1] 17 50
all(apply(myAns, 1, sum) == totalResult)
[1] TRUE
Disclaimer: I am the author of RcppAlgos
edited 2 days ago
answered 2 days ago
Joseph Wood
3,21921634
3,21921634
1
Nice answer (+1), but how long before it becomes computationally infeasible? It still seems like a brute-force approach (albeit one aided by a well-written package).
– John Coleman
2 days ago
@JohnColeman, first of all I am a huge fan. I have learned so much from your contributions. Secondly, you are right (as usual)... this will become computationally infeasible quickly. This very problem has aided me to sleep many many many nights (I know the general saying is "kept me up", but I find comfort in these types of problems). So far, my conclusion is that you can't do better than an optimized brute force for the general case. I have tried (and failed) to apply an integer partition approach. My current approach is to apply a sort of binary approach.. continued
– Joseph Wood
2 days ago
@JohnColeman, as we are able to generate thenth
combination, we can test thesum
(or whatever constraint function we are applying) and apply loose bounds on where we should be searching. This will (in theory) at least limit the space of results we are testing (albeit in a brute forceish manner). Addendum to the last comment: I should have said that " my current research" instead of "my current approach".. meaning that this is under active development.
– Joseph Wood
2 days ago
add a comment |
1
Nice answer (+1), but how long before it becomes computationally infeasible? It still seems like a brute-force approach (albeit one aided by a well-written package).
– John Coleman
2 days ago
@JohnColeman, first of all I am a huge fan. I have learned so much from your contributions. Secondly, you are right (as usual)... this will become computationally infeasible quickly. This very problem has aided me to sleep many many many nights (I know the general saying is "kept me up", but I find comfort in these types of problems). So far, my conclusion is that you can't do better than an optimized brute force for the general case. I have tried (and failed) to apply an integer partition approach. My current approach is to apply a sort of binary approach.. continued
– Joseph Wood
2 days ago
@JohnColeman, as we are able to generate thenth
combination, we can test thesum
(or whatever constraint function we are applying) and apply loose bounds on where we should be searching. This will (in theory) at least limit the space of results we are testing (albeit in a brute forceish manner). Addendum to the last comment: I should have said that " my current research" instead of "my current approach".. meaning that this is under active development.
– Joseph Wood
2 days ago
1
1
Nice answer (+1), but how long before it becomes computationally infeasible? It still seems like a brute-force approach (albeit one aided by a well-written package).
– John Coleman
2 days ago
Nice answer (+1), but how long before it becomes computationally infeasible? It still seems like a brute-force approach (albeit one aided by a well-written package).
– John Coleman
2 days ago
@JohnColeman, first of all I am a huge fan. I have learned so much from your contributions. Secondly, you are right (as usual)... this will become computationally infeasible quickly. This very problem has aided me to sleep many many many nights (I know the general saying is "kept me up", but I find comfort in these types of problems). So far, my conclusion is that you can't do better than an optimized brute force for the general case. I have tried (and failed) to apply an integer partition approach. My current approach is to apply a sort of binary approach.. continued
– Joseph Wood
2 days ago
@JohnColeman, first of all I am a huge fan. I have learned so much from your contributions. Secondly, you are right (as usual)... this will become computationally infeasible quickly. This very problem has aided me to sleep many many many nights (I know the general saying is "kept me up", but I find comfort in these types of problems). So far, my conclusion is that you can't do better than an optimized brute force for the general case. I have tried (and failed) to apply an integer partition approach. My current approach is to apply a sort of binary approach.. continued
– Joseph Wood
2 days ago
@JohnColeman, as we are able to generate the
nth
combination, we can test the sum
(or whatever constraint function we are applying) and apply loose bounds on where we should be searching. This will (in theory) at least limit the space of results we are testing (albeit in a brute forceish manner). Addendum to the last comment: I should have said that " my current research" instead of "my current approach".. meaning that this is under active development.– Joseph Wood
2 days ago
@JohnColeman, as we are able to generate the
nth
combination, we can test the sum
(or whatever constraint function we are applying) and apply loose bounds on where we should be searching. This will (in theory) at least limit the space of results we are testing (albeit in a brute forceish manner). Addendum to the last comment: I should have said that " my current research" instead of "my current approach".. meaning that this is under active development.– Joseph Wood
2 days ago
add a comment |
up vote
0
down vote
This would give you what you need for a small sample, but you will encounter issues with combinatorial explosion very quickly as you increase the size of the problem
tv <- sample(1:10, 10, replace = TRUE)
tn <- 5
tr <- 20
combinations <- combn(tv, tn)
equals.tr <- apply(combinations, MARGIN = 2, FUN = function(x) sum(x) == tr)
combinations[, equals.tr]
add a comment |
up vote
0
down vote
This would give you what you need for a small sample, but you will encounter issues with combinatorial explosion very quickly as you increase the size of the problem
tv <- sample(1:10, 10, replace = TRUE)
tn <- 5
tr <- 20
combinations <- combn(tv, tn)
equals.tr <- apply(combinations, MARGIN = 2, FUN = function(x) sum(x) == tr)
combinations[, equals.tr]
add a comment |
up vote
0
down vote
up vote
0
down vote
This would give you what you need for a small sample, but you will encounter issues with combinatorial explosion very quickly as you increase the size of the problem
tv <- sample(1:10, 10, replace = TRUE)
tn <- 5
tr <- 20
combinations <- combn(tv, tn)
equals.tr <- apply(combinations, MARGIN = 2, FUN = function(x) sum(x) == tr)
combinations[, equals.tr]
This would give you what you need for a small sample, but you will encounter issues with combinatorial explosion very quickly as you increase the size of the problem
tv <- sample(1:10, 10, replace = TRUE)
tn <- 5
tr <- 20
combinations <- combn(tv, tn)
equals.tr <- apply(combinations, MARGIN = 2, FUN = function(x) sum(x) == tr)
combinations[, equals.tr]
answered Nov 19 at 12:04
Cleland
1456
1456
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53373243%2funique-combinations-of-vector-elements-that-fulfill-criteria%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
You should ask a mathematician for an algorithm other than brute force.
– Roland
Nov 19 at 11:41
you could check out the
partitions
library that has a lazy iterator for large combinations, although I don't think it has a function to directly solve your problem– jenesaisquoi
Nov 19 at 13:40
You are essentially trying to find the positive solutions to a system of two linear Diophantine equations (one equation involving
totalResult
and one involvingtotalNumber
). Ehrhart polynomials are relevant for simply counting the number of solutions, though there might be another approach.– John Coleman
Nov 19 at 15:02