optimal algorithm grouping data in javascript











up vote
10
down vote

favorite
3












The following (simplified) json type of data defines a Contact:



{
id: number;
name: string;
phone: string;
email: string
}


There is the following group of data:



+---+----------+-------------+---------------------------+ 
|id | name | phone |email |
+---+----------+-------------+---------------------------+
|1 | John | 11111111 |aaaa@test.com |
|2 | Marc | 22222222 |bbbb@test.com |
|3 | Ron | 99999999 |aaaa@test.com |
|4 | Andrew | 55555555 |dddd@test.com |
|5 | Wim | 99999999 |gggg@test.com |
|6 | Marc | 33333333 |cccc@test.com |
|7 | Dan | 44444444 |cccc@test.com |
+---+----------+-------------+---------------------------+


Goal is to find the groups that belong together using javascript(optionally in lodash, but main idea is to get the algorithm clear) according to the following constraint: a contact belongs to a group when any of the following criteria are the same: name, phone or email. The results shows the id's grouped as arrays in arrays. a contact in a group of 1 is ignored.



In the example above, it means that contacts with ids 1,3,5 belong together since 1,3 share the same email and 3 and 5 share the same phonenumber. Likewise 2,6,7: 2 and 6 have the same name and 6 and 7 have the same email. 5 does not have anything in common.
The expected result therefore is:
[[1,3,5], [2,6,7]]



Background:
One solution that works is iterating over every item and check for the remainder of the list if name, email, or phone is the same. If so, group them and take them out of the list (in the example we compare 1 with all the items in the list and only 3 is found). problem is that next items also need to be checked again to these groups, because in this case 5 is not yet detected as part of the group. This makes the algorithm complex, while I suspect there is a simple way of solving this in linear time. There might also be a name for this class of problems?
`










share|improve this question
























  • That looks like a cool problem... Just to make sure that I understood it correctly: what would it be the expected result if #2 had the following phone #: 99999999? Would it be [[1, 2, 3, 5, 6, 7]]?
    – Josep
    Nov 20 at 9:27








  • 1




    correct, we would have one group.
    – Han
    Nov 20 at 9:39










  • I posted my answer below. BTW: contact id 4 falls into its own category
    – Ahmad
    Nov 20 at 9:52










  • Nice question. Upvoted and added my take on it. Hope it helps.
    – Akrion
    Nov 21 at 4:53

















up vote
10
down vote

favorite
3












The following (simplified) json type of data defines a Contact:



{
id: number;
name: string;
phone: string;
email: string
}


There is the following group of data:



+---+----------+-------------+---------------------------+ 
|id | name | phone |email |
+---+----------+-------------+---------------------------+
|1 | John | 11111111 |aaaa@test.com |
|2 | Marc | 22222222 |bbbb@test.com |
|3 | Ron | 99999999 |aaaa@test.com |
|4 | Andrew | 55555555 |dddd@test.com |
|5 | Wim | 99999999 |gggg@test.com |
|6 | Marc | 33333333 |cccc@test.com |
|7 | Dan | 44444444 |cccc@test.com |
+---+----------+-------------+---------------------------+


Goal is to find the groups that belong together using javascript(optionally in lodash, but main idea is to get the algorithm clear) according to the following constraint: a contact belongs to a group when any of the following criteria are the same: name, phone or email. The results shows the id's grouped as arrays in arrays. a contact in a group of 1 is ignored.



In the example above, it means that contacts with ids 1,3,5 belong together since 1,3 share the same email and 3 and 5 share the same phonenumber. Likewise 2,6,7: 2 and 6 have the same name and 6 and 7 have the same email. 5 does not have anything in common.
The expected result therefore is:
[[1,3,5], [2,6,7]]



Background:
One solution that works is iterating over every item and check for the remainder of the list if name, email, or phone is the same. If so, group them and take them out of the list (in the example we compare 1 with all the items in the list and only 3 is found). problem is that next items also need to be checked again to these groups, because in this case 5 is not yet detected as part of the group. This makes the algorithm complex, while I suspect there is a simple way of solving this in linear time. There might also be a name for this class of problems?
`










share|improve this question
























  • That looks like a cool problem... Just to make sure that I understood it correctly: what would it be the expected result if #2 had the following phone #: 99999999? Would it be [[1, 2, 3, 5, 6, 7]]?
    – Josep
    Nov 20 at 9:27








  • 1




    correct, we would have one group.
    – Han
    Nov 20 at 9:39










  • I posted my answer below. BTW: contact id 4 falls into its own category
    – Ahmad
    Nov 20 at 9:52










  • Nice question. Upvoted and added my take on it. Hope it helps.
    – Akrion
    Nov 21 at 4:53















up vote
10
down vote

favorite
3









up vote
10
down vote

favorite
3






3





The following (simplified) json type of data defines a Contact:



{
id: number;
name: string;
phone: string;
email: string
}


There is the following group of data:



+---+----------+-------------+---------------------------+ 
|id | name | phone |email |
+---+----------+-------------+---------------------------+
|1 | John | 11111111 |aaaa@test.com |
|2 | Marc | 22222222 |bbbb@test.com |
|3 | Ron | 99999999 |aaaa@test.com |
|4 | Andrew | 55555555 |dddd@test.com |
|5 | Wim | 99999999 |gggg@test.com |
|6 | Marc | 33333333 |cccc@test.com |
|7 | Dan | 44444444 |cccc@test.com |
+---+----------+-------------+---------------------------+


Goal is to find the groups that belong together using javascript(optionally in lodash, but main idea is to get the algorithm clear) according to the following constraint: a contact belongs to a group when any of the following criteria are the same: name, phone or email. The results shows the id's grouped as arrays in arrays. a contact in a group of 1 is ignored.



In the example above, it means that contacts with ids 1,3,5 belong together since 1,3 share the same email and 3 and 5 share the same phonenumber. Likewise 2,6,7: 2 and 6 have the same name and 6 and 7 have the same email. 5 does not have anything in common.
The expected result therefore is:
[[1,3,5], [2,6,7]]



Background:
One solution that works is iterating over every item and check for the remainder of the list if name, email, or phone is the same. If so, group them and take them out of the list (in the example we compare 1 with all the items in the list and only 3 is found). problem is that next items also need to be checked again to these groups, because in this case 5 is not yet detected as part of the group. This makes the algorithm complex, while I suspect there is a simple way of solving this in linear time. There might also be a name for this class of problems?
`










share|improve this question















The following (simplified) json type of data defines a Contact:



{
id: number;
name: string;
phone: string;
email: string
}


There is the following group of data:



+---+----------+-------------+---------------------------+ 
|id | name | phone |email |
+---+----------+-------------+---------------------------+
|1 | John | 11111111 |aaaa@test.com |
|2 | Marc | 22222222 |bbbb@test.com |
|3 | Ron | 99999999 |aaaa@test.com |
|4 | Andrew | 55555555 |dddd@test.com |
|5 | Wim | 99999999 |gggg@test.com |
|6 | Marc | 33333333 |cccc@test.com |
|7 | Dan | 44444444 |cccc@test.com |
+---+----------+-------------+---------------------------+


Goal is to find the groups that belong together using javascript(optionally in lodash, but main idea is to get the algorithm clear) according to the following constraint: a contact belongs to a group when any of the following criteria are the same: name, phone or email. The results shows the id's grouped as arrays in arrays. a contact in a group of 1 is ignored.



In the example above, it means that contacts with ids 1,3,5 belong together since 1,3 share the same email and 3 and 5 share the same phonenumber. Likewise 2,6,7: 2 and 6 have the same name and 6 and 7 have the same email. 5 does not have anything in common.
The expected result therefore is:
[[1,3,5], [2,6,7]]



Background:
One solution that works is iterating over every item and check for the remainder of the list if name, email, or phone is the same. If so, group them and take them out of the list (in the example we compare 1 with all the items in the list and only 3 is found). problem is that next items also need to be checked again to these groups, because in this case 5 is not yet detected as part of the group. This makes the algorithm complex, while I suspect there is a simple way of solving this in linear time. There might also be a name for this class of problems?
`







javascript algorithm grouping lodash






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 20 at 12:01

























asked Nov 20 at 9:18









Han

535




535












  • That looks like a cool problem... Just to make sure that I understood it correctly: what would it be the expected result if #2 had the following phone #: 99999999? Would it be [[1, 2, 3, 5, 6, 7]]?
    – Josep
    Nov 20 at 9:27








  • 1




    correct, we would have one group.
    – Han
    Nov 20 at 9:39










  • I posted my answer below. BTW: contact id 4 falls into its own category
    – Ahmad
    Nov 20 at 9:52










  • Nice question. Upvoted and added my take on it. Hope it helps.
    – Akrion
    Nov 21 at 4:53




















  • That looks like a cool problem... Just to make sure that I understood it correctly: what would it be the expected result if #2 had the following phone #: 99999999? Would it be [[1, 2, 3, 5, 6, 7]]?
    – Josep
    Nov 20 at 9:27








  • 1




    correct, we would have one group.
    – Han
    Nov 20 at 9:39










  • I posted my answer below. BTW: contact id 4 falls into its own category
    – Ahmad
    Nov 20 at 9:52










  • Nice question. Upvoted and added my take on it. Hope it helps.
    – Akrion
    Nov 21 at 4:53


















That looks like a cool problem... Just to make sure that I understood it correctly: what would it be the expected result if #2 had the following phone #: 99999999? Would it be [[1, 2, 3, 5, 6, 7]]?
– Josep
Nov 20 at 9:27






That looks like a cool problem... Just to make sure that I understood it correctly: what would it be the expected result if #2 had the following phone #: 99999999? Would it be [[1, 2, 3, 5, 6, 7]]?
– Josep
Nov 20 at 9:27






1




1




correct, we would have one group.
– Han
Nov 20 at 9:39




correct, we would have one group.
– Han
Nov 20 at 9:39












I posted my answer below. BTW: contact id 4 falls into its own category
– Ahmad
Nov 20 at 9:52




I posted my answer below. BTW: contact id 4 falls into its own category
– Ahmad
Nov 20 at 9:52












Nice question. Upvoted and added my take on it. Hope it helps.
– Akrion
Nov 21 at 4:53






Nice question. Upvoted and added my take on it. Hope it helps.
– Akrion
Nov 21 at 4:53














4 Answers
4






active

oldest

votes

















up vote
3
down vote



accepted










Idea:




  • Start with 0 groups

  • Iterate your list of contacts

  • Check if there is a group with the contact name, or phone, or email. Merge all the members of those groups as the same group. Then add yourself to that group. If not, begin a new group with yourself and set the name, phone and email group as yourself.


Union find is an efficient structure to handle merging of disjoint sets. Code taken from here. As it uses path compression and union by rank, you can consider that the whole code is linear in the amount of contacts.






var data = [
{id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
{id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
{id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
{id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
{id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
{id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
{id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
];

// UNION-FIND structure, with path comression and union by rank

var UNIONFIND = (function () {

function _find(n)
{
if(n.parent == n) return n;
n.parent = _find(n.parent);
return n.parent;
}

return {
makeset:function(id){
var newnode = {
parent: null,
id: id,
rank: 0
};
newnode.parent = newnode;
return newnode;
},

find: _find,

combine: function(n1, n2) {
var n1 = _find(n1);
var n2 = _find(n2);

if (n1 == n2) return;

if(n1.rank < n2.rank)
{
n2.parent = n2;
return n2;
}
else if(n2.rank < n1.rank)
{
n2.parent = n1;
return n1;
}
else
{
n2.parent = n1;
n1.rank += 1;
return n1;
}
}
};
})();

var groupHash = {name: {}, phone: {}, email: {}}
var groupNodes =

data.forEach(function(contact){
var group = UNIONFIND.makeset(contact.id);
var groups = new Set();
["name", "phone", "email"].forEach(function(attr){
if (groupHash[attr].hasOwnProperty(contact[attr])) groups.add(groupHash[attr][contact[attr]])
});

groups = Array.from(groups);
groups.push(group);
groupNodes.push(group);

for(var i = 1; i < groups.length; i++) {
UNIONFIND.combine(groups[0], groups[i]);
}

["name", "phone", "email"].forEach(function(attr){
groupHash[attr][contact[attr]] = groups[0];
});

})

var contactsInGroup = {}


groupNodes.forEach(function(group){
var groupId = UNIONFIND.find(group).id;

if (contactsInGroup.hasOwnProperty(groupId) == false) {
contactsInGroup[groupId] = ;
}

contactsInGroup[groupId].push(group.id);
})

var result = Object.values(contactsInGroup).filter(function(list){
return list.length > 1
})

console.log(result)








share|improve this answer























  • Using Map instead of JS objects would allow you to avoid hasOwnProperty. You already use Set, which has a similar API and advantages.
    – tucuxi
    Nov 21 at 11:35










  • Your code seems to do mostly the same as mine, but you merge your groups on-the-fly while I wait until all are built to start processing merges. Both should be linear in the number of entries.
    – tucuxi
    Nov 21 at 11:39






  • 1




    @tucuxi I usually only use map for object references, but it could be used yeah. Nice solution that doesn´t need an extra structure like union find. Tested with data replicated until 450k entries and yours takes 850ms while mine 1050ms, definitely both linear ^^
    – juvian
    Nov 21 at 14:19


















up vote
2
down vote













Any answer that iterates over each of n entries, and then over a growing list of m groups to match against is going to have worst-time performance of O(n*m) (found when there are no two entries that match on any term).



Any answer that iterates over each entry, and then over groups, and uses arrays to test for matching values among q options will further have to pay a penalty of O(q) per match. In the worst case, with say all e-mails the same and all phones different, this will mean O(n*m).



I believe this answer is O(n), because assuming that the number of fields to match with is a constant (in this case, 3: name, phone and email), all operations in the main loop, which is run once per entry, are O(1).



There is an extra complication to fix the fact that, late in the process, we may find a bridge between two (or even 3) groups, as entries can match on different fields with entries from different groups. This may happen several times. To avoid having to rebuild groups during the main loop, we leave merging to the very end, where we first build a map of what-group-ends-up-where, and then finally move all entry IDs to their final group. This can all be done in O(m), with m the number of groups; with an extra O(n) when actually copying entry IDs to the merged groups: overall, we are still in O(n) territory.



The last line builds arrays of ids from the merged groups, and filters out any that does not have over 1 element.



const data = [
{id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
{id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
{id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
{id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
{id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
{id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
{id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
];

const groups = function(inputs) {

let valuesToGroups = new Map(
['name', 'phone', 'email'].map(key => [key, new Map()]));
let groups = new Map();
let pendingMerges = ;
for (const entry of inputs) {
let group = undefined;
let found = ;
for (const [key, valueMap] of valuesToGroups) {
// look up value in values-index for current key
group = valueMap.get(entry[key]);
if (group !== undefined) {
found.push(group);
// not breaking allows groups to be merged
}
}
if (found.length === 0) {
// not found: create new group
group = groups.size;
groups.set(group, [entry.id]);
} else {
// found: add entry to chosen group
group = found[0];
groups.get(group).push(entry.id);
if (found.length > 1) {
pendingMerges.push(found);
}
}
// add entry's values to index, pointing to group
for (const [key, valueMap] of valuesToGroups) {
valueMap.set(entry[key], group);
}
}
// do pending merges; initially, all groups are stand-alone
let merges = new Map(Array.from(groups.keys()).map(k => [k, k]));
for (const merge of pendingMerges) {
// contents will go to the lowest-numbered group
const sorted = merge.map(groupId => merges.get(groupId)).sort();
sorted.forEach(groupId => merges.set(groupId, sorted[0]));
}
const cleanGroups = new Map();
groups.forEach((value, key) => {
const k = merges.get(key);
if ( ! cleanGroups.has(k)) {
cleanGroups.set(k, );
}
value.forEach(id => cleanGroups.get(k).push(id))
})
// return only non-empty groups
return [... cleanGroups].filter(g => g[1].length>1).map(g => [... g[1]]);
}(data);

console.log(""+JSON.stringify(groups))
// output is [[1,2,3,5,6,7]]





share|improve this answer




























    up vote
    0
    down vote













    Here is another suggestion of a route you could take. The idea is to use one Array.reduce to group by id and keep all the values (vls) and combined results (ids) in that accumulator object.



    This way you can easily compare the name/phone/email using Array.some + Array.includes (which is what the getGroupId function does).



    Once you have grouped and have the almost final result just prettify it by removing the groups with length of one and picking only the ids array of the rest:






    var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];

    const getGroupId = (obj, vals) => Object.entries(obj)
    .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

    const group = d => d.reduce((r, c) => {
    let values = Object.values(c), groupID = getGroupId(r, values)[0]

    if(!groupID)
    r[c.id] = ({ vls: values, ids: [...r[c.id] || , c.id] })
    else {
    r[groupID] = ({
    vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id]
    })
    }
    return r
    }, {})

    const prettify = grp => Object.values(grp).reduce((r,c) => {
    if(c.ids.length > 1)
    r.push(c.ids)
    return r
    }, )

    console.log(prettify(group(data)))





    One thing to note is that we do not care about the number of properties since we do Object.values. So you can easily add another address or fax to that list and it would still work with zero code changes.



    As per feedback here is another version which works slightly different:






    var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];
    var testData = [{ id: 1, name: 'John', phone: '1', email: 'a' }, { id: 2, name: 'Marc', phone: '2', email: 'b' }, { id: 3, name: 'Ron', phone: '1', email: 'b' }];

    const getGroupId = (obj, vals) => Object.entries(obj)
    .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

    const group = d => d.reduce((r,c,i,a) => {
    let values = Object.values(c), groupID = !i ? i : getGroupId(r, values)[0]

    if (!groupID) {
    let hits = a.filter(x =>
    x.id != c.id && values.some(v => Object.values(x).includes(v)))
    hits.forEach(h =>
    r[c.id] = ({ vls: [...values, ...Object.values(h)], ids: [c.id, h.id] }))
    }
    else
    r[groupID] = r[groupID].ids.includes(c.id) ? r[groupID] :
    ({ vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id] })
    return r
    }, {})

    const prettify = grp => Object.values(grp).reduce((r, c) => {
    if (c.ids.length > 1)
    r.push(c.ids)
    return r
    }, )

    console.log(prettify(group(data))) // OP data
    console.log(prettify(group(testData))) // Test data





    The reason for this version is due to the testData provided by @Mark which has the 2nd element not matching the first but matching the 3rd which actually matches the 1st ... so they all should be hits.



    To get to that once we find a match we look for matches of that same initial match and push in the same group so we can have the maximum amount of data to match on.



    The result is that once we get the first group with the first element we then find and push the 3rd as well and from there it is much easier to match the 2nd. The logic is slightly more complex and I would imagine less performant.






    share|improve this answer























    • I would expect these to all be in one group: {id:1,name:'John',phone:'1',email:'a'}, {id:2,name:'Marc',phone:'2',email:'b'}{id:3,name:'Ron', phone:'1',email:'b'}, but it seems to leave 2 out.
      – Mark Meyer
      Nov 21 at 5:47












    • I did not get it the first time due to the funky formatting in the comments section but you ware correct. I updated with another version which matches as expected. Thanks.
      – Akrion
      Nov 21 at 7:50










    • The matching code seems to iterate over all values. This is wasteful, because a map could, on the right field, report on matches in O(1). Your code is brief (I have not looked into its correctness), but cannot be optimal.
      – tucuxi
      Nov 21 at 11:29










    • In order to get the correctness of the groupings you have to scan for matches once you got the initial hit. I am not saying it is perfect but it is pretty generic and could be optimized more for sure.
      – Akrion
      Nov 21 at 19:31


















    up vote
    -1
    down vote













    One way to accomplish what you need, is to separate the contacts into groups.
    Each group will contain a list of names, phones and emails.



    Then iterate through contacts, and see if the current contact falls into any of the groups. If not, create a new group and set its names/phones/emails so that next contacts might fall into the same on.






    var data = [
    {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
    {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'},
    {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
    {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
    {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
    {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
    {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
    ];

    var groups = ;

    data.forEach(function(person){
    var phone = person.phone;
    var email = person.email;
    var name = person.name;
    var id = person.id;
    var found = false;
    groups.forEach(function(g){
    if( g.names.indexOf(name) > -1
    || g.phones.indexOf(phone)>-1
    || g.emails.indexOf(email)>-1) {
    found = true;
    g.names.push(name);
    g.phones.push(phone);
    g.emails.push(email);
    g.people.push(id);
    }
    });
    if(!found) {
    groups.push({names:[name],phones:[phone],emails:[email],people:[id]});
    }


    });
    var output=;
    groups.forEach(function(g){
    output.push(g.people);
    });
    console.log(output); //[ [1,3,5] , [2,6,7] , [4] ]








    share|improve this answer

















    • 2




      O(n^2*log(n)) ? I thought OP was looking for an optimal solution
      – joyBlanks
      Nov 20 at 10:44










    • the log(n) part is ok, but the multiplication with quadratic part n^2 will cause problems for bigger datasets. However, do not have better solution myself
      – Han
      Nov 20 at 13:47










    • Groups with only 1 element should not be part of the result as per OP. [4] should not be in the result.
      – Akrion
      Nov 21 at 6:11











    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53389734%2foptimal-algorithm-grouping-data-in-javascript%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    4 Answers
    4






    active

    oldest

    votes








    4 Answers
    4






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    3
    down vote



    accepted










    Idea:




    • Start with 0 groups

    • Iterate your list of contacts

    • Check if there is a group with the contact name, or phone, or email. Merge all the members of those groups as the same group. Then add yourself to that group. If not, begin a new group with yourself and set the name, phone and email group as yourself.


    Union find is an efficient structure to handle merging of disjoint sets. Code taken from here. As it uses path compression and union by rank, you can consider that the whole code is linear in the amount of contacts.






    var data = [
    {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
    {id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
    {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
    {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
    {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
    {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
    {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
    ];

    // UNION-FIND structure, with path comression and union by rank

    var UNIONFIND = (function () {

    function _find(n)
    {
    if(n.parent == n) return n;
    n.parent = _find(n.parent);
    return n.parent;
    }

    return {
    makeset:function(id){
    var newnode = {
    parent: null,
    id: id,
    rank: 0
    };
    newnode.parent = newnode;
    return newnode;
    },

    find: _find,

    combine: function(n1, n2) {
    var n1 = _find(n1);
    var n2 = _find(n2);

    if (n1 == n2) return;

    if(n1.rank < n2.rank)
    {
    n2.parent = n2;
    return n2;
    }
    else if(n2.rank < n1.rank)
    {
    n2.parent = n1;
    return n1;
    }
    else
    {
    n2.parent = n1;
    n1.rank += 1;
    return n1;
    }
    }
    };
    })();

    var groupHash = {name: {}, phone: {}, email: {}}
    var groupNodes =

    data.forEach(function(contact){
    var group = UNIONFIND.makeset(contact.id);
    var groups = new Set();
    ["name", "phone", "email"].forEach(function(attr){
    if (groupHash[attr].hasOwnProperty(contact[attr])) groups.add(groupHash[attr][contact[attr]])
    });

    groups = Array.from(groups);
    groups.push(group);
    groupNodes.push(group);

    for(var i = 1; i < groups.length; i++) {
    UNIONFIND.combine(groups[0], groups[i]);
    }

    ["name", "phone", "email"].forEach(function(attr){
    groupHash[attr][contact[attr]] = groups[0];
    });

    })

    var contactsInGroup = {}


    groupNodes.forEach(function(group){
    var groupId = UNIONFIND.find(group).id;

    if (contactsInGroup.hasOwnProperty(groupId) == false) {
    contactsInGroup[groupId] = ;
    }

    contactsInGroup[groupId].push(group.id);
    })

    var result = Object.values(contactsInGroup).filter(function(list){
    return list.length > 1
    })

    console.log(result)








    share|improve this answer























    • Using Map instead of JS objects would allow you to avoid hasOwnProperty. You already use Set, which has a similar API and advantages.
      – tucuxi
      Nov 21 at 11:35










    • Your code seems to do mostly the same as mine, but you merge your groups on-the-fly while I wait until all are built to start processing merges. Both should be linear in the number of entries.
      – tucuxi
      Nov 21 at 11:39






    • 1




      @tucuxi I usually only use map for object references, but it could be used yeah. Nice solution that doesn´t need an extra structure like union find. Tested with data replicated until 450k entries and yours takes 850ms while mine 1050ms, definitely both linear ^^
      – juvian
      Nov 21 at 14:19















    up vote
    3
    down vote



    accepted










    Idea:




    • Start with 0 groups

    • Iterate your list of contacts

    • Check if there is a group with the contact name, or phone, or email. Merge all the members of those groups as the same group. Then add yourself to that group. If not, begin a new group with yourself and set the name, phone and email group as yourself.


    Union find is an efficient structure to handle merging of disjoint sets. Code taken from here. As it uses path compression and union by rank, you can consider that the whole code is linear in the amount of contacts.






    var data = [
    {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
    {id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
    {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
    {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
    {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
    {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
    {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
    ];

    // UNION-FIND structure, with path comression and union by rank

    var UNIONFIND = (function () {

    function _find(n)
    {
    if(n.parent == n) return n;
    n.parent = _find(n.parent);
    return n.parent;
    }

    return {
    makeset:function(id){
    var newnode = {
    parent: null,
    id: id,
    rank: 0
    };
    newnode.parent = newnode;
    return newnode;
    },

    find: _find,

    combine: function(n1, n2) {
    var n1 = _find(n1);
    var n2 = _find(n2);

    if (n1 == n2) return;

    if(n1.rank < n2.rank)
    {
    n2.parent = n2;
    return n2;
    }
    else if(n2.rank < n1.rank)
    {
    n2.parent = n1;
    return n1;
    }
    else
    {
    n2.parent = n1;
    n1.rank += 1;
    return n1;
    }
    }
    };
    })();

    var groupHash = {name: {}, phone: {}, email: {}}
    var groupNodes =

    data.forEach(function(contact){
    var group = UNIONFIND.makeset(contact.id);
    var groups = new Set();
    ["name", "phone", "email"].forEach(function(attr){
    if (groupHash[attr].hasOwnProperty(contact[attr])) groups.add(groupHash[attr][contact[attr]])
    });

    groups = Array.from(groups);
    groups.push(group);
    groupNodes.push(group);

    for(var i = 1; i < groups.length; i++) {
    UNIONFIND.combine(groups[0], groups[i]);
    }

    ["name", "phone", "email"].forEach(function(attr){
    groupHash[attr][contact[attr]] = groups[0];
    });

    })

    var contactsInGroup = {}


    groupNodes.forEach(function(group){
    var groupId = UNIONFIND.find(group).id;

    if (contactsInGroup.hasOwnProperty(groupId) == false) {
    contactsInGroup[groupId] = ;
    }

    contactsInGroup[groupId].push(group.id);
    })

    var result = Object.values(contactsInGroup).filter(function(list){
    return list.length > 1
    })

    console.log(result)








    share|improve this answer























    • Using Map instead of JS objects would allow you to avoid hasOwnProperty. You already use Set, which has a similar API and advantages.
      – tucuxi
      Nov 21 at 11:35










    • Your code seems to do mostly the same as mine, but you merge your groups on-the-fly while I wait until all are built to start processing merges. Both should be linear in the number of entries.
      – tucuxi
      Nov 21 at 11:39






    • 1




      @tucuxi I usually only use map for object references, but it could be used yeah. Nice solution that doesn´t need an extra structure like union find. Tested with data replicated until 450k entries and yours takes 850ms while mine 1050ms, definitely both linear ^^
      – juvian
      Nov 21 at 14:19













    up vote
    3
    down vote



    accepted







    up vote
    3
    down vote



    accepted






    Idea:




    • Start with 0 groups

    • Iterate your list of contacts

    • Check if there is a group with the contact name, or phone, or email. Merge all the members of those groups as the same group. Then add yourself to that group. If not, begin a new group with yourself and set the name, phone and email group as yourself.


    Union find is an efficient structure to handle merging of disjoint sets. Code taken from here. As it uses path compression and union by rank, you can consider that the whole code is linear in the amount of contacts.






    var data = [
    {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
    {id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
    {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
    {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
    {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
    {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
    {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
    ];

    // UNION-FIND structure, with path comression and union by rank

    var UNIONFIND = (function () {

    function _find(n)
    {
    if(n.parent == n) return n;
    n.parent = _find(n.parent);
    return n.parent;
    }

    return {
    makeset:function(id){
    var newnode = {
    parent: null,
    id: id,
    rank: 0
    };
    newnode.parent = newnode;
    return newnode;
    },

    find: _find,

    combine: function(n1, n2) {
    var n1 = _find(n1);
    var n2 = _find(n2);

    if (n1 == n2) return;

    if(n1.rank < n2.rank)
    {
    n2.parent = n2;
    return n2;
    }
    else if(n2.rank < n1.rank)
    {
    n2.parent = n1;
    return n1;
    }
    else
    {
    n2.parent = n1;
    n1.rank += 1;
    return n1;
    }
    }
    };
    })();

    var groupHash = {name: {}, phone: {}, email: {}}
    var groupNodes =

    data.forEach(function(contact){
    var group = UNIONFIND.makeset(contact.id);
    var groups = new Set();
    ["name", "phone", "email"].forEach(function(attr){
    if (groupHash[attr].hasOwnProperty(contact[attr])) groups.add(groupHash[attr][contact[attr]])
    });

    groups = Array.from(groups);
    groups.push(group);
    groupNodes.push(group);

    for(var i = 1; i < groups.length; i++) {
    UNIONFIND.combine(groups[0], groups[i]);
    }

    ["name", "phone", "email"].forEach(function(attr){
    groupHash[attr][contact[attr]] = groups[0];
    });

    })

    var contactsInGroup = {}


    groupNodes.forEach(function(group){
    var groupId = UNIONFIND.find(group).id;

    if (contactsInGroup.hasOwnProperty(groupId) == false) {
    contactsInGroup[groupId] = ;
    }

    contactsInGroup[groupId].push(group.id);
    })

    var result = Object.values(contactsInGroup).filter(function(list){
    return list.length > 1
    })

    console.log(result)








    share|improve this answer














    Idea:




    • Start with 0 groups

    • Iterate your list of contacts

    • Check if there is a group with the contact name, or phone, or email. Merge all the members of those groups as the same group. Then add yourself to that group. If not, begin a new group with yourself and set the name, phone and email group as yourself.


    Union find is an efficient structure to handle merging of disjoint sets. Code taken from here. As it uses path compression and union by rank, you can consider that the whole code is linear in the amount of contacts.






    var data = [
    {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
    {id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
    {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
    {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
    {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
    {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
    {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
    ];

    // UNION-FIND structure, with path comression and union by rank

    var UNIONFIND = (function () {

    function _find(n)
    {
    if(n.parent == n) return n;
    n.parent = _find(n.parent);
    return n.parent;
    }

    return {
    makeset:function(id){
    var newnode = {
    parent: null,
    id: id,
    rank: 0
    };
    newnode.parent = newnode;
    return newnode;
    },

    find: _find,

    combine: function(n1, n2) {
    var n1 = _find(n1);
    var n2 = _find(n2);

    if (n1 == n2) return;

    if(n1.rank < n2.rank)
    {
    n2.parent = n2;
    return n2;
    }
    else if(n2.rank < n1.rank)
    {
    n2.parent = n1;
    return n1;
    }
    else
    {
    n2.parent = n1;
    n1.rank += 1;
    return n1;
    }
    }
    };
    })();

    var groupHash = {name: {}, phone: {}, email: {}}
    var groupNodes =

    data.forEach(function(contact){
    var group = UNIONFIND.makeset(contact.id);
    var groups = new Set();
    ["name", "phone", "email"].forEach(function(attr){
    if (groupHash[attr].hasOwnProperty(contact[attr])) groups.add(groupHash[attr][contact[attr]])
    });

    groups = Array.from(groups);
    groups.push(group);
    groupNodes.push(group);

    for(var i = 1; i < groups.length; i++) {
    UNIONFIND.combine(groups[0], groups[i]);
    }

    ["name", "phone", "email"].forEach(function(attr){
    groupHash[attr][contact[attr]] = groups[0];
    });

    })

    var contactsInGroup = {}


    groupNodes.forEach(function(group){
    var groupId = UNIONFIND.find(group).id;

    if (contactsInGroup.hasOwnProperty(groupId) == false) {
    contactsInGroup[groupId] = ;
    }

    contactsInGroup[groupId].push(group.id);
    })

    var result = Object.values(contactsInGroup).filter(function(list){
    return list.length > 1
    })

    console.log(result)








    var data = [
    {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
    {id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
    {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
    {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
    {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
    {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
    {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
    ];

    // UNION-FIND structure, with path comression and union by rank

    var UNIONFIND = (function () {

    function _find(n)
    {
    if(n.parent == n) return n;
    n.parent = _find(n.parent);
    return n.parent;
    }

    return {
    makeset:function(id){
    var newnode = {
    parent: null,
    id: id,
    rank: 0
    };
    newnode.parent = newnode;
    return newnode;
    },

    find: _find,

    combine: function(n1, n2) {
    var n1 = _find(n1);
    var n2 = _find(n2);

    if (n1 == n2) return;

    if(n1.rank < n2.rank)
    {
    n2.parent = n2;
    return n2;
    }
    else if(n2.rank < n1.rank)
    {
    n2.parent = n1;
    return n1;
    }
    else
    {
    n2.parent = n1;
    n1.rank += 1;
    return n1;
    }
    }
    };
    })();

    var groupHash = {name: {}, phone: {}, email: {}}
    var groupNodes =

    data.forEach(function(contact){
    var group = UNIONFIND.makeset(contact.id);
    var groups = new Set();
    ["name", "phone", "email"].forEach(function(attr){
    if (groupHash[attr].hasOwnProperty(contact[attr])) groups.add(groupHash[attr][contact[attr]])
    });

    groups = Array.from(groups);
    groups.push(group);
    groupNodes.push(group);

    for(var i = 1; i < groups.length; i++) {
    UNIONFIND.combine(groups[0], groups[i]);
    }

    ["name", "phone", "email"].forEach(function(attr){
    groupHash[attr][contact[attr]] = groups[0];
    });

    })

    var contactsInGroup = {}


    groupNodes.forEach(function(group){
    var groupId = UNIONFIND.find(group).id;

    if (contactsInGroup.hasOwnProperty(groupId) == false) {
    contactsInGroup[groupId] = ;
    }

    contactsInGroup[groupId].push(group.id);
    })

    var result = Object.values(contactsInGroup).filter(function(list){
    return list.length > 1
    })

    console.log(result)





    var data = [
    {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
    {id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
    {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
    {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
    {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
    {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
    {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
    ];

    // UNION-FIND structure, with path comression and union by rank

    var UNIONFIND = (function () {

    function _find(n)
    {
    if(n.parent == n) return n;
    n.parent = _find(n.parent);
    return n.parent;
    }

    return {
    makeset:function(id){
    var newnode = {
    parent: null,
    id: id,
    rank: 0
    };
    newnode.parent = newnode;
    return newnode;
    },

    find: _find,

    combine: function(n1, n2) {
    var n1 = _find(n1);
    var n2 = _find(n2);

    if (n1 == n2) return;

    if(n1.rank < n2.rank)
    {
    n2.parent = n2;
    return n2;
    }
    else if(n2.rank < n1.rank)
    {
    n2.parent = n1;
    return n1;
    }
    else
    {
    n2.parent = n1;
    n1.rank += 1;
    return n1;
    }
    }
    };
    })();

    var groupHash = {name: {}, phone: {}, email: {}}
    var groupNodes =

    data.forEach(function(contact){
    var group = UNIONFIND.makeset(contact.id);
    var groups = new Set();
    ["name", "phone", "email"].forEach(function(attr){
    if (groupHash[attr].hasOwnProperty(contact[attr])) groups.add(groupHash[attr][contact[attr]])
    });

    groups = Array.from(groups);
    groups.push(group);
    groupNodes.push(group);

    for(var i = 1; i < groups.length; i++) {
    UNIONFIND.combine(groups[0], groups[i]);
    }

    ["name", "phone", "email"].forEach(function(attr){
    groupHash[attr][contact[attr]] = groups[0];
    });

    })

    var contactsInGroup = {}


    groupNodes.forEach(function(group){
    var groupId = UNIONFIND.find(group).id;

    if (contactsInGroup.hasOwnProperty(groupId) == false) {
    contactsInGroup[groupId] = ;
    }

    contactsInGroup[groupId].push(group.id);
    })

    var result = Object.values(contactsInGroup).filter(function(list){
    return list.length > 1
    })

    console.log(result)






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 20 at 14:59

























    answered Nov 20 at 14:30









    juvian

    13.2k22026




    13.2k22026












    • Using Map instead of JS objects would allow you to avoid hasOwnProperty. You already use Set, which has a similar API and advantages.
      – tucuxi
      Nov 21 at 11:35










    • Your code seems to do mostly the same as mine, but you merge your groups on-the-fly while I wait until all are built to start processing merges. Both should be linear in the number of entries.
      – tucuxi
      Nov 21 at 11:39






    • 1




      @tucuxi I usually only use map for object references, but it could be used yeah. Nice solution that doesn´t need an extra structure like union find. Tested with data replicated until 450k entries and yours takes 850ms while mine 1050ms, definitely both linear ^^
      – juvian
      Nov 21 at 14:19


















    • Using Map instead of JS objects would allow you to avoid hasOwnProperty. You already use Set, which has a similar API and advantages.
      – tucuxi
      Nov 21 at 11:35










    • Your code seems to do mostly the same as mine, but you merge your groups on-the-fly while I wait until all are built to start processing merges. Both should be linear in the number of entries.
      – tucuxi
      Nov 21 at 11:39






    • 1




      @tucuxi I usually only use map for object references, but it could be used yeah. Nice solution that doesn´t need an extra structure like union find. Tested with data replicated until 450k entries and yours takes 850ms while mine 1050ms, definitely both linear ^^
      – juvian
      Nov 21 at 14:19
















    Using Map instead of JS objects would allow you to avoid hasOwnProperty. You already use Set, which has a similar API and advantages.
    – tucuxi
    Nov 21 at 11:35




    Using Map instead of JS objects would allow you to avoid hasOwnProperty. You already use Set, which has a similar API and advantages.
    – tucuxi
    Nov 21 at 11:35












    Your code seems to do mostly the same as mine, but you merge your groups on-the-fly while I wait until all are built to start processing merges. Both should be linear in the number of entries.
    – tucuxi
    Nov 21 at 11:39




    Your code seems to do mostly the same as mine, but you merge your groups on-the-fly while I wait until all are built to start processing merges. Both should be linear in the number of entries.
    – tucuxi
    Nov 21 at 11:39




    1




    1




    @tucuxi I usually only use map for object references, but it could be used yeah. Nice solution that doesn´t need an extra structure like union find. Tested with data replicated until 450k entries and yours takes 850ms while mine 1050ms, definitely both linear ^^
    – juvian
    Nov 21 at 14:19




    @tucuxi I usually only use map for object references, but it could be used yeah. Nice solution that doesn´t need an extra structure like union find. Tested with data replicated until 450k entries and yours takes 850ms while mine 1050ms, definitely both linear ^^
    – juvian
    Nov 21 at 14:19












    up vote
    2
    down vote













    Any answer that iterates over each of n entries, and then over a growing list of m groups to match against is going to have worst-time performance of O(n*m) (found when there are no two entries that match on any term).



    Any answer that iterates over each entry, and then over groups, and uses arrays to test for matching values among q options will further have to pay a penalty of O(q) per match. In the worst case, with say all e-mails the same and all phones different, this will mean O(n*m).



    I believe this answer is O(n), because assuming that the number of fields to match with is a constant (in this case, 3: name, phone and email), all operations in the main loop, which is run once per entry, are O(1).



    There is an extra complication to fix the fact that, late in the process, we may find a bridge between two (or even 3) groups, as entries can match on different fields with entries from different groups. This may happen several times. To avoid having to rebuild groups during the main loop, we leave merging to the very end, where we first build a map of what-group-ends-up-where, and then finally move all entry IDs to their final group. This can all be done in O(m), with m the number of groups; with an extra O(n) when actually copying entry IDs to the merged groups: overall, we are still in O(n) territory.



    The last line builds arrays of ids from the merged groups, and filters out any that does not have over 1 element.



    const data = [
    {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
    {id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
    {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
    {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
    {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
    {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
    {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
    ];

    const groups = function(inputs) {

    let valuesToGroups = new Map(
    ['name', 'phone', 'email'].map(key => [key, new Map()]));
    let groups = new Map();
    let pendingMerges = ;
    for (const entry of inputs) {
    let group = undefined;
    let found = ;
    for (const [key, valueMap] of valuesToGroups) {
    // look up value in values-index for current key
    group = valueMap.get(entry[key]);
    if (group !== undefined) {
    found.push(group);
    // not breaking allows groups to be merged
    }
    }
    if (found.length === 0) {
    // not found: create new group
    group = groups.size;
    groups.set(group, [entry.id]);
    } else {
    // found: add entry to chosen group
    group = found[0];
    groups.get(group).push(entry.id);
    if (found.length > 1) {
    pendingMerges.push(found);
    }
    }
    // add entry's values to index, pointing to group
    for (const [key, valueMap] of valuesToGroups) {
    valueMap.set(entry[key], group);
    }
    }
    // do pending merges; initially, all groups are stand-alone
    let merges = new Map(Array.from(groups.keys()).map(k => [k, k]));
    for (const merge of pendingMerges) {
    // contents will go to the lowest-numbered group
    const sorted = merge.map(groupId => merges.get(groupId)).sort();
    sorted.forEach(groupId => merges.set(groupId, sorted[0]));
    }
    const cleanGroups = new Map();
    groups.forEach((value, key) => {
    const k = merges.get(key);
    if ( ! cleanGroups.has(k)) {
    cleanGroups.set(k, );
    }
    value.forEach(id => cleanGroups.get(k).push(id))
    })
    // return only non-empty groups
    return [... cleanGroups].filter(g => g[1].length>1).map(g => [... g[1]]);
    }(data);

    console.log(""+JSON.stringify(groups))
    // output is [[1,2,3,5,6,7]]





    share|improve this answer

























      up vote
      2
      down vote













      Any answer that iterates over each of n entries, and then over a growing list of m groups to match against is going to have worst-time performance of O(n*m) (found when there are no two entries that match on any term).



      Any answer that iterates over each entry, and then over groups, and uses arrays to test for matching values among q options will further have to pay a penalty of O(q) per match. In the worst case, with say all e-mails the same and all phones different, this will mean O(n*m).



      I believe this answer is O(n), because assuming that the number of fields to match with is a constant (in this case, 3: name, phone and email), all operations in the main loop, which is run once per entry, are O(1).



      There is an extra complication to fix the fact that, late in the process, we may find a bridge between two (or even 3) groups, as entries can match on different fields with entries from different groups. This may happen several times. To avoid having to rebuild groups during the main loop, we leave merging to the very end, where we first build a map of what-group-ends-up-where, and then finally move all entry IDs to their final group. This can all be done in O(m), with m the number of groups; with an extra O(n) when actually copying entry IDs to the merged groups: overall, we are still in O(n) territory.



      The last line builds arrays of ids from the merged groups, and filters out any that does not have over 1 element.



      const data = [
      {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
      {id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
      {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
      {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
      {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
      {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
      {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
      ];

      const groups = function(inputs) {

      let valuesToGroups = new Map(
      ['name', 'phone', 'email'].map(key => [key, new Map()]));
      let groups = new Map();
      let pendingMerges = ;
      for (const entry of inputs) {
      let group = undefined;
      let found = ;
      for (const [key, valueMap] of valuesToGroups) {
      // look up value in values-index for current key
      group = valueMap.get(entry[key]);
      if (group !== undefined) {
      found.push(group);
      // not breaking allows groups to be merged
      }
      }
      if (found.length === 0) {
      // not found: create new group
      group = groups.size;
      groups.set(group, [entry.id]);
      } else {
      // found: add entry to chosen group
      group = found[0];
      groups.get(group).push(entry.id);
      if (found.length > 1) {
      pendingMerges.push(found);
      }
      }
      // add entry's values to index, pointing to group
      for (const [key, valueMap] of valuesToGroups) {
      valueMap.set(entry[key], group);
      }
      }
      // do pending merges; initially, all groups are stand-alone
      let merges = new Map(Array.from(groups.keys()).map(k => [k, k]));
      for (const merge of pendingMerges) {
      // contents will go to the lowest-numbered group
      const sorted = merge.map(groupId => merges.get(groupId)).sort();
      sorted.forEach(groupId => merges.set(groupId, sorted[0]));
      }
      const cleanGroups = new Map();
      groups.forEach((value, key) => {
      const k = merges.get(key);
      if ( ! cleanGroups.has(k)) {
      cleanGroups.set(k, );
      }
      value.forEach(id => cleanGroups.get(k).push(id))
      })
      // return only non-empty groups
      return [... cleanGroups].filter(g => g[1].length>1).map(g => [... g[1]]);
      }(data);

      console.log(""+JSON.stringify(groups))
      // output is [[1,2,3,5,6,7]]





      share|improve this answer























        up vote
        2
        down vote










        up vote
        2
        down vote









        Any answer that iterates over each of n entries, and then over a growing list of m groups to match against is going to have worst-time performance of O(n*m) (found when there are no two entries that match on any term).



        Any answer that iterates over each entry, and then over groups, and uses arrays to test for matching values among q options will further have to pay a penalty of O(q) per match. In the worst case, with say all e-mails the same and all phones different, this will mean O(n*m).



        I believe this answer is O(n), because assuming that the number of fields to match with is a constant (in this case, 3: name, phone and email), all operations in the main loop, which is run once per entry, are O(1).



        There is an extra complication to fix the fact that, late in the process, we may find a bridge between two (or even 3) groups, as entries can match on different fields with entries from different groups. This may happen several times. To avoid having to rebuild groups during the main loop, we leave merging to the very end, where we first build a map of what-group-ends-up-where, and then finally move all entry IDs to their final group. This can all be done in O(m), with m the number of groups; with an extra O(n) when actually copying entry IDs to the merged groups: overall, we are still in O(n) territory.



        The last line builds arrays of ids from the merged groups, and filters out any that does not have over 1 element.



        const data = [
        {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
        {id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
        {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
        {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
        {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
        {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
        {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
        ];

        const groups = function(inputs) {

        let valuesToGroups = new Map(
        ['name', 'phone', 'email'].map(key => [key, new Map()]));
        let groups = new Map();
        let pendingMerges = ;
        for (const entry of inputs) {
        let group = undefined;
        let found = ;
        for (const [key, valueMap] of valuesToGroups) {
        // look up value in values-index for current key
        group = valueMap.get(entry[key]);
        if (group !== undefined) {
        found.push(group);
        // not breaking allows groups to be merged
        }
        }
        if (found.length === 0) {
        // not found: create new group
        group = groups.size;
        groups.set(group, [entry.id]);
        } else {
        // found: add entry to chosen group
        group = found[0];
        groups.get(group).push(entry.id);
        if (found.length > 1) {
        pendingMerges.push(found);
        }
        }
        // add entry's values to index, pointing to group
        for (const [key, valueMap] of valuesToGroups) {
        valueMap.set(entry[key], group);
        }
        }
        // do pending merges; initially, all groups are stand-alone
        let merges = new Map(Array.from(groups.keys()).map(k => [k, k]));
        for (const merge of pendingMerges) {
        // contents will go to the lowest-numbered group
        const sorted = merge.map(groupId => merges.get(groupId)).sort();
        sorted.forEach(groupId => merges.set(groupId, sorted[0]));
        }
        const cleanGroups = new Map();
        groups.forEach((value, key) => {
        const k = merges.get(key);
        if ( ! cleanGroups.has(k)) {
        cleanGroups.set(k, );
        }
        value.forEach(id => cleanGroups.get(k).push(id))
        })
        // return only non-empty groups
        return [... cleanGroups].filter(g => g[1].length>1).map(g => [... g[1]]);
        }(data);

        console.log(""+JSON.stringify(groups))
        // output is [[1,2,3,5,6,7]]





        share|improve this answer












        Any answer that iterates over each of n entries, and then over a growing list of m groups to match against is going to have worst-time performance of O(n*m) (found when there are no two entries that match on any term).



        Any answer that iterates over each entry, and then over groups, and uses arrays to test for matching values among q options will further have to pay a penalty of O(q) per match. In the worst case, with say all e-mails the same and all phones different, this will mean O(n*m).



        I believe this answer is O(n), because assuming that the number of fields to match with is a constant (in this case, 3: name, phone and email), all operations in the main loop, which is run once per entry, are O(1).



        There is an extra complication to fix the fact that, late in the process, we may find a bridge between two (or even 3) groups, as entries can match on different fields with entries from different groups. This may happen several times. To avoid having to rebuild groups during the main loop, we leave merging to the very end, where we first build a map of what-group-ends-up-where, and then finally move all entry IDs to their final group. This can all be done in O(m), with m the number of groups; with an extra O(n) when actually copying entry IDs to the merged groups: overall, we are still in O(n) territory.



        The last line builds arrays of ids from the merged groups, and filters out any that does not have over 1 element.



        const data = [
        {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
        {id:2,name:'Marc',phone:'99999999',email:'bbbb@test.com'},
        {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
        {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
        {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
        {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
        {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
        ];

        const groups = function(inputs) {

        let valuesToGroups = new Map(
        ['name', 'phone', 'email'].map(key => [key, new Map()]));
        let groups = new Map();
        let pendingMerges = ;
        for (const entry of inputs) {
        let group = undefined;
        let found = ;
        for (const [key, valueMap] of valuesToGroups) {
        // look up value in values-index for current key
        group = valueMap.get(entry[key]);
        if (group !== undefined) {
        found.push(group);
        // not breaking allows groups to be merged
        }
        }
        if (found.length === 0) {
        // not found: create new group
        group = groups.size;
        groups.set(group, [entry.id]);
        } else {
        // found: add entry to chosen group
        group = found[0];
        groups.get(group).push(entry.id);
        if (found.length > 1) {
        pendingMerges.push(found);
        }
        }
        // add entry's values to index, pointing to group
        for (const [key, valueMap] of valuesToGroups) {
        valueMap.set(entry[key], group);
        }
        }
        // do pending merges; initially, all groups are stand-alone
        let merges = new Map(Array.from(groups.keys()).map(k => [k, k]));
        for (const merge of pendingMerges) {
        // contents will go to the lowest-numbered group
        const sorted = merge.map(groupId => merges.get(groupId)).sort();
        sorted.forEach(groupId => merges.set(groupId, sorted[0]));
        }
        const cleanGroups = new Map();
        groups.forEach((value, key) => {
        const k = merges.get(key);
        if ( ! cleanGroups.has(k)) {
        cleanGroups.set(k, );
        }
        value.forEach(id => cleanGroups.get(k).push(id))
        })
        // return only non-empty groups
        return [... cleanGroups].filter(g => g[1].length>1).map(g => [... g[1]]);
        }(data);

        console.log(""+JSON.stringify(groups))
        // output is [[1,2,3,5,6,7]]






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 21 at 11:25









        tucuxi

        11.5k22858




        11.5k22858






















            up vote
            0
            down vote













            Here is another suggestion of a route you could take. The idea is to use one Array.reduce to group by id and keep all the values (vls) and combined results (ids) in that accumulator object.



            This way you can easily compare the name/phone/email using Array.some + Array.includes (which is what the getGroupId function does).



            Once you have grouped and have the almost final result just prettify it by removing the groups with length of one and picking only the ids array of the rest:






            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r, c) => {
            let values = Object.values(c), groupID = getGroupId(r, values)[0]

            if(!groupID)
            r[c.id] = ({ vls: values, ids: [...r[c.id] || , c.id] })
            else {
            r[groupID] = ({
            vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id]
            })
            }
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r,c) => {
            if(c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data)))





            One thing to note is that we do not care about the number of properties since we do Object.values. So you can easily add another address or fax to that list and it would still work with zero code changes.



            As per feedback here is another version which works slightly different:






            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];
            var testData = [{ id: 1, name: 'John', phone: '1', email: 'a' }, { id: 2, name: 'Marc', phone: '2', email: 'b' }, { id: 3, name: 'Ron', phone: '1', email: 'b' }];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r,c,i,a) => {
            let values = Object.values(c), groupID = !i ? i : getGroupId(r, values)[0]

            if (!groupID) {
            let hits = a.filter(x =>
            x.id != c.id && values.some(v => Object.values(x).includes(v)))
            hits.forEach(h =>
            r[c.id] = ({ vls: [...values, ...Object.values(h)], ids: [c.id, h.id] }))
            }
            else
            r[groupID] = r[groupID].ids.includes(c.id) ? r[groupID] :
            ({ vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id] })
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r, c) => {
            if (c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data))) // OP data
            console.log(prettify(group(testData))) // Test data





            The reason for this version is due to the testData provided by @Mark which has the 2nd element not matching the first but matching the 3rd which actually matches the 1st ... so they all should be hits.



            To get to that once we find a match we look for matches of that same initial match and push in the same group so we can have the maximum amount of data to match on.



            The result is that once we get the first group with the first element we then find and push the 3rd as well and from there it is much easier to match the 2nd. The logic is slightly more complex and I would imagine less performant.






            share|improve this answer























            • I would expect these to all be in one group: {id:1,name:'John',phone:'1',email:'a'}, {id:2,name:'Marc',phone:'2',email:'b'}{id:3,name:'Ron', phone:'1',email:'b'}, but it seems to leave 2 out.
              – Mark Meyer
              Nov 21 at 5:47












            • I did not get it the first time due to the funky formatting in the comments section but you ware correct. I updated with another version which matches as expected. Thanks.
              – Akrion
              Nov 21 at 7:50










            • The matching code seems to iterate over all values. This is wasteful, because a map could, on the right field, report on matches in O(1). Your code is brief (I have not looked into its correctness), but cannot be optimal.
              – tucuxi
              Nov 21 at 11:29










            • In order to get the correctness of the groupings you have to scan for matches once you got the initial hit. I am not saying it is perfect but it is pretty generic and could be optimized more for sure.
              – Akrion
              Nov 21 at 19:31















            up vote
            0
            down vote













            Here is another suggestion of a route you could take. The idea is to use one Array.reduce to group by id and keep all the values (vls) and combined results (ids) in that accumulator object.



            This way you can easily compare the name/phone/email using Array.some + Array.includes (which is what the getGroupId function does).



            Once you have grouped and have the almost final result just prettify it by removing the groups with length of one and picking only the ids array of the rest:






            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r, c) => {
            let values = Object.values(c), groupID = getGroupId(r, values)[0]

            if(!groupID)
            r[c.id] = ({ vls: values, ids: [...r[c.id] || , c.id] })
            else {
            r[groupID] = ({
            vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id]
            })
            }
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r,c) => {
            if(c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data)))





            One thing to note is that we do not care about the number of properties since we do Object.values. So you can easily add another address or fax to that list and it would still work with zero code changes.



            As per feedback here is another version which works slightly different:






            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];
            var testData = [{ id: 1, name: 'John', phone: '1', email: 'a' }, { id: 2, name: 'Marc', phone: '2', email: 'b' }, { id: 3, name: 'Ron', phone: '1', email: 'b' }];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r,c,i,a) => {
            let values = Object.values(c), groupID = !i ? i : getGroupId(r, values)[0]

            if (!groupID) {
            let hits = a.filter(x =>
            x.id != c.id && values.some(v => Object.values(x).includes(v)))
            hits.forEach(h =>
            r[c.id] = ({ vls: [...values, ...Object.values(h)], ids: [c.id, h.id] }))
            }
            else
            r[groupID] = r[groupID].ids.includes(c.id) ? r[groupID] :
            ({ vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id] })
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r, c) => {
            if (c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data))) // OP data
            console.log(prettify(group(testData))) // Test data





            The reason for this version is due to the testData provided by @Mark which has the 2nd element not matching the first but matching the 3rd which actually matches the 1st ... so they all should be hits.



            To get to that once we find a match we look for matches of that same initial match and push in the same group so we can have the maximum amount of data to match on.



            The result is that once we get the first group with the first element we then find and push the 3rd as well and from there it is much easier to match the 2nd. The logic is slightly more complex and I would imagine less performant.






            share|improve this answer























            • I would expect these to all be in one group: {id:1,name:'John',phone:'1',email:'a'}, {id:2,name:'Marc',phone:'2',email:'b'}{id:3,name:'Ron', phone:'1',email:'b'}, but it seems to leave 2 out.
              – Mark Meyer
              Nov 21 at 5:47












            • I did not get it the first time due to the funky formatting in the comments section but you ware correct. I updated with another version which matches as expected. Thanks.
              – Akrion
              Nov 21 at 7:50










            • The matching code seems to iterate over all values. This is wasteful, because a map could, on the right field, report on matches in O(1). Your code is brief (I have not looked into its correctness), but cannot be optimal.
              – tucuxi
              Nov 21 at 11:29










            • In order to get the correctness of the groupings you have to scan for matches once you got the initial hit. I am not saying it is perfect but it is pretty generic and could be optimized more for sure.
              – Akrion
              Nov 21 at 19:31













            up vote
            0
            down vote










            up vote
            0
            down vote









            Here is another suggestion of a route you could take. The idea is to use one Array.reduce to group by id and keep all the values (vls) and combined results (ids) in that accumulator object.



            This way you can easily compare the name/phone/email using Array.some + Array.includes (which is what the getGroupId function does).



            Once you have grouped and have the almost final result just prettify it by removing the groups with length of one and picking only the ids array of the rest:






            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r, c) => {
            let values = Object.values(c), groupID = getGroupId(r, values)[0]

            if(!groupID)
            r[c.id] = ({ vls: values, ids: [...r[c.id] || , c.id] })
            else {
            r[groupID] = ({
            vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id]
            })
            }
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r,c) => {
            if(c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data)))





            One thing to note is that we do not care about the number of properties since we do Object.values. So you can easily add another address or fax to that list and it would still work with zero code changes.



            As per feedback here is another version which works slightly different:






            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];
            var testData = [{ id: 1, name: 'John', phone: '1', email: 'a' }, { id: 2, name: 'Marc', phone: '2', email: 'b' }, { id: 3, name: 'Ron', phone: '1', email: 'b' }];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r,c,i,a) => {
            let values = Object.values(c), groupID = !i ? i : getGroupId(r, values)[0]

            if (!groupID) {
            let hits = a.filter(x =>
            x.id != c.id && values.some(v => Object.values(x).includes(v)))
            hits.forEach(h =>
            r[c.id] = ({ vls: [...values, ...Object.values(h)], ids: [c.id, h.id] }))
            }
            else
            r[groupID] = r[groupID].ids.includes(c.id) ? r[groupID] :
            ({ vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id] })
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r, c) => {
            if (c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data))) // OP data
            console.log(prettify(group(testData))) // Test data





            The reason for this version is due to the testData provided by @Mark which has the 2nd element not matching the first but matching the 3rd which actually matches the 1st ... so they all should be hits.



            To get to that once we find a match we look for matches of that same initial match and push in the same group so we can have the maximum amount of data to match on.



            The result is that once we get the first group with the first element we then find and push the 3rd as well and from there it is much easier to match the 2nd. The logic is slightly more complex and I would imagine less performant.






            share|improve this answer














            Here is another suggestion of a route you could take. The idea is to use one Array.reduce to group by id and keep all the values (vls) and combined results (ids) in that accumulator object.



            This way you can easily compare the name/phone/email using Array.some + Array.includes (which is what the getGroupId function does).



            Once you have grouped and have the almost final result just prettify it by removing the groups with length of one and picking only the ids array of the rest:






            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r, c) => {
            let values = Object.values(c), groupID = getGroupId(r, values)[0]

            if(!groupID)
            r[c.id] = ({ vls: values, ids: [...r[c.id] || , c.id] })
            else {
            r[groupID] = ({
            vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id]
            })
            }
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r,c) => {
            if(c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data)))





            One thing to note is that we do not care about the number of properties since we do Object.values. So you can easily add another address or fax to that list and it would still work with zero code changes.



            As per feedback here is another version which works slightly different:






            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];
            var testData = [{ id: 1, name: 'John', phone: '1', email: 'a' }, { id: 2, name: 'Marc', phone: '2', email: 'b' }, { id: 3, name: 'Ron', phone: '1', email: 'b' }];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r,c,i,a) => {
            let values = Object.values(c), groupID = !i ? i : getGroupId(r, values)[0]

            if (!groupID) {
            let hits = a.filter(x =>
            x.id != c.id && values.some(v => Object.values(x).includes(v)))
            hits.forEach(h =>
            r[c.id] = ({ vls: [...values, ...Object.values(h)], ids: [c.id, h.id] }))
            }
            else
            r[groupID] = r[groupID].ids.includes(c.id) ? r[groupID] :
            ({ vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id] })
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r, c) => {
            if (c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data))) // OP data
            console.log(prettify(group(testData))) // Test data





            The reason for this version is due to the testData provided by @Mark which has the 2nd element not matching the first but matching the 3rd which actually matches the 1st ... so they all should be hits.



            To get to that once we find a match we look for matches of that same initial match and push in the same group so we can have the maximum amount of data to match on.



            The result is that once we get the first group with the first element we then find and push the 3rd as well and from there it is much easier to match the 2nd. The logic is slightly more complex and I would imagine less performant.






            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r, c) => {
            let values = Object.values(c), groupID = getGroupId(r, values)[0]

            if(!groupID)
            r[c.id] = ({ vls: values, ids: [...r[c.id] || , c.id] })
            else {
            r[groupID] = ({
            vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id]
            })
            }
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r,c) => {
            if(c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data)))





            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r, c) => {
            let values = Object.values(c), groupID = getGroupId(r, values)[0]

            if(!groupID)
            r[c.id] = ({ vls: values, ids: [...r[c.id] || , c.id] })
            else {
            r[groupID] = ({
            vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id]
            })
            }
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r,c) => {
            if(c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data)))





            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];
            var testData = [{ id: 1, name: 'John', phone: '1', email: 'a' }, { id: 2, name: 'Marc', phone: '2', email: 'b' }, { id: 3, name: 'Ron', phone: '1', email: 'b' }];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r,c,i,a) => {
            let values = Object.values(c), groupID = !i ? i : getGroupId(r, values)[0]

            if (!groupID) {
            let hits = a.filter(x =>
            x.id != c.id && values.some(v => Object.values(x).includes(v)))
            hits.forEach(h =>
            r[c.id] = ({ vls: [...values, ...Object.values(h)], ids: [c.id, h.id] }))
            }
            else
            r[groupID] = r[groupID].ids.includes(c.id) ? r[groupID] :
            ({ vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id] })
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r, c) => {
            if (c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data))) // OP data
            console.log(prettify(group(testData))) // Test data





            var data = [ {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'}, {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'}, {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'}, {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'}, {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'}, {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'}, {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'} ];
            var testData = [{ id: 1, name: 'John', phone: '1', email: 'a' }, { id: 2, name: 'Marc', phone: '2', email: 'b' }, { id: 3, name: 'Ron', phone: '1', email: 'b' }];

            const getGroupId = (obj, vals) => Object.entries(obj)
            .find(([k,v]) => v.vls.some(x => vals.includes(x))) ||

            const group = d => d.reduce((r,c,i,a) => {
            let values = Object.values(c), groupID = !i ? i : getGroupId(r, values)[0]

            if (!groupID) {
            let hits = a.filter(x =>
            x.id != c.id && values.some(v => Object.values(x).includes(v)))
            hits.forEach(h =>
            r[c.id] = ({ vls: [...values, ...Object.values(h)], ids: [c.id, h.id] }))
            }
            else
            r[groupID] = r[groupID].ids.includes(c.id) ? r[groupID] :
            ({ vls: [...r[groupID].vls, ...values], ids: [...r[groupID].ids, c.id] })
            return r
            }, {})

            const prettify = grp => Object.values(grp).reduce((r, c) => {
            if (c.ids.length > 1)
            r.push(c.ids)
            return r
            }, )

            console.log(prettify(group(data))) // OP data
            console.log(prettify(group(testData))) // Test data






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 21 at 7:50

























            answered Nov 21 at 4:22









            Akrion

            9,15911224




            9,15911224












            • I would expect these to all be in one group: {id:1,name:'John',phone:'1',email:'a'}, {id:2,name:'Marc',phone:'2',email:'b'}{id:3,name:'Ron', phone:'1',email:'b'}, but it seems to leave 2 out.
              – Mark Meyer
              Nov 21 at 5:47












            • I did not get it the first time due to the funky formatting in the comments section but you ware correct. I updated with another version which matches as expected. Thanks.
              – Akrion
              Nov 21 at 7:50










            • The matching code seems to iterate over all values. This is wasteful, because a map could, on the right field, report on matches in O(1). Your code is brief (I have not looked into its correctness), but cannot be optimal.
              – tucuxi
              Nov 21 at 11:29










            • In order to get the correctness of the groupings you have to scan for matches once you got the initial hit. I am not saying it is perfect but it is pretty generic and could be optimized more for sure.
              – Akrion
              Nov 21 at 19:31


















            • I would expect these to all be in one group: {id:1,name:'John',phone:'1',email:'a'}, {id:2,name:'Marc',phone:'2',email:'b'}{id:3,name:'Ron', phone:'1',email:'b'}, but it seems to leave 2 out.
              – Mark Meyer
              Nov 21 at 5:47












            • I did not get it the first time due to the funky formatting in the comments section but you ware correct. I updated with another version which matches as expected. Thanks.
              – Akrion
              Nov 21 at 7:50










            • The matching code seems to iterate over all values. This is wasteful, because a map could, on the right field, report on matches in O(1). Your code is brief (I have not looked into its correctness), but cannot be optimal.
              – tucuxi
              Nov 21 at 11:29










            • In order to get the correctness of the groupings you have to scan for matches once you got the initial hit. I am not saying it is perfect but it is pretty generic and could be optimized more for sure.
              – Akrion
              Nov 21 at 19:31
















            I would expect these to all be in one group: {id:1,name:'John',phone:'1',email:'a'}, {id:2,name:'Marc',phone:'2',email:'b'}{id:3,name:'Ron', phone:'1',email:'b'}, but it seems to leave 2 out.
            – Mark Meyer
            Nov 21 at 5:47






            I would expect these to all be in one group: {id:1,name:'John',phone:'1',email:'a'}, {id:2,name:'Marc',phone:'2',email:'b'}{id:3,name:'Ron', phone:'1',email:'b'}, but it seems to leave 2 out.
            – Mark Meyer
            Nov 21 at 5:47














            I did not get it the first time due to the funky formatting in the comments section but you ware correct. I updated with another version which matches as expected. Thanks.
            – Akrion
            Nov 21 at 7:50




            I did not get it the first time due to the funky formatting in the comments section but you ware correct. I updated with another version which matches as expected. Thanks.
            – Akrion
            Nov 21 at 7:50












            The matching code seems to iterate over all values. This is wasteful, because a map could, on the right field, report on matches in O(1). Your code is brief (I have not looked into its correctness), but cannot be optimal.
            – tucuxi
            Nov 21 at 11:29




            The matching code seems to iterate over all values. This is wasteful, because a map could, on the right field, report on matches in O(1). Your code is brief (I have not looked into its correctness), but cannot be optimal.
            – tucuxi
            Nov 21 at 11:29












            In order to get the correctness of the groupings you have to scan for matches once you got the initial hit. I am not saying it is perfect but it is pretty generic and could be optimized more for sure.
            – Akrion
            Nov 21 at 19:31




            In order to get the correctness of the groupings you have to scan for matches once you got the initial hit. I am not saying it is perfect but it is pretty generic and could be optimized more for sure.
            – Akrion
            Nov 21 at 19:31










            up vote
            -1
            down vote













            One way to accomplish what you need, is to separate the contacts into groups.
            Each group will contain a list of names, phones and emails.



            Then iterate through contacts, and see if the current contact falls into any of the groups. If not, create a new group and set its names/phones/emails so that next contacts might fall into the same on.






            var data = [
            {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
            {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'},
            {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
            {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
            {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
            {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
            {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
            ];

            var groups = ;

            data.forEach(function(person){
            var phone = person.phone;
            var email = person.email;
            var name = person.name;
            var id = person.id;
            var found = false;
            groups.forEach(function(g){
            if( g.names.indexOf(name) > -1
            || g.phones.indexOf(phone)>-1
            || g.emails.indexOf(email)>-1) {
            found = true;
            g.names.push(name);
            g.phones.push(phone);
            g.emails.push(email);
            g.people.push(id);
            }
            });
            if(!found) {
            groups.push({names:[name],phones:[phone],emails:[email],people:[id]});
            }


            });
            var output=;
            groups.forEach(function(g){
            output.push(g.people);
            });
            console.log(output); //[ [1,3,5] , [2,6,7] , [4] ]








            share|improve this answer

















            • 2




              O(n^2*log(n)) ? I thought OP was looking for an optimal solution
              – joyBlanks
              Nov 20 at 10:44










            • the log(n) part is ok, but the multiplication with quadratic part n^2 will cause problems for bigger datasets. However, do not have better solution myself
              – Han
              Nov 20 at 13:47










            • Groups with only 1 element should not be part of the result as per OP. [4] should not be in the result.
              – Akrion
              Nov 21 at 6:11















            up vote
            -1
            down vote













            One way to accomplish what you need, is to separate the contacts into groups.
            Each group will contain a list of names, phones and emails.



            Then iterate through contacts, and see if the current contact falls into any of the groups. If not, create a new group and set its names/phones/emails so that next contacts might fall into the same on.






            var data = [
            {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
            {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'},
            {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
            {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
            {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
            {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
            {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
            ];

            var groups = ;

            data.forEach(function(person){
            var phone = person.phone;
            var email = person.email;
            var name = person.name;
            var id = person.id;
            var found = false;
            groups.forEach(function(g){
            if( g.names.indexOf(name) > -1
            || g.phones.indexOf(phone)>-1
            || g.emails.indexOf(email)>-1) {
            found = true;
            g.names.push(name);
            g.phones.push(phone);
            g.emails.push(email);
            g.people.push(id);
            }
            });
            if(!found) {
            groups.push({names:[name],phones:[phone],emails:[email],people:[id]});
            }


            });
            var output=;
            groups.forEach(function(g){
            output.push(g.people);
            });
            console.log(output); //[ [1,3,5] , [2,6,7] , [4] ]








            share|improve this answer

















            • 2




              O(n^2*log(n)) ? I thought OP was looking for an optimal solution
              – joyBlanks
              Nov 20 at 10:44










            • the log(n) part is ok, but the multiplication with quadratic part n^2 will cause problems for bigger datasets. However, do not have better solution myself
              – Han
              Nov 20 at 13:47










            • Groups with only 1 element should not be part of the result as per OP. [4] should not be in the result.
              – Akrion
              Nov 21 at 6:11













            up vote
            -1
            down vote










            up vote
            -1
            down vote









            One way to accomplish what you need, is to separate the contacts into groups.
            Each group will contain a list of names, phones and emails.



            Then iterate through contacts, and see if the current contact falls into any of the groups. If not, create a new group and set its names/phones/emails so that next contacts might fall into the same on.






            var data = [
            {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
            {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'},
            {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
            {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
            {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
            {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
            {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
            ];

            var groups = ;

            data.forEach(function(person){
            var phone = person.phone;
            var email = person.email;
            var name = person.name;
            var id = person.id;
            var found = false;
            groups.forEach(function(g){
            if( g.names.indexOf(name) > -1
            || g.phones.indexOf(phone)>-1
            || g.emails.indexOf(email)>-1) {
            found = true;
            g.names.push(name);
            g.phones.push(phone);
            g.emails.push(email);
            g.people.push(id);
            }
            });
            if(!found) {
            groups.push({names:[name],phones:[phone],emails:[email],people:[id]});
            }


            });
            var output=;
            groups.forEach(function(g){
            output.push(g.people);
            });
            console.log(output); //[ [1,3,5] , [2,6,7] , [4] ]








            share|improve this answer












            One way to accomplish what you need, is to separate the contacts into groups.
            Each group will contain a list of names, phones and emails.



            Then iterate through contacts, and see if the current contact falls into any of the groups. If not, create a new group and set its names/phones/emails so that next contacts might fall into the same on.






            var data = [
            {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
            {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'},
            {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
            {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
            {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
            {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
            {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
            ];

            var groups = ;

            data.forEach(function(person){
            var phone = person.phone;
            var email = person.email;
            var name = person.name;
            var id = person.id;
            var found = false;
            groups.forEach(function(g){
            if( g.names.indexOf(name) > -1
            || g.phones.indexOf(phone)>-1
            || g.emails.indexOf(email)>-1) {
            found = true;
            g.names.push(name);
            g.phones.push(phone);
            g.emails.push(email);
            g.people.push(id);
            }
            });
            if(!found) {
            groups.push({names:[name],phones:[phone],emails:[email],people:[id]});
            }


            });
            var output=;
            groups.forEach(function(g){
            output.push(g.people);
            });
            console.log(output); //[ [1,3,5] , [2,6,7] , [4] ]








            var data = [
            {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
            {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'},
            {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
            {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
            {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
            {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
            {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
            ];

            var groups = ;

            data.forEach(function(person){
            var phone = person.phone;
            var email = person.email;
            var name = person.name;
            var id = person.id;
            var found = false;
            groups.forEach(function(g){
            if( g.names.indexOf(name) > -1
            || g.phones.indexOf(phone)>-1
            || g.emails.indexOf(email)>-1) {
            found = true;
            g.names.push(name);
            g.phones.push(phone);
            g.emails.push(email);
            g.people.push(id);
            }
            });
            if(!found) {
            groups.push({names:[name],phones:[phone],emails:[email],people:[id]});
            }


            });
            var output=;
            groups.forEach(function(g){
            output.push(g.people);
            });
            console.log(output); //[ [1,3,5] , [2,6,7] , [4] ]





            var data = [
            {id:1,name:'John',phone:'11111111',email:'aaaa@test.com'},
            {id:2,name:'Marc',phone:'22222222',email:'bbbb@test.com'},
            {id:3,name:'Ron',phone:'99999999',email:'aaaa@test.com'},
            {id:4,name:'Andrew',phone:'55555555',email:'dddd@test.com'},
            {id:5,name:'Wim',phone:'99999999',email:'gggg@test.com'},
            {id:6,name:'Marc',phone:'33333333',email:'cccc@test.com'},
            {id:7,name:'Dan',phone:'44444444',email:'cccc@test.com'}
            ];

            var groups = ;

            data.forEach(function(person){
            var phone = person.phone;
            var email = person.email;
            var name = person.name;
            var id = person.id;
            var found = false;
            groups.forEach(function(g){
            if( g.names.indexOf(name) > -1
            || g.phones.indexOf(phone)>-1
            || g.emails.indexOf(email)>-1) {
            found = true;
            g.names.push(name);
            g.phones.push(phone);
            g.emails.push(email);
            g.people.push(id);
            }
            });
            if(!found) {
            groups.push({names:[name],phones:[phone],emails:[email],people:[id]});
            }


            });
            var output=;
            groups.forEach(function(g){
            output.push(g.people);
            });
            console.log(output); //[ [1,3,5] , [2,6,7] , [4] ]






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 20 at 9:51









            Ahmad

            8,11143463




            8,11143463








            • 2




              O(n^2*log(n)) ? I thought OP was looking for an optimal solution
              – joyBlanks
              Nov 20 at 10:44










            • the log(n) part is ok, but the multiplication with quadratic part n^2 will cause problems for bigger datasets. However, do not have better solution myself
              – Han
              Nov 20 at 13:47










            • Groups with only 1 element should not be part of the result as per OP. [4] should not be in the result.
              – Akrion
              Nov 21 at 6:11














            • 2




              O(n^2*log(n)) ? I thought OP was looking for an optimal solution
              – joyBlanks
              Nov 20 at 10:44










            • the log(n) part is ok, but the multiplication with quadratic part n^2 will cause problems for bigger datasets. However, do not have better solution myself
              – Han
              Nov 20 at 13:47










            • Groups with only 1 element should not be part of the result as per OP. [4] should not be in the result.
              – Akrion
              Nov 21 at 6:11








            2




            2




            O(n^2*log(n)) ? I thought OP was looking for an optimal solution
            – joyBlanks
            Nov 20 at 10:44




            O(n^2*log(n)) ? I thought OP was looking for an optimal solution
            – joyBlanks
            Nov 20 at 10:44












            the log(n) part is ok, but the multiplication with quadratic part n^2 will cause problems for bigger datasets. However, do not have better solution myself
            – Han
            Nov 20 at 13:47




            the log(n) part is ok, but the multiplication with quadratic part n^2 will cause problems for bigger datasets. However, do not have better solution myself
            – Han
            Nov 20 at 13:47












            Groups with only 1 element should not be part of the result as per OP. [4] should not be in the result.
            – Akrion
            Nov 21 at 6:11




            Groups with only 1 element should not be part of the result as per OP. [4] should not be in the result.
            – Akrion
            Nov 21 at 6:11


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53389734%2foptimal-algorithm-grouping-data-in-javascript%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            404 Error Contact Form 7 ajax form submitting

            How to know if a Active Directory user can login interactively

            TypeError: fit_transform() missing 1 required positional argument: 'X'