Delete duplicates from hierarchy of lists

Is it possible for Mathematica to compare a number of lists of different sizes, and delete duplicates in a hierarchical manner?

For example, given the following lists:

list1 = {2, 5, 10, 11, 12, 17, 18};
list2 = {5, 8, 11, 16, 19, 21};
list3 = {5, 8, 9, 11, 16, 19, 21, 23};

I would like to output:

list1 = {2, 5, 10, 11, 12, 17, 18}
list2 = {8, 16, 19, 21}
list3 = {9, 23}

where list1 keeps all its original elements, list2 keeps only elements belonging to list2 and not list1, and list3 keeps only those elements not contained in lists 1 & 2.

If this will take an inordinately long time, I am happy with just comparing the list with the previous list and deleting the duplicates from the second list.

=================

  

 

Closely related How to make an efficient difference of sets function?
– Artes
Nov 19 ’13 at 13:01

=================

2 Answers
2

=================

This could be done with set operations:

list1
Complement[list2, list1]
Complement[list3, Union[list1, list2]]

  

 

@ Ubpdqn, thank you very much for your help on this 🙂
– martin
Nov 19 ’13 at 11:46

You can use Fold to extend ubpdqn’s use of Complement and Union. Pre-sorting the lists isn’t necessary, especially if they are already sorted, but it ran slightly faster on large random lists.

list1 = {2, 5, 10, 11, 12, 17, 18};
list2 = {5, 8, 11, 16, 19, 21};
list3 = {5, 8, 9, 11, 16, 19, 21, 23};
lists = {list1, list2, list3};

Reap[Fold[(Sow @ Complement[#2, #1]; Union[##]) &, {}, Sort /@ lists]][[-1, -1]]

(* {{2, 5, 10, 11, 12, 17, 18},
{8, 16, 19, 21},
{9, 23}} *)

Comparison with and without Sort:

Block[{lists},
SeedRandom[1];
lists = Table[RandomInteger[2 * 10^5, RandomInteger[{10^3, 10^4}]], {200}];
Reap[Fold[(Sow @ Complement[#2, #1]; Union[##]) &, {},
Sort /@ lists]][[-1, -1]] // Last // AbsoluteTiming
]

(* {0.951474,
{130, 8034, 9175, 19773, 23884, 25942, 32039, 35681, 43301,
48881, 55009, 55076, 59170, 60909, 74140, 75298, 79379, 84588,
96241, 100238, 101899, 104788, 106787, 109378, 110730, 118780,
120469, 121675, 124215, 129721, 130787, 136182, 137604, 138686,
143374, 152045, 161277, 169106, 171461, 188972}} *)

Block[{lists},
SeedRandom[1];
lists = Table[RandomInteger[2 10^5, RandomInteger[{10^3, 10^4}]], {200}];
Reap[Fold[(Sow @ Complement[#2, #1]; Union[##]) &, {},
lists]][[-1, -1]] // Last // AbsoluteTiming
]

(* {1.009209,
{130, …, 188972}} *)