Mathematics and Computer Science help us streamline our thought process. Over the decades/years, focus on these two great fields have helped humans to understand complex topics through simple processes. In this story, let us focus on the basic subsets, permutations and combinations problems from a mathematics and computer science perspective.
Subsets
Subset can be defined in terms of presence or absence of candidate elements. So, subset is defined as a possible groupings of candidate elements. Subset does not rely on the order of the elements(with an exception during implementation). Subsets is basically a collection of all possible groupings of candidate elements(collection of subset)
Lets get started with an example. The candidate elements are [1, 2]. The subset(s) of these 2 elements are
1. {} - select none
2. {1} - select only '1'
3. {2} - select only '2'
4. {1, 2} - select '1' and '2' in any order (same as {2, 1})
In the above example with 2 unique elements, there are 2²=4 sets in the subsets. Let us analyze the possible ways of presence or absence of each element.
[1, 2]
1. Start with {}
2. Add '1' to a copy of the current state ({})
2.1 Updated state is {}, {1}
3. Add '2' to a copy of the current state ({}, {1})
3.1 Updated state is {}, {1}, {2}, {1, 2}
So, for an input of size ’n’, there will be 2^n sets in the subsets. Of these 2^n sets,
- one will be empty set = {}
- one will a complete set = {complete input}
- remaining (2^n) - 2 will be partial sets
Lets start with a regular recursive code to compute subsets for any set of input elements. In the following code, each element is included and excluded in a candidate set. Subsets are generated recursively using the computed inclusion and exclusion states.
An iterative version of the same logic looks simple and follows the logic of general understanding of subset computation. The following code computes subsets iteratively
The simple explanation is, when an element arrives
- existing subsets will not contain the current element
- clone all the existing subsets and add the current element to each of the cloned subset
- add the cloned subsets back to the total subsets
- proceed with next element
The iterative approach shines and is more intuitve when used with enumeration. The following code shares the internal state and hence beware of modification to the returned result, as it might impact future subsets. Alternatively, the result can be cloned before returning to avoid this scenario.
To better understand/debug, we can enumerate and get the subsets over the stream of numbers using the following code
All the above approaches work when the input contains unique elements. If input can contain duplicates, a slight modification is required in the above solutions to produce expected subsets.
Duplicate in Input
When the input contains duplicates, subset computation has to handle duplicate occurances to produce the correct result. Lets take the example of [1, 1, 2]. Running through the above logic, we will get incorrect(duplicates) in the output
1. {} - select none
2. {1} - select only first '1'
3. {1} - select only second '1'
4. {2} - select only the '2'
5. {1, 1} - select '1' and '1' in any order (same as {1, 1})
6. {1, 2} - select first '1' and '2'
7. {1, 2} - select second 1' and '2'
8. {1, 1, 2} - select both '1'’s and '2'
In the above above example the following duplicates are invalid
3. {1} - select only second '1'
7. {1, 2} - select second 1' and '2'
The expected correct result will be
1. {} - select none
2. {1} - select only the first '1'
3. {2} - select only the '2'
4. {1, 1} - select '1' and '1' in any order (same as {1, 1})
5. {1, 2} - select first '1' and '2'
6. {1, 1, 2} - select both '1'’s and '2'
Now to proceed with correct implementation, there is a need to differentiate between the first occurance of an element and any subsequent(duplicate) occurance of the same element. This duplicate detection can be easily achieved if the same valued elements occur consequetively in the input. Sorting the input will simulate the behavior of similar elements appearing consecutively.(Note: sorting is not absolutely required if consequetive occurance of the similar elements can be guranteed).
Given that the input is sorted, lets focus on how to construct subset from the sorted input.
1. Identify duplicate elements by remembering the previous value (sorted input)
2. If current element different from previous element in the sorted input, then process the element as per the above unique solution case (clone and append to every existing set)
3. If current element is same as previous element in the sorted input, then consider only sets that has the previous element (clone and append only to those sets)The rationale behind considering only the sets that has the previous element is: don't duplicate the work done by previous element
Now, lets go over the example based on the above conditions for the input [1,2,1]
1. sort the input
2. input = [1¹, 1², 2] (either increasing or decreasing)
3. initialize result = {}
4. initialize previous = INFINITY
Lets start with current element = 1¹ (first ‘1’)
5. compare current element with previous element
6. compare 1¹ with INFINITY
7. 1¹ != INFINITY, so clone the complete result
8. cloned = {}
9. insert 1¹ to each of the cloned set (in this case only 1)
10. cloned = {1¹}
11. add cloned sets to existing result
12. result = {}, {1¹}
13. assign current element to previous
14. previous = 1¹
Now lets consider the next element 1² (duplicate element)
15. compare current element with previous element
16. compare 1² with 1¹
17. 1² == 1¹, so consider only the sets containing previous element(1¹) as candidate sets.
18. clone the candidate sets
19. cloned = {1¹}
20. insert 1² to each of the cloned set
21. cloned = {1¹, 1²}
22. add cloned sets to existing result
23. result = {}, {1¹}, {1¹, 1²}
24. assign current element to previous
25. previous = 1²
Now lets consider the next element 2
26. compare current element with previous element
27. compare 2 with 1²
28. 2 != 1², so clone the complete result
29. cloned = {}, {1¹}, {1¹, 1²}
30. insert 2 to each of the cloned set
31. cloned = {2}, {1¹, 2}, {1¹, 1², 2}
32. Add cloned sets to existing result
33. result = {}, {1¹}, {1¹, 1²}, {2}, {1¹, 2}, {1¹, 1², 2}
34. assign current element to previous
35. previous = 2
Proceed with any more numbers in the input and follow the same logic.
Now, lets try to code this logic. The recursive code includes an element if one of the following condition is met:
- current element is seen for the first time
- current element is same as previously added element
Now lets try to code the same logic iteratively. In the following code, subsets are computed iteratively for every input element based on the existing subsets.
- current element is seen for the first time, then clone the complete subsets and add current element and remember the count of cloned subsets
- current element is same as previous element, then clone only the subsets containing the previous element (use the remembered count)
To better understand/debug, we can enumerate and get the subsets over the stream of numbers using the following code:
Hope this article would have helped in understanding and coding the subset problems effectively. Applaud or comment if you like or have any questions or find any bugs. Lets continue with permutations in the next article :)