Array_intersect and set theory

Sometime back a colleague asked me when would one use a function like array_intersect(). At that time other than asking him to read up on set theory I filed the question away. If you think about it when was the last time you thought about set theory while programming:-)

This week a simple problem was presented to me. A table of users has 4322 users and a new list had 1698 users. The new list was the latest list with valid users and the users table had to be synced up to the new list. I realized that this was a classic set theory problem and broke it down in to three parts:

  1. If a user in the user table is also in the new list – do nothing
  2. If a user in the user table is not in the new list then delete it from the user table
  3. If a user in the new list is not in the user table, add it to the user table

Using a Venn diagram this can be shown as follows:

sets-img1

In this we have two sets, A and B where:
A = New list of users (latest list of users)
B = User table (existing users)

The common area where the “users in the new list” are also in the “user table” is called an “intersection” in set theory and can be shown as follows (satisfies item-1 above):

C = A ∩ B

Users in the user table but not in the new list can be shown using set difference like so (satisfies item-2 above):

D = B – C

Users in the new list but not in the user table can be shown again using difference like so (satisfies item-3 above):

E = A – C

If we were to use an example:

New list: A = {‘John’, ‘Mary’, ‘David’, ‘Jack’, ‘Jason’}

User table: B = {‘Dexter’, ‘William’, ‘Jack’, ‘John’, ‘Christine’}

Users in user table and new list: C = A ∩ B = {‘Jack’, ‘John’}.
Users in the user table but not in the new list: D = B – C = {‘Dexter’, ‘William’, ‘Christine’}.
Users in the new list but not in the user table: E = A – C = {‘Mary’, ‘David’, ‘Jason’};

If we use Venn diagrams:

sets-img2

sets-img3

Using PHP we can write this as:

$A = array('John', 'Mary', 'David', 'Jack', 'Jason');
$B = array('Dexter', 'William', 'Jack', 'John', 'Christine');
$C = array_intersect($A, $B);

$D = array_diff($B, $C);
$E = array_diff($A, $C);

//display the results
echo 'New list A: ', print_r($A, true);
echo 'User table B: ', print_r($B, true);

echo 'Users in user table and new list(Item 1) – do nothing C: ', print_r($C, true);
echo 'Users in the user table but not in the new list(Item 2) – delete from user table D: ', print_r($D, true);

echo 'Users in the new list but not in the user table(Item 3) – add to user table E: ', print_r($E, true);

Jayeshs-Mac-mini:test jwadhwani$ php test.php
New list A: Array
(
[0] => John
[1] => Mary
[2] => David
[3] => Jack
[4] => Jason
)
User table B: Array
(
[0] => Dexter
[1] => William
[2] => Jack
[3] => John
[4] => Christine
)
Users in user table and new list (Item 1) – do nothing C: Array
(
[0] => John
[3] => Jack
)
Users in the user table but not in the new list (Item 2) – delete from user table D: Array
(
[0] => Dexter
[1] => William
[4] => Christine
)
Users in the new list but not in the user table (Item 3) – add to user table E: Array
(
[1] => Mary
[2] => David
[4] => Jason
)

Once you have your arrays you can iterate over them and perform any operations such as deleting records from the database.

There are other ways to solving this problem but none that are elegant come to mind. How would you solve such a problem? I look forward to your comments.

Happy Programming!

-Jayesh

Advertisements