Apple push notification, PHP and XML diff

In a recent project I had to send a push notification to my iOS app by differencing two XML files.

The XML file basically contains a list of incidents and looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<incidents>
    <incident>
        <description><![CDATA[Server crashed.]]></description>
        <id>88</id>
        <time>1/4/2012 5:31 PM</time>
    </incident>
    <incident>
        <description><![CDATA[Server stolen.]]></description>
        <id>87</id>
        <time>1/3/2012 4:30 PM</time>
    </incident>
    <incident>
        <description><![CDATA[Server on fire.]]></description>
        <id>86</id>
        <time>1/2/2012 3:29 PM</time>
    </incident>
    <incident>
        <description><![CDATA[Server damaged.]]></description>
        <id>85</id>
        <time>1/2/2012 2:28 PM</time>
    </incident>
    <incident>
        <description><![CDATA[Server misplaced.]]></description>
        <id>84</id>
        <time>1/1/2012 1:27 PM</time>
    </incident>
</incidents>

Please note that for the purposes of this post I have pared down the structure and content of the file. Also, the XML file was sent to me from a parent application server and I had no control over its generation.

For the push notification I had to inform the app how many incidents had changed and new ones added. This implied that I had to do an XML diff on the current XML file and the previous one.

I could recurse over the two files but then that would be too cumbersome. It would be coupled with the structure of the xml file and I am against tight coupling. So, I thought of comparing hashes of the old and new incidents.

My strategy was to use three arrays, one which would contain the previous hashes for reference, second which would store the incident hashes from the current XML and the third which would store the counts for changed, new and deleted and un-changed.

Here is the code:

define('INCIDENT_HASHES_FILENAME', 'incident_hashes.txt');
define('XML_FILE', 'test.xml');

$incidentsPrev = array(); //will contain previous hashes if any 
$incidentHashes = array(); //will contain new hashes
$incidentStatus = array(); //will contain final status id => status - 'N' = New 'C' = Changed, 'D' = Deleted 'NC' => No Change

//check if a reference file exists.
//if a file exists but returns corrupt information
//start anew 
if(file_exists(INCIDENT_HASHES_FILENAME)){
	$incidentsPrev = unserialize(trim(file_get_contents(INCIDENT_HASHES_FILENAME)));
	if(!is_array($incidentsPrev)){
		$incidentsPrev = array();
	}
}

//debug
echo 'Previous Hashes: ', '<pre>', print_r($incidentsPrev, true), '</pre>';

//contents of the new xml file
$xml = file_get_contents(XML_FILE);
$xmlIterator = new SimpleXMLIterator($xml);

//iterate and get hashes
foreach($xmlIterator as $incident){
	//get hash for the incident
	$hash = md5($incident->asXML());
	$id = (int)$incident[0]->id;
	$incidentHashes[$id] = $hash;
}

//compare hashes
foreach($incidentHashes as $id => $hash){
	//check if the incident exists in the hash array
	if(array_key_exists($id, $incidentsPrev) === true){
		if($incidentsPrev[$id] === $hash){ //no change
			$incidentStatus[$id] = 'NC';	
		}else{
			$incidentStatus[$id] = 'C'; //changed
		}
		unset($incidentsPrev[$id]); //we are done with this one
	}else{
		$incidentStatus[$id] = 'N';//new one
	}
	//at this point what $incidentsPrev contains are incidents
	//which are deleted
	foreach($incidentsPrev as $k => $v){ //all deletes
		$incidentStatus[$k] = 'D';
	}
}

//save away the results for next time
file_put_contents(INCIDENT_HASHES_FILENAME, serialize($incidentHashes));

//results
$new = count(array_keys($incidentStatus, 'N'));
$changed = count(array_keys($incidentStatus, 'C'));
$deleted = count(array_keys($incidentStatus, 'D'));
$noChange = count(array_keys($incidentStatus, 'NC'));

echo '<p>', 'New: ', $new, ' Changed: ', $changed, ' Deleted: ', $deleted, ' No Change: ', $noChange, '</p>';

echo 'Current Hashes: ', '<pre>', print_r($incidentHashes, true), '</pre>';
echo 'Status: ', '<pre>', print_r($incidentStatus, true), '</pre>';

First off I check if I have a file with previous hashes. First time around I will not have any previous hashes so all incidents will be labeled as new. This is what my output looks like:

Previous Hashes: Array
(
)
New: 5 Changed: 0 Deleted: 0 No Change: 0

Current Hashes: Array
(
    [88] => e99e791d7413872d2b10b2774ae75e32
    [87] => 68e218269cc5a548d86272b4dee27015
    [86] => 71cc7681549a1590748ceef9034069b0
    [85] => 1d71598d3142a946e9d115276a4baee8
    [84] => eb2869de9544a89515604da930522298
)
Status: Array
(
    [88] => N
    [87] => N
    [86] => N
    [85] => N
    [84] => N
)
</pre>

Now I will change one incident, delete another and add a new one. The XML file now looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<incidents>
	<incident>
        <description><![CDATA[Rat ate server **NEW**.]]></description>
        <id>89</id>
        <time>1/5/2012 6:32 PM</time>
    </incident>
    <incident>
        <description><![CDATA[Server crashed.]]></description>
        <id>88</id>
        <time>1/4/2012 5:31 PM</time>
    </incident>
    <incident>
        <description><![CDATA[Server stolen again. **CHANGED**]]></description>
        <id>87</id>
        <time>1/3/2012 4:30 PM</time>
    </incident>
    <incident>
        <description><![CDATA[Server on fire.]]></description>
        <id>86</id>
        <time>1/2/2012 3:29 PM</time>
    </incident>
    <incident>
        <description><![CDATA[Server misplaced.]]></description>
        <id>84</id>
        <time>1/1/2012 1:27 PM</time>
    </incident>
</incidents>

Now if I run the program again this is what I get:

Previous Hashes: Array
(
    [88] => e99e791d7413872d2b10b2774ae75e32
    [87] => 68e218269cc5a548d86272b4dee27015
    [86] => 71cc7681549a1590748ceef9034069b0
    [85] => 1d71598d3142a946e9d115276a4baee8
    [84] => eb2869de9544a89515604da930522298
)

New: 1 Changed: 1 Deleted: 1 No Change: 3

Current Hashes: Array
(
    [89] => 98b372c37454d5e0209109cc3393274a
    [88] => e99e791d7413872d2b10b2774ae75e32
    [87] => 9f72a06f5f193ec350300754419444dc
    [86] => 71cc7681549a1590748ceef9034069b0
    [84] => eb2869de9544a89515604da930522298
)


Status: Array
(
    [89] => N
    [88] => NC
    [87] => C
    [86] => NC
    [85] => D
    [84] => NC
)

The code is pretty straightforward. Here are the significant steps:

0. Get hold of previous hashes if available.
1. Iterate over the XML and get all the current hashes.
2. Now compare the two.
3. If not found, must be new.
4. If found and hashes do not match then data has changed, otherwise unchanged.
5. The remaining hashes indicate that those incidents have been removed.

Possibilities for something like this are many. For example I could send the status array as a JSON package and inform the user what has changed, new or deleted with icons or change in background color.

Also note that you could use array_intersect_assoc for getting unchanged incidents or array_diff_assoc for new incidents. I will leave the rest to your imagination.

I hope this helps you and I look forward to your comments.

Happy Coding!

Jayesh

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s