1. 2016
    Apr
    28

    Compare Difference Between Arrays

    Posted By


    two arrays

    Recently got asked a question that got me looking into the way PHP compares difference in arrays. This can be done using either of these functions.

    array_diff compares the difference of arrays.
    array_diff_key which computes the difference of arrays using keys for comparison.
    array_diff_assoc which computes the difference of arrays with additional index check.

    Alternative ARRAY_DIFF function

    This surprisingly took me a while to settle on a solution, since the arrays you are comparing can get very large, it needs to be really fast. Here is my solution to the problem below.

    
    
    function array_diffs() {
        $count = func_num_args();
        if($count < 2) {
            trigger_error('Must provide at least 2 arrays for comparison.');
        }
        if(!is_array(func_get_arg(0))) {
            trigger_error('Parameters must be passed as arrays.');
        }
        $out = func_get_arg(0);
    	//resolve comparison issue
        $func = function(&$o, $k, $v, $c) {
    		$dbl = function($a, $b) {
    			$e = 0.00001;
    			if(abs($a-$b) < $e) {
    				return true;
    			}
    			return false;
    		};
    		if((gettype($v) == 'integer' && gettype($c) == 'double') || (gettype($v) == 'double' && gettype($c) == 'integer')) {
    			if(gettype($v) == 'integer' && $dbl((double) $v, $c) || gettype($c) == 'integer' && $dbl((double) $c, $v)) {
    				unset($o[$k]);
    			}
    		} elseif(gettype($v) == 'double' && gettype($c) == 'double') {
    			if($dbl($v,$c)) {
    				unset($o[$k]);
    			}
    		} elseif($v == $c) {
    			unset($o[$k]);
    		}
    	};
        for($i = 1; $i < $count; $i++) {
            if(!is_array(func_get_arg($i))) {
                trigger_error('Parameters must be passed as arrays.');
            }
            foreach(func_get_arg($i) as $compare_key => $compare_value) {
                foreach(func_get_arg(0) as $key => $value) {
                    if(is_numeric($key) && is_string($value)) {
    			$func($out, $key, $value, $compare_value);
                    } else {
    			$func($out, $key, $value, $compare_value);
                    }
                }
            }
        }
        return $out;
    }
    
    

    A number of things that I had to keep in mind while using this function.

    You are using the other parameters to compare the first parameter.
    The arrays passed as parameters can be multidimensional, if one of the items in the array you are comparing is an array, then it checks if there is another occurrence in the other arrays that you are using to compare.

    Since the result you are looking for is the items in the array that does not occur in the other arrays, it’s simply a process of elimination.

    
    
    $results = array_diffs(array('a','b','c','g','f'), array('b','o','i', 'p'), array('a','i','k','m')); //array('c','g','f') 
    $results = array_diffs(array('a','b',24=>130,2=>2,'c','g','f'), array('b','o','i',24=>100,'p'), array('a',2=>2,'i','k','m')); //array(24=>130, 25=>'c', 26=>'g', 27=>'f')
    $results = array_diffs(array('a'=>'s','b','c','g','f'), array('b','o','i','p'), array('a'=>'m','i','k','m')); //array('a'=>'s','c','g','f')
    
    

    Going Even Further

    Now what if you would like to go a step further? What if you would like to compare the differences between all the arrays. If there are items in any of the arrays that to no other occurrence, here you are doing a check on all the parameter arrays.

    
    
    function array_diffs() {
        $count = func_num_args();
        if($count < 2) {
            trigger_error('Must provide at least 2 arrays for comparison.');
        }
        $check = array();
        $out = array();
     	//resolve comparison issue
        $func = function($a, $b) {
    		$dbl = function($i, $d) {
    			$e = 0.00001;
    			if(abs($i-$d) < $e) {
    				return true;
    			}
    			return false;
    		};
    		if((gettype($a) == 'integer' && gettype($b['value']) == 'double') || (gettype($a) == 'double' && gettype($b['value']) == 'integer')) {
    			if((gettype($a) == 'integer' && $dbl((double) $a, $b['value'])) || (gettype($b['value']) == 'integer' && $dbl((double) $b['value'], $a))) {
    				return true;
    			}
    		} elseif((gettype($a) == 'double') && (gettype($b['value']) == 'double')) {
    			return $dbl($a,$b['value']);
    		} elseif($a == $b['value']) {
    			return true;
    		}
    		return false;
    	};
        for($i = 0; $i < $count; $i++) {
            if(!is_array(func_get_arg($i))) {
                trigger_error('Parameters must be passed as arrays.');
            }
            foreach(func_get_arg($i) as $key => $value) {
                if(is_numeric($key) && is_string($value)) {
                    if(array_key_exists($value, $check) && $func($value, $check[$value])) {
                        $check[$value]['count'] = $check[$value]['count'] + 1;
                    } else {
                        $check[$value]['value'] = $value;
                        $check[$value]['count'] = 1;
                    }
                } elseif(is_numeric($key) && (is_bool($value) || is_null($value) || is_numeric($value) || is_object($value) || is_resource($value))) {
    				$update = false;
    				foreach($check as $check_key => $check_value) {
    					if(is_numeric($key) && (is_bool($check_value['value']) || is_null($check_value['value']) || is_numeric($check_value['value']) || is_object($check_value['value']) || is_resource($check_value['value'])) && $func($value, $check_value)) {
    						$update = true;
    						$check[$check_key]['count'] = $check[$check_key]['count'] + 1;
    					} 
    				}
    				if(!$update) {
    					$check[] = array('value' => $value, 'count' => 1);
    				}
                } else {
                    if(array_key_exists($key, $check) && $func($value, $check[$key])) {
                        $check[$key]['count'] = $check[$key]['count'] + 1;
                    } else {
                        $check[$key]['value'] = $value;
                        $check[$key]['count'] = 1;
                    }
                }
            }
        }
    	$index = 0; 
    	foreach($check as $check_key => $check_value) { 
    		if($check_value['count'] == 1) { 
    			for ($i = 0; $i < $count; $i++) {
    				foreach(func_get_arg($i) as $key => $value) {
    					if(is_numeric($key) && is_string($value)) {
    						if($value == (string) $check_key) {
    							$out[$index] = $value; 
    							$index++;
    						}
    					} elseif(is_numeric($key) && (is_bool($value) || is_null($value) || is_numeric($value) || is_object($value) || is_resource($value))) { 
    						if(is_numeric($key) && (is_bool($check_value['value']) || is_null($check_value['value']) || is_numeric($check_value['value']) || is_object($check_value['value']) || is_resource($check_value['value']))) {
    							if($check_value['value'] == $value) { 
    								if(is_numeric($value)) { 
    									$out[$key] = $value;
    									if(!array_key_exists(($key+1), $out)) {
    										$index = $key;
    										$index++;
    									}
    								} else {
    									$out[$index] = $value; 
    									$index++;
    								}
    							}
    						}
    					} else {
    						if($key == $check_key) {
    							$out[$key] = $value; 
    						}
    					}
    				}
    			}
    		}
    	}
        return $out;
    }
    
    

    Here you will have to take into consideration that you are not just comparing one array in your parameters, but all the arrays in your parameters, your results can look like this.

    
    
    $result = array_distinct(array('a','b','c','g','f'), array('b','o','i', 'p'), array('a','i','k','m')); //array('c','g','f','o','p','k','m')
    $result = array_distinct(array('a','b',24=>130,2=>2,'c','g','f'), array('b','o','i',24=>100,'p'), array('a',2=>2,'i','k','m')); //array(24=>100,25=>'c',26=>'g',27=>'f',28=>'o',29=>'p',30=>'k',31=>'m')
    $result = array_distinct(array('a'=>'s','b','c','g','f'), array('b','o','i','p'), array('a'=>'m','i','k','m')); //array('a'=>'m','c','g','f','o','p','k','m')
    
    

    Feedback

    Thank you all, I very much appreciate your feedback on this post, I understand the expected result for a solution that calculates the difference between all the arrays should not be a merged array but a list of arrays with each difference, here is my update putting the this expected result into consideration.

    
    
    function array_distinct() {
        $count = func_num_args();
        if($count < 2) {
            trigger_error('Must provide at least 2 arrays for comparison.');
        }
        $check = array();
        $out = array();
     	//resolve comparison issue
        $func = function($a, $b) {
    		$dbl = function($i, $d) {
    			$e = 0.00001;
    			if(abs($i-$d) < $e) {
    				return true;
    			}
    			return false;
    		};
    		if((gettype($a) == 'integer' && gettype($b['value']) == 'double') || (gettype($a) == 'double' && gettype($b['value']) == 'integer')) {
    			if((gettype($a) == 'integer' && $dbl((double) $a, $b['value'])) || (gettype($b['value']) == 'integer' && $dbl((double) $b['value'], $a))) {
    				return true;
    			}
    		} elseif((gettype($a) == 'double') && (gettype($b['value']) == 'double')) {
    			return $dbl($a,$b['value']);
    		} elseif($a == $b['value']) {
    			return true;
    		}
    		return false;
    	};
        for($i = 0; $i < $count; $i++) {
            if(!is_array(func_get_arg($i))) {
                trigger_error('Parameters must be passed as arrays.');
            }
            foreach(func_get_arg($i) as $key => $value) {
                if(is_numeric($key) && is_string($value)) {
                    if(array_key_exists($value, $check) && $func($value, $check[$value])) {
                        $check[$value]['count'] = $check[$value]['count'] + 1;
                    } else {
                        $check[$value]['value'] = $value;
                        $check[$value]['count'] = 1;
                    }
                } elseif(is_numeric($key) && (is_bool($value) || is_null($value) || is_numeric($value) || is_object($value) || is_resource($value))) {
    				$update = false;
    				foreach($check as $check_key => $check_value) {
    					if(is_numeric($key) && (is_bool($check_value['value']) || is_null($check_value['value']) || is_numeric($check_value['value']) || is_object($check_value['value']) || is_resource($check_value['value'])) && $func($value, $check_value)) {
    						$update = true;
    						$check[$check_key]['count'] = $check[$check_key]['count'] + 1;
    					} 
    				}
    				if(!$update) {
    					$check[] = array('value' => $value, 'count' => 1);
    				}
                } else {
                    if(array_key_exists($key, $check) && $func($value, $check[$key])) {
                        $check[$key]['count'] = $check[$key]['count'] + 1;
                    } else {
                        $check[$key]['value'] = $value;
                        $check[$key]['count'] = 1;
                    }
                }
            }
        }
        foreach($check as $check_key => $check_value) {
            if($check_value['count'] == 1) {
               for ($i = 0; $i < $count; $i++) {
                    foreach(func_get_arg($i) as $key => $value) {
                        if(is_numeric($key) && is_string($value) && ($value == (string) $check_key)) {
    			$out[$i][$key] = $value;
    		    } elseif(is_numeric($key) && ($check_value['value'] == $value)) {
    			$out[$i][$key] = $value;
    	            } elseif(is_string($key) && ($check_value['value'] == $value)) {
    			$out[$i][$key] = $value;
                        }
                    }
                }
            }
        }
        return $out;
    }
    
    

    Here is what your results can look like this.

    
    
    $result = array_distinct(array('a','b','c','g','f'), array('b','o','i', 'p'), array('a','i','k','m')); //array(0=>array('c','g','f'),1=>array('o','p'),2=>array('k','m'))
    $result = array_distinct(array('a','b',24=>130,2=>2,'c','g','f'), array('b','o','i',24=>100,'p'), array('a',2=>2,'i','k','m')); //array(0=>array(24=>130,25=>'c',26=>'g',27=>'f'),1=>array(24=>100,1=>'o',25=>'p'),2=>array('k','m'))
    $result = array_distinct(array('a'=>'s','b','c','g','f'), array('b','o','i','p'), array('a'=>'m','i','k','m')); //array(0=>array('a'=>'s','c','g','f'),2=>array('a'=>'m','k','m'),1=>array('o','p'))
    
    

  2. About Emeka Echeruo

    Emeka Echeruo

    I love sports, football which I refuse to call soccer, and the outdoor especially walks in park. Software development is my passion, there is a beauty in creating something out of nothing but algebra that ends up becomes a part of a persons daily life. I love kids, dogs, nightlife and art because it finds you and moves you emotionally!

  3. Leave a Reply

    Your email address will not be published. Required fields are marked *

    This site uses Akismet to reduce spam. Learn how your comment data is processed.