Back · Index

Simple diff implementation in PHP

Array diff implementation

This method, while working, is inefficient as even common words are lumped in together with the changes words. It was done because I don't yet fully grasp LCS.

Example Output

Input

$str1 = 'It\'s very hard to try and do this diff properly. This is text. 012345.';
$str2 = 'It\'s very hard for me to try and code this diff function properly. This is different. 012345.';

Output

It's very hard to try and do this diff properly. This is text. 012345.for me to try and code this diff function properly. This is different. 012345.

Code


function merge_adjacent(array $array, $glue) {
    $array = array_reverse($array, true);
    foreach ($array as $key=>&$value) {
        $next = $key - 1;
        if (isset($array[$next])) {
            unset($array[$key]);
            $array[$next] = $array[$next].$glue.$value;
        }
    }
    return array_reverse($array, true);
}

function check($old, $new) {
    $old_arr = preg_split('/\s+/', $old);
    $new_arr = preg_split('/\s+/', $new);
    $return = array_intersect_assoc($old_arr, $new_arr); /* Common words */

    /* merge_adjacent() will merge array values where the array keys are next to each other */
    $diff_old = merge_adjacent(array_diff_assoc($old_arr, $new_arr),' ');
    $diff_new = merge_adjacent(array_diff_assoc($new_arr, $old_arr), ' ');
    // $diff_old = array_diff_assoc($old_arr, $new_arr);
    // $diff_new = array_diff_assoc($new_arr, $old_arr);

    foreach ($diff_old as $index=>$word) {
        $add = "<del>$word</del>";
        $return[$index] = $add;
    }
    foreach ($diff_new as $index=>$word) {
        if (isset($return[$index])) {
            $return[$index] .= "<ins>$word</ins>";
        } else {
            $return[$index] = "<ins>$word</ins>";  
        }
    }
    ksort($return);
    return(implode(' ',$return));
}

Resources


  1. Bitwise Operators @ PHP.net 

  2. Longest Common Substring problem @ Wikipedia 

Loaded in 3.27ms