
JojoDiff - diff utility for binary files

Copyright © 2002-2011 Joris Heirbaut
This software is hosted by: <http://sourceforge.net>


    1. Purpose

    *JDIFF * is a program that outputs the differences between two
    (binary) files.
    *JPTCH* can then be used to reconstruct the second file from the
    first file.
    For example:
      o *jdiff* archive0000.tar archive0001.tar archive0001.jdf
      o *jptch* archive0000.tar archive0001.jdf archive0001b.tar
    will create a file archive0001b.tar which is identical to
    archive0001.tar.

    Possible applications include:
      o incremental backups,
      o synchronising files between two computers over a slow network
        (see *JSYNC*). 

    *JDIFF* tries to find a minimal set of differences between two files
    using a heuristic algorithm with constant space and linear time
    complexity. This means that accuracy is traded over speed. *JDIFF*
    will therefore, in general, not always find the smallest set of
    differences, but will try to be fast and will use a fixed amount of
    memory.

    JDIFF does not compress the generated patchfile. It is recommended
    to do so with any compression tool you like. See below for an
    example using ZIP.

    Download these utilities from the Jojo's Binary Diff Download Page
    <http://sourceforge.net/projects/jojodiff/>.


    2. Version and history

    The current version of this utility is bèta 0.8 dating from
    September 2011. The modification history is as follows:
        v0.1 	June 2002 	Insert/delete algorithm.
        v0.2a 	June 2002 	Optimized patch files.
        v0.2b 	July 2002 	Bugfix on code-length of 252.
        v0.2c 	July 2002 	Bugfix on divide-by-zero in verbose mode.
        v0.3a 	July 2002 	Copy/insert algorithm.
        v0.4a 	September 2002 	Select "best" of multiple matches.
        v0.4b 	October 2002 	Optimize matches.
        v0.4c 	January 2003 	Rewrote selection algorithm between
                                multiple matches.
        v0.6 	April 2005 	Support files larger than 2GB.
        v0.7 	November 2009 	Optimizations for files larger than 2GB.
        v0.8 	September 2011 	Conversion to C++ classes that should be
        easier to reuse.


    3. Installation

    On Windows systems:
      o Compiled executables are within the "win32" directory. You can
        run them from a command prompt. 

    On GCC/Linux systems:
      o Compiled ELF binaries are within the "linux" directory.
      o You may also compile the source by running "make" within the
        "src" directory.
      o Copy the resulting binaries to your /usr/local/bin.
      o Within the bash directory, you can find an example BASH script,
        *JSYNC*, which I use for synchronizing files between two
        computers connected over a slow network. 


    4. Usage

*jdiff* [options] original_file new_file [output_file]

    *Options:*
        -v 	Verbose (greeting, results and tips).
        -vv 	Verbose (debug info).
        -h 	Help (this text).
        -l 	List byte by byte (ascii output).
        -lr 	List groups of bytes (ascii output).
        -b 	Try to be better (using more memory).
        -f 	Try to be faster: using less memory, no out of buffer compares.
        -ff 	Try to be faster: no out of buffer compares, no prescanning.
        -m size 	Size (in kB) for look-ahead buffers (default 128).
        -bs size 	Block size (in bytes) for reading from files (default
        4096).
        -s size 	Number of samples in mega (default 8 mega samples).

    *Principles:*
        *JDIFF* tries to find equal regions between two binary files
        using a heuristic hash algorithm and outputs the differences
        between both files. Heuristics are generally used for improving
        performance and memory usage, at the cost of accuracy.
        Therefore, this program may not find a minimal set of
        differences between files.
    *Notes:*
      o Options -m and -s should be used after -b, -f or -ff.
      o Accuracy may be improved by increasing the number of samples.
      o Speed may be increased with option -f or -ff (lower accuracy).
      o Sample size is always lowered to the largest n-bit prime (n < 32)
      o Original and new files must be random access files.
      o Output is sent to standard output if output file is missing.
    *Important:*
        Do not use jdiff directly on compressed files, such as zip,
        gzip, rar, because compression programs tend to increase the
        difference between files ! Instead use jdiff on uncompressed
        archives, such as tar, cpio or zip -0, and then compress the
        files afterwards, including the jdiff patch file. Afterwards, do
        not forget to uncompress the files before using jpatch. For
        example:
        *zip* -0 archive0000.zip mydir/* 	put mydir in an archive
        *zip* -0 archive0001.zip mydir/* 	some time later
        *jdiff* archive0000.zip archive0001.zip archive0001.jdf
        difference between archives
        *zip* -9 archive0001.jdf.zip 	send compressed difference file to
        a friend
        *zip* -9 archive0000.zip.zip archive0000.zip 	compress the
        archive before sending to a friend
        *...*	
        *unzip* archive0000.zip.zip 	restore uncompressed zip file
        *unzip* archive0000.jdf.zip 	restore uncompressed jdf file
        *jpatch* archive0000.zip archive0001.jdf archive0001b.zip
        recreate archive001.zip
        *unzip* archive0001b.zip	restore mydir

        You may also replace zip -0 by tar and zip -9 by gzip, or any
        other archiving and/or compression utility you like. 

*jpatch* [options] original_file patch_file [output_file]

    *Options:*
        -v 	Verbose (greeting, results and tips).
        -vv 	Verbose (debug info).
        -vvv 	Verbose (more debug info).
        -h 	Help (this text).

    *Principles:*
        *JPATCH* reapplies a diff file, generated by jdiff, to the
        original file, restoring the new file. 


    5. Contacts and remarks

    Author: 	Joris Heirbaut
    Contact me via sourceforge <http://sourceforge.net/projects/jojodiff>

    If you like this program, please let me know ! If you reuse the
    source code for your personal (open source) applications, it would
    be great to let me know too.


        6. Acknowledgements

    Earlier versions of this software have been developed within the
    Cygwin/GNU environment. More recently, development has been done in
    Eclipse/CDT using GCC and MinGW/GCC. 
