Introduction

Earlier in the year I wrote a Lua implementation of xxd -i (it’s at the bottom under the “Embedding” section). I wrote this because a few platforms I was working with didn’t support xxd. I didn’t want to attempt compiling and installing it. I had Lua installed on these platforms so it was easier to write a Lua script to do the same thing. I only needed -i output so implementing all of xxd wasn’t necessary. Plus it was fun.

Since then I’ve run into another issue with xxd -i. Specifically, I have a generated header and I don’t have the original file it was created from. In this case it’s a PNG. xxd supports a -r option for reversing its output. However, it doesn’t support reversing files generated with the -i option…

Solution

I wrote a script to reverse xxd -i output and turn it back into the original file. This time I used Python and not Lua. Python made it easier and I didn’t need to run it on the platforms that didn’t support xxd (or Python). I really should have implementations in both Languages; that might be a future project.

Here is the Python file for reversing the xxd -i output. I’ve also put both scripts on GitHub as Bin-Header.

header2bin.py

#!/usr/bin/env python

# The MIT License (MIT)
#
# Copyright (c) 2015 John Schember <john@nachtimwald.com>
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.

import re
import struct
import sys

'''
Output a binary file that was created with xxd -i.
xxd cannot reverse its output when using the -i
(C header file output).
'''
def main(args):
    # Check arguments
    if '-h' in args or '--help' in args:
        print('Usage: %s infile outfile' % args[0])
        print('Reverse a C header generated by xxd -i back to a binary file.')
        return 1

    if len(args) != 3:
        print('%s invalid option' % args[0])
        print('Try '%s --help' for more information.' % args[0])
        return 2

    # Setup our variables
    infile = args[1]
    outfile = args[2]
    indata = ''
    hexnums = []

    # Read the file
    try:
        with open(infile, 'rb') as f:
            indata = f.read()
    except Exception as e:
        print('Failed to read '%s': %s' % (infile, e))
        return 3

    # Prepare the file
    # Remove all newlines from the file so we can match as if it was one string
    indata = re.sub('[rn]+', '', indata)
    # Match the start and end pulling out the part with the hex numbers
    match = re.match('^unsigned char.*[].*=.*{(.*)};.*unsigned int.*$', indata)
    if not match:
        print('File does not have proper format')
        return 3
    # Be safe that we actually captured the hex numbers
    try:
        indata = match.group(1)
    except Exception as e:
        print('Hex data group not found')
        return 3

    # Pull out the hex numbers. We do this instead of a split
    # so we don't have to worry about any other formatting.
    pat = re.compile('0x[0-9a-fA-F]{2}')
    for h in re.findall(pat, indata):
        hexnums.append(h)

    # Be sure we found some data
    if len(hexnums) == 0:
        print('Could not find any hex number data')
        return 3

    # Write the numbers as binary to the output file
    try:
        with open(outfile, 'wb') as f:
            for h in hexnums:
                # Pack it into binary. xxd by default outputs as big endian.
                # There is an option in some implementations (-e) that will
                # output as little endian. This doesn't matter because we're
                # writing single bytes.
                f.write(struct.pack('B', int(h, 16)))
    except Exception as e:
        print('Failed to write '%s': %s' % (outfile, e))
        return 3

    return 0;

if __name__ == '__main__':
    sys.exit(main(sys.argv))