Hex Editor¶
Overview¶
The hex editor was designed for reverse engineering 6502 machine code, initially for the Atari 8-bit computer systems, but expanded to Apple ][ and other 6502-based processors. There is also support for most other 8-bit processors, but as the author doesn’t have experience with other 8-bit processors they have not been extensively tested.
Opening a file to edit will present the main hex edit user interface that shows many different views of the data. Editing is supported in the hex view, character view, and the disassembly. There is also a bitmap view but is presently only for viewing, not editing.
Viewing Data¶
The various views can be scrolled independently, but there is only one cursor location. Clicking on a location in one view will move the other views to show the same location. Selections are analogous; see below.
Segments¶
Binary data is parsed using code that started life as part of Omnivore but I spun it out because it’s useful as a standalone library: atrcopy. It knows a lot about Atari 8-bit files and disk images, knows some stuff about Apple ][ files and disk images, and knows almost nothing about anything else (yet). Atrcopy thinks of binary data in terms of segments, where a segment is simply a portion of the disk image.
The interesting feature of atrcopy is due to the use of numpy, and it’s this: segments can provide views of the same data in different orders. And, changing a byte in one segment also changes the value in other segments that contain that byte because there is only one copy of the data.
This turns out to be super useful. For instance, the first segment that appears in Omnivore’s list of segments will contain all the data from the disk image, in the order that the bytes appear in the file. This may or may not mean much depending on the format of the image. As an example, the catalog of an Apple DOS 3.3 disk is stored in sectors that increment downwards, so the catalog appears backwards(-ish. It’s complicated). So atrcopy goes further and breaks this disk image segment into smaller segments depending on the type of the file. In the catalog example above, it creates another segment that displays the catalog in the correct order. Changing a byte in either of those segments will change the value in the other, because it’s really the same value. It’s just two different looks into the same data.
Editing Data¶
Hex data can be edited by:
clicking on a cell and changing the hex data
selecting a region (or multiple regions; see Selections below) and using one of the operations in the Bytes Menu
cutting and pasting hex data from elsewhere in the file
cutting and pasting hex data from another file edited in a different tab or window
pasting in data from an external application.
Character data can be edited by clicking on a character in the character map to set the cursor and then typing. Inverse text is supported for Atari modes. Also supported are all the selection and cut/paste methods as above.
Baseline Data¶
Omnivore automatically highlights changes to each segment as compared to the state of the data when first loaded.
Optionally, you can specify a baseline difference file to compare to a different set of data, like a canonical or reference image. This is useful to compare changes to some known state over several Omnivore editing sessions, as Omnivore will remember the path to the reference image and will reload it when the disk image is edited in the future.
As data is changed, the changes as compared to the baseline will be displayed in red.
By default, baseline data difference highlighting is turned on, you can change this with the Show Baseline Differences menu item.
Selections¶
Left clicking on a byte in any of the data views (hex, char, disassembly, bitmap, etc.) and dragging the mouse with the button held down will start a new selection, finished by releasing the mouse button. The selection will be shown in all views of the data, scrolling each view independently if necessary.
The selection may be extended by shift-left-click, extending from either the beginning or the end of the selection as appropriate.
Multiple selections are supported by holding the Control key (Command on Mac) while clicking and dragging as above. Extending a selection when using multiple selection is not currently supported.
Find¶
The data is searchable in multiple ways. Starting any search will display a search bar on the bottom of the main window. The basic search bar available with the `Find`_ menu item tries to be flexible and will show matches in any data view using an appropriate conversion for that view. For instance, the text string “00” in the search bar will find values of 0 in the hex view, strings of “00” in the character view, labels that have “00” anywhere in their text, “00” as an operand in the disassembly, or anything that has “00” in a comment.
The Find Next and `Find Prev`_ menu items (or keyboard shortcuts) will traverse the list of matches.
A more complicated search can be performed using the Find Using Expression menu item that support ranges of addresses or specific data values as search parameters using arbitrary boolean expressions.
Comments¶
Comments are a hugely important part of reverse engineering, because by definition the original source has been lost (or was never available). As you figure things out, it’s important to write things down. Omnivore supports adding a comment to any byte in the file, and it will appear in any segment that views that byte.
In the sidebar is a big ol’ list of comments, and selecting one of the comments will move the data views to display the byte that is referenced by that comment. Because there may be multiple views of the same byte, the comment shown in the comments list is the first segment that contains that comment.
Note that segments must have a defined origin for the segment to be considered as the primary for that comment.
Disassembler¶
Omnivore started out as a reverse engineering tool for Atari 8-bit computers, which use the 6502 processor. After developing for a while, I found a python disassembler called udis that supports multiple processors. Through its usage, Omnivore can disassemble (and assemble! See below) code for:
6502
65816
65c02
6800
6809
6811
8051
8080
z80
but the 6502 is the only processor have direct knowledge of and so the only one I’ve tested thoroughly. Bug reports (and patches!) for the other processors are welcome.
Mini-Assembler¶
The disassembly can be edited using a simple mini-assembler; clicking on an opcode provides a text entry box to change the command. The mini-assembler supports all CPU types, not just 6502.
Labels¶
Labels can be set on an address, and the label will be reflected in the disassembly code. Also, memory mapping files can be supplied that automatically label operating system locations.
Data Regions¶
To support reverse engineering, regions can be marked as data, code, ANTIC display lists, and other types. Regions are highlighted in a different style and changes how the disassembly is displayed.
Static Tracing of Disassembly¶
To help identify regions, static tracing can be used. Turning on static tracing assumes that every byte is data and shows temporary highlights over the entire segment. Starting a trace at an address causes Omnivore to follow the path of execution until it hits a return, break or bad instruction, marking every byte that it traverses as code. It will also follow both code paths at any branch. This is not an emulator, however, so it is not able to tell if there is any self-modifying code. Any blocks of code that aren’t reached will require additional traces. When tracing is finished, the results can be applied to the segment to mark as data or code.