Reverse engineering — every programmer’s dirty little secret
Every day I sit down at my workstation, evaluate the tasks assigned to me, and then instantly begin coding solutions to the problems I must solve. This is all done by memory of course, without the assistance of programmers who have tread the same road before me. I do this without hesitation, and I still always manage to create an optimized, and scalable program, invariably developing the perfect piece of software. If you believe this, I have some swampland I would like to sell to you East of the Mississippi. Please email me for details.
The hard truth is that there are frequently situations that present themselves, in which I must leverage the work of other programmers. It requires borrowing code that was refined in the fires long before I arrived, and will live in some software long after I am gone. I have to analyze the structure and operation of this code, and reverse engineer it in order to abstract out superfluous components and functionality. On occasion I pass off the entire charade as a direct result of my ingenious approach. Then, I feel guilty, like this is my dirty little secret.
Reality eventually sets in, and I see other programmers doing exactly the same thing. There are very rarely projects that we work on, in which the final product is the complete result of our efforts alone. The process of discovery in just about any software engineer’s day-to-day activities involves reconstitution. Think of it this way — it is powdered code, and I just add water, the result of which is edible (and useful to me) in final form. Unfortunately, the process can also produce a gelatinous mass that is not consumable by anyone. The following tips should help prevent this from happening.
There is a method to the madness
Especially when reverse engineering a framework or an inordinately complex piece of software, the first mistake programmers often make is to dive right into the code. You should never forget that before it became a finished product, it was at some juncture a series of class diagrams and wireframes. Unless you are merely looking for a code fragment, it will be helpful to document portions of the software, attempting to recreate the original functionality on paper. In addition, get access to the API and help documentation, since all of this should give you insight into the design decisions that were made.
Rip off with responsibility
Just getting it done is never the primary goal of reverse engineering. Instead of pulling out the code you need piecemeal, and then adding back to it haphazardly, get a sense for what the previous programmer was trying to accomplish. An immediate understanding of the logic is not always necessary, but knowing the intent is crucial. Making edits may decrease the performance, or introduce a security threat, and you will find fixing bugs to be like playing a game of Pin the Tail on the Donkey. This is the joy of reverse engineering, where you begin to uncover “trade secrets”, and learn from other experienced professionals.
Inherit the code, and inherit the mistakes
With all likelihood, when you inherit a bit of code from another programmer, you may also inherit the mistakes made within that code. They may be subtle, or abundantly clear, but they are mistakes nonetheless. Know where to look for information on these shortcomings. If it is an open-source project that you are customizing, are there forums, or Google Groups? Do some research as well, and try to pinpoint areas of the application that other programmers complain about the most. If users complain about the interface, or a piece of functionality that constantly fails, then this is another area of (dis)interest. You want to eliminate the possibility of perpetuating deficient coding practices.
Give credit where credit is due
The final point to consider is in regards to licensing and acknowledgments. If your software is open-source, as is the software you reverse engineered, then it might not matter. However, if you plan to redistribute software under a commercial license, then great care should be taken in how you approach licensing issues. All that may be required is written recognition for the efforts of other programmers or organizations. There is also the question of patent infringements. By duplicating interaction design elements, you might risk a lawsuit. However, if you only need to know the underpinnings, then it is unnecessary to thank every software engineer who has influenced your decisions.
Leave a Comment
Comments are moderated. No profanity. Only <a>...</a>, <blockquote>...</blockquote>, and <code>...</code> are allowed.
Seperate paragraphs by pressing the "Enter" key twice, or press it once to break to a new line.
2 Comments
#01, Nov 21 2007
raveman
This is very hard topic, i personally try to ignore everything and dive into the code. It might not be the best solution, but what you said look hard for me. I try to read documentation before, but it doesn’t really help me.
I got question for you, if you were put in BIG project with few thousand classes with BAD documentation, how would you start ? I try to live from task to task, because getting to know the whole thing would be really hard.
#02, Nov 21 2007
jatal
Good overview of the issue and your description of approaching the problem. All developers use “cut’n'paste” to provide solutions quickly. However, this generally causes code-bloat and propagates erroneous code (as you indicate).
If I have access to modify the code in question, then I see more value add to creating a subroutine/function that performs the common task. I find that this often exposes the pre-existing bugs since I have to write unit tests (ones whose sole purpose isn’t code coverage).
Regarding raveman’s question about large projects: again I find that properly written unit tests solve most of this problem. Unit tests don’t just expose erroneous functionality, they also serve as documentation for “how” to use a particular piece of code. Large projects without working unit tests will rarely be successful in their maintenance life-cycle. Developers will have to spend too much time re-learning code written by some guru/contractor that cut’n'run from their cut’n'paste code :)