How to modify the output of a program for which you don’t have the source code

In our company we have a small program (.exe 500Kb size) that does mathematical calculation and in the end it spits out the result on a Excel spreadsheet that we use to continue our workflow.

I want to modify the columns, spacing format and add VBA logic etc. on the Excel spreadsheet but since this parameters are not configurable in that program, it seems to me the only way to modify it is to break down/reverse engineer the .exe

Nobody knows in what language it was programmed in, the only thing we know is:

  1. Developed 20+ years ago
  2. Developer retired 10 years ago
  3. GUI Application
  4. Runs standalone
  5. Size 500Kb

Any suggestions what options I have to deal with such kind of problems? Is reverse engineering the only option, or is there a better approach?

23

Reverse engineering can become very hard, even more if you do not just want to understand the program’s logic, but change and recompile it. So first thing I would try is to look for a different solution.

I want to modify the columns, spacing format and add VBA logic etc. on the Excel spreadsheet

If that is the only thing you want, and the calculation done by the program is fine, why not write a program in the language of your choice (maybe an Excel macro) which calls your legacy “exe”, takes the output and processes it further.

13

In addition to the already given answers by Doc Brown and Telastyn, I would like to suggest an alternative approach (under the assumption it’s mission critical).

If you do not know the computations it performs and the calculations are (somewhat) mission-critical: Deduce the original logic in the .exe file by any means necessary. Decode it using a decompiler/disassembler like IDA if necessary. Hire a consultant (or a batch of consultants) if necessary.

Sure, work around it for now using their solution, but do not let it be.

The reason I suggest is as follows: You have admitted that the calculations are very complex (according to an engineer you spoke to). It’s also mission-critical. So if somehow the original .exe stops working due to changes in the platforms you have (maybe 16-bit support gets dropped?), you have just lost a mission-critical piece of knowledge.

Now, I’m not concerned about losing the .exe, but about losing the knowledge it encodes. That knowledge must be recovered.

As before: if that knowledge is already available, make sure to write it down in a format that it’s not going to be lost anytime soon. Otherwise, recover it and write it down.

13

Ask the original programmer, if possible.

A few weeks ago i’ve been contacted by a firm I used to work for 10 years ago with the very same question about an mdb file developed mid 90s.

6

Any suggestions what options I have to deal with such kind of problems?

If all you’re looking to do is modify the output, then why not simply use composition?

Instead of modifying the black box you can’t easily access, you create a new program that takes the Excel output, and does your formatting/column changes too. Then you could make a new exe/script that calls the two programs in order, so it appears to the end user that there is just one program that does all of the work – even though it’s two distinct steps under the hood.

10

There are companies that specialise in exactly this kind of problem. They use proprietary code to decompile native code into a high level language, then apply human expertise to make it useful (e.g. giving variables appropriate names).

Some years ago my employer used this to migrate some native S/390 mainframe code onto Linux servers. We gave them a binary, they gave us source code in C.

Whether this is necessary in your case, is up to you. If you only care about the format of the output, you can simply massage the output after it’s been produced. However as others have pointed out, having business logic hidden in a binary blob could be an ongoing risk.

Write a simple wrapper around the program, capturing its output. It is not complex to do as many languages (Java, C++, Python, .NET, for instance) have means for this. Parse the output and generate another, in the desired form. The user will call your new program. The old executable will stay next to it, or even can be automatically extracted from resource, before invoking it.

This solution of course works well enough only when output is well structured so easy to parse.

That it is a GUI application, is not a blocking problem. You can launch it, generate output, and then automatically post process it when this GUI terminates.

8

Write some tests that exercise as many cases as possible on the old code. Find corner cases, test wrong input, and test correct input.

Pin down what is correct output given various cases, and then try to write an implementation that satisfies the same tests.

I wouldn’t go down the reverse engineering route. It’s incredibly complicated to reverse machine code, and you should already know what the purpose of the exe is. Reverse engineering is a little too much work for what you’re after.

If the software was developed by one guy 20 years ago, it’s probably not something that takes a lot of modern power. A GUI program that stretched the machine 20 years ago will barely register on a modern machine, so you’re probably looking at something that’s relatively simple to reproduce.

Try to reverse engineer the exe. Only for the purpose of finding the computation logic or at-least to get a fair hint of what it actually does and if your reverse engineering can get you to that point, you can write new application based on that computation logic. Apart from that, I don’t see other wayout.

Easier said than done, reverse engineer an exe created 20 years back is real challenge.

2

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *