I’m interested in whether there is such a thing as a pseudo-compiler that can create a kind of binary or bytecode version of a plaintext script file, which can only be accessed by a proprietary piece of software?
I work with a proprietary software client in a Windows environment, and the software has its own script manager which can ‘compile’ a script file into a ‘binary’ file. The binary is a non-plaintext file, which can only be accessed by the proprietary software to execute custom tasks.
The reason for this, I assume, is because software has its own library of built-in functions that the company doesn’t want to make available to the public. If the binaries aren’t true binaries as I suspect (in the sense that they’re not compiled to the CPU-specific instructions) I would also assume it would be possible to revert the binaries back to plaintext script files.
I have a strong suspicion that the company that wrote the proprietary software didn’t write a bespoke compiler for their script manager, as this is not the primary feature of the software but more of an add-on. If so, they must have used a third-party client or library to compile and read these pseudo-binaries.
I’d like to find out more about this sort of practice in general, to get a better understanding. Does anyone know of standard pseudo-compiler libraries available for Windows (or other) platforms? Any information on this would be greatly appreciated!
5
If the point of transforming code reversibly is not to produce another form of executable code (script code or machine code), then it really amounts to plain old encryption.
In other words, yes, there are many libraries to do that, but they’re just the standard encryption libraries, and instead of inventing a proprietary transformation you’ll almost certainly be better off using a standard algorithm with a secret, proprietary key.
Is there such a thing as a ‘pseudo-compiler’ for proprietary software?
Yes.
Beyond that, it’s hard to give a definitive answer to this question as there are too many unknowns. A non-exhaustive list of what that binary file could be are:
- A plain text file that’s been encrypted,
- A plain text file that’s been compressed,
- A syntax tree generated from a lexer/parser, that is therefore syntactically valid, saving the app from checking it itself,
- The output from a genuine compiler, which could just be wrapped inside the companies only “compiler”, which invokes the real one and obfuscates the output,
- A genuinely compiled (to byte code or machine code) executable
I’m sure there are dozens more possibilities that could be added here. The point being, without reverse-engineering those binary files (or just asking the company!), there’s no way to know for sure.