It occurs to me that there’s not a heck of a lot of difference between
$>python module.py
And:
$>javac module.java
$>java module.class
The former compiles to an intermediate language (python bytecode) and executes the program. The latter breaks the steps up, first compiling to the intermediate language (jvm bytecode) and then executing on another line. In fact I can rewrite the python to break out the two steps, as in this SO question.
It seems people make a big deal about the stark difference between compiled and interpreted languages. It seems the lines are entirely blured. Other popular interpreted languages havi similar features. Php “compiles” to its own opcodes which can be cached or stored for later use. Perl also is compiled to something.
So… is there really any difference between these popular interpreted languages and popular compiled languages that compile to VMs? Perhaps in one case the VMs are typically more memory resident whereas with the “interpreted” languages they typically have their runtimes spun up? Yet this seems like it could be easily changed.
Yet there still seems to be something of a difference. If they are more-or-less the same, then why is it that the performance of Java/C# seems to approach C++ while the “interpreted” languages are still an order of magnitude off? If its all truly bytecodes running in a VM, and all really the same, why the big difference in performance?
4
There are many differences. First of all, think of the difference between a bytecode interpreter and a language interpreter. It’s easy to interpret bytecode, because all commands follow a predictable format, but interpreting a language involves parsing and lexing — operations that can be quite taxing, depending on the language.
C# and Java don’t only compile to bytecode. They also use JIT compilation, which allows them to interpert the bytecode a few times for the entire duration of the application, and caching the result — instead of interpreting it whenever the execution thread stumbles upon it, which would involve lots of redundancy.
As far as I remember, python can compile to bytecode, but it doesn’t use JIT, which can drastically increase performance.
5
Typically, interpreted languages aren’t eligible for full-on static analysis (e.g. static type checking across multiple modules) and optimization. Compiling to bytecode can provide this. OTOH, interpreted languages can run even if some parts (libraries, etc.) are missing, because you’re late-binding references. IOW, if you don’t actually call the code, it doesn’t matter if it’s there or not.
1