Side-effect free programming language for reproducible data transformation

  softwareengineering

Is there a usable programming language that disallows all side effects except for its input stream (aka STDIN) and its output stream (aka STDOUT)? All executable scripts in the language should be guaranteed to produce exactly the same output when given the same input (unless they don’t terminate soon enough). This excludes any other external state such as:

  • access to the file system (like known to be limited in sandboxes)
  • access to APIs, databases, network connections…
  • access to environment variables, time

By usable I mean the language should include basic datatypes and operations as known from other programming languages, unlike for instance a Turing machine emulator. How could such a language be useful? It would allow execution of arbitrary scripts for reproducible data transformation.

A defined subset of a more powerful language can also be useful if there is a practical way to actually define and enforce the subset.

P.S.: I extended the question and added an example answer with a language that is actually used for reproducible data transformation but too limited for more complex tasks.

5

Check out Total Functional Programming which mentions Epigram and Charity.

Haskell is the first that comes to mind, having the best combination of purity and popularity. You can also search for “purely functional programming languages.” Everything is evaluated lazily. Even STDIN and STDOUT are handled purely by passing in a fake “state-of-the-world” parameter and returning it “modified.” This little bit of trickery allows Input/Output to appear functional to the program.

Programs in Haskell (and presumably other purely functional languages) are not sequential. You do not write, “First do this, then do that.” You simply list dependencies. “This depends on that.” Then the program figures out what has to happen first (imagine it starts with the results, and puts everything on a stack until it gets to the inputs, then pops things off the stack to process in order).

6

It’s pretty difficult to accidentally open a file in Haskell. A lot of people don’t know how to do it on purpose. Just don’t use the parts you don’t want. Since there is no way to interpret any IO actions outside of the main function, an appropriately designed and audited main can prevent unwanted side effects for your entire program.

A very limited language without side-effects for reproducible data transformation is the subset of sed having the form:

s/regexp/replacement/

Same input will always produce same output no matter how you choose regexep and replacement. This does not apply to extended search-and-replace-statements such as in Perl with the /e flag.

6

LEAVE A COMMENT