For a while now, I have been automating tasks by writing shell scripts in bash. These have gradually become more and more complicated, and I am now finding that bash scripts are a little too simple for what I want – for example, I recently wrote a 1000-line program that includes multithreading, IPC, a CGI interface, and a fairly complex set of config options.
Perhaps my impression is wrong, but I have always thought that shell scripting was intended to be used mainly for fairly small automation and configuration tasks, and is not really a suitable language to be writing an application in.
Most of the time I am doing low-level things on a single server, e.g. starting/stopping other processes, accessing sockets, monitoring system status, that kind of thing.
I am trying to understand better what I can do as a next step instead of creating a huge bash script program. Is there a paradigm that I need to be thinking about when trying to determine when to use shell scripts vs a “proper” programming language?
8
Shell scripts are good for a sequence of shell commands. While Bash has a couple of more advanced features, anything that goes beyond passing strings to commands is a lot of pain. In particular, these are signs that your program has outgrown a shell script:
- you have complex control flow, e.g. loops, recursion, or parallelism that goes beyond pipes etc..
- you need complex data structures. While Bash provides arrays, other shells do not and require you to use flat files instead. Using collections of data is difficult in any case.
- you have to validate inputs to your program. Shell scripts are difficult to write correctly with different expansion, escaping, and word splitting mechanisms. A proper language can help there, maybe by offering a type system.
- you want modularization because you have many helper functions.
- more code is spent on logic rather than invoking commands.
- you have performance requirements that make it unreasonable to spawn a new process for each little operation.
Scripting languages (e.g. Perl/Python/Ruby) offer various features that make programming a lot more comfortable, e.g. stricter syntax and semantics (scoped variables!), complex data structures, libraries for common tasks, modularization features, …. Their advantage compared with other programming languages is that you don’t have to set up a compilation toolchain which allows for a similar development experience as developing shell scripts. However, invoking external tools becomes more difficult. What was a simple case of set -e; tool foo --bar="$baz"
in a shell script would have to be written in Perl as 0 == system 'tool', 'foo', "--bar=$baz" or die "Couldn't invoke tool: $!"
or open my $result, '-|', 'tool', 'foo', "--bar=$baz" or die "Couldn't invoke tool: $!"
, depending on what precisely you were trying to do (similar problems also exist for Python and Ruby, of course). This makes difficult things possible, but simple things less simple. There is a point where this is really worth it, but not every shell script is at that point just because it’s big or complicated.
If your shell script only contains certain parts that would benefit from a better programming language, it would also be possible to rewrite just these in another language, and then invoke them as external programs. However, this will only work if the main data flow is simple enough. The opposite is also possible: having a program invoke a shell script for a part that just consists of invoking other programs. Such programs that consist of multiple executable files can be quite reasonable, but also introduce a few problems of their own: is my relative path set correctly? How can I debug these interacting programs? How can I structure my data that all parts can easily deal with it?
I think three important parameters to be considered are:
-
code reuse.
Code reuse among shell scripts tends to be difficult (scripts tend to be very problem-specific)
-
frequency of execution of the program.
For a one time task (typically filesystem oriented task) shell scripts (bash, tcsh…) are enough.
Now consider what happens when other people have to maintain the code (or you too after a few months)
-
the complexity of the data structures involved.
A language like Python is often the right choice for larger projects/scripts (and it has a a very easy to read and understand syntax).
Anyway take a look at:
- Can I use Python as a bash replacement?
- Strengths of Shell Scripting compared to Python
for further details about specific strengths of Python vs shell scripting.
Apart from creating (complex) web interfaces, bash scripting could be the right tool for you since you know it better than others.
It is how you arrange your scripts in a way of maximizing code reuse, and adapt changes with minimum effort. You should be able to create complexity out of simplicity by composing previous codes/libraries. And it is the way how Unix programming works (like all pipes and stuff).
I think bash scripting could be sufficient and proper enough for your needs.