Ankit1010 wrote:Yeah I noticed this too, shell scripts are far too fragile. Do you know of any way of really making sure that they run the same across multiple systems (like maybe bundling your versions of all the tools you use with the script)?
Write it in Python. </troll>
The fact that something like that is a viable troll, to me indicates that the situation in the shell needs to be addressed.
But to some extent, it's a fundamental problem with the usual approach taken in the shell: The tools and commands that make up the "shell programming language" are quasi-standardized in some cases (posix - but versions actually used usually carry a bunch of non-standard extensions) - but there may be multiple groups writing alternate versions of the command (which would be installed with the same command name), and the output of these programs is usually made for display to a terminal or line printer, for human consumption, which machine-consumption as a secondary goal... Which means the exact format of the output (which we rely upon in scripts) is sensitive even to aesthetic concerns.
Python, Perl, etc. can avoid this situation because there's a namespace system to allow for name clashes, a central authority controlling the development of the core language as well as the module namespace, and processing "human-readable" output normally isn't done. Rather, code written for the language works with the language's data structures, producing very machine-readable answers up until the point when finally data is displayed on-screen.
I don't think anyone really has authority to effectively standardize the environment in which the shell operates - and there's nothing currently in place to provide any kind of namespace support in a reliable fashion (for instance to write a script that says "I want to run GNU find here" and have it work even if GNU find is installed as "gfind" to avoid a clash with another version of "find" that's installed as the standard version) - but one could mitigate the problem of shell script stability somewhat, using the same shells and many of the same tools, if we changed how we work with them: for instance, don't attempt to parse "pretty" program output. Parse something that's more specifically machine-readable instead. In the case of problem 1, using "date" instead of "cal" is a pretty clear-cut solution. (cal's output is meant for human consumption - on the display or on a line printer. So it doesn't matter if there's extra whitespace at the end of the line, or even if the first column is Monday instead of Sunday - a person reading it can adjust. Scripts are going to be a lot more sensitive to such minor changes.) More generally, using programs in such a way that their output is in a format that's consistent and extensible would be the way to go. Just as an example, if the system were populated with tools that took XML input and produced XML output, then you could reliably process the result of commands like "ls -l", even if additional columns are added to the output (for instance, more extensive security information like ACLs, or "fattr" metadata size, file forks if we were dealing with old Mac stuff, etc.). This is because instead of chopping up lines saying "give me the fifth field" or "give me the last field" or "give me characters 5-8 from that line" you'd say "give me the file size field" - and consistently get full precision on the answer (like "15924" bytes as opposed to human-readable quantities like "15k"). Even if you have different versions of a program, possibly even from different sources, they would be able to target a core set of data fields that should be present in their output, while still providing their own extensions on top of that. (Pretty-printing for the user would then be a separate step. Presumably one that would have to be explicitly specified by the user, though of course having that happen automatically would be nice...)
XML is just one possibility, obviously. The important things about such a move are that the output format has to be something extensible, there have to be adequate tools on the system for dealing with data in that format, and preferably there has to be some consensus on what format to use and how to use it. As a bonus, if programs' output format could be somehow explicitly identified, allowing for the possibility of auto-conversion, all the better. It's a little bit hard to visualize any of that actually happening.