When is a command line not a line?

Tim Waugh

Revision History
Revision 1 19 October 2001 TMW
Initial revision.
Revision 2 6 November 2001 TMW
Mention that Perl's system() can take a list.

Table of Contents

An intuitive idea
String replacements in command lines
Attempts to fix the problem
Real examples of command line manipulation
Mailcap
Desktop Entries
Conclusion

Abstract

This article discusses the way in which intuitive handling of ‘command lines’ can lead to bugs and security problems, and suggests a solution.

An intuitive idea

We all know what a command line is. It's what you give to the shell in order to run a program with, perhaps, some options. It's what you type in to make something happen. When you are typing in a command line, you do it character by character, separating the command's name and its options by spaces. And that's all there is to it.

At a programming level, it's the same thing: Perl has system(); so does C; Python has os.system(); all of which take a command line (a string of characters) and execute it.

So what's the point of writing this article? The answer is that all is not as it seems. When it comes to actually finding the command to execute, the ‘command line’ must be broken up into an array of words. Here's why:

int execve(const char *filename, char *const argv[], char *const envp[]);

The execve() C library function that eventually gets called takes an array (or vector) of strings, argv, not a single command line string. This array of strings is given to the program when it starts up, and it appears as one of the parameters to the main() entry point function. We'll refer to this as the argument vector, as distinct from the command line which is the string of characters we started with.

So far we've seen two different ways of looking at command lines: a string of characters, and an array of strings. You might be thinking that really they are the same and that you can get from one to the other without any problems. You would be largely right, but it isn't as easy to get from one to the other as you might think. That's where the problems start to appear.