Sunday 3 April 2011

Shell scripts to handle filenames with spaces

Posix/Unix/Linux was not designed to handle filenames with spaces in them. However, Linux and Windows filesystems allow them and also many other "funny" characters. This has been brewing as a topic in Linux Journal recently, and Dave Taylor has just written an article on it in the February, 2011 issue. He spots files with spaces in them by the shell pattern "*\ *" and then mucks around changing spaces into other things. It's good stuff, but overkill for some cases.


For a long time now, I've been writing scripts that handle filenames both with and without spaces. You've got to know your shell and how Posix commands work! Commonly, I want to list files in a directory and do things to them whether or not they have spaces in them.

Shell patterns such as "*" break strings into "words" based on whitespace (spaces, tabs, newlines). This stuffs up a filename if its has spaces in it, since the name then gets split into separate words. But commands such as "ls" (when not directed to a terminal) list each filename on a separate line. So if you have something that distinguishes between spaces/tabs and newlines then you can get complete filenames with or without spaces.

The shell command "read" reads a line and breaks it into words. so
 read a b c
with input
 a line of text
will assign
 a="a"
 b="line"
 c="of text"
Just
  read line
will read all of the line into the variable. It stops reading on end-of-line so it has the distinction type I often need.

But how to use it? Well, the shell while loop is just a simple command, and as such can have its I/O redirected.  So I do this:
 ls |
 while read filename
 do
    #process filename e.g.
    cp "$filename" ~/backups
 done
This works for all files, with or without spaces. Just don't forget the quotes while processing the file! 

Of course, this doesn't work for all uses: note the find and xargs combination that Dave also commented on:
 find . -print0 | xargs -0 ...

No comments:

Post a Comment