Mac OS X, the Terminal, Unicode and ls

Years ago I’ve figured out how to configure Mac OS X’s Terminal.app and the .inputrc to support umlauts encoded in UTF-8 in terminal windows. But there was still one essential piece missing. I was able to enter umlauts, vim was able to display umlauts, “ls | cat” showed correct umlauts in filenames, even bash completion was able to complete umlauts properly, but executing simply ls failed to display correct UTF-8 encoded umlauts. Today — after probably five years — I’ve found the first workaround: ls -w.

   -w    Force raw printing of non-printable characters.  This is the default
         when output is not to a terminal.

Though, this workarond is ugly, it works. It indicates that ls fails to determine umlauts as printable characters although the locale is set to a german UTF-8 locale. By forcing ls to ignore the fact that a character is supposedly non-printable, one might indeed get trouble as soon as a file name contains really non-printable characters. Until this happens I’ll be happy with this workaround. At long last, it seems feasible to use proper file names in german. This was the last missing piece for me to start using german file names.

In the end it turns out there seems to be one more programmer (the author of ls of OS X) who should have read Joel Spolsky’s blog entry: “The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)” I wish there was a law forcing everyone to read that before he is allowed to program one line of code. It would have saved me — probably — hundreds of hours of my life.

This entry was posted in OS X and tagged , , , , , . Bookmark the permalink.

One Response to Mac OS X, the Terminal, Unicode and ls

  1. jonasbn says:

    Great tips – thanks :-)

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>