msort: Unicode normalization failed [SOLVED]
Posted: Wed Jan 25, 2012 12:24 pm
I am trying to use msort in Linux Mint 12 to alphabetize files containing Unicode characters, but I get "Unicode normalization failed" no matter what arguments I use.
The puzzling thing is that the command works without any problems in Ubuntu 11.04. In Linux Mint 12, no luck in either Cinnamon/gnome-terminal or MATE/mate-terminal. In Ubuntu, msort works without specifying the Unicode normalization mode (i.e., the default NFC mode is used--whatever that means), whereas in Mint I've tried all normalization modes and all of them give me the same "Unicode normalization failed" error.
Another puzzling thing is that the simple "sort" command does not give me an error in Mint, but unfortunately it does not quite get the job done: sort makes no distinction between c and ĉ, g and ĝ, h and ĥ, etc. I specifically want c to sort before ĉ, g before ĝ, h before ĥ, etc...msort can handle custom sort orders whereas sort cannot.
Here's the simple command I'm trying to execute:
The file-with-sort-order simply contains the sort order (in this case the Esperanto alphabet: abcĉdefgĝhĥijĵklmnoprsŝtuŭvz). Obviously, both the file-with-sort-order and unsorted-list contain simple text and are saved with UTF-8 encoding.
Any ideas why I get the "Unicode normalization failed" error in Linux Mint 12 but not in Ubuntu 11.04? And, of course, how do I get past this error so that I can use msort in Linux Mint? This is the last task I'd like to be able to do in Mint before I completely convert from Ubuntu. Any help would be much appreciated.
The puzzling thing is that the command works without any problems in Ubuntu 11.04. In Linux Mint 12, no luck in either Cinnamon/gnome-terminal or MATE/mate-terminal. In Ubuntu, msort works without specifying the Unicode normalization mode (i.e., the default NFC mode is used--whatever that means), whereas in Mint I've tried all normalization modes and all of them give me the same "Unicode normalization failed" error.
Another puzzling thing is that the simple "sort" command does not give me an error in Mint, but unfortunately it does not quite get the job done: sort makes no distinction between c and ĉ, g and ĝ, h and ĥ, etc. I specifically want c to sort before ĉ, g before ĝ, h before ĥ, etc...msort can handle custom sort orders whereas sort cannot.
Here's the simple command I'm trying to execute:
Code: Select all
msort -s file-with-sort-order unsorted-list > sorted-list
Any ideas why I get the "Unicode normalization failed" error in Linux Mint 12 but not in Ubuntu 11.04? And, of course, how do I get past this error so that I can use msort in Linux Mint? This is the last task I'd like to be able to do in Mint before I completely convert from Ubuntu. Any help would be much appreciated.