Not UTF-8 valid

About writing shell scripts and making the most of your shell
Forum rules
Topics in this forum are automatically closed 6 months after creation.
Locked
Seff

Not UTF-8 valid

Post by Seff »

Hello. I have some bash scripts that don't run properly- namely, scripts from gog.com. Trying to run them, I get the error "gtk: Locale not supported by C library." Opening them in Mousepad results in "This document was not UTF-8 valid." How can I find out what the proper locale is? Thank you. Also, I note that they seem to have been make with gtk 2.0.
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 1 time in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
rene
Level 20
Level 20
Posts: 12240
Joined: Sun Mar 27, 2016 6:58 pm

Re: Not UTF-8 valid

Post by rene »

"gtk" warnings are not produced by the shell itself; will be from whichever graphical program said script launches. The question is as such perhaps not fully fitting here, but this just so as to comment on the issue in a technical sense; not to say it should be elsewhere...

It seems likely that you have a default LANG or LC_<foo> setting for a locale that is not in fact installed. Try locale for your current settings, locale -a for the installed locales. If you notice any missing edit as root /etc/locale.gen, uncomment those you need additionally, save and run sudo locale-gen.

As to mousepad complaining: hard to say anything without an example of a script for which it complains.
Seff

Re: Not UTF-8 valid

Post by Seff »

rene
Level 20
Level 20
Posts: 12240
Joined: Sun Mar 27, 2016 6:58 pm

Re: Not UTF-8 valid

Post by rene »

Seems to not be something to test stand-alone. Have you tried the things I mentioned?
User avatar
xenopeek
Level 25
Level 25
Posts: 29459
Joined: Wed Jul 06, 2011 3:58 am

Re: Not UTF-8 valid

Post by xenopeek »

Indeed, interesting to see the output of locale and locale -a commands.

Isn't gog using a shell script with embedded binary content? One very big .sh file that indeed won't open in text editor.

Failing anything else you could try running the installer from the terminal and prefix the installer command with LC_ALL=C to make it use the default C (English) locale.
Image
Seff

Re: Not UTF-8 valid

Post by Seff »

Code: Select all

locale
LANG=en_US.UTF-8
LANGUAGE=en_US:en_GB:en
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=en_US.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=
I can run "Blocks That Matter", so call it a partial success. (rene is probably right about this topic being in the wrong forum.)

Code: Select all

locale -a
C
C.UTF-8
en_AG
en_AG.utf8
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
en_HK.utf8
en_IE.utf8
en_IN
en_IN.utf8
en_NG
en_NG.utf8
en_NZ.utf8
en_PH.utf8
en_SG.utf8
en_US.utf8
en_ZA.utf8
en_ZM
en_ZM.utf8
en_ZW.utf8
POSIX/[code]
rene
Level 20
Level 20
Posts: 12240
Joined: Sun Mar 27, 2016 6:58 pm

Re: Not UTF-8 valid

Post by rene »

Not seeing anything wrong; you have en_US.UTF-8 configured and installed. Sorry, no idea then what the issue is.
User avatar
catweazel
Level 19
Level 19
Posts: 9763
Joined: Fri Oct 12, 2012 9:44 pm
Location: Australian Antarctic Territory

Re: Not UTF-8 valid

Post by catweazel »

Seff wrote: Mon Oct 15, 2018 5:40 pm "This document was not UTF-8 valid."
That makes me suppose that the file is missing its two byte UTF8 header.
"There is, ultimately, only one truth -- cogito, ergo sum -- everything else is an assumption." - Me, my swansong.
rene
Level 20
Level 20
Posts: 12240
Joined: Sun Mar 27, 2016 6:58 pm

Re: Not UTF-8 valid

Post by rene »

Sorry having to disagree with you again but there is no such thing as a "two byte UTF8 header" on a shell script. xenopeek's suggestion of the script embedding a binary seems likely; certainly random binary data will not (all) be valid UTF-8. However, very little useful to add without an example of the script/s it/themself/ves, so, well, ...
User avatar
catweazel
Level 19
Level 19
Posts: 9763
Joined: Fri Oct 12, 2012 9:44 pm
Location: Australian Antarctic Territory

Re: Not UTF-8 valid

Post by catweazel »

rene wrote: Fri Oct 26, 2018 8:50 pm Sorry having to disagree with you again
It's no skin off my nose. I was merely supposing anyway.

Cheers.
"There is, ultimately, only one truth -- cogito, ergo sum -- everything else is an assumption." - Me, my swansong.
Seff

Re: Not UTF-8 valid

Post by Seff »

There's little doubt the script runs binary code- it invokes MojoSetup, which installs a game.

Code: Select all

abel="Blocks That Matter (GOG.com)"
script="./startmojo.sh"
scriptargs=""
licensetxt=""
targetdir="binaries"
filesizes="679302"
keep="n"
quiet="n"
It's a very long script; this is just a small sample. Might be best to let it lie.
gm10

Re: Not UTF-8 valid

Post by gm10 »

Seff wrote: Thu Oct 25, 2018 12:08 pm

Code: Select all

LANGUAGE=en_US:en_GB:en
LC_ALL=
Change the first one to en_US.UTF-8 and either remove the other one or set it to en_US.UTF-8 as well and you should be good. The file to edit should be ~/.pam_environment, or if it's not in there then add it and/or find where the original line comes from and change it there.
catweazel wrote: Fri Oct 26, 2018 8:58 pm
rene wrote: Fri Oct 26, 2018 8:50 pm Sorry having to disagree with you again
It's no skin off my nose. I was merely supposing anyway.
Actually you're both wrong. There's no reason why a shell script couldn't have a UTF BOM (aka "two byte UTF8 header") but the error message comes from him loading binary data into a text editor.
Last edited by gm10 on Sat Oct 27, 2018 3:11 pm, edited 2 times in total.
rene
Level 20
Level 20
Posts: 12240
Joined: Sun Mar 27, 2016 6:58 pm

Re: Not UTF-8 valid

Post by rene »

gm10 wrote: Sat Oct 27, 2018 1:20 pm There's no reason why a shell script couldn't have a UTF BOM (aka "two byte UTF8 header")
There certainly is.

Code: Select all

rene@t5500:~$ cat foo.sh 
#!/bin/sh
echo foo
rene@t5500:~$ xxd foo.sh 
00000000: efbb bf23 212f 6269 6e2f 7368 0a65 6368  ...#!/bin/sh.ech
00000010: 6f20 666f 6f0a                           o foo.
rene@t5500:~$ ./foo.sh 
./foo.sh: line 1: #!/bin/sh: No such file or directory
foo
Note, 0xef 0xbb 0xbf is the UTF-8 encoding of the Unicode BOM (U+FEFF); the error says that the first line is no longer even recognized as a shebang; that as far as the shell is concerned the leading \ufeff is just random garbage. That there is no such thing as a [ ... ] header on a shell script.

The fact that it in the specific above case still ends up echoing is moreover mere historical accident of the shell invoking itself on the file when failing a system-based shebang invocation. The BOM fully breaks any other kind of script. E.g.,

Code: Select all

rene@t5500:~$ cat foo.awk 
#!/usr/bin/awk -f
BEGIN { print "foo" }
rene@t5500:~$ xxd foo.awk 
00000000: efbb bf23 212f 7573 722f 6269 6e2f 6177  ...#!/usr/bin/aw
00000010: 6b20 2d66 0a42 4547 494e 207b 2070 7269  k -f.BEGIN { pri
00000020: 6e74 2022 666f 6f22 207d 0a              nt "foo" }.
rene@t5500:~$ ./foo.awk 
./foo.awk: line 1: #!/usr/bin/awk: No such file or directory
./foo.awk: line 2: BEGIN: command not found
The LANGUAGE advise is by the way also wrong; that one isn't an LC_ var and is syntactically fine as is.
gm10

Re: Not UTF-8 valid

Post by gm10 »

rene wrote: Sat Oct 27, 2018 2:26 pm Note, 0xef 0xbb 0xbf is the UTF-8 encoding of the Unicode BOM (U+FEFF); the error says that the first line is no longer even recognized as a shebang;
I stand corrected. Apparently despite UTF-8 support in the shell the first character must be ASCII. How interesting, I never knew. Thx.
Locked

Return to “Scripts & Bash”