Illegal Characters on NTFS Drive

Questions about applications and software
Forum rules
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
Locked
mikeflan
Level 16
Level 16
Posts: 6912
Joined: Sun Apr 26, 2020 9:28 am
Location: Houston, TX

Illegal Characters on NTFS Drive

Post by mikeflan »

The "crisis" has been averted. So this is really just for documentation and discussion.

I had an interesting problem develop and I have learned a few things while addressing it.
The problem was that a USB external drive formatted to NTFS ended up with a few files on it that contained illegal characters.
Some of the file names were like this:

Code: Select all

jelly?stuff.txt
jelly:stuff.txt
jelly//_stuff.txt
Yeah, that last one is somewhat interesting.

Anyway, the files got there by FreeFileSync copying the files over from an EXT4 fixed drive on my computer.

Windows could not even see the drive. Refused to even view it.
On Linux I could see the files, but not rename the files (remove the illegal characters), even with the terminal. It would say something like file not found. Of course I'm saying this for the NTFS drive.
rmdir -r dir and rmdir -r filename would not work either. It would say something like directory not found.

I installed Detox, but oddly that would not work either no matter what I did. I don't think it worked on ? or :, but it definitely did not work on //_. Also gave a file not found error. Probably Detox is to fix the name on the EXT4 partition before you copy over??

So what was the solution? I saw on a website to use chkdsk /f f: in Windows. Oddly enough, after about 30 minutes it worked. It changed the offending filenames and would browse the drive. And back to Linux and FFS worked good. It did create another directory that probably shouldn't have been created, but overall it worked good. It pains me to give MS any credit at all, but I must in this case.

I have 2 questions:

How could I have fixed this in Linux?

How can I search an EXT4 partition for all NTFS "illegal characters"? Maybe something like this:
find . -type f -iname '*[:?/]*'
I suspect there are more illegal characters than I have listed.
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 1 time in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
Moonstone Man
Level 16
Level 16
Posts: 6078
Joined: Mon Aug 27, 2012 10:17 pm

Re: Illegal Characters on NTFS Drive

Post by Moonstone Man »

mikeflan wrote: Wed Apr 14, 2021 8:55 pm I suspect there are more illegal characters than I have listed.
Many more. pax is best for copying files where there may be illegal NTFS characters in names:

pax -rw -s '/[?<>\\:*|\"]/_/gp' /source/path/ /destination/path/

It will change the invalid characters in the regex to underscores.
User avatar
rossdv8
Level 7
Level 7
Posts: 1737
Joined: Wed Apr 23, 2014 4:48 am
Location: Within 2,000 kilometres of Alice Springs, Australia
Contact:

Re: Illegal Characters on NTFS Drive

Post by rossdv8 »

This is just an addition to the discussion part from someone has suffered this often when making backups of things like my /home.

Often if you are copying a large number of files from one drive to another and the copy process hangs for a long time, it might be worth checking for a file in the folder it is hung on with a coloscopy in the name.
Like file : name.ext.

I've noticed it especially with especially off the shelf external USB drives, when I've been copying a collection of books from a book reader to my hard drive, then from the hard drive to an external drive, where the Author and Title are separated with :

It is becoming a very common problem for some of us who use these readers.

Bony Featherfoot's suggestion to use pax might well be worth a look !!
Current main OS: MInt 21.3 with KDE Plasma 5.27 (using Compiz as WM) - Kernel: 6.5.0-15 on Lenovo m900 Tiny, i5-6400T (intel HD 530 graphics) 16GB RAM.
Sharks usually only attack you if you are wet
Moonstone Man
Level 16
Level 16
Posts: 6078
Joined: Mon Aug 27, 2012 10:17 pm

Re: Illegal Characters on NTFS Drive

Post by Moonstone Man »

rossdv8 wrote: Thu Apr 15, 2021 12:26 am Featherfoot...
Ah, you have clues. Now I must sing you.
User avatar
rossdv8
Level 7
Level 7
Posts: 1737
Joined: Wed Apr 23, 2014 4:48 am
Location: Within 2,000 kilometres of Alice Springs, Australia
Contact:

Re: Illegal Characters on NTFS Drive

Post by rossdv8 »

I used to work on the Gulf and in the Territory. I've been lucky enough to be allowed to see certain things. I also have one front tooth knocked out :D
Current main OS: MInt 21.3 with KDE Plasma 5.27 (using Compiz as WM) - Kernel: 6.5.0-15 on Lenovo m900 Tiny, i5-6400T (intel HD 530 graphics) 16GB RAM.
Sharks usually only attack you if you are wet
rene
Level 20
Level 20
Posts: 12240
Joined: Sun Mar 27, 2016 6:58 pm

Re: Illegal Characters on NTFS Drive

Post by rene »

mikeflan wrote: Wed Apr 14, 2021 8:55 pm How could I have fixed this in Linux? How can I search an EXT4 partition for all NTFS "illegal characters"?
A note; the issue as such is not in fact NTFS itself but Windows and/or its applications (such as CHKDSK) stumbling over filenames. Important in the sense of this being the reason that NTFS-3G does not disallow those characters by default; it by default uses the by NTFS itself supported POSIX namespace.

You can disallow creating for Windows illegal names with the windows_names mount option; see man ntfs-3g, which also details what filenames are in fact illegal in Windows:

Code: Select all

windows_names
       This option prevents files, directories and extended attributes to be created with a name not allowed by windows, because

              - it contains some not allowed character,
              - or the last character is a space or a dot,
              - or the name is reserved.

       The forbidden characters are the nine characters " * / : < > ? \ | and those whose code is less than 0x20, and the reserved names  are  CON,
       PRN, AUX, NUL, COM1..COM9, LPT1..LPT9, with no suffix or followed by a dot.
       
       Existing such files can still be read (and renamed).
There's no option like e.g. CIFS's mapchars to automatically translate illegal characters to for example underscores but this will at least have an attempt at creating a file with a for Windows illegal filename produce an error.

As to those above listed illegal names I believe this monstrosity catches them all

Code: Select all

find /where/ever/ -type f | grep '["*:<>?\\|[:cntrl:]]\|[[:space:]\.]$\|[\^/]\(CON\|PRN\|AUX\|NUL\|COM[[:digit:]]\|LPT[[:digit:]]\)$'
[EDIT] Minor adustment: the above would miss files with an embedded \n in their name due to grep working on \n-seperated lines by default. You can catch that case as well by e.g.

Code: Select all

find /where/ever/ -type f -print0 | grep -z '["*:<>?\\|[:cntrl:]]\|[[:space:]\.]$\|[\^/]\(CON\|PRN\|AUX\|NUL\|COM[[:digit:]]\|LPT[[:digit:]]\)$' | xargs -0 ls -l
There's always some detail to overlook...
mikeflan
Level 16
Level 16
Posts: 6912
Joined: Sun Apr 26, 2020 9:28 am
Location: Houston, TX

Re: Illegal Characters on NTFS Drive

Post by mikeflan »

Thanks to all who responded.
pax -rw -s '/[?<>\\:*|\"]/_/gp' /source/path/ /destination/path/
I was going to ask why the forward slash is not included here, but then I tried to create a file with a forward slash (on EXT4) and it is not allowed. Not allowed in Windows either. But somehow I ended up with a half dozen files that included 2 forward slashes! So I guess my search script will include a search for forward slash. Linux doesn't allow creation of a file with a forward slash, but thankfully it allows removing it both in Nemo and via the terminal rename command.

My problem first manifested when I was using FreeFileSync (FFS) to backup an EXT4 to a NTFS. FFS complained that it could not delete certain files, but allowed me to ignore the problem. I thought my hard drive was going south. Glad to hear that is not the problem (I hope).

I don't always update my software. If it works I keep using it. But in this case I just updated from FFS ver 11.3 to 11.9. I don't see anything in these upgrades that would change the FFS behavior:

Code: Select all

FreeFileSync 11.9 [2021-04-01]
------------------------------
Save different layouts depending on screen resolution
Fixed large file icon scaling quality (Windows)
Fixed broken default filter excluding DocumentRevisions (macOS)
Don't immediately exit terminal when installer error is showing (Linux)
Explicitly set file permissions when installing missing directories (Linux)
Support installation using noexec temp directory (Linux)
Don't fail installation if root is the only user (Linux)
Added automatic socket close on execv (Linux, macOS)
Fixed Google Drive login hanging after authentication (Linux)
Correctly generate and parse Windows epoch time (Windows, macOS)

FreeFileSync 11.8 [2021-03-03]
------------------------------
Fixed unexpected file size error when copying to (S)FTP, and Google Drive

FreeFileSync 11.7 [2021-03-01]
------------------------------
Detect moved files on FTP (if server supports MLSD)
Allow installation only for current or all users (Linux)
Added application uninstaller: uninstall.sh (Linux)
Use login user config path when running as root (macOS, Linux)
Fixed detection of moved files with unstable device IDs (macOS, Linux)
Strict checking for duplicate file IDs
Avoid EINVAL invalid argument error when using F_PREALLOCATE (macOS)
Restore input focus after closing log panel
Double-click on file to open Google Drive web interface
Fixed alpha channel image scaling glitch
Fixed recycle bin folders being created recursively
Fixed thread count status message fluctuation
Don't quit FreeFileSync when parent terminal is closed (SIGHUP)
Fixed "Operation not supported" error when setting directory locks
Show folder picker despite SHCreateItemFromParsingName() error
Work around "OLE received a packet with an invalid header" error

FreeFileSync 11.6 [2021-02-01]
------------------------------
New FreeFileSync installer (Linux)
New auto-updater for the Donation Edition (macOS, Linux)
Support reading FTP file symlinks
Added context menu option "Edit with FreeFileSync" (Linux, KDE)
Support starting via symlink (macOS)
Command line support with "freefilesync" symlink in /usr/local/bin (macOS)
Fixed starting via symlink found by PATH (Linux)
Preserve keyboard focus when starting sync via F9
Don't show relative parent path if folder does not exist
Added high-resolution application icons (Linux, macOS)
Work around "500 'HELP' command unrecognized" FTP error
Fixed menu bar icon not being removed immediately (macOS Big Sur)
Don't allow folder names ending with dot character (Windows)
Mitigate ERROR_ALREADY_ASSIGNED: Local Device Name Already in Use [Wnetaddconnection2]
Fixed startup failure when app folder contains back quote char (macOS)
Fixed network card not found error on virtual machine (KVM Linux)
Fixed RTL layout direction in popup dialogs

FreeFileSync 11.5 [2021-01-02]
------------------------------
New configuration context menu option to delete from disk
Start auto retry delay at time of error instead of reporting
Added error details to status message before retry
Improved color scheme to better integrate with system colors
Keep partial SFTP results after network failure
Fixed incorrect panel font (macOS Big Sur)
Fixed SFTP retry not working after network drop
Fixed crash on exit with floating panels (macOS Big Sur)
Fixed auto-close option not being remembered
Fixed installer high-DPI scaling issues
Fixed mouse hover issues with grid column header
Fixed menu bar icons not showing (Linux)
Removed redundant GUI layout recalculations
Keep correct panel sizes after log panel maximize
Support modern folder picker in installer
Don't raise progress dialog after sync when resuming from systray

FreeFileSync 11.4 [2020-12-04]
------------------------------
New progress graph "this one sparks joy"
Remember progress dialog size
New config file context menu option "Show in file manager"
Work around libcurl performance bug during FTP upload
Only log modification time errors after comparing by size or content
Smaller icon size for efficient screen layout (Linux)
Use system-native recycle bin icon
Fixed DeviceIoControl(IOCTL_VOLUME_GET_VOLUME_DISK_EXTENTS): ERROR_MORE_DATA
Support MTP devices lacking a friendly name
Fix grid scrolling with small mouse rotations (macOS)
Faster mouse scrolling on high-DPI resolution displays
Keep previous windows size when maximized during auto-exit
the above would miss files with an embedded \n in their name
I have had that happen. When I first moved to Linux one of my Perl scripts created zip files with a newline at the end. It took me a while to figure it out because if all the files in a directory have a newline in them you don't necessarily see the problem in Nemo. And I didn't have enough experience with the terminal to see the problem at first.
find /where/ever/ -type f -print0 | grep -z '["*:<>?\\|[:cntrl:]]\|[[:space:]\.]$\|[\^/]\(CON\|PRN\|AUX\|NUL\|COM[[:digit:]]\|LPT[[:digit:]]\)$' | xargs -0 ls -l
Oddly, that works good as long as there is a file with an illegal character in the directory. But if no files contain illegal characters it prints out all files in the directory! I haven't figured out why yet.

This has some interesting info:
https://stackoverflow.com/questions/197 ... tory-names
User avatar
Flemur
Level 20
Level 20
Posts: 10097
Joined: Mon Aug 20, 2012 9:41 pm
Location: Potemkin Village

Re: Illegal Characters on NTFS Drive

Post by Flemur »

mikeflan wrote: Wed Apr 14, 2021 8:55 pm

Code: Select all

jelly//_stuff.txt
Yeah, that last one is somewhat interesting.
That's an invalid filename in linux because of the "/"; did it exist in the source?
Please edit your original post title to include [SOLVED] if/when it is solved!
Your data and OS are backed up....right?
rene
Level 20
Level 20
Posts: 12240
Joined: Sun Mar 27, 2016 6:58 pm

Re: Illegal Characters on NTFS Drive

Post by rene »

mikeflan wrote: Thu Apr 15, 2021 9:01 am Oddly, that works good as long as there is a file with an illegal character in the directory. But if no files contain illegal characters it prints out all files in the directory! I haven't figured out why yet.
It's just the xargs ls -l turning into an actual ls -l when the pipeline isn't in fact delivering it any filenames. As said, there's always some detail to overlook on the Linux command line --- but it was a very minor detail anyway. If you don't intend to be looking for files with embedded newlines in their filenames, the one before the edit will do.
mikeflan
Level 16
Level 16
Posts: 6912
Joined: Sun Apr 26, 2020 9:28 am
Location: Houston, TX

Re: Illegal Characters on NTFS Drive

Post by mikeflan »

That's an invalid filename in linux because of the "/"; did it exist in the source?
I'm still trying to figure out how those illegal filenames got into some of my text files. Something (probably) me, changed them. The last thing I did to that whole directory structure is remove the spaces in the filenames with this:

Code: Select all

find . -type f -name "* *.txt" -exec rename "s/\s/_/g" {} \;
No, that didn't do it. I sure don't know yet how it happened.
Moonstone Man
Level 16
Level 16
Posts: 6078
Joined: Mon Aug 27, 2012 10:17 pm

Re: Illegal Characters on NTFS Drive

Post by Moonstone Man »

mikeflan wrote: Thu Apr 15, 2021 5:14 pm I'm still trying to figure out how those illegal filenames got into some of my text files.

fat.fingers.jpg
fat.fingers.jpg (5.3 KiB) Viewed 1131 times
mikeflan
Level 16
Level 16
Posts: 6912
Joined: Sun Apr 26, 2020 9:28 am
Location: Houston, TX

Re: Illegal Characters on NTFS Drive

Post by mikeflan »

No, it definitely is not that. It happened in a directory with 59,000 text files in it. Many files named in a similar format. Say 40 files have the same format, but 8 of them (all in a vertical row; next to each other) get a colon put into them. It might be my Perl programs which copy these files from another directory, but I don't think so. I will learn more when I do the procedure again. The procedure is very simple, but I have a lot going on right now.
Locked

Return to “Software & Applications”