Wednesday, May 8, 2019

find files with non-ascii chars in file name




Is there a way I can find files with non-ascii chars? I could use a pipe of course - and filter the files with perl, but for efficiency I'd like to set it all in find. I tried the following:



find . -type f -name '*[^[:ascii:]]*'


it doesn't work at all.



Edit:




I'm now trying to make use of



find . -type f -regex '.*[^[:ascii:]].*'


It is an emacs regexp and it has [:ascii:] class. But the expression I'm trying to use doesn't work.



Edit 2:




LC_COLLATE=C find . -type f -regex '.*[^!-~].*'


matches files with non-ascii chars (a complete voodoo...). But also matches files with a space in the name.


Answer



This seems to work for me in both default and posix-extended mode:



LC_COLLATE=C find . -regex '.*[^ -~].*'



There could be locale-related issues, though, and I don't have a large corpus of non-ascii filenames to test it on, but it catches the ones I have.


No comments:

Post a Comment

plot explanation - Why did Peaches' mom hang on the tree? - Movies & TV

In the middle of the movie Ice Age: Continental Drift Peaches' mom asked Peaches to go to sleep. Then, she hung on the tree. This parti...