Chris Harrington (Personal)
2007-Aug-20 12:14 UTC
[Vorbis] [sorta offtopic] Removing bad bytes from filenames
I'm moving my Ogg Vorbis collection off of my Linux server and onto my laptop. I plan on using iTunes to play my collection, but, that's a whole 'nother can of worms. I'm having trouble moving my complete collection over because dbPowerAmp (an application I used to love) made some dumb decisions about naming my files. For example, in track/album/artist names that contained a question mark ("O Brother, Where Art Thou?") it would insert the octal byte 0277; an upside-down question mark. It was a curiosity in Windows, but it's a real headache now. I have a LOT of files that I'm copying over, and I have a lot of these poorly-named files to deal with. I transfer the files across using a really basic netcat pipe: tar -c | nc :: nc | tar -x but tar pukes on the files with bad filenames. Like the example I cite above, I can't seem to actually pass the correct parameter to rename to get it to remove \0277 from my filenames. I really don't want to do this manually. $ ls -b *Thou*18* Various\ -\ O\ Brother,\ Where\ Art\ Thou\277\ (18)\ -\ Fairfield\ Four\ -\ Lonesome\ Valley\ [4.07].ogg $ rename Thou\0277 Thou *.ogg $ ls -b *Thou*18* Various\ -\ O\ Brother,\ Where\ Art\ Thou\277\ (18)\ -\ Fairfield\ Four\ -\ Lonesome\ Valley\ [4.07].ogg Any bright ideas? Awk, maybe? Hmm. Thanks. -Chris PS Sorry for being off topic. I'm sure other Vorbis enthusiasts have come across this migration issue before.
Jernej Simonèiè
2007-Aug-20 13:16 UTC
[Vorbis] [sorta offtopic] Removing bad bytes from filenames
On Monday, August 20, 2007, 21:07:05, Chris Harrington (Personal) wrote:> Like the example I cite above, I can't seem to actually pass the correct > parameter to rename to get it to remove \0277 from my filenames. I > really don't want to do this manually.Try temporarily changing your locale to ISO-8859-1 (or Windows-1250, depending on what was used when dbPowerAmp renamed those files), you should be able to use rename then. -- < Jernej Simon?i? ><><><><>< http://deepthought.ena.si/ > A memorandum is written not to inform the reader but to protect the writer. -- Acheson's Rule of the Bureaucracy
Chris Harrington (Personal)
2007-Aug-20 13:17 UTC
[Vorbis] [sorta offtopic] Removing bad bytes from filenames
Chris Harrington (Personal) wrote:> I'm having trouble moving my complete collection over because dbPowerAmp > (an application I used to love) made some dumb decisions about naming my > files. For example, in track/album/artist names that contained a > question mark ("O Brother, Where Art Thou?") it would insert the octal > byte 0277; an upside-down question mark. > [blah blah blah...]So awk hadn't yet occurred to me until I wrote that email. Here's a basic awk script that takes care of the problem (it's naive, and it doesn't *solve* the problem; it just makes the problem *solvable*) to be used in conjunction with the -b option of ls. <script language="gawk"> /\\0/ { printf("mv %s", $0); gsub(/\\0/, "OCTAL0"); printf(" %s\n", $0); } /\\1/ { printf("mv %s", $0); gsub(/\\1/, "OCTAL1"); printf(" %s\n", $0); } /\\2/ { printf("mv %s", $0); gsub(/\\2/, "OCTAL2"); printf(" %s\n", $0); } /\\3/ { printf("mv %s", $0); gsub(/\\3/, "OCTAL3"); printf(" %s\n", $0); } </script> This way, when you're trying to fix filenames using "rename", you can type $ rename ThouOCTAL277 Thou *.ogg to handle the issue intelligently. To run, pipe in ls -b on the files you're examining, like this: $ ls -b *.ogg | awk -f findBadOnes.awk > killoctal.sh The "gsub" function requires "gawk/nawk", but I don't think anybody still runs old "awk". -Chris
Chris Harrington (Personal)
2007-Aug-20 14:36 UTC
[Vorbis] [sorta offtopic] Removing bad bytes from filenames
Larry Fenske wrote:> You could also have used "tr", something like this: > > for i in *Thou*18* > do > j=`echo "$i" | tr -d "\277"` > if [ "$i" != "$j" ] > then > mv -i "$i" "$j" > fi > doneInteresting. I saw "tr". Someone described "rename" as "tr/mv"... Larry Fenske also wrote:> I just now discovered that this should also work: > > rename Thou`echo -ne "\277"` Thou *.ogg >I had to change it to echo -ne "\0277" but yeah, that works really well. Go go secret bash magic. Thanks. -Chris
vorbis@towanda.com
2007-Aug-20 14:53 UTC
[Vorbis] [sorta offtopic] Removing bad bytes from filenames
You could also have used "tr", something like this: for i in *Thou*18* do j=`echo "$i" | tr -d "\277"` if [ "$i" != "$j" ] then mv -i "$i" "$j" fi done I just now discovered that this should also work: rename Thou`echo -ne "\277"` Thou *.ogg - Larry Fenske> $ ls -b *Thou*18* > Various\ -\ O\ Brother,\ Where\ Art\ Thou\277\ (18)\ -\ Fairfield\ Four\ > -\ Lonesome\ Valley\ [4.07].ogg > $ rename Thou\0277 Thou *.ogg > $ ls -b *Thou*18* > Various\ -\ O\ Brother,\ Where\ Art\ Thou\277\ (18)\ -\ Fairfield\ Four\ > -\ Lonesome\ Valley\ [4.07].ogg > > Any bright ideas? Awk, maybe? Hmm. > > Thanks. > -Chris
Ulrich Windl
2007-Aug-20 23:35 UTC
[Vorbis] [sorta offtopic] Removing bad bytes from filenames
On 20 Aug 2007 at 14:07, Chris Harrington (Personal) wrote:> I'm moving my Ogg Vorbis collection off of my Linux server and onto my > laptop. I plan on using iTunes to play my collection, but, that's a > whole 'nother can of worms. > > I'm having trouble moving my complete collection over because dbPowerAmp > (an application I used to love) made some dumb decisions about naming my > files. For example, in track/album/artist names that contained a > question mark ("O Brother, Where Art Thou?") it would insert the octal > byte 0277; an upside-down question mark. > > It was a curiosity in Windows, but it's a real headache now. I have a > LOT of files that I'm copying over, and I have a lot of these > poorly-named files to deal with. > > I transfer the files across using a really basic netcat pipe: > tar -c | nc :: nc | tar -xIn a first attempt, have you tried replaxing tar with zip/unzip? Most likely you'll have to fix your filenames on Linux before transferring.> > but tar pukes on the files with bad filenames. > > Like the example I cite above, I can't seem to actually pass the correct > parameter to rename to get it to remove \0277 from my filenames. I > really don't want to do this manually.sed -e 's/[^-A-Z, 0-9]/_/g' # very brutal, but you get the idea... Ulrich
Daniil Kolpakov
2007-Aug-21 01:14 UTC
[Vorbis] [sorta offtopic] Removing bad bytes from filenames
? ????????? ?? 20 ??????? 2007 Chris Harrington (Personal) ???????(a):> I transfer the files across using a really basic netcat pipe: > tar -c | nc :: nc | tar -xWhat about scp instead of this? -- /dev/brains: permission denied Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html
Mark Doll
2007-Aug-21 01:52 UTC
[Vorbis] [sorta offtopic] Removing bad bytes from filenames
There's a perl script named `convmv' (http://j3e.de/linux/convmv/), which converts filenames between character sets. Mark.
Paul Martin
2007-Aug-24 10:58 UTC
[Vorbis] [sorta offtopic] Removing bad bytes from filenames
On Mon, Aug 20, 2007 at 02:07:05PM -0500, Chris Harrington (Personal) wrote:> Like the example I cite above, I can't seem to actually pass the > correct parameter to rename to get it to remove \0277 from my > filenames. I really don't want to do this manually.Do you have Perl's rename installed? $ rename Usage: rename [-v] [-n] [-f] perlexpr [filenames] The following will remove any non ASCII characters from filenames in the current directory: rename -v 's/[^ -~]//g' ./* Add a "-n" to the options to see what it would do before you let it rip, so that you know you'll be happy with the results. If you have a directory tree to do this to... find ./dir_root -type f | rename -v 's/[^ -~]//g' where ./dir_root is the root directory of the tree. -- Paul Martin <pm@nowster.org.uk>