Greetings, I am able to get a english word list in <file> by using the following command cat <file> | tr -sc A-Za-z '\012' My question is how to specify unicode character and ASCII. Specifically text text file containing 3 byte sequence starting with \x0e in the tr command. I am able to see the character using: echo -e '\xe0\xa5\xbf' What regex incantation would make tr give the results I want? I am new to unicode. Regards, Rajagopal
>I am able to get a english word list in <file> by using the following command > >cat <file> | tr -sc A-Za-z '\012' > >My question is how to specify unicode character and ASCII. >Specifically text text file containing 3 byte sequence starting with >\x0e in the tr command. > >I am able to see the character using: > >echo -e '\xe0\xa5\xbf' > >What regex incantation would make tr give the results I want? > >I am new to unicode.You don't say much as to what bounds the words, spaces? Give more info, but http://www.regular-expressions.info/unicode.html leads to some Perl solutions.
Seemingly Similar Threads
- virt-customize fail to inject firstboot script when running it from script.
- Re: virt-customize fail to inject firstboot script when running it from script.
- virt-resize Fatal error: exception Guestfs.Error("e2fsck_f
- Re: virt-customize fail to inject firstboot script when running it from script.
- Reading JPEG file, converting to HEX