Are direct bit-rate converters possible for Ogg Vorbis? Or do they already exist? Even more, is it possible to directly convert MP3/ATRAC3 (i.e., Sony Minidisc) encoding to and from Ogg Vorbis? There has been an earlier discussion on this question. But this discussion centered around the fact that people couldn't see the point, and were rather hostile to the idea. There is a very good reason to want this. I can best illustrate this with an example case: A while ago, a college of me, for a few years in a row, spent several months a year in northern Irian Yaya, which is as remote as the name suggests. She recorded a lot of speech from local villagers on her lap-top on solar power. She was REALY isolated. Now, a new project would switch to Minidisc (ATRAC3) recordings, transferring them to the lap-top/CD-ROM in compressed form (e.g., Ogg Vorbis, 80 kbs would do). Finally this speech would end up in a huge speech corpus for small languages, which uses a different compression codec and another bit-rate for archiving, say MP3 at 192 kbs. Note that the researcher in question cannot influence the encoding in the recording device nor the final compression format of the corpus. So answers like, "Don't do this", and "Let them switch to Ogg Vorbis" are not productive. In that case she would simply switch to the corpus codec, e.g., MP3 at 192 kbs. This is not as far-fetched as it might seem. In Europe and the USA, lots of money is currently spent on building large corpora of small languages. These languages are spoken in Jungles (Amazonia, South-east Asia), Tundra's (Northern Siberia, Kamchatka), and high mountain area's (Himalaya's). Furthermore, large corpora (>100GB) of natural speech are collected by volunteers carrying minidisc equipment. All this speech will end up in archives using some kind of compression. I have done some studies of the effects of compression on speech acoustics and all these compression steps would, each individually, not introduce distortions large enough to matter (i.e., RMS error < 1 semitone for standard speech analysis results). However, used in cascade, the distortion explodes for whole spectrum measures. The results suggest that the problem lies in the accumulation of quantization noise, but I could be wrong. This explosion can be prevented, I think, by NOT doing decoding->encoding steps, but by doing direct format translations, WITHOUT decoding. However, there seem to be no such translators and I do not know whether they are even possible. Can anyone help? Rob -- Rob van Son Institute of Phonetic Sciences/ACLC University of Amsterdam --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
It is possible to lower bitrate without decoding on Ogg Vorbis files, but the tools do not currently exist. It is not possible to directly convert from any compressed format to a different compressed format without decoding->encoding. It is a logical impossibility. The data is of a completely different type. <p>Mark On Mon, 19 Aug 2002 19:37, R.J.J.H. van Son wrote:> Are direct bit-rate converters possible for Ogg Vorbis? Or do they > already exist? > > Even more, is it possible to directly convert MP3/ATRAC3 (i.e., Sony > Minidisc) encoding to and from Ogg Vorbis?--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
R.J.J.H. wrote:> Now, a new project would switch to Minidisc (ATRAC3) recordings, > transferring them to the lap-top/CD-ROM in compressed form (e.g., > Ogg Vorbis, 80 kbs would do). Finally this speech would end up in a > huge speech corpus for small languages, which uses a different > compression codec and another bit-rate for archiving, say MP3 at 192 > kbs.I suggest going straight to that final format if you can't simply store the original data files (which will be the best quality you have left). Regardless of which encoding scheme you use for later stages, there will be losses in the conversion. AFAIK you'll lose somewhat less by re-encoding to MP3, as the artefacts will be the same in both cases. But there's no advantage is coverting from low bitrate mp3 ATRAC3 to high bitrate mp3, unless you have to because ATRAC3 can't be played by any of the software codecs you have available. So if at all possible, just grab the files off the minidiscs as digital data, and store those. Moz --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
"R.J.J.H. van Son" wrote:> Are direct bit-rate converters possible for Ogg Vorbis? Or do they > already exist?Eventually, it will be possible to convert high-bitrate .ogg files to lower bitrate versions without having to reencode. That's called "peeling" and the tools to do so don't exist yet. However, the .ogg files you create now will be perfectly peelable in the future, once those tools exist.> Now, a new project would switch to Minidisc (ATRAC3) recordings, > transferring them to the lap-top/CD-ROM in compressed form (e.g., Ogg > Vorbis, 80 kbs would do). Finally this speech would end up in a huge > speech corpus for small languages, which uses a different compression > codec and another bit-rate for archiving, say MP3 at 192 kbs.Afaik, ATRAC is a pretty high-bitrate codec, so I don't see any serious problems re-encoding that stuff to something much smaller. However, re-encoding those -q 2 Ogg Vorbis files to anything higher, like MP3 @ 192kbps makes absolutely no sense - that's a waste of space. The files will be bigger, while sounding either equal or worse than the q2 .ogg files. The less re-encoding the better. Go directly from ATRAC to MP3@192kbps and from ATRAC to Vorbis@q2, etc pp. Alternatively, since you say that there's a notebook involved, record the speeches (or at least as much as possible, with that notebook & solar gear) directly to harddisk in .wav format and then encode that to Ogg Vorbis in some insane quality. You can peel those files lateron and you keep enough quality to have a decent result when reencoding that to your corpus delicti MP3 @ 192kbps. This might work pretty well - 44.1kHz/16bit/mono takes 5MB/min, so there is quite a few hours of constant recording possible on today's notebooks harddisks. IF you can't get around re-encoding (it is not possible to "translate" the audio files from one format (e.g. Vorbis) to another (MP3) without loss), reencode as few times as possible. This applies to *all* lossy codecs - none that I know of can be transcoded into each other (when I say "transcode" i mean the lossless translation from one format into the other ... others use it for "re-encoding", so there might be some confusion). <p>Moritz --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
>> IF you can't get around re-encoding (it is not possible to "translate" >> the audio files from one format (e.g. Vorbis) to another (MP3) without >> loss), reencode as few times as possible. This applies to *all* lossy >> codecs - none that I know of can be transcoded into each other (when I >> say "transcode" i mean the lossless translation from one format into the >> other ... others use it for "re-encoding", so there might be some >> confusion). >> > >On a totally unconnected note....is it not possible >to convert from mp3 to vorbis without leaving the >frequency domain? That would imply much less loss >then converting mp3->wav->ogg. >It may be possible to transcode SOME of the information entirely within the frequency domain, howevery when you consider transient sounds, it not be possible due to the fact that transients that exist in the time domain appear continuous in the frequency domain and vice-versa. Add in the short-block to long-block detection in MP3 with its poorly overlaped frames and I do not even want to consider attempting using it as a pure frequency domain source. Myles --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.