ian lutz
2007-May-01 02:22 UTC
[Vorbis-dev] contstant bittrate mode - block size - packet size
Hello, I am trying to implement a realtime encoding then streaming solution using the vorbis codec (would be wrapped in a "7F" type for minimal framing overhead) for sending audio over a low bit rate wireless link. since we need a constant bittrate (i.e. constant packet size) for this solution we want to run the alg in constant bitrate mode; for evaluation of the codec I followed the following procedure; - encode and decode using examples provided (still wrapped in ogg) - remove the ogg wrapping code, encode a file in vorbis, decode it again. The problem is that constant bittrate does not seem to be constant, the packet size varies greatly on the encode (it should be constant if using CBR). My assumption is that your resoviour algorithm provides an average bittrate over multiple frames of audio data, rather than encoding each frame to the same number of bits. my tests on changing the different parameters for encoding shows that no matter what settings i use (at least i can think of) do I get a constant packet size out of the encode. also when getting a block to encode from the vd structure the encode seems to require multiple reads of the imput to get enough data to get a block for encode #define READ 882 (20ms) struct ovectl_ratemanage2_arg rma; fp_infile = fopen(infile,"rb"); fp_outfile = fopen(outfile,"wb"); fp_enc_info = fopen("enc_info.txt","w"); printf("%s %s\n",infile,outfile); num_blocks = 0; num_bytes = 0; num_packets = 0; //num_mseconds = 0; rma.management_active = 1; rma.bitrate_limit_max_kbps = 3*128000; rma.bitrate_limit_min_kbps = 3*128000; rma.bitrate_limit_reservoir_bits = 0; rma.bitrate_limit_reservoir_bias = 0; rma.bitrate_average_kbps = 3*128000; /* skip over header */ readbuffer[0] = '\0'; for (i=0, founddata=0; i<30 && ! feof(fp_infile) && ! ferror(fp_infile); i++) { fread(readbuffer,1,2,fp_infile); if ( ! strncmp((char*)readbuffer, "da", 2) ) { founddata = 1; fread(readbuffer,1,6,fp_infile); break; } } vorbis_info_init(&vi); //ret = vorbis_encode_init(&vi,2,44100,3*128000,3*128000,3*128000);//setup for constant bittrate //vorbis_encode_ctl(&vi,OV_ECTL_RATEMANAGE_HARD,&rma); //ret=vorbis_encode_init_vbr(&vi,2,44100,0.1); //ret = vorbis_encode_init(&vi,2,44100,-1,128000,-1); // ret = ( vorbis_encode_setup_managed(&vi,2,44100,128000,128000,128000) || // vorbis_encode_ctl(&vi,OV_ECTL_RATEMANAGE2_SET,&rma) || // vorbis_encode_setup_init(&vi)); ret = vorbis_encode_setup_managed(&vi,2,44100,3*128000,3*128000,3*128000); ret = ret||vorbis_encode_ctl(&vi,OV_ECTL_RATEMANAGE_HARD,&rma); ret = ret||vorbis_encode_setup_init(&vi); if(ret)exit(1); /* set up the analysis state and auxiliary encoding storage */ vorbis_analysis_init(&vd,&vi); vorbis_block_init(&vd,&vb); //encode loop total_samples_consumed = 0; num_reads=0; while(!feof(fp_infile)){ // num_mseconds = 0; long i; long bytes=fread(readbuffer,1,READ*4,fp_infile); /* stereo hardwired here */ /* expose the buffer to submit data */ float **buffer=vorbis_analysis_buffer(&vd,READ); /* uninterleave samples */ for(i=0;i<bytes/4;i++){ buffer[0][i]=((readbuffer[i*4+1]<<8)| (0x00ff&(int)readbuffer[i*4]))/32768.f; buffer[1][i]=((readbuffer[i*4+3]<<8)| (0x00ff&(int)readbuffer[i*4+2]))/32768.f; } /* tell the library how much we actually submitted */ vorbis_analysis_wrote(&vd,i); total_samples_consumed = total_samples_consumed+i; /* vorbis does some data preanalysis, then divvies up blocks for more involved (potentially parallel) processing. Get a single block for encoding now */ num_reads++; while(vorbis_analysis_blockout(&vd,&vb)==1){ /* analysis, assume we want to use bitrate management */ vorbis_analysis(&vb,NULL); vorbis_bitrate_addblock(&vb); num_blocks++; fprintf(fp_enc_info,"num_reads per blockout:%d\t",num_reads); num_reads=0; num_packets = 0; while(vorbis_bitrate_flushpacket(&vd,&op)){ /* push packed into file */ fwrite(op.packet,op.bytes,1,fp_outfile); num_packets++; fprintf(fp_enc_info,"samples consumed:%d\tnum blocks:%d\tbytes written:%d\tpackets written for block:%d\n",total_samples_consumed,num_blocks,op.bytes,num_packets); } } } has debug output: num_reads per blockout:3 samples consumed:2646 num blocks:1 bytes written:138 packets written for block:1 num_reads per blockout:0 samples consumed:2646 num blocks:2 bytes written:1115 packets written for block:1 num_reads per blockout:0 samples consumed:2646 num blocks:3 bytes written:152 packets written for block:1 num_reads per blockout:0 samples consumed:2646 num blocks:4 bytes written:138 packets written for block:1 num_reads per blockout:0 samples consumed:2646 num blocks:5 bytes written:144 packets written for block:1 num_reads per blockout:0 samples consumed:2646 num blocks:6 bytes written:135 packets written for block:1 num_reads per blockout:0 samples consumed:2646 num blocks:7 bytes written:137 packets written for block:1 num_reads per blockout:0 samples consumed:2646 num blocks:8 bytes written:139 packets written for block:1 num_reads per blockout:0 samples consumed:2646 num blocks:9 bytes written:144 packets written for block:1 num_reads per blockout:0 samples consumed:2646 num blocks:10 bytes written:141 packets written for block:1 num_reads per blockout:1 samples consumed:3528 num blocks:11 bytes written:133 packets written for block:1 num_reads per blockout:0 samples consumed:3528 num blocks:12 bytes written:145 packets written for block:1 num_reads per blockout:0 samples consumed:3528 num blocks:13 bytes written:138 packets written for block:1 num_reads per blockout:0 samples consumed:3528 num blocks:14 bytes written:136 packets written for block:1 num_reads per blockout:0 samples consumed:3528 num blocks:15 bytes written:145 packets written for block:1... ..... plus more the expected behaviour of CBR is that we put in say 20ms of audio (882) samples and for a CBR compression to 384kbs should produce 960 Byte packets iof compressed audio. any help on setting up CBR correctly would be greatly appreciated. thanks - Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/vorbis-dev/attachments/20070501/d8bea645/attachment.html
xiphmont@xiph.org
2007-May-02 02:14 UTC
[Vorbis-dev] contstant bittrate mode - block size - packet size
On 5/1/07, ian lutz <ianlutz@gmail.com> wrote:> since we need a constant bittrate (i.e. constant packet size)You will never get constant packet size out of the reference libs, and you will have some amount of trouble doing that within spec (ie, doing a good job) even if you wrote your own algorithms. For a start, vorbis does not group 'short blocks' together into a single packet equal in length to a long block, so that alone will give you fits. Leaving that aside, both vorbis and mp3 'cheat' to get CBR. In mp3 a single mp3 packet does not always equal an integral mp3 frame (the frame size is kept the same but the amount of audio it holds sloshes around using a 'bit reservoir'). I believe the rule of thumb in mp3 is that you need up to nine frames to make sure you're certain to reconstruct a desired frame of audio (it was either nine or seven, can't remember which). In Vorbis, a single packet always yields a single frame of audio. However, the packet size varies significantly. CBR is implemented by implementing a 'bit reservoir' like in mp3, but the reservoir is not hidden behind constant sized frames. However, like mp3, the reservoir limits will not be viloated.> my tests on changing the different parameters for encoding shows that no > matter what settings i use (at least i can think of) do I get a constant > packet size out of the encode.Correct.> also when getting a block to encode from the vd structure the encode seems > to require multiple reads of the imput to get enough data to get a block for > encodeCorrect. the blocking algorithm must work ahead by a minimum amount to do short/long block detection. Unlike most mp3 decoders, it is not always working ahead the maximum amount; it appears to vary due to lazy evaluation. It will not blindly buffer blocks if it already knows it does not need to and also if it suddenly needs blocks to work ahead it can ask for them all at once.> the expected behaviour of CBR is that we put in say 20ms of audio (882) > samples and for a CBR compression to 384kbs should produce 960 Byte packets > iof compressed audio.That is likely impossible in the current Vorbis->Ogg mapping. You'd need to write a layer, like in mp3, where the varying packet sizes are hidden by the framing algorithm (thus making the framing provided by Ogg unnecessary, which was part of the impetus for doing that in mp3. If you provide your own framing, the container doesn't need to, and many containers can't). Monty