"speech/music" mode starts kicking in when the encoder thinks it's advantageous to do so, and down below that where it tends to go all-speech-mode (if the encoder thinks the input is probably mostly speech). Above that the bit savings from including the special speech mode isn't worth the tradeoff in quality (you could just reduce the bitrate/quality setting that much and still end up with higher-quality results). Subjectively for me, it seems like there's really no point in encoding opus above about 128kbps at most for 2-channel audio. In the default 96kbps to 128kbps range the sound quality is already top-quality to my ears, and doesn't really leave more than negligible room to hear any more improvement no matter how many bits you give it to use (from my personal perspective, I think if you're willing to throw 256+kbps at stereo audio, you might as well switch over to FLAC instead.) (I have no experience with >2-channel audio, so while I assume more bits would be useful there I can't really make any predictions). tl;dr: the special speech mode is for preserving speech quality at lower (<32kbps) bitrates rather than preserving bits at higher bitrates, and you can probably chop the file you're encoding down to half or less of what you're using and probably not notice any difference in sound quality with opus.