Yeah, no matter how good the microphone actually is on a headset, it uses an ancient codec so until we get Bluetooth 5.3 everywhere with lc3 codex then we won't actually have good mic input from headphones and headsets. I predict that this is all going to change this year and next year. But the full stack has to support it from headphones to Bluetooth chips to OS.
Headsets can and do use other codecs already. This is especially true for Enterprise headsets with dongles - these still use Bluetooth but by controlling both sides they can pick codecs.
LE Audio is great though - and is already, as "the full stack" has had support for quite a while... Assuming you don't happen to get your equipment from a certain fruit supplier that is notoriously slow at implementing open standards, almost as if they want to not give you a choice outside buying their own proprietary solutions...
I cannot wait to not take an audio quality hit while the mic is on on the airpods.
Especially since OSX is terrible at input/output preferences.
If you alt(opt)+click the sound icon in the menu bar you can easily select your inputs and outputs. I really just want airpods with a mic and no audio quality hit so I can use it in simracing so I don't have to have an external mic arm.
It switches back on a whim for the most arbitrary things, though. In Windows the same can happen but I can at least temporarily disable an input if it is doing that.
Doing some things like disabling an input/output device, or an internal keyboard, or a webcam. Almost impossible. Even if there are some ways, they change so often. Let's say you have two cameras and an application that always picks the internal one. I couldn't find a way to disable the internal camera so that this app would pick the only available one.
I "fixed" this with a Hammerspoon snippet that monitors input changes and reverts them:
mic = hs.audiodevice.findInputByName("MacBook Pro Microphone")
function handle_deselected(_, type)
if (type == "gone") then
if not mic:inUse() then
mic:setDefaultInputDevice()
end
end
end
mic:watcherCallback(handle_deselected)
mic:watcherStart()
Ah yeah you're right. Does the "Audio MIDI Setup" Mac utility app help you here at all?
It gets close, but no way to truly pin it still. It effectively does the same thing that System Settings > Sound > Output & Input does but with a better UI making it clearer that you are making a change to the primary. But the change is still just as unpinned as it would be from the other location.
It's so strange (and frustrating) to me that "Bluetooth audio" means "you pass the Bluetooth hardware PCM samples, and it encodes them itself in hardware; or the Bluetooth driver decodes packets in hardware to PCM samples, and then passes them to userspace."
It reminds me of the telephone network, where even though the whole thing is just another packet-switched network these days, the abstraction exposed to the handset is an analogue baseband audio signal.
---
Why can't we get another type of "Bluetooth audio", that works like VoIP does between handsets and their PBXes — where the two devices will:
1. do a little handshake to negotiate a set of hardware-accelerated audio codecs the devices (not the Bluetooth transceivers!) both support, in descending order of quality, constrained by link throughput + noise; and then
2. open a (lossy, in-order) realtime "dumb pipe" data carrier channel, into which both sides shove frames pre-encoded by their separate audio codec chip?
Is this just AVDTP? No — AVDTP does do a capabilities negotiation, sure, but it's a capabilities negotiation about the audio codecs the Bluetooth transceiver chip itself has been extended with support for — support where, as above, userland and even the OS kernel both just see a dumb PCM-sample pipe.
What I'm talking about here is taking audio-codec handling out of the Bluetooth transceiver's hands — instead just telling the transceiver "we're doing lossy realtime data signalling now" and then spraying whatever packets you the device want to spray, encoded through whatever audio-codec DSP you want to use. No need to run through a Bluetooth SIG standardization process for each new codec.
(Heck, presuming a PC/smartphone on the send side, and a sufficiently-powerful smart speaker/TV/sound bar on the receive side, both sides could actually support new codecs the moment they're released, via software updates, with no hardware-acceleration required, doing the codec part entirely on CPU.)
---
Or, if we're talking pie-in-the-sky ideas, how about a completely different type of "Bluetooth audio", not for bidirectional audio streaming at all? One that works less like VoIP, and more like streaming VOD video (e.g. YouTube) does?
Imagine a protocol where the audio source says "hey, I have this 40MB audio file, it's natively in container format X and encoding Y, can you buffer and decode that yourself?" — and then, if the receiver says "yeah, sure", the source just blasts that audio file out over a reliable stream data carrier channel; the receiver buffers it; and then the receiver does an internal streaming decode from its own local buffer from that point forward — with no audio channel open, only a control channel.
Given the "race to sleep" argument, I presume that for the average use-case of "headphones streaming pre-buffered M4As from your phone", this third approach would actually be a lot less battery-draining than pinging the receiver with new frames of audio every few-hundred milliseconds. You'd get a few seconds of intensive streaming, but then the transcievers on both ends could both just go to sleep until the next song is about to play.
Of course, back when the Bluetooth Audio spec was written, something the size of AirPods couldn't have had room to support a 40MB DRAM buffer + external hardware parse-and-decode of M4A/ALAC/etc. But they certainly could today!
While we're at it, it'd be great if we could avoid remuxing e.g. facetime audio, which is AAC, and notification sounds, into a single stream before sending it to bluetooth. Would be nice to avoid the latency and just shove the raw AAC from facetime into the headset, and when a notification ping arrives, send that as a separate audio stream with maybe a different codec
Yeah, the ultimate Bluetooth audio protocol would probably be a meta-protocol, combining the two ideas I mentioned with a MIDI-like timecoded sequencing protocol. You'd pre-buffer one or more notification sound effects onto the receiver, registering them with audio-session-specific IDs; begin streaming the live audio (the stream gets an ID); and then use the control sequencing stream to say "mix in a copy of registered-stream N [the ping sfx] at time T." (And the sender then cut the sfx off early with another such command, if it wanted.)
We basically did that in old project. We need to transfer audio from our device to phone, but our device is BLE only, and LE audio was not mature enough.
So we define a custom BLE service and blast audio file through it
I think the bigger issue might be the microphone placement. Humans tend to prefer microphones that are closer to microphones which are further away (this is one reason headsets w/ a boom arm usually sound better than a built-in microphone.) Having the microphone behind you / to the side (as in the case of an AirPod) is not great either. Of course, audio processing can fix a lot of this.
Are AirPods limited to the Bluetooth spec though? I think they extend it.
i don't know the details but airpods pro sound noticeably terrible and bluetooth-y. It's almost shocking.
They extend it in some ways, but I'm not sure if they do in this way. They do sound kind of terrible, but I always assumed it was due to the microphones being way back by your ears. I'm not sure though