I'm curious, why haven't you used TDM I2S microphones for your array and used PDM?
I understand that ICS-52000 is a relatively low cost ($2/100pcs) and there are even breakout boards available with 4 microphones, which can be chained to 8 or 16, like https://www.cdiweb.com/datasheets/notwired/ds-nw-aud-ics5200...
Then you can take Jetson (or any I2S capable hardware with DSP or GPU on it) and chain 16 microphones per I2S port. It would seem a lot easier to assemble and probgam, if comared to FPGA setup.
(OP here) tverbeure hit most of the main points, but mostly cost ($2/mic vs $0.5/mic adds up when there are 192 microphones), difficulty of finding things with enough i2s interfaces (even with 16 way daisy chaining, thats still more than most/all things will have). The FPGA/custom hardware was part of the fun as well!
Yeah, I've also had difficulty finding something with enough I2S. It was a while back and I've used Sprocket carrier for Jetson TX2 - it had 6 lanes, so up to 96. It was for a SODAR application, so the sampling frequency was not that critical and to me it felt like the perfect trick to make an array with off-the-shelf hardware. So I was just curious, if this was something you've considered.
For something indoors, yes, I can see how low sampling frequency gets very limiting. And 192 microphones, that's really pushing it. Love it.
The $2/mic vs $0.5/mic argument is a fun one. You've obviously poured enormous amount of engineering in there, involving PCB design, FPGA and network programming, writing custom CUDA kernels, signal processing, PyTorch, the list goes on. And you've had 4090 plugged in your PC in 2023. Classic hobbit in a mithril vest ;)
Not OP, but I looked in to this a few years ago. It was more expensive then, and only went to 20 kHz. Higher frequencies are helpful if you're listening for the hiss of leaking gas, or corona discharge of an electric arc.
The Orin has 6xI2S ports internally, so that would work up to 16*6 = 96 microphones, which is a good number. But it looks like maybe only 3 are brought out & on different dev board connectors [1]? As with a lot of design, the devil is in the details. An FPGA could be easier to configure if you need more than 96 microphones.
My notes:
ICS-52000 $3.50, 20 kHz
ICS-41350 $1.05, 40 kHz
SPH0641LU4H-1 $1.45, 80 kHz+
[1] https://docs.nvidia.com/jetson/archives/r34.1/DeveloperGuide...
I've considered making a phased array myself, but never got around to sending out the PCB. But here are two reasons by I2S is not the best option:
* I2S requires 3 instead of the 2 pins of PDM. However, in the datasheet that you provided, it shows how you can daisy-chain microphones which is really cool (even if not standard I2S.) So that argument goes away.
* PDM gives you access to way higher sample rates which in turns gives you more flexibility in choosing the delay for a delay-and-sum operation. For example, if the PDM clock is 2MHz, you could theoretically delay with a precision of 0.5us. In practice, you'll do that with lower precision, but with I2S, the clock will typically max out at 192kHz.
* PDM microphones then do be cheaper.
1) and 3) are valid, but 2) isn't really. In that sort of pipeline, you usually do IQ sampling which allows you to phase-shift by any arbitrary value with a complex multiplication.