My (never finished) master's thesis was about something similar - taking advantage of the fact that (almost) all smartphones have at least two microphones I wanted to locate and separate a speaker in 3D.
A few takeaways:
-The sampling rate is slightly off between devices - approximately ±1 sample per second - not a lot, but you need to take that into account.
-Spectral characteristics in consumer microphones are all over the place - two phones of the same model, right out of the box, will have not only measurable, but also audible differences.
-Sound bounces off of everything, particularly concrete walls.
-A car is the closest thing to an anechoic chamber you can readily access.
-The Fourier transform of a Gaussian is a Gaussian, which is very helpful when you need to estimate the frequency of a harmonic signal (like speech) with a wavelength shorter than half your window, but just barely.
> - A car is the closest thing to an anechoic chamber you can readily access.
I recall a youtuber solving the anechoic chamber problem by finding a big empty field - nothing to reflect off of except the ground - and maybe putting some foam below the experiment.
It doesn't kill environmental noise, of course, but it apparently did a very good job of killing reflections from his own instruments.
In my case wind noise disturbed the signal too much. Normally there's additional processing which deals with it, but I was working with (next to) raw data.
Surely a carpeted closet full of clothes is better than a car
I didn't have such a place at the time, but I found one and results weren't as good as in a car.
Sound deadening generally requires mass to work for lower frequencies and the seats absorbed them all nicely. I got some reflections from - I assume - the windows, but they were manageable in comparison. Didn't even produce much of a standing wave when I tried.
>The Fourier transform of a Gaussian is a Gaussian, which is very helpful when you need to estimate the frequency of a harmonic signal (like speech) with a wavelength shorter than half your window, but just barely.
I get the gaussian link. But, can you explain your point with more detail?
The log of a Gaussian is a parabola, which makes finding where exactly a peak in the spectrum lies a question of solving a quadratic equation.
My plan was to detect the frequency of the speaker by counting (with weights) which distance between peaks is the most common. I wanted to avoid calculating the power cepstrum as I felt that I was running out of computing power on the devices already[0] - a mistaken belief in the long run, but I was too proud of my little algorithm and how stable it was to let go.
[0] Sending raw sample data to a more powerful machine was out of the question as I wanted to remain within the bandwidth offered by Bluetooth at the time due to power consumption considerations.