@Charley oh, that’s just awesome. I was not expecting this kind of answer! I’ll probably give it a try on my next bangood/ali commande
I also made some progress since yesterday but… I warn you, this is waaay cheaper x)
First, I find a solution to get “current noise level” with any mic and ffpmeg. Work great and should be pretty cross plateforme (I was not able to make it work on my Rasp tho). The great thing with that solution is that I can use any kind of microphone, and so have recycled and old BT speaker with integrated mic. Therefore I can place it wherever I want (juste need an usb power socket) and have my capturing computer where it’s already is. I could even use more than one mic if I need.
With:
./ffmpeg.exe -f dshow \
-i audio="Casque (SHARKK Hands-Free AG Audio)" \
-t 2 \
-c:a libmp3lame -ar 44100 -b:a 320k -ac 1 \
-af astats=metadata=1:reset=1,ametadata=print:key=lavfi.astats.Overall.RMS_level \
-f null NUL
Line 1: I run ffmpeg from Windows using dshow (asla shoud be an working alternative on linux)
Line 2: using my old BT mic as input
Line 3: capturing sound for 2sec. I could get more or less, 2sec seem fine to have data updated fast enough.
Line 4: ??? duno, I find this somewhere on the Internet ¯_(ツ)_/¯
Line 5: this seem to make ffmpeg display a little report about the sound
Line 6: -f null to tel ffmpeg that I dont want to record this, and NUL is the /dev/null windows equivalent to point to nothing
This will log many things, looking like that:
built with gcc 10.2.1 (GCC) 20200726
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-icon
v --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libope
ncore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnap
py --enable-libsoxr --enable-libsrt --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack
--enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zli
b --enable-gmp --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvo-amrwbenc --enable-libmyso
fa --enable-libspeex --enable-libxvid --enable-libaom --enable-libgsm --enable-librav1e --disable-w32threads --
enable-libmfx --enable-ffnvcodec --enable-cuda-llvm --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvd
ec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, dshow, from 'audio=Casque (SHARKK Hands-Free AG Audio)':
Duration: N/A, start: 450603.230000, bitrate: 1411 kb/s
Stream #0:0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
[Parsed_ametadata_1 @ 000002d2170b3580] frame:0 pts:0 pts_time:0
[Parsed_ametadata_1 @ 000002d2170b3580] lavfi.astats.Overall.RMS_level=-58.525207
Output #0, null, to 'NUL':
Metadata:
encoder : Lavf58.45.100
Stream #0:0: Audio: mp3 (libmp3lame), 44100 Hz, mono, s16p, 320 kb/s
Metadata:
encoder : Lavc58.91.100 libmp3lame
[Parsed_ametadata_1 @ 000002d2170b3580] frame:1 pts:22535 pts_time:0.510998
[Parsed_ametadata_1 @ 000002d2170b3580] lavfi.astats.Overall.RMS_level=-65.373106
[Parsed_ametadata_1 @ 000002d2170b3580] frame:2 pts:44365 pts_time:1.00601
[Parsed_ametadata_1 @ 000002d2170b3580] lavfi.astats.Overall.RMS_level=-61.159115
[Parsed_ametadata_1 @ 000002d2170b3580] frame:3 pts:66591 pts_time:1.51
[Parsed_ametadata_1 @ 000002d2170b3580] lavfi.astats.Overall.RMS_level=-62.354711
[Parsed_ametadata_1 @ 000002d2170b3580] frame:4 pts:88421 pts_time:2.00501
[Parsed_ametadata_1 @ 000002d2170b3580] lavfi.astats.Overall.RMS_level=-66.096016
size=N/A time=00:00:02.02 bitrate=N/A speed=1.01x
video:0kB audio:80kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[Parsed_astats_0 @ 000002d215446f80] Channel: 1
[Parsed_astats_0 @ 000002d215446f80] DC offset: 0.000415
[Parsed_astats_0 @ 000002d215446f80] Min level: -6.000000
[Parsed_astats_0 @ 000002d215446f80] Max level: 34.000000
[Parsed_astats_0 @ 000002d215446f80] Min difference: 0.000000
[Parsed_astats_0 @ 000002d215446f80] Max difference: 4.000000
[Parsed_astats_0 @ 000002d215446f80] Mean difference: 0.600073
[Parsed_astats_0 @ 000002d215446f80] RMS difference: 0.877633
[Parsed_astats_0 @ 000002d215446f80] Peak level dB: -59.679155
[Parsed_astats_0 @ 000002d215446f80] RMS level dB: -66.094560
[Parsed_astats_0 @ 000002d215446f80] RMS peak dB: -65.123101
[Parsed_astats_0 @ 000002d215446f80] RMS trough dB: -67.363859
[Parsed_astats_0 @ 000002d215446f80] Crest factor: 2.093005
[Parsed_astats_0 @ 000002d215446f80] Flat factor: 0.000000
[Parsed_astats_0 @ 000002d215446f80] Peak count: 3
[Parsed_astats_0 @ 000002d215446f80] Noise floor dB: -60.204939
[Parsed_astats_0 @ 000002d215446f80] Noise floor count: 11026
[Parsed_astats_0 @ 000002d215446f80] Bit depth: 16/16
[Parsed_astats_0 @ 000002d215446f80] Dynamic range: 36.650178
[Parsed_astats_0 @ 000002d215446f80] Zero crossings: 138
[Parsed_astats_0 @ 000002d215446f80] Zero crossings rate: 0.006259
[Parsed_astats_0 @ 000002d215446f80] Channel: 2
[Parsed_astats_0 @ 000002d215446f80] DC offset: 0.000415
[Parsed_astats_0 @ 000002d215446f80] Min level: -6.000000
[Parsed_astats_0 @ 000002d215446f80] Max level: 34.000000
[Parsed_astats_0 @ 000002d215446f80] Min difference: 0.000000
[Parsed_astats_0 @ 000002d215446f80] Max difference: 4.000000
[Parsed_astats_0 @ 000002d215446f80] Mean difference: 0.604427
[Parsed_astats_0 @ 000002d215446f80] RMS difference: 0.880367
[Parsed_astats_0 @ 000002d215446f80] Peak level dB: -59.679155
[Parsed_astats_0 @ 000002d215446f80] RMS level dB: -66.097472
[Parsed_astats_0 @ 000002d215446f80] RMS peak dB: -65.125332
[Parsed_astats_0 @ 000002d215446f80] RMS trough dB: -67.370146
[Parsed_astats_0 @ 000002d215446f80] Crest factor: 2.093707
[Parsed_astats_0 @ 000002d215446f80] Flat factor: 0.000000
[Parsed_astats_0 @ 000002d215446f80] Peak count: 3
[Parsed_astats_0 @ 000002d215446f80] Noise floor dB: -60.204939
[Parsed_astats_0 @ 000002d215446f80] Noise floor count: 11026
[Parsed_astats_0 @ 000002d215446f80] Bit depth: 16/16
[Parsed_astats_0 @ 000002d215446f80] Dynamic range: 36.650178
[Parsed_astats_0 @ 000002d215446f80] Zero crossings: 136
[Parsed_astats_0 @ 000002d215446f80] Zero crossings rate: 0.006168
[Parsed_astats_0 @ 000002d215446f80] Overall
[Parsed_astats_0 @ 000002d215446f80] DC offset: 0.000415
[Parsed_astats_0 @ 000002d215446f80] Min level: -6.000000
[Parsed_astats_0 @ 000002d215446f80] Max level: 34.000000
[Parsed_astats_0 @ 000002d215446f80] Min difference: 0.000000
[Parsed_astats_0 @ 000002d215446f80] Max difference: 4.000000
[Parsed_astats_0 @ 000002d215446f80] Mean difference: 0.602250
[Parsed_astats_0 @ 000002d215446f80] RMS difference: 0.879001
[Parsed_astats_0 @ 000002d215446f80] Peak level dB: -59.679155
[Parsed_astats_0 @ 000002d215446f80] RMS level dB: -66.096016
[Parsed_astats_0 @ 000002d215446f80] RMS peak dB: -65.123101
[Parsed_astats_0 @ 000002d215446f80] RMS trough dB: -67.370146
[Parsed_astats_0 @ 000002d215446f80] Flat factor: 0.000000
[Parsed_astats_0 @ 000002d215446f80] Peak count: 3.000000
[Parsed_astats_0 @ 000002d215446f80] Noise floor dB: -60.204939
[Parsed_astats_0 @ 000002d215446f80] Noise floor count: 11026.000000
[Parsed_astats_0 @ 000002d215446f80] Bit depth: 16/16
[Parsed_astats_0 @ 000002d215446f80] Number of samples: 22050
We can see 3 “Max level” value there, one for channel 1 and 2 and one for the overall
I’v then used my hardcord linux skill to extract this with the help of
2>&1 | grep "Max level:" | tail -n1 | sed -E "s/.* ([0-9]+.[0-9]+)/\1/"
I could probably avoid using grep tail and sed together, but at least this combo is easy to read for me.
And voila, I got a one liner that return me the “max level” for the last two sec.
I can know just put this into a curl command like
curl -X POST --header "Content-Type: text/plain" --header "Accept: application/json" -d $(./ffmpeg.exe -f dshow -i audio="Casque (SHARKK Hands-Free AG Audio)" -t 2 -c:a libmp3lame -ar 44100 -b:a 320k -ac 1 -af astats=metadata=1:reset=1,ametadata=print:key=lavfi.astats.Overall.RMS_level -f null NUL 2>&1 | grep "Max level:" | tail -n1 | sed -E "s/.* ([0-9]+.[0-9]+)/\1/") "http://192.168.1.50:8080/rest/items/Kitchen_Loudness"
Put all of this into a simple script
And call this in a while (will probably switch for a cron based thing to avoir deadly crash)
Now I just have to rules this on the openhab side
I know this is a pretty unclean solution, but it’s “hardware free” for me, and seem to work fine. I’ll now have to get some loudness values to calibrate my rules and I think I’ll have a working solution for a while