Raspberry pi 4 hard freeze, possibly overheating

Hello everyone!
I have been experiencing this issue sometimes with my older openhab 2 installation; and recently i’ve got it upgraded to v3. The problem is: openhab just completely freezes. No ssh, no ping, no any response, nothing. No even heartbeat LED.
After the upgrade i started experiencing it quite often, sometimes even twice per day. So i decided it’s time to do something about it. After all, the problem now happens frequently, making it easier to debug.
First of all i connected a laptop to the console port and waited until the system crashes. It finally did. I expected to see a kernel panic, but to my surprise there was completely nothing. Just frozen solid. So, looks like HW issue.
My HW setup is:

  1. SATA 2.5" hard drive, connected via USB
  2. Raspberry pi 4 4gb
  3. Custom made HAT, adding 32 I/O ports and 1wire interface via i2c.
  4. 3xRS485 ports: modbus, herzborg bus, energy meter
  5. 4 relays on GPIOS, not used yet.
    A hard drive is mounted on a plate; raspberry with the hat is mounted on top of drive case, the rest is connected by wires.
    I have sysinfo binding, the average temperature of the rpi core was 52…53 celsius.

In the meanwhile i am trying to bisect the bug, so i’ve disabled persistence and physically disconnected the HDD. The core temperature fell down to 48…49 celsius. Now running. If there’s no crash for one month, i consider the problem found and will redo the mount so that the drive is mounted separately and doesn’t add heat to the RPi. If not, i;ll have to disconnect the HAT (and cut off 1wire temperature monitoring.

So my questions are:

  1. Has anyone successfully solved any similar problem ?
  2. What’s the normal RPi operating temperature ?
  3. I’ve seen some cooling HATs available on the market. But can they work with other HW addons ? At least mounting something on top would severely restrict the air flow.

The temperatured are definitely ok. I had issues with a pi4, 4GB - when adding a mysensors GW on the GPIO pins and a SSD via USB. Looking into the logs there were low voltage warnings (even when using the official pi4 power supply). I solved it by moving the gpio addon to a dedicated rpi3 without a ssd (sd card only)

Powering the board through a usb3 connector is a crappy idea. I have seen it too. I ended up powering the board through my HAT from a centralized +5V supply from a Meanwell unit.
Okay, I’ll try to build a simple usb power injector and run a dedicated wire to the HDD.
In fact i even tried to complain about such a power input on the official rpi forum as it’s very DIY-unfriendly, but they basically told me to GTFO because i am lame because i must use their power supply. Looks like they’re satisfied with selling a kids’ toy and don’t care about other usages

Meanwhile crashed again yesterday. :frowning: So not solved yet

My 4GB RPi4 runs constantly at about 55°C since a year or so.
I have a SSD attached and a Zwave hat which usually consumes less power.
All is powered from official power supply.
So I would follow Daniel’s suggestion.

Yes, i will also try to disconnect the addon board
I am also suspecting my +5v grid. At the moment it’s a 5A power brick, to which the Raspberry and a 8-port Ethernet hub are connected.
Anyone, please tell me, how many amps is that holy blessed official psu is rated for? I remember reading “at least 3a” somewhere.
I also remember that my previous board, orange pi 2e 2gb, also experienced kernel panics out of the blue. That was one of my reasons to replace it and that’s why it’s so hard for me to believe in having two faulty boards in a row

Well, i’ve unplugged the HAT, left only UARTs because it completely makes no sense to run without them. Also i disconnected my 1-wire network from the same +5V power (forgot to mention that).
The switch is rated 800 ma, that’s ridiculous. And i’ve checked one of local stores, official RPi supply is rated 3A. So my 5A PSU should be enough.
Let’s see…

Hello there! In case if someone is reading this. Long time has passed, but i’ve finally made some advance with this problem.

It’s not the power supply; i am currently powering the raspi using a dedicated PSU, separate from the rest of the system.

But it has to do with the network. Probably it’s some low-level bug. The Rasbperry is connected to a Netis 8-port switch, like this one https://tienda.siliceo.es/2502-thickbox_default/netis-st3108s-switch-8-ports-10-100-mbps-small.jpg . And, while frozen, the Raspberry isn’t sitting completely idle. It does something weird with the switch. Its Ethernet activity LED blinks approximately 4 times per second; and it does something to the switch so that it stops conducting traffic.

I tried capturing anything with Wireshark. In promiscuous mode it sees absolutely nothing if my laptop connected directly to the Raspberry. If i try to monitor what’s happening to the switch; i see that all the traffic just stops except broadcast and Multicast. It looks like the hung Raspberry trashes switch’s MAC routing table.

Physically unplugging the Raspberry from the switch restores the network back to normal immediately.