Want to give an update. Our MongoDB has grown very large and is toppling over on it self with item updates and access logs being pushed by the 20k+ OH instances constantly. My guess is there is a lot of tuning we can (need) to do to make this better, but that’s going to take some time.
Until we can get that figured out, i have made a few changes I hope will dramatically improve the performance and availability of the service, with the tradeoff that IFTTT triggers will not be working during this time.
One major performance issue we are seeing is the update of item events, which is mainly used for IFFFT triggering. This is a two step call, we query the item out of mongo , compare the old value to the one which just received , and if its changed write the item back to the DB.
This is generating a mind boggling amount of traffic on our node and mongdb servers, and i believe we have hit a tipping point. I am going to re-evaluate this part of the code and see if there’s a better way to do this. Until then, we may need to live without this for a little while.
I have just made these changes and restarted all services, so OH’s should be connecting over the next hour or so. I will also say that there still need to be more investigative work done here and much of this is still speculative on my part until i can monitor the system as it comes back up.
If there are any MongoDB experts who know how to tune for performance (beyond adding index’s and changing ulmits) I would appreciate and assistance.
Long story short, hopefully the service is coming up now and i will continue to monitor.