Pi Garage V2
It’s been a while since Pi Garage V2 has been released and I have not mentioned it on my socials. Since this is a major release I thought I would go into a bit of detail on to why I decided to make a major release version bump as well as the nitty gritty to what changed.
V1 Problems
Although V1 worked it had a pretty big flaw in my opinion and it was related to how it handled the sequences. To rewind, in Pi Garage a sequence is the actual hardware “actions” that take place when a door is “opened”, “closed” or “toggled”.
In V1 the state was updated in the one request. This meant that if you wanted the UI in the mobile app to reflect what was happening you may have needed to have some pretty long delays. I.e. if you had a 20 second door opening this meant you needed to have a 20 second long HTTP request to change the “state”.
Although long lived requests are not a problem this did mean that I experienced some undesirable behaviour in poor WIFI scenarios. When a request would not immediately connect there was a potential that the mobile app would keep retrying it. This could mean you would get into the scenario where you pressed the button on the mobile app and nothing happened. You would then walk towards the wifi and then the request would go through without you pressing a button again. More than once this would shock me that something happened without me “actioning” it.
Solution Design
The underlying fix to the issue would be to make the requests “complete” much faster. I.e. something along the lines of “I have received your request and will action it for you” without waiting for the underlying action (opening the door for example) to complete.
As most of you would have already picked up on this means converting the current synchronous workflow into an asynchronous one.
However this has some complexities as there are no current processes that are asynchronous. Typically asynchronous workflows require more infrastructure/code layout than synchronous workflows. This is the reason why V1 did not do this initially as being synchronous allowed quick “time to market”.
Another complexity is that there is locking code for preventing two requests from trying to perform an action on a given door at the same time. This was due to relays being locked up from multiple requests in quick succession for the same door. These would need to be added to the asynchronous AND API flows at the same time. I.e. a request should fail if it cannot be added to a processing queue but it also needs to be locked/unlocked in the asynchronous processing workflows.
There are also 2 parts of the workflow that may or may not need to be in the same process. There is the state update (“opening” -> “open”) as well as the actual sequence processing (the relay clicking on/off).
Redis + BullMQ
A typical pattern I have used before for this sort of thing is a message processor such as RabbitMQ or equivalent. However I thought this was pretty heavy handed for such a simple application.
I also new that I would need locking in the future and thought that Redis would be good for that. In another project I had used the node package Bull to use simple queueing and to avoid circular dependencies in another application.
However Bull is marked as in maintenance mode and to use BullMQ instead. Although the name makes you think it is linked with RabbitMQ it is not and is still dependent on Redis.
To be honest when using BullMQ with NestJS I struggled for a multitude of reasons.
- The NestJS queues documentation still lists Bull not BullMQ as the default recipe.
- The BullMQ documentation for NestJS leave much to be desired with what things are and how they interact.
- Checking Redis connection is up before publishing message is not possible resulting in message just waiting until Redis would comeback before publishing old message that may not be relevant anymore.
There was much trial and error to find a pattern that would work and in my opinion the library left a little flexibility to be desired. The last point I still haven’t found a valid solution for and is something that will silently “eat” messages without any response to the user. For example if you start the backend service and then Redis dies for some reason. When you click on the mobile app to “toggle” a door instead of receiving an error that something has gone wrong (so you immediately know) it will publish the message, return 200 success and the UI won’t have changed (I.e. won’t change to “opening” or “closing”).
Current Status
Although Pi Garage is at v2.2.0 (as of writing) all of this work has not been completed. The initial work of making the requests asynchronous has been completed yet there is two parts that have not been completed.
The “lockout” for preventing relays from locking up has not been re-introduced as of yet. This was introduced earlier to try and prevent relays from being locked up in software when requests were issued to quickly. This was slated to be in v2.2.0, but the Material 3 design issue has had to take priority (Future story coming in the near future on this).
The other issue is in the mobile app where the request timeout has not been added to the changing state API (and possibly others as well). I haven’t looked into this one (the flutter HTTP request library) but most libraries support someway of limiting a request duration before failing.