IT System Design 101: Design for Deltas

In the previous article about system design I briefly hinted about a beneficial byproduct of correctly partitioned subsystems and well-defined APIs. For the lack of a better expression, I call this property “delta capability” or “delta mode”. What does it mean? It means that we communicate only the data changes the subsystem sees. Let’s formulate an example.

Let’s say one has a subsystem reading continuously measurement values from a device. Let’s say we are reading 2 things, voltage and some kind of status word, telling if the device has alarm or fault state or if it is operating just fine. If the situation is somewhat stable, we can expect only minor fluctuation in the voltage value and no changes to the status word. Now the issue becomes how to communicate the voltage and status to the application core. As we learned in the previous article, there are basically 2 fundamental modes of operation to make this happen.

The first mode is “full passthrough”. In this mode ALL the data is dumbly sent along. Even if the voltage of the device nor the status word has changed, in this mode the subsystem would just pass the data along. This is the “dumb pipes” approach talked earlier about.

The second mode, which we could call “delta mode”, communicates only the changed data to the application core. For this to work the subsystem needs to keep a record of previously received data. If the data has changed, then that data and that data only should be sent to the application core.

Of these 2 modes of operation the delta mode is better design, I argue. As described in the previous article example, in this mode the subsystem passes only a limited number of data, dramatically reducing the stress on IPC bus. In comparison, the full passthrough mode, the subsystem passes all the data to the IPC bus. And to add insult to injury, it does it every single time the subsystem goes to read data from the device. I argue that (depending a bit on the situation), on average about 93% of the IPC bus capacity can be spared with the delta approach. The following picture hopefully demonstrates the fundamentality involved.

If we want to further limit the amount of data travelling between subsystems, we can introduce also something called hysteresis. It means that for us to send the delta change to application core, the change must be a bit more than the actual. Let’s give another example. Let’s say our hypothetical subsystem reads from device 54.5V, then 54.6V, then again 54.5V and again 54.6V, continuing this cycle. Now, even if we had the delta mode in the system, we would still send all these voltage values during the 4 rounds the subsystem did the actual read. The change between consecutive values is 0.1V, right? But now, lets introduce a hysteresis value of 0.1V. It means change must be more than 0.1V to actually trigger a data sent to the application core. Because the change was always 0.1V, no data was sent. If, however, the read data was 54.5V, then 54.7V, then this step would trigger the send because the change was 0.2V, i.e. greater than our hysteresis level 0.1V. In this case it would send 54.7V .

So, again, when to use delta mode? Basically, almost always. I support a design paradigm where the subsystems are written with a low-level language, resulting in fast execution. Outsourcing the data handling to the fast code doing delta calculation, we relieve the application core of checking the data changes every time. Remember, the tasks of the application core can already be quite heavy because it is doing every kind of limit and concurrency checking and persistence handling, as per Alistair Cockburn.

Maybe the only exception when NOT to use delta mode is when the subsystem code is incredibly slow, and the application core code is quite fast. But then there would be one key benefit we would be missing. It is the “push capability”.

In push capability design, every data emitting subsystem is working in delta mode. During the startup of the system, the subsystem sends one time the full dataset to the application core to establish a baseline (or vice versa, or they do mutual baselining). After this, the subsystem only sends and receives deltas. Now this enables the application core to treat all these delta values as changes which need to be communicated to other subsystems. In a system operating with push principle, usage of push disables the need of the subsystems to do resource consuming polling. They just establish the baseline once, and then only react to the changes with very low resource utilization as minimum amount of data is passed. This is especially handy in UI subsystems, and many UI-related frameworks actually expect new data as push values instead of poll results.

Another key paradigm in creating a working delta mode subsystem is needed in self-addressing device bus situations. This paradigm is called the decoupling of address and entity data. But it is worth another article.

Leave a Reply

Your email address will not be published. Required fields are marked *