Node-RED is flow-based visual programming, which requires a new way of thinking when it comes to refactoring and programming generally.
Node-RED models data flows between computational nodes. The lines represent the flow of data and the large colourful rectangles are the computational components, i.e., nodes.
Import to note, data flows from left to right: the left side of a node is the input side and the right side is the output node. I have indicated this by using arrows on the input side, the Node-RED editor does not have arrows so these are only here for presentational purposes.
Data may be altered by the nodes and all alterations are passed down the flow. A node may only have zero or one input but many outputs. Each output, i.e. line out of a node, is a duplication of the data object returned by the node.
This means that each node has its own copy of the data. Which implies that data objects should not be excessive in size since duplication, i.e. cloning, costs time and memory.
Note: I use Node-RED version 3.0.2 for this, your mileage might vary.
I am going to deal with something that has often happened to me: copy & pasting of nodes leads to duplication of computational code, i.e. nodes.
How to deal with deal with duplication? Visual programming also aims to apply the Don’t Repeat Yourself - DRY principle, but how to handle that in a visual manner?
Starting with this flow:
Here we have two yellow nodes that represent two different states that both flow to a purple node. Between them are two orange nodes respectively. The orange nodes are duplication of the same computation.
How to handle the duplication?
An initial idea would be to do something like this:
(Note: dashed nodes are disabled nodes and can be removed in this case.)
Node-RED does not allow us to specify which output flow to take: all output is passed to all connected nodes.
Here the problem becomes how to select the correct purple node after doing the common computation?
Diverting flows is possible using nodes with two or more output connectors. Nodes have one input but can have multiple output connectors not to be confused with output connections.
One solution could be to set a flag somewhere further up the flow and then have a switch
node make divert the flow according to that flag:
(Note: setting a flag
in Node-RED terms means adding an attribute to the data object passing allow the flow. This modified data object arrives at the switch node.)
A switch
node has multiple output connectors and represents an if .. then .. else ..
programmatic statement, in this case if flag == 1 then take top flow else bottom flow
.
A switch as one output connector per if-clause. So if we had a third value for the flag, then our switch would have three output connectors. Something like this pseudo code:
if flag == 1 then output-connector-1
if flag == 2 then output-connector-2
else output-connector-3
Then the switch
would like this:
This approach is one approach to solving computational duplication, there are others.
An alternative approach is sometimes to perform commonality before the specifics:
Here the common computation is done and then the flow diverge. This could be a case of some computation that is independent of two yellow nodes and therefore can be done beforehand. This approach is also the fail early, fail fast pattern.
The flag for the switch
(lots of hand-waving here) is set somewhere up further the flow!
This approach to duplication is fine if there is only one area of duplication but what if we had two different and disconnected flows that use the same code?
In this case we have two different flows that simply cannot be brought together, how do we refactor this? Enter Subflows! Subflows are a concept within Node-RED for bundling together nodes into a mini-flow, i.e., a subflow.
Subflows can be created by highlighting the common nodes and going to the Node-RED menu and selecting Subflows --> Selection to Subflow
.
The subflow then becomes:
In the subflow editor, one can specify the number of outputs (one in this case) and whether the subflow has zero or one input (also one in this case).
Our flows become this:
The advantage of Subflows is that any modification made to them is reflected in all usages of the subflow. Meaning that if I changed the subflow to contain three nodes, then that would reflected in both flows above.
Subflows can have multiple output connectors (not to be confused with connections) and if we look at the top flow, we have the switch node
:
Perhaps that would be candidate for putting into our subflow and having multiple output connectors? Let’s try it.
Our subflow becomes this:
And our flows become this:
But wait! The second flow now has a dangling output connector on the subflow? What to do with it? Does the second output flow into the red node? What does the second output semantically mean?
And here in lies the disadvantage of having Subflows sharing change. The challenge of finding the balance between code commonality remains, even with a visual programming environment.
There is another approach that I describe over here and that is using link in/out/call nodes. That is probably the approach that you will use most often but also a little tricky to get ones head around.
I am sure there more approaches (including creating nodes), if you have any ideas please send me an email.
Also this page has been created using Node-RED, click on the [Source Flow]
link below to view the code behind this page!