Why is flow based programming so hard?

FBP as described by J. Paul Rodker Morrison:

… [Flow Based Programming] uses a “data processing factory” metaphor for designing and building applications. FBP defines applications as networks of “black box” processes, which communicate via data chunks (called Information Packets) travelling across predefined connections (think “conveyor belts”), where the connections are specified externally to the processes. These black box processes can be reconnected endlessly to form different applications without having to be changed internally. FBP is thus naturally component-oriented. - source

Conveyor belts or assembly lines is a good analogy for FBP: as the data flows through the program/flow, each node along the way alters or consumes or adds to that data. Eventually the modified data packet comes out the other end.

Assembly lines work on the same principle. As a product moves through the assembly line, each station/point along the way applies the same alteration to that product.

Assembly: construct, put together

Line: a connection of points forming a path

Thus an assembly line becomes a series of points, along which a product is constructed.

The hidden difficulty in the assembly line concept (and in Flow Based Programming) is how focussed is each single station? What task is performed at each station and how complex is that task?

In theory an assembly line could consist of a single station where the product is put together completely or it could consist of a hundred steps that each apply the alteration to the product - what are the optimal number of steps to construct the product?

The answer is there is no answer. There are various factors that influence the answer, but there is no general answer for every product.

Assembly lines that have humans at each station can quantify the work done at each station by the human working at that station. For the assembly of a doll, for example, there would be a station were a worker adds the eyes to the doll. That worker repeats the eye-adding alteration over and over again. Having that worker additionally attaching the feet would imply that, along with a collection of eyes, the worker would additionally have a collection of feet at their station.

This would make the station take up more space and would require that two parts are delivered to the station. In addition, the worker would constantly be context switching between attaching feet and attaching eyes, which could lead to mistakes since the worker now has two tasks to perform.

Hence the amount of work done at a station has inter-related dependencies: a) resupplying the station b) space taken up by the station c) context switching of workers between tasks d) overall speed of the assembly line if workers take too long to perform their alterations, e) reusability: can the tasks be applied to different forms of the product. The worker attaching eyes can attach eyes to any doll but are the feet the same for all dolls?

Therefore creating an assembly line using humans includes the limits of the humans. This is not the case of an assembly line built with software, there are no such limits. Software does not have difficulty context switching, software takes no breaks and software, once it is working, makes no mistakes repeating the same task over and over again.

Flow based programming has individual components that perform single tasks upon the data flowing through these components but what limitations do these components face?

A computer does not get tired, a computer does not need to be resupplied, a computer does not make mistakes when context switching. A computer is a better robot than a human.

What limits remain? Decomposability. How to decompose the product or problem into smaller tasks that can also be reused.

Creating a single node that does everything with the data is the same as creating an algorithmic textual program. This does not take the advantage of using flow based programming. If flows consist of a single node, then these flows should be done using textual programming methods and the visual component is simple overhead.

Along which lines should these algorithms be broken down to make them suitable for node-based programming? How much “work” should an individual node do?

The Unix philosophy points out that a program should do one task and one task only but that task is done exceptionally well.

This philosophy is easily applied to well-defined concepts such as sorting: a sorting node would be able to sort alphabetically or numerically, either ascending or descending. That is all the sorting node does. Should the sort node also be able to count elements? No of course not, sorting is the specific and sole task of the sorting node.

Should the sort node be able to distinguish between upper and lower case when sorting or must the data be transformed before being sorted? Yes the sorting node should be aware of case because case has an influence on sorting. If case had no influence on sort algorithm, then the sort node should not be aware of case’ness of data. But here the grey-zone begins: should the sorting node be case aware?

The application of the Unix philosophy is why Linux systems have a large collection of programs sort, wc, time, cut, col, cat which can be chained together into an assembly line using the pipe | character. It takes a certain amount of imagination to find the right combination of programs in Unix to complete a specific task. And that is the underlying, unspoken difficulty of flow based programming.

Decomposability

Flow Based Programming, be it visual or textual, is hard since:

Question 1: How to break down a task into smaller components?
Question 2: How to combine existing components to fulfil the requirements of a specific task?

Why break down tasks into smaller components in the first place?

Reuse of components is the goal. Reusing code that has been written, tested and proven is an advantage over writing everything from scratch.

Single responsibility

Avoid combining multiple responsibility into a single component.

Using Unix tools that have been developed by many people over years, used by many folks, ensures that these tools work. Using the sort command in Unix will makes more sense that writing own code to do sorting.

Often it is better to transform data into a formats that are compatible with existing tools than rewriting the required functionality to work with the original data.

Flow based programming centres around code reuse and recombining what already exists. This also exists in other paradigms of programming. The difference is that reuse is required in flow based programming for it to make sense.

Last updated: 2024-02-29T08:46:38.372Z

Comments powered by giscus

The author is available for Node-RED development and consultancy.