Currently, there is a panic around packages and dependencies within the software development environment because of AI. Basically, what is happening is that AI is being used to find zero-days in existing packages, in particular NPM and Python communities. These zero-days are being exploited faster than the packages can be patched. A lot of developers are worried about dependencies being corrupted downstream by evil actors in the package supply chain. Folks are losing trust in package maintenance and package installers. As a result, dependencies are being scrutinised and potentially being replaced by home-made code.
This is particularly worrying in the NodeJS community because they use many, many, many, many packages, and many, many packages have interdependencies. Consequently, you have many attack vectors and a great surface area for potential problems. This has now come to the point where developers are saying, “Well, I’d rather get Claude to generate the code I need from that package and incorporate that into my software rather than taking the risk of package corruption from a non-trusted package sources.”
Which is kind of idiotic because Claude[1] has been trained on the packages that have the zero-days. So there’s no guarantee that Claude won’t come up with code that also has zero-days in it.
So the same developers that claim that AI isn’t conscious because it’s a probability engine, then claim, “Okay, if I take the code from Claude, then I know it won’t be zero-day’ed.” Which is a false conclusion because Claude is not thinking; it’s randomly generating characters. Those characters are based what it has been trained on. If it’s been trained on packages that do have these zero-days, that means the code it generates can and could potentially also have zero-days.
Now, unless you argue that Claude is conscious and is able to pick up on these things and fix that, then I think you’re kind of not really solving the problem by recoding stuff that you need using Claude[2]. I think you’re going to end up with the same problems, just that these problems are going to be happening down the road.
So, that’s the issue at the moment: you can’t trust dependencies and you can’t really trust what Claude comes up with as well.
How do dependencies work in flow-based programming? Flow-based programming, which is similar to Unix pipes, uses small modular functionality and combines those parts into a longer chain of computation. Each part has inputs and generates outputs which are passed along the chain of computation until a final result is generated. In Node-RED, this is done using nodes that are connected via wires. Nodes do the computation and wires transport the data between the computation units.
Basically, what is happening in flow-based programming is abstraction of certain activities to a higher level. Instead of dealing with if-then-else-while kind of constructs, you’re using HTTP requests, JSON parsing, send message to message bus and similar. So you’ve moved up one level and have these nodes that are doing higher-level tasks, much like a Unix pipeline combines commands. Unix pipes combine Unix commands together. Each of those commands is a high-level representation of part of the overall task that you wish to be doing. And so it is with Flow Based programming.
This means that each node in Flow Based programming has a specific task that it performs. And that task is the only thing it does. So you have an HTTP request node. And all it does is make an HTTP request. It doesn’t do an API request. It doesn’t maintain state. It doesn’t do a specific REST request. It does an HTTP request. And it doesn’t handle receiving requests. It’s just doing a request. It’s pulling in data. That’s all it does. Within that scope, it also supports SSL certificates because that’s part of HTTPS. So it does everything around that functionality and nothing more.
Should it be doing JSON parsing? In Node-RED, for example, the HTTP request node does have a JSON parser functionality. But there is also a JSON parser node which should be dealing with it. The cleaner approach would be to have an HTTP request that generates content, i.e., text or binary, and then have a JSON parser parsing that. The HTTP request node has a return value, it’s either UTF-8 string, a binary buffer, or a parsed JSON object. Strictly speaking the JSON parsing goes beyond what a HTTP request node should be doing.
There are no definities or standardisation. There are many grey zones here of what makes a node and what functionality a node covers. Things like CSV parsing or HTML parsing or JSON parsing are simple things because these are very well-defined features. So we’ve got a definition language, HTML, and we want to parse it and HTML is well-specified. The node for parsing it knows exactly what to do. And likewise for CSV and JSON.
Things like HTTP requests and MQTT stuff and WebSockets and this kind of stuff—you start having feature creep, perhaps. Because you have this thing: do I parse HTML or don’t I parse HTML? Do I parse JSON? Don’t I parse JSON. Strictly speaking, you don’t want to be parsing this stuff because the aim is separation of concern.
The point here is each node then has a set of dependencies for doing this stuff. So these nodes don’t implement an HTTP request engine, these nodes use libraries that provide this functionality. And so there is a set of dependencies. So if you have an HTTP request node, it’s going to have a dependency that somewhere along the line uses or implements HTTP requests. So anything related to HTTP request will be isolated. The dependency will be isolated to the HTTP request node. Same goes for an MQTT node. The MQTT node won’t have a dependency that has an HTTP request somewhere along the line, it will have a library that does MQTT communication.
And so what I’m getting at here is that you have these dependencies within flow-based programming, but they’re isolated around the nodes. Node-RED itself has a set of dependencies, but these aren’t related to HTTP requests or MQTT features, Node-RED only has dependencies related to presenting an inspired flow-based programming engine. That means it knows how to do message parsing, it knows how to connect nodes, it knows how to visualise that. That’s all Node-RED does and that’s the only dependency it requires.
So by doing that, then, of course, you start to be able to go: “Okay, well, if I have a broken dependency, then that dependency is required by that node. So I’ve got to deactivate that specific node. Or replace that node with something else. Everything else stays the same”.
So here you have this dependency management built into flow-based programming because it’s part of the paradigm. Each node has a different functionality, and this is the same as you look at Unix pipes. So if you build a Unix pipeline, with a bunch of commands, and one of those commands has a zero-day, you just replace that command. Or you upgrade that command. The rest of the commands in that pipeline aren’t affected.
That’s a great way of ensuring that dependency problems and zero-days don’t affect your codebase. Potential problems are limited to a set of nodes that you can easily deactivate and identify and workaround.
The advantage of using flow-based programming is clean dependency management because dependencies are managed by separate nodes, and not your entire application has a set of dependencies. But can nodes be interchanged as simply as the advertising says?
The aim is to replaced zero-day’ed libraries with alternatives that - have yet - had any discovered or exploited zero-days. In Flow Based programming this then becomes that a set of nodes that depend on the corrupted library must be replaced, what is involved in drop-in replacement of nodes?
And in flow-based programming, nodes have the same API: they get a message with a bunch of parameters, and then they do something, Sure, the definition of which properties the message must have for each node changes and is modified. So it is well possible to have two different or three different or four different HTTP request nodes that each require different properties on the message, But that’s clearly defined,
And the convention within Node-RED is, for example, that the message will have a property payload on the message represents the results of the command or of the node. So if you do have four different HTTP request nodes, by convention, once they do the request and get the return value, that result will be in msg.payload. So downstream from that node, there will probably won’t be any changes to be made.
Upstream from the node, you might have to set a different property for the URL you might be requesting, or whatever you’re doing with the HTTP request node, Ideally, you wouldn’t have to change the API, but the point is that there is no clear definition for nodes to say, “Okay, well, I’m going to be taking this property, this property, and this property,” right, on the message. That’s not standardised, and it can’t be standardised because each node does different things, But there’s certain stuff that are standardised and conventionalised, and that makes it easier to swap out nodes that potentially have issues.
Flow Based programming promises between dependency management by encapsulating dependencies with nodes that provide functionality. A modularisation of the architecture that also provides a modularisation of the dependencies required to fulfil the task at hand.
Unfortunately Node-RED isn’t that modularised as perhaps presented by this text. Node-RED in fact has a long list of dependencies because the core nodes are included with Node-RED the inspired Flow Based programming engine. A separation of those core nodes into their own packages would solve this.
In pure Flow Based programming terms, the presented encapsulation of dependencies is provided.
Content by human, formatting by electrified silicon.