Pier Paolo Fumagalli

Tales from the Toilet: how Javascript helps the production of tissue papers

Producing tissue paper, kitchen rolls, folded napkins or toilet paper is not for the faint of heart. Gigantic machines rewind huge rolls of tissue paper weighing almost a ton processing it at a speed of 40 km/h, and a single minute of downtime cuts into the slim margins of the paper industry.

The asynchronous nature of Javascript and Node.JS allows telemetry data to be harvested from ancient PLCs controlling the production, and its real-time analysis in the cloud, enabling operators and factories to raise production quality, improve performance and reduce waste.

Join me on a journey to understand how modern programming techniques make IIoT and Industry 4.0 a reality today in the toilet paper world!

Portrait photo of Pier Paolo Fumagalli


Wow. Being the last one on the last day of the presentation in the Side Track. I'm impressed that there are still people here that are not sleeping. It's professionally what we call this slot the twilight zone. And I know that you guys are tired after two days of conferencing and learning and JavaScripting.

So, I hope that this talk is going to be a little fun for you. That said, you know, like I've added a few jokes here and there. But my wife tells me that my dad jokes are absolutely terrible. So, we'll see how that works out for you guys.

Nobody laughs. That's a good start.

Hi. My name is Pier. You might remember me from such amazing projects such as: The Java servlet API. The Java API for XML.

Java 2 SE 1.5. And a lot more Java goodness. I'm sure right now you're actually wondering whether I am at the right conference here.

Well, I am. You see, like a few years back when I was living in Tokyo, I met this fine gentleman. His name is Jed Schmidt. We were working together. And he's one of the creators of, for example, Brooklyn Jazz.

Quite a famous name over there. And thanks to the beatings of these fine gentlemen I kind of like  praise the Lord. I saw the light.

I abandoned the dark side. And converted to this wonderful world of Node.js. Let me tell you, life since then has been all ponies and rainbows. But I want to point out one thing. I know that I'm gonna anger some of you here.

I am not a, you know, like I'm definitely not into Star Wars. I'm more of a Trekkie myself. But enough about me. Let's get into the nitty gritty details of this talk.

So, today we're talking about the almighty toilet paper roll. And now I'm sure that you're really wondering, am I at the right conference? How many of you are familiar with this wonderful object?

[ Laughter ]

I mean, you know, if not this one, we can get the black one. It's so Berlin. You know? Amazing. For your health, I really hope that you had a good use of this today.

You know? But what about JavaScript? And toilet paper? Well, let's take a step back. You see, a couple of years back I took a job at a company called Korber, a big giant in industrial manufacturing in Germany.

They are one of the largest manufacturers of industrial machines. Including, amongst a thousand other things, tobacco machines, palletizing equipment, we actually do produce machines that produce toilet paper and kitchen rolls. I am embedded with their digital lab. Korber Digital. And we develop digital applications for our customers, right?

More specifically, I am in a team that develops this wonderful app. What we're building is called K Edge. K Edge is an app that has been designed to offer shift support in for the operators of toilet paper machines, right? On KEdge, as you can see there, like, you know, operators of  operators of these machines can actually see their production stats like how many logs they have produced, the average speed at which the machine is running, downtimes and whatnot. They can see the telemetry from the machine itself. You can see actually the graph over there.

We take that graph, we analyze the speed, we create yellow segments or red segments. Yellow segments are reduced productivity so that when we are below that green line you see over there. And the red segments is when the machine is actually stopped.

So, what happens, the operator at that point has the ability to create a digital report of their shift, right? So, KEdge is deployed as  on a tablet that goes alongside the traditional HMI. The HMI is that big computer you see there that controls actually the machine itself. And the operators use the tablet to create a digital journal of their shift. And, you know, replace their old paperbased trail of stuff. Right?

So, let's look a little bit about how KEdge is built. So, this is a quick outline of our architecture. The frontend is a React app. So, lots of JavaScript.

It's deployed statically on Amazon S3, served through Cloud front and a login will be handled by incognito really soon. And backup communication goes through Redux to a bunch of JavaScript microservices which have been deployed as Amazon Lambda functions and therefore accessible via the AWS gateway. Interesting, isn't it? No.

It's 2019. Nobody cares about another god damned React app. You know, if we were to talk about this, you might just as well go out to the beach and catch some sun while you can. You know, enjoy the heat and so on and so forth. And so, what's actually interesting about what we do?

Well, to figure that out, we have to see how toilet paper is actually produced, right? So, thanks to the National Geographic, this is how a modern toilet paper factory looks on the inside. This has been shot at one of the biggest producers of toilet paper here in the European market. And you can actually hear how loud this place is. Right? [it's loud]

Giant factories, just to make toilet paper. But easier on the ears, let's look a little bit at how our production line is actually configured. This is a pretty machine, one of ours. Wonderful piece of equipment that runs at around 50 kilometers an hour. And how is toilet paper produced? We start from the top left with some giant jumble rolls, we call them.

Those are three tons of paper. Oneply paper. To put that into perspective, one of those rolls is enough for at least three of you to make toilet paper for the rest of your life kind of thing.

So, pretty big. We unwind those ones, right? And we unwind them one, two, or three or four of them depending if we want one, two, three or fourply toilet paper. After the winders over here, we have what is called the embosser. It basically takes the splice of toilet paper, pushes them together and embosses this nice pattern over here.

And in the process, it injects air into the paper. So, it makes it thicker, it makes it fluffier, softer, gentler on your rear end, maybe. And after the paper is embossed, basically what happens is that the cardboard core here gets produced by the machine that you see in the middle over there. The cardboard core slides in and we actually start rewinding the embossed paper around the cardboard core in giant logs.

It's like this, but it's 3 meters wrong. Those go into the thing that look like a cage, it's an accumulator, buffering the unwinding part at the top from the cutting and packaging at the rear end of the line. You see at the very top over there, we have a log saw, which basically take this is log and cuts it and makes these things.

And then quickly, the  at the left  no, sorry, at the right of your screen we see the packager. That basically, you know, takes four, eight, 12 toilet paper rolls, puts them together, wraps them around. Nice package at the supermarket. And at the bottom, the palletizer. And they are stacked up and put in a pact and, boom, ready for shipping.

But now you will be asking, is all of this controlled with JavaScript? No. Production  a production line  even the modern production lines are controlled by something called a PLC or programmable logic controller. I like to call them legacy hardware from the last century. This is a last generation Siemens S7 1500, top of the line.

Great PLC. There is one little problem with these beasts that they are still programmed using a thing called lather logic. This is an example of a program.

But, you know, like in the PLCs themselves, we don't have variables. In most of the PLCs we address variables are their location in the memory. We don't know exactly what's stored here and there. This is definitely not for the faint of heart.

Programming one of those things makes COBOL look so 2019. You know?

And if it's not bad enough, we actually don't work on these beautiful brand new machines most of the time. We use them on the machines that have been in the field for like 10, 15 years. This is normally how we find a PLC. This is Inga. It's a machine that I had to connect in order to extract data from in order for our application to work correctly, right? And it's a jungle of wire.

And how do we extract data from this medieval piece of kit? Well, we add more wires. We add, you know, like more wires and an industrial PC in there which we call the gateway. And the gateway connected on one side to the PLC with the beautiful' they are Internet table. And thanks to our friends at Twilio, we have a dedicated connection that pumps the data down to the cloud, right? And this, the gateway, is where we run all this JavaScript goodness that we wrote. But before we get into the JavaScript part, like I wanted to show you a little bit the connectivity scenario.

So, on the left we have the little factory which is our PLC while in the middle the chip is our gateway, running node. And bridges kind of like the time gap between, you know, like the  between the last century and now. And obviously we have the cloud. So, the data flow here is interesting. Because being stuck in the past, that PLC does not know anything about encryption, security, not even a password, right?

If I can read from a PLC, I can write to it, I can reprogram it, I can do whatever I want with it. And the only thing that I need to do that is an IP address which I can connect to. Now, this is very insecure. That has been used many times in the past.

Probably the most famous case is when all the centrifuges in Iran for processing uranium were disabled by malicious code by disabling the PLC that spun up the centrifuges and boom, it's gone. But when we push the data through the include, we want everything to be safe. So, we started looking at the physical boundaries of security.

And so, we install our gateway into the cabinet where the PLC is. We only have one cable. Nothing gets in, nothing gets out. Nobody contacts the PLC from outside of the cabinet of the PLC itself, right? And outside when we want to reach through the cloud, you know, like we have a nice secure modern Linuxbased like VPN on top of an LTE link and yada, yada, yada, right? So, we actually implement  we actually define a security perimeter around the PLC so that we can protect it from attacks.

And we do this by installing the gateway with a PLC itself, right? The gateway itself also serves another bunch of purposes. We want to consolidate data.

Because as much as our friends at Twilio advertise their 4G, in most cases factories are, you know, they're not very well in terms of reception. Most of the times we get 2G, 40 kilo bytes a second. That's not quite enough to actually have a full stream of data coming out. We actually do a lot of the data consolidation on the gateway itself. That's why we need the processing power of JavaScript to do that.

And so, we send only, for example, the data that changes in the PLC. And we also do a lot of caching. So, we do  we cache everything that we read on to the device itself because, well, it's 2G. When and, you know, whatever connectivity comes and goes and so on and so forth. In case of connectivity failures, and some hiccup occurs, we can quickly, or as quick as we can, reingest all the data that we've read while that link was down and then continue at normal operation, right?

So, we started this thing with a wonderful program by IBM Research that was called Node Read. It's a wonderful tool. It has libraries that allows us to connect to the PLC. And you see here a very easy configuration, every 10 seconds we have the speed, and we have the first alarm. Shift it down to the right.

Send it over to Amazon, right? This is great. Nice going. Can do things very quickly.

It's all based in JavaScript, and it's amazing, but  gets really complicated when we start reading like a bunch of variables. This is actually a real configuration file we had in production when we started reading more valuables to the thing. And it really gets a mess. It becomes like a bunch of wires.

You never know who is changing what. We cannot manage these files and so on and so forth.

So, a year ago roughly it dawned on me, you know what? We have the technology. We're going to rebuild it. And we came up with a thing that is called the PLC reader. PLC reader is a wonderful little Node.js application that we wrote in order to clean up the mess that was Node Read.

JavaScript, why? Because it was built in Node. Continue with Node because we had the reliability over there. But we managed to bring some automation into the world of point and click, right?

It does a very simple job. It reads values from various sources, which we call drivers. And processes in various ways through a pipeline of what we call processers. Right? On top of that, it's easily deployed as a Debian package. We bought a tool, Debianize.

It's not point and click, but it's YAML files. Easy to manage, get, central, we can track who changes what and convert with CircleCI and Ansible to anywhere we want. All right?

So, let's look at one of those configuration files. Very, very easily here, we have one driver. A driver that, guess what, next to a PLC. A Siemens S7PC, we have the IP address in the port which we connect to, read and write. Rack and slot our particular parameters for the RFC106 protocol that we're using to talk to these things.

And what's important is that every 2 seconds we read these variable  or better, we read these memory addresses at DB 5102. And at offset 12. That's a word.

And that's basically a sign 16. That's what we call speed. That's where we find the speed that the machine is running. At offset 36, we see the code of the first alarm and so on and so forth. That's one of the drivers.

We have many drivers which we wrote. And to monitor the performance of the gateway. Here I just listed like load average

Every 10 seconds we read the load average, CPU percentages, every 10 seconds, calculate the percentage of the CPU used. And we have latency to monitor the connection. And this is how we start getting data into the system, right? Doing that is very easy, we have processers organized one after the other. Each process can subscribe to one, two, many, all of the values that are published by the drivers or are republished by the processers.

And here we have very simple example of three processers that we are actually using. The first one is a function. The only line of JavaScript that probably you will see in my talk.

Which takes this speed which in the machine is published in decimeters per minute. I don't know why. But we divide by ten, meters per minute. And the second one is string, we read alarms as unsigned 16bit integers. But they're not numbers, we make an average on an alarm code.

We convert them into strings and consolidate. Like I said before, if the alarm changes, we publish it. Otherwise, if the alarm is not changed, we swallow that message, right? But the most important probably processers that we have are like the last two that we use at the end of the pipeline. The batcher and the MQTT.

The batcher is very simply something that subscribes to all the messages, accumulates them, and every 30 seconds pushes out that thing that you see like over there. Functions out a message. Which is no more, no less than an array of timestamp at which the data was read and the vault of the data point. Giant array. We also compress it so it's nice and tiny when we send over our 2G link.

And then the MQTT processer does what it's supposed to be doing. Sends it over to MQTT secure, blah, blah, blah, blah. Very nays, right?

This is how we use JavaScript. Everything behind this Y ML fuel, ever processer, every driver, the entire infrastructure to drive this thing is written in JavaScript. And we're using it like  like it's been a labor of love. We are extremely proud that it's running in production at real customer's sites.

It's feeding KEdge with live real data. It has been working out great.

If you are a dad and, you know, maybe how proud you are. The other thing that I mentioned, there were a couple of extra things. Enough with the kit. YAML. We've wrote a bunch of YAML extensions which we needed for these configuration files.

YAML is great, but it's also terrible. It doesn't allow including, it doesn't allow a bunch of things. So, we wrote a little extension over there to merge a raise over raise. So, if we have in one file define a list of drivers and in another file another list of drivers, we can just merge them together.

It works great, obviously with the include driver. So, we include a part  a snippet of YAML into another file. Making sure that all the variables are visible by the included files and whatnot. So, there is a little bit of logic over there.

And then the last one we use a lots for configs ration is the join which takes a lot of joins and concatenates them into a giant string. It's great for variables, a certificate ID that repeats 25 million times in our configuration files. Right?

The other thing that I talked about was Debianize, it's a tool we wrote in order to take your npm package and have it as a Deb. Why? We use Ubuntu on our gateways. It was easy to get it out of the way. You just slap it in your package.JSON, and then run the npm, this is the configuration of the PLC reader in itself.

It's very complex. It has sensible defaults so it's a one line of addition. Makes a Debian package for you.

So, what's the state for us? What's the state in terms of all the stuff we have wrote so far? We have PLC reader which is open source. You can just check it out on npm. We have Debianize which is out for everybody to use. YAML extensions, open source, get them, download them even if you don't care about PLCs or industrial machines.

They're there. They work.

We also throw in  because you're such a good audience, we throw in a freebie, the IRONdb persister. That's the time series database we use for telemetry data. It was a fantastic database created by a friend of mine, Theo. But they had no JavaScript drivers for them, so we wrote them and they're also available for you guys to use if you use IRONdb.

And that's pretty much it for me because I think that I have two minutes and I was wonder for you guys have any questions? Sure.

AUDIENCE: [ Away from microphone ]

PIER: Because  that's right. Because actually it is readable enough that it can be handled by presale support. Because in most  for the inclusions. That's why we wrote the YAML extensions.

There are parts that people don't touch. But, for example, the list of addresses in the memory, in the PLC, that comes from an Excel file that the machine manufacturer actually makes.

So, that one is handled by a different team. When they change the PLC programming, they can just change that part. And then, you know, like all the rest flows together. So, that's why we chose to use YAML rather than JavaScript for that. Yeah.

Anything else? There was a question over there at the bottom. But, you know, like I guess, nobody picks it up.

AUDIENCE: [ Away from microphone ]

PIER: Wow. Such an amazing question, right? I spent the first 15 years of my career building Java systems. I'm one of the cofounders of Jakarta and the XML Apache projects and I have been doing XML since 1997. For me it's like PLC reader, it's a labor of love.

The problem over there is that we have seen that there is an aggressive actor in the Java world that is actually starting to copyright and push their copyright of the APIs out. And we're talking about Oracle over there. That didn't give us confidence that what we were building for  especially in this case  when these gateways are over a 2G link, we didn't have the confidence that we could update the Java virtual machine every six months like Oracle wants us to do kind of thing. And I know that Amazon came out with some wonderful things on their side. But I wonder how long those are going to last in a court.

So, JavaScript, on the other hand, you know, ECMA, pretty much every implementation out there is open source. And it gives us the stability that we need. Thank you very much.

[ Applause ]