Our engineers here at Endian Technologies AB have decades of experience with real-time embedded systems. During the past couple of years we have become experts on Zephyr, a real-time operating system for secure IoT devices. In this article I’d like to highlight a major improvement in its networking drivers and what it means for our future projects.
Say you’re building an IoT device. How does your device get on the Internet? What technology you choose depends on your application. Maybe your application is coupled with a gateway and you can use 6LoWPAN via BLE. Perhaps you even need cabled Internet and you use an Ethernet controller.
These technologies are good, but they are not always right. Sometimes you need cellular or Wi-Fi. Suppose you decide to use a cellular modem with LTE. If your project uses a general purpose OS like Debian then things are pretty easy to get working.
What if your device is so small that it can’t run a general purpose OS? Generally this has meant that your project would be a second-class citizen when it comes to Internet access. You have had to rely on vendor-specific AT commands.
To be very concrete, let us say your project is using a Nordic nRF52840 SoC and a cellular NB-IoT modem. Your firmware uses Zephyr and runs on the SoC, which communicates with the modem through a UART. How do you get on the Internet with this combination?
Let’s say you’re pretty green to this cellular modem thing. You would open up the documentation for your modem and find three things within easy reach:
You do some experiments in minicom and the commands appear to work. With this in mind you get going designing how your device will communicate with the network. You wanted to use CoAP over DTLS but it turns out the modem doesn’t support that. You need data to be encrypted on the network, so you grudgingly decide to use classic HTTP and TLS. Besides, the server guys are already familiar with it, so everyone is happy.
This appears to be a good solution and you start writing the library that will talk with the AT vendor commands for HTTP. After working way too long on this, you finally get the right contact person for your cellular operator and they tell you that you shouldn’t use TCP on NB-IoT. They don’t recommend it at all. Oops!
Your mind is looking for a way out and finds it: Zephyr has an Internet stack and you can adapt it to work with the AT vendor commands for sockets. Luckily for you, you find that Zephyr already has drivers like these, called socket offloading drivers. You decide to use DTLS with a socket offloading driver. It quickly turns out that this combination is not supported since the modem doesn’t support DTLS. But we were going to use Zephyr’s network stack? Turns out the offloading drivers recently went through a change where they hook into the network stack in such a way that Zephyr’s own DTLS/TLS support is not used. Oops!
Things are looking pretty desperate for you. You need encryption, but the modem doesn’t support DTLS, which your RTOS does support, but your RTOS doesn’t have a driver that supports DTLS in combination with your modem. TCP is not recommended on NB-IoT, but you’ve meanwhile found out that even if it worked it would mean that your data is sent in clear text on the modem’s UART.
Why are things not this bad on Linux? If your application was using Linux then you wouldn’t have to go anywhere near AT vendor commands. Why not? Because you would use PPP. PPP is like having running water in your house. Networking with AT vendor commands is like bringing water in buckets.
Zephyr 2.0 (September 2019) added support for PPP. Zephyr 2.2 (March 2020) added a GSM modem driver that uses PPP. Zephyr 2.3, which is not yet released, adds GSM 07.10 multiplexing. Much of this work was done by Jukka Rissanen at Intel. It is difficult to convey how significant and important this work has been.
PPP in combination with GSM 07.10 lets Zephyr use cellular modems in just the same way that Linux uses them. The GSM 07.10 protocol provides multiple virtual UARTs over a single physical UART and makes it possible to use AT commands while PPP is up and running. PPP is a full duplex serial protocol with framing and checksumming. It has control protocols (NCPs) for the serial line itself (LCP) and protocols for negotiating Internet addresses and DNS (IPCP and IPV6CP).
Here is a comparison:
PPP is better in every way. There is of course some place in the world for networking via AT commands. If you’re using a PIC processor that can’t run Zephyr then they might be your only option. But if you have any chance at all to use Zephyr, then you’re better off with PPP.
I’ve built applications in the past using AT vendor commands, but those days are gone now that Zephyr has PPP support.
Before finishing I would like to point out one particularly bad combination of protocols. Here it is: TCP over AT commands.
TCP/IP has built-in checksums, retransmissions and flow control. TCP over AT commands is just not TCP. It might be used to simulate a remote serial port, but it can’t carry any serious amount of data or communicate reliably with a real Internet server.
Bit errors and dropped bytes on serial lines are a common occurrence. With AT commands, a dropped byte on the UART results in a dropped byte on the TCP connection. No applications are written to work when a byte is lost on a TCP connection. When using a real network stack those types of errors are corrected by the network layer before they reach the application. But with AT commands there is no chance to correct the error. By the time the byte has gone missing, the modem has already sent an acknowledgement to the server and there is no way to correct the error.
TLS over such a lossy TCP connection is just not viable. Any error on the UART results in a fatal error that breaks the TLS connection. There is some theoretical hope for DTLS over UDP over AT commands, which would work because DTLS does its own checksumming and handles lost packets. But TCP over AT? Don’t even bother trying.
In a contest between how we used to do it (AT commands) with how we’re doing it now (PPP and GSM 07.10), the new way wins every time. Where we’ve been using PPP, the amount of network problems experienced during development have not just diminished by some factor; they have completely disappeared.
I visited FOSDEM in the beginning of February. For those who don’t know, FOSDEM is the largest free software conference in Europe, attracting more than 8000 enthusiasts and hackers from all over the world. The conference requires no registration and is held on a university campus is Brussels.
This year I didn’t have a clear strategy or focus for the talks I wanted to see. The amount of talks and development rooms usually requires a strategy - popular rooms become crowded really fast and there is usually a wait outside. So to see a talk that begins at 13:00, one sometimes has to be outside the room at 11:00. The upside of this is that one gets to see talks one didn’t plan to, which is usually a refreshing experience.
I usually tell people going to FOSDEM for the first time, that if they are unwilling to figure out a strategy, then just go to one of the two largest rooms when they have nothing else planned. Those rooms (Janson and K105) are always home to relatively general topics and/or keynote tracks. They are also large enough to provide both enough room oxygen for everyone. As I didn’t have a strategy myself, that is how I ended up spending my FOSDEM, outside of food, visiting various stands, buying tshirts and talking to people I don’t usually meet outside of FOSDEM.
Apart from the more general talks I saw in the large halls, I particularly enjoyed a talk about the uselessness of end-to-end encryption in messaging apps, from the user’s perspective, given by a XMPP developer on Saturday. On Sunday, there was a big talk where the Matrix developers bragged about how great end-to-end encryption is in Matrix. While the Matrix developers acknowledged the weaknesses highlighted by the XMPP developer, their enthuisasm felt rather unwarranted, given that the main point in the earlier talk was that end-to-end encryption only benefits the server operator.
In the end, FOSDEM was as enjoyable as always, and I got myself a new pair of GNOME socks and a t-shirt, which is all I really wanted from the trip :-)
Building and bringing connectivity to embedded Linux devices is kind of Endian’s thing - we’ve been doing it since the company started back in 2003.
Linux is a great choice for any device that can support it - it’s free, has great hardware support, it’s incredibly feature-rich and has an enormous, dedicated developer community that are constantly optimizing performance, adding features and squashing bugs. If your device can run Linux, it probably should.
But tiny IoT devices that wants to operate for years on a coin-cell battery can’t. Linux can be made quite tiny, but it won’t ever be tiny enough to run on a MCU with 16 kB RAM and 128 kB flash.
On these devices, the traditional solution means choosing some proprietary kernel which you then customize to your use case with homebrewed or copy-pasted code. But the requirements on modern IoT devices has made this approach unfeasible. If you want your connected device to be secured, robust, performant and power efficient, you need modern development principles: community collaboration, well-designed abstractions, modularity.
Enter Zephyr RTOS - a project that aims to do for tiny IoT devices what Linux has done for the rest of the embedded world. The permissive licensing model, active community, feature richness, focus on security and robustness and wide hardware support makes Zephyr a great choice for modern IoT development.
We have already used Zephyr for several projects, including (the world’s first?) 6LoWPAN-connected EV charging station; a solar-powered, camera-equipped recycling bin; an award-winning self-powered flow sensor and a smart lock that uses NFC and BLE.
These products are incredibly complex, but also have strict requirements regarding encrypted communication, ultra-low power consumption and device-to-cloud interoperability. It’s true that these constraints can be satisfied using just about any RTOS, but the time-to-market likely wouldn’t even be of the same order of magnitude as Zephyr.
If you’re curious about what Zephyr can do for your company - drop us a line or give us a call. We’re always happy to help.
In the end of September (2019) I visited the All Systems Go conference. The official slogan is “The open source community #conference focused on foundational user-space #Linux technologies”, in other words what is often refered to as “Linux Plumbing”.
My impression is that it doesn’t focus on a specific industry (unlike for example Embedded Linux Conference where you can also find alot of plumbing related discussions), but of course things might get a bit skewed if presenters are working on similar things. Companies that are heavily invested in this is for example Facebook and Kinvolk. Both with a cloud and container focus, which might not be my personal biggest interest but it’s always interesting to get a different perspective and also think about which parts can be reused for other purposes. The general feeling was that this year can be summarized like “containers, containers, containsers and (e)BPF”.
Listening in on anything related to systemd is ofcoures always interesting and this year maybe the most interesting and potentially controversial one might be about systemd-homed and the visions for it.
Every time I go to a conference I always try to go to one “wildcard” speach. I try to find an unallocated slot in my personal schedule where I find a talk about something that I might not be particularly interested in but has a crazy enough description which makes you wonder how it even fits in. This year my wildcard choice became the GNU Poke talk which was awesome and it has since been hyped by people like GregKH and described as the most impressive presentation ever seen! It’s always a great fealing listening to a talk where the presenter is both humble but also very excited about their work.
There was also a great presentation by the same person on BPF in the GNU toolchain And another talk from Facebook also related to BPF (and systemd): https://media.ccc.de/v/ASG2019-144-custom-cgroup-bpf-programs-in-systemd
The talk I brought home to my fellow embedded interested collegues where the great one about using bringing up an STM32 with only free tools, which was both a good introduction to the tools and also some deep details about how STM32 works.
Lots of work still seems to be needed still to bring docker kicking and screaming into the brave new cgroups2 worlds.
Some other notable talks I went to that you might want to look at if you want to know the latest state of things in that area: