Boyd Multerer, CEO of Kry10 and Xbox’s father of invention, joins Ryan Chacon on the IoT For All Podcast to discuss mission-critical devices and formal methods. They talk about what games consoles teach us about secure devices, the changing regulatory landscape of critical software, how to think about digital transformation, and what companies can do to ensure secure software and devices.
About Boyd Multerer
Boyd Multerer has been building software and devices for over 30 years. He spent 18 years at Microsoft, 15 years of which was on the Xbox team. There he lead the development of Xbox Live, XNA, and the Xbox One operating system. Today, he is the CEO of Kry10 and has radically re-imagined what it means to build an operating system for mission-critical devices. Boyd has applied lessons in cyber security from the game console world and combined it with the latest in hardcore mathematics-based software techniques to build an operating system that takes a true security-first approach to the devices we depend on.
Interested in connecting with Boyd? Reach out on LinkedIn!
Kry10 delivers a modern platform, tools, and management services to help businesses realize the full potential of IoT and high value connected devices. The Kry10 platform is built on the most secure foundation while enabling the highest level of resilience and manageability to meet mission critical needs. Kry10’s platform approach can be encapsulated in one simple phrase: Trust but Isolate®. Kry10 leverages the formal verification of the seL4 microkernel to bring you an operating system that is secure, self-healing, and dynamic with minimal downtime, even during upgrades. This approach builds on the concept of zero trust architectures by limiting the code that can run in privileged mode and isolating as many non-core capabilities as possible.
Key Questions and Topics from this Episode:
(20:51) Learn more and follow up
– [Ryan] Welcome, Boyd, to the IoT For All Podcast. Thanks for being here this week.
– [Boyd] It’s good to be here.
– [Ryan] Yeah, it’s great to have you. Let’s kick this off by having you do just a quick introduction about yourself and the company to our audience.
– [Boyd] Let’s see. I’ve been doing software for kind of a long time now. I had previously been at Microsoft where I worked on Xbox for about 15 years. I got to work on Xbox Live and XNA and then the Xbox One operating system. And then eventually you realize that game consoles are big, fat, hot, plugged in industrial devices. They’re not actually PCs. They got security issues that are more akin to industrial systems. What I’ve been doing for the last, I don’t know, 10 years or so now is taking some of the lessons learned from consoles, combining it with some of the absolute latest techniques in securing software, and we are now building an operating system aimed at IoT and industrial systems. Critical infrastructure, trying to raise the bar on what it means to be building secure devices.
– [Ryan] So tell me a little bit more about what you learned through your experience with consoles and the technologies that are available today to get you to where you are now and working on these mission critical devices and what’s so unique about those learnings transformed into these mission or being able to be put into these mission critical devices and what separates that apart from other things.
– [Boyd] The big thing from, I guess there’s two lessons from console land. One of them is how do you package up an update? How do you do remote delivery? How do you do key management and know that an update being sent to devices is actually correct and is authorized by the right people. That’s a little bit more straightforward. The harder part of what console land teaches is that there’s a lot of, there’s cases when you don’t have physical control of the device that’s running the software you care about. In the normal PC world, the defender is trying to protect the user against some unknown attacker out across the internet who’s trying to anonymously come in and take over your computer.
In the console, the attacker owns the console, and they have a soldering iron. And it’s a very different set of attack scenarios, and you have to go down into, all the way down to the bottom, keys and chips, thinking through boot process, thinking through low level architectures, all the way up to the top.
Now when you think about industrial systems, or cars, or things that are in the field, there’s no administrator anywhere near it. If you want to go to it, you have to get in a truck, you have to go into space, you have to go to the physical device, and that can be difficult, it can be expensive, and sometimes the adversary’s been there first.
– [Ryan] And when we’re talking about mission critical devices, I’m sure there’s, most of our audience understands what that is. But just before we dive into this further, for our audience who might not know exactly what mission critical devices are, like what, which applications would be considered to have mission critical devices, how do you classify them? What is a mission critical device?
– [Boyd] I’ll give you the really broad definition and then I’ll give you more of a narrow. So the really broad definition is here’s a device that’s doing a job that somebody depends on, right? Different people get to define what their mission is and whether or not it’s critical. But a more narrow definition, mission critical is often used to mean infrastructure. What’s the device that is controlling the substation that brings electricity to your house? Or brings water to your house? It’s services that keep people alive. It could be the computer in your car that prevents it from crashing or does automatic braking.
But really, if your business depends on it, it’s mission critical for you. So it could still be factory controllers. It could be things in your house that allow it to function. So everyone’s got a mission. It’s just a matter of how important and how critical is that mission.
– [Ryan] One of the things that was mentioned ahead of time was talking about how we’re at a crossroads in software for devices right now. What does that mean? Can you elaborate on that and Where that’s coming from?
– [Boyd] Okay, so how do we go about building devices today? You’re going to go, we have this luxury of having bigger computers that are driven by advances coming out of cellphone tech. So we’ve got big chips. We’ve got fairly large amounts of memory. This is like this luxury that we’re in compared to 10 years ago. And what are we putting on it? We’re putting on operating systems that were designed for PCs in the 1990s. Right? Kind of blew my mind when I realized, oh, we’re using software that was from the 1900s on modern devices. That means monolithic kernels. That means architectures where drivers and critical systems of a computer are all sitting at the same level, at the same privileged level, and attacks in one of those can spread into others, right? So it’s easy to do because that’s what we’re used to. We’re used to monolithic kernels such as Linux and others, which were a great design in the nineties, especially when Pentiums were nice and slow. And it’s fine for a PC where you’re sitting there, and you can deal with an error, and you can reboot it. It’s not okay for that device that may be a thousand miles away that you can’t afford to go to to actually administer it.
– [Ryan] And one of the things that’s interesting and I’m sure kind of brings in unique elements to all of this is just the fact that the physical world is relatively at times insecure. There’s a lot of vulnerabilities out there that are different. And I know there’s, because of that, the whole kind of way you approach the development of not only the software, but also the hardware is unique and governments are starting to take action. So can you talk a little bit about what you’re seeing happen from your perspective to help address the physical world insecurities and vulnerabilities that are out there?
– [Boyd] You mentioned government. So I’ve got a kind of a rule of thumb I’ve been using lately. If you want to know what’s going to be the thing everyone’s worried about in 10 years, go look at what DARPA and the military agencies are putting research money into now. And 10 years ago, there was a whole lot of work on AI. There was a whole lot of trying to understand automated systems. And right now, you look at RFPs and you look at calls, and there’s just a whole lot of formal methods and mathematics. In other words, there’s some new techniques. They’re not really new, they’ve been known about for a long time. The new thing is that they’re scaling for the first time. There’s some new techniques that have come into play where you can take things like advanced mathematics and use it to prove that software has been built correctly. And it changes the game. It means you can take very small pieces of software and you know they’re right versus you think they’re right because you tested it. Knowing they’re right means you’ve used math to prove that they’re correct. And if you choose the right little bits, then you can leverage that into systems that have got properties that you know those properties are good. So that’s like a big stepwise change in how you think about building software. And that’s at the software level. Down at the hardware level, there’s other stepwise changes. But you have to look at each layer, looking for things that eliminate entire classes of attacks. And sometimes that means going back to basics and rethinking what are your core principles.
– [Ryan] How would you describe the digital transformation that we need, as just where we are now with technologies and what businesses are looking for and so forth?
– [Boyd] It’s going to be, it’s going to be at a low level, right? Because our fundamentals are built from the PC era where we can assume administrators. That means that the transformation that’s happening is going to be at a layer below what most programmers and what most users actually think about, right? Sort of like you switch from a gas car to an electric car. You still got a steering wheel, you still got brake and gas pedals, you still got, you know, blinker sticks and all that. So driving the car feels the same, but the fundamental technology in the car sitting below it, the fact your engine went away, it’s now batteries, and it’s motors, that was a fundamental change that didn’t really affect the upper layers. So I think that’s what this digital transformation is going to look like. There will still be applications, there will still be drivers, and a lot of APIs will look familiar. But the entire bottom end of the stack has to get swapped out for something with a stronger footing. And that will take a little bit of time, and it’s a really fundamental change even if it doesn’t feel like it’s that much of a change to the people who are writing code and building devices
– [Ryan] Let me ask this. If I’m a business out there listening to this, cyber security has been a topic we’ve talked about many times in the past and there’s obviously we’ve gone over like just in IoT, what are the vulnerabilities you need to be thinking about and so forth, but it doesn’t always seem to me, and I’m sure there’s a case similar to your experience that companies don’t always really understand the risks that are present or potentially become present when they adopt an IoT solution. So if you were to be talking to a company, what is it that you not only recommend that they be thinking about and planning for, but also just like generally from what’s happening in the space or how you see people come to you and address this? What are they missing?
– [Boyd] Yeah, great, this is spot on. This is really the challenge that everybody’s got when you’re trying to explain this is where devices are going. Why should anyone make a change when using the older systems which you know and they’re easily available when they seem to work. Why should anyone care? And when I talk to a lot of companies, that’s the headspace they’re in. They’re worried about their product. They’re worried about the thing they’re building and why should I take on this extra worry? Or put it another way, security just feels like it’s a cost. It doesn’t add a feature to my product. It just raises the cost. So companies are going to be a little slower to wrap their heads around the risks that they take. And you really have to use that word. This is about risk management. The governments are already there. The governments are already there from a military perspective, and they’re already there from a society risk perspective. So the first thing I would tell you is to tell everyone is to go look at the European Cyber Resilience Act. What they are doing is they are legislating if your product has a software flaw in it, and you cause damage, you cause someone to die then your board is liable, right?
So they’re changing the rules around what it means to be liable for software in the US. Typically, yeah, if you build a piece of hardware, and it fails, your company’s liable. And then software gets its free pass. Software, we don’t understand software. If a software error happens and a car crashes, it’s, you know, no one’s fault That is changing. The governments are changing the rules to say no software faults should be treated like hardware faults. How come the software people get a free pass?
And that means that companies have to start thinking about the risk that they have. Can they buy insurance? Can they mitigate those risks? How do they contain the risks on the products that they build? Including software just like they’ve had to do in the past with warranties on the hardware, right? This is a reset in thinking that they have to go to.
– [Ryan] And what do you think this will do just looking out like five years, 10 years into the future? This is a pretty big shift, just like in the way, not only the software guys to be thinking about things, but companies who are adopting the technologies. There’s, like you mentioned, the insurance element, just, there’s just, it seems like there is an endless number of things that need to happen or that are going to be changed because of this. So where do you see the biggest impacts being had? Do you just think it’s going to result in a more kind of careful creation of the code? Do you think it’s going to cause maybe the adoption to slow down? Do you, what do you see as the biggest impacts of something like what’s happening in Europe, potentially happening here in the States?
– [Boyd] Oh and by the way, it is happening in the States. There’s a bunch of, there’s cyber resilience acts, different states are looking at the rules and changing their rules of liability. It is absolutely happening in the US. It’s just the European Cyber Resilience Act is a little more concrete and you can point to it and read it.
Look at what the national labs are worried about. They are worried about this exact subject. So there’s a whole new national lab called CYMANII, C Y M A N I I, which is all about protecting industrial systems from cyber vulnerabilities. Huge amount of research going into what does it mean to have a stronger footing in industrial systems. That will end up eventually translating into regulations. But what I’m seeing is the companies that are a little more forward in building devices where if they fail, people die, they’re closer to already being freaked out.
– [Ryan] I’m just thinking about like the story like with Tesla and stuff, right?
– [Boyd] Yeah, exactly. So car manufacturers, yeah, you’d be surprised that a lot of bigger companies who are, who build devices that we consider critical, they’re not quite there yet. So there’s going to be a transition, but they’re going to have to do it. Once it’s the governments realize that there’s a societal impact, then the story’s over. It’s just a matter of time.
– [Ryan] So how can companies that are listening to this that are handled, that work on the software piece for hardware, how can they prepare? What should they be thinking about or just doing to start to put this on their roadmap of things they need to now be worried about?
– [Boyd] They need to think about how they manage the risk on their devices. I would say that the things I would worry about is I would worry about isolation, I would worry about risk, and I would worry about updating and deployment. So in other words, if I have software on a device, I have a bunch of different packages, I have device drivers, I’ve got applications, I really want to think hard about how are they connected to each other and how are they separated from each other because if one piece goes down, it can’t take the rest down. I would be worried about how am I going to update these devices? And if you’re big enough, and you’ve got some tolerance to deal with new software, which is always a fun thing, then you could look at systems like what we’re building, where we’ve got that built in at the bottom layer. But even from the beginning, you’ve got to be thinking about isolating your components and thinking about resilience, restarting, and you’ve got to be thinking about how you’re going to update it because those are the tools that are at our disposal.
– [Ryan] So this is a lot more than just getting insurance and writing better like terms and conditions and things like that, there’s
– [Boyd] Yeah, I think what’ll happen though is that insurance companies are going to start requiring it. I mean, if you can get cyber insurance, and if you can get it at all, which is a big question right now, then they’re gonna say, they’re going to be requiring better techniques. So you need to be on what happening.
– [Ryan] I wanted to ask you as it relates to this, this is a topic that when we first spoke, you brought up, which was, is new to me. Obviously, I’m not an engineer, but you were talking about it and its importance in this whole cyber risk space, and that’s formal methods. Our audience is, expands from technical engineers to non technical people. So how would you explain what formal methods are and why they’re important in this realm?
– [Boyd] So formal methods is using core mathematics to model the logic of your software and then to prove that it has certain properties. Okay, it’s a mouthful. What that actually means. Normally when we build software, we write a bunch of code, and then we write some tests, and then we run those tests, and hey, it passed the test, so we think it’s probably going to work.
But you don’t know that it’s going to work in every case because your test didn’t try every possible input. In other words, the way we do testing today is probabilistic testing. Hey, we tried a whole bunch of inputs, we tried some failure cases, we tried some success cases, they all pass, work the way we thought, so it’s probably gonna work. And what you didn’t know is that in this weird edge case, there’s a value you could pass in where your function fails. What formal methods does is it uses a mathematical model of your function, and it effectively tests all possible inputs at the same time. It’s a little like magic. I don’t, I can’t fully explain how it works, but it tests all, and this is really important, all possible inputs. So there are, you can prove that there are no cases where it will fail. And that’s fundamentally different and also really hard.
– [Ryan] So how do companies like, how does, how do formal methods get brought into the way companies do things now?
– [Boyd] Most companies should never do it. Okay, so here’s the way it’s taught in university is for 50 years, it’s been taught the same way. It’s, hey, today is, we’re going to do formal methods day in computer science class. Here’s what it is. We’re going to prove two or three lines of code, and you’ll never use this. It’s too hard. There’s been some breakthroughs. We’re going from five to 10 to maybe a hundred lines of code being modeled in math and proven. At University of New South Wales in Sydney, they’ve now gotten their techniques down where they can prove about 10,000 lines of code. And that was a big multi year decade long effort. Once they got that though, then if you choose the right 10,000 lines of code, you can build a kernel, which means you can build isolation, and you can build the tools that you need to support applications out of the formerly proven code. The applications are not going to be proven. So don’t worry about that. That’ll be too expensive and too hard. What you want to prove is the bucket that the isolation’s in. You want to prove that the container the application lives in doesn’t have an error so that when your app fails, or your app is attacked, it can’t break out of that container and take down the app next to it.
– [Ryan] So this really connects well to what we’ve been talking about with these mission critical devices and the software that’s in them and touching all the different pieces of that.
– [Boyd] Yeah. So the way to take formal methods, it’s very hard, it’s very expensive. Most people should never do it. Although if it’s appropriate, it’s worth it. Amazon has got 600 formal methods people now. Like the chip companies have got tens of thousands of people doing formal methods, proving the logic in the chips. What hasn’t been done is taken it up to the OS level, which is new, and that was what was cracked in Sydney, which is why I’m in New Zealand and why our company is mostly in Sydney, right? So you let small, you let dedicated groups of people do the formal methods, but you do it in a way where it can leverage everyone else’s code, right? Build good containers, build proofs of communication lines, so that you know that your device has separated the risk into smaller pools because that’s really what this is about. Imagine drone is flying through the air, the mapping is talking to the radio, an attack comes in, takes over the map, and their goal is to crash the avionics app to get it to hit the ground. If they’re formally separated from each other, you may be able to crash the mapping app, but you can’t cause the avionics to fail.
– [Ryan] If you’re saying that most companies shouldn’t do this because it’s very hard, I guess a good way to wrap this all up is what do you, what should companies that are engaging with or using or building mission critical devices do to really ensure security on these devices?
– [Boyd] Yeah, okay. So let’s phrase it, let me get it right. So most companies will never do formal methods. Most companies should demand that the systems that they’re building on have used formal methods. The only people who really get this, there’s a few groups who really get this, but the main group that gets it is government because they’ve been forced to through defense. That stuff is always under attack. Everybody else has to play catch up and they need to do it fast. So they need to educate themselves about what is the changing legislation. They need to educate themselves about what are the tools that are coming online because within a year or two, there’s going to be multiple really good tool sets that they can leverage to build these devices. And they have to understand that by doing the same old same old and using monolithic kernels and Linux and all this stuff on these devices isn’t going to be viable in a very short period of time. So if I’m listening to this and I’m trying to decide what I’m going to do, the first thing is educational. You need to pick your CTO and have them go learn about what are the options that are coming because they’re going to be forced to make a change.
– [Ryan] Boyd, this has been a great conversation. This is a topic we have, we’ve talked about cyber security in the past, but to the level of detail that we dove into today and really talking about the mission critical use cases, devices, and things and what’s happening there, obviously four methods is the first time we’ve ever spoken about it. So, I really appreciate you taking the time and breaking this down into a way that the non technical members of our audience are going to be able to understand because I was able to understand a lot of this that beforehand, I wasn’t sure how much of it would actually go over my head, but you did a great job. Where can our audience learn more about what you all are doing, follow up on this kind of topic and conversation if they have any questions or anything like that?
– [Boyd] Obviously you can go to our website. So it’s kry10.com, k r y 1 0 dot com. There’s a bunch of videos that we’ve put up on YouTube, but they’re a little bit more on the technical side. The kernel that we use, the fully formally proven kernel is called SEL4, which is out of Sydney. So if you, there’s a bunch of videos that came out of the SEL4 summit in Munich last year. We’re about to go to Minneapolis and do a bunch more videos there. So plenty of discussion of what we’re doing. Again, slightly more technical audience. Feel free to reach out to us if you think you have an application that fits this type of space.
The other thing I would look, I would go to is frankly, on YouTube, look up HACMS, H A C M S. This was a U. S. military DARPA program which really showed for the first time that you can take formal methods and kernels built this way and use it to protect devices. They had an autonomous helicopter flying around that they were red teaming and trying to take over and there was a lot of lessons that came out of it. The fundamentals, I would look at what DARPA has put up on YouTube. It’s some really good material there.
– [Ryan] Definitely make sure we link that up to our audience so they can check it out. But other than that, Boyd, thanks again for taking the time and we’d love to have you back sometime in the future to continue this conversation, talk about just the advances that are happening in this space as the government starts to build more kind of regulations and guidelines around the stuff we were talking about today. It’s going to be fascinating to see how software companies adapt.
– [Boyd] Thank you. It’s been great.