How has your experience as a DARPA PM on the HACMs program shaped your interests within security?
My background is in programming language design and implementation with a focus on domain-specific languages. I went to DARPA being aware of the work on the (seL4) microkernel and the Compcert C compiler. I had the idea of taking the mathematical approaches from those two projects and applying them to operating systems software to produce software that provably doesn’t have many of the kinds of low-hanging fruit vulnerabilities that hackers like to exploit. At DARPA, I became aware of the work of Tadayoshi Kohno and Stefan Savage in breaking into an automobile using five different attack vectors that didn’t require them to physically touch the car and used the networking interfaces that existed to gain access. It occurred to me that a vehicle platform was even more compelling than a cell phone platform in terms of applying these tools and techniques. It also brought into mind the ecosystem of the internet of things and networked computers that have the ability to physically affect the world: cars, robotics, quad-copters, medical devices, pacemakers, insulin pumps, control systems for doors in prisons, and so on. There are tons of papers in Black Hat, for example, where white hat hackers showed that they could break into these systems. A lot of them have the same characteristics in that they’re not necessarily running a ton of software or they’re running software where only a small part is important for proving security guarantees. Right now formal methods are good for verifying a hundred thousand lines of code but beyond that it’s outside the scope of what the techniques could do. In those settings, a hundred thousand lines of code gets you a long way toward having a much more secure system. In knowing about seL4, the Compcert C compiler, and looking at the work of Stefan Savage and Tadayoshi Kohno in breaking into cars and extrapolating from cars to other cyber-physical systems, it looked like there was confluence of software and techniques that were ready to be applied and a compelling need to apply them to get better security. That’s what led me to HACMs and what led me to do a much deeper dive in security than I had done previously. I’m not a security researcher, I’m a programming languages researcher.
What are novel ways that the field of programming languages is shaping how we think about security and how do you picture that intersection coming together more closely in the future?
C is the bane of security. The C programming language is used in a lot of low-level code because it directly reflects the underlying machine. When you’re writing low-level code, having direct access to the machine is very important but what we’ve empirically discovered after 50 years is that humans are incapable of writing C code that doesn’t have security-relevant flaws in it. Theoretically it’s possible to write perfect C code and verified systems do produce C code that is flawless in terms of memory vulnerabilities but people seem unable to do it without the machine checking. There are three ways in which programming language technology is relevant. One is the formal verification-based approaches where you have plain raw C code and prove that it doesn’t have vulnerabilities, which is what happens with the seL4 project. Another approach is writing in some higher-level language that has a richer proof-type structure built into it. Rust is an example of this approach. The type system can track some of the relevant memory safety properties. It’s still possible to have bugs in the code and harder to write because you must satisfy the type system but it’s a big step forward compared to C. At an even higher level, you provide a domain specific language where the programmers are writing at a significantly higher level of abstraction and then the compiler is generating the lower-level C code along with a proof that the generated code has desired security properties. In the HACMs program, the Ivory and Tower languages from Galois were able to generate low-level C code and schedule it into tasks. Right now we have case studies that show how to design domain specific languages that can generate code in low-level languages along with correctness proofs but we don’t have yet have a fully worked out story for the general case.
What do you think are the current biggest barriers to making systems more secure?
One of them is that it is just fundamentally a really hard problem. You have to get everything right and there are many, many, many things included in that. Things like the original design, how you formalize that design, how you write the specs, how you build the implementation, how you configure it, the hardware, the security of that hardware, the trustworthiness of the human operators and their ability to fend off phishing attacks, the security of their passwords, etc. You have to get all of those things right in order to build a system that is secure and that’s a super hard problem beyond the scope of any one person, group, or company. It requires a large ecosystem with lots of participants and that means you have independent components that need to fit together and fitting together independent components is a notoriously hard problem for security. The second issue is that we don’t properly account for the cost of security vulnerabilities. People build software and throw it out into the world and the user uses it at their own risk. When a security breach happens, it is completely the responsibility of the customer to pay for all of the negative ramifications that happen as the result of the security breach. The people who would have to pay to invest in better security aren’t paying the costs of the bad security and therefore they’re not doing as much as they could to produce software that is as secure as we know how to make it today. It’s also partially that we honestly don’t know how to measure what the cost is. You could have a latent vulnerability in your code and if nobody ever finds it then not fixing it is fine. But if someone finds it and uses it to exfiltrate information or cause physical damage, it could cost a fortune. It is very imprecise and the people who make decisions about what to invest in want to have a quantifiable risk metric. The economic incentives are not aligned to solve the problem.
What do you wish computer science students knew about programming languages and security?
People don’t fully appreciate the extent to which the implementation language impacts the security of the final system in terms of how easy it is to write code that doesn’t have certain kinds of flaws in the first place or how easy it is to analyze the code to determine whether it has security vulnerabilities. If people were writing in pure functional code, it would make the security much higher. It wouldn’t solve all the problems but it would solve certain classes of problems. Those classes, such as memory vulnerabilities, are a major source of exploits. Of course, there comes a performance overhead with a higher level language but if we did a better job of accounting for costs, I hypothesize that those upfront costs of choosing a language that supports better security hygiene would pay for themselves very quickly.