We presented a paper that basically shows you that there is a potential for quantum optimisation on these problems. We use a D-Wave — not the universal quantum computer. We use a D-Wave because they’re a lot bigger, there’s more qubits, and the kind of problem we’re trying to look at is one that fits very well on to a quantum.
The first stage was to go, “Let’s go and try for that.” The next stage was to then go through and recode the data. The problem with vulnerability data in particular is, there’s no real publicly available data set. I had to make a few little guesses there, but I went with my experience and I formulated an idea that you have two main data points: You have vulnerabilities, and you have hosts.
For anyone who’s done some graph theory, they might go, “That might be a bipartite graph,” which in simple terms is just a graph with two sides. You take one side as the host, and then for every host, you attach it to the other side, which is representing vulnerabilities. The data you get is usually host first, and your questions are vulnerability first. There’s a validation in that you should use graph structures to get your vulnerability data stored and utilised, because you tend to get, say, “This host has these issues” and “This host has these issues” and “This host has all of these” and then move on from there, whereas the questions are, “We just read a headline that says that this issue is a major problem. Where are the hosts with that?”
If you don’t use a graph, you have to go through every single host record every time you want to ask a question and look for the thing you’re looking for, which means that you have to go to the database every single time, which means it’s very slow. Even with well-indexed databases, it just doesn’t help all that much. Moving to graphs is very natural, but the bipartite graphs don’t fit too easily into the problem we want to solve, which is a very nice, neat quadratic unconstrained binary optimisation, QUBO, which is what we wanted to use. That’s where we went for any of that.
The idea of getting to that point was to do a transformation, and that’s where there’s the interesting idea from a mathematical point of view in the paper, which is, there’s this nice way of coding these things that, if it applies to your problem, allows you to solve certain cuts on the bipartite graph by finding a total minimum vector cover on the dual graph — the graph you generate from the bipartite graph. That doesn’t take much more processing overhead. You just iterate around the whole graph and construct the second graph, and then you solve that.
We almost organically grew toward this idea of being able to go, “Here’s a gnarly problem. Let’s look at this subset, vulnerabilities and hosts, and we’re going to try and remove kill-chains.” What’s a kill-chain there? A kill-chain is when two vulnerabilities are connected by a host. Then, we took that problem and went, “We’ll make another graph that connects vulnerabilities if, and only if, they’re connected via a host.” That was easy enough to do.
Then, when you solve that, we wote a short proof that says, “If you remove all the edges here, you totally disconnect all of the vulnerabilities from each other on your original bipartite graph.” With that, you’re then able to say, “I haven’t fixed every patch, because that’s a hard problem, but what I have done is, I have removed all of the well-connected patches.” There’s no kill-chain here because literally no vulnerability can see any other vulnerability, and that’s the kind of totality you’re able to gain. That’s an exponentially hard problem as the graphs grow, but with a quantum computer, the lines stayed pretty flat when we were measuring time.