Lea Kissner (co-designer of Google Zanzibar and Head of Privacy Engineering for Twitter) on Zanzibar’s design decisions, user-friendly ways to structure access controls, and when (not) to write your own Zanzibar.
Developer Den is a series of interviews with notable developers in our community to learn more about their journey into engineering. We sat down with Lea Kissner, co-designer of Google Zanzibar and Head of Privacy Engineering for Twitter.
Tip: Oso cofounder/CTO, Sam Scott, recently wrote a post on Building Zanzibar from Scratch. In it, he builds up the relationship tuple data model, a model to query the data, a configuration interface to compose queries plus the logic to evaluate it. If Zanzibar interests you, bookmark that post for later.
How did you get interested in computers?
When I was six or seven, my dad pulled out this calculator that you could program in BASIC and handed it to me. I proceeded to make incredibly elaborate password routines that laughed at you when you got the password wrong. It was a security system only — the passwords weren't protecting anything. You know, maybe it's not that surprising that I ended up in security and privacy.
From there, how'd you get into academia?
When I was five, I decided I wanted to get a Ph.D. in physics because the physicists are the ones with the liquid nitrogen. I was like, "that is where I want to be." I went to college and figured I was going to go the physics route — I double-majored in CS, because, why not? Then I ended up doing CS because I liked the math better.
At first, I was working on robotics. In high school, I had been in the FIRST Robotics Competition, a competition where you build 130-pound robots. The summer after high school, I did an internship at NASA working on a Mars Rover. They even tested the robot I worked on! It did not end up going to Mars, but they did test it in the Atacama desert. They called me after to tell me, "Hey, your code was the only code that didn't burn out any motors." I thought: this is not a super terrible thing to be working on.
When I was in college, I was doing internships at Xerox PARC — spending my weekends, summers, and evenings working on reconfigurable modular robotics. Eventually I looked around and thought, "you know what? These solder fumes are going to eat my brain, and someday my brain might come in handy." When I took my first physics class in college, which was h7a at Berkeley, I had immediately realized: it was math that I liked. I decided to just take math and theory classes.
And how did you get into cryptography?
I thought about what I could do with my math and theory classes! I went and got a Ph.D. in crypto to support my math habit. It was pretty obvious to all of my CS professors that I was going to go get a Ph.D., because I was saying things like, "I don't want to take classes about networking. Can I talk about whether this particular decision problem is decidable?" They were all like, "Yeah, you are clearly going to grad school."
What was your path from graduate school back to industry work?
I thought I would want to go into academia because I really liked teaching. But a lot of professors looked really unhappy! I was doing the work because I liked it. I wanted to continue to like it. I know there are a lot of people that are happy in academia, but the particular samples I saw looked not-thrilled.
I took a job at a company called BBN, which is a government contracting shop — they do a lot of work for DARPA. I was there maybe nine months and the only thing I could really do was read RFCs all day. I made it about halfway through the IPsec RFC and I quit to go to Google, because Google promised me that I wasn't going to be bored. In fact it was sometimes a little overly exciting, but I was not bored. I do poorly with boredom.
2021 has been the year of people deciding that they like Zanzibar, which you developed. [Airbnb and Carta are both writing their own Zanzibar clones, and there are several open-source implementations]. Is there any advice you'd like to give those developers, or anything you’d like to explain to them?
The semantics in Zanzibar are very carefully designed to try and make it very difficult for you to shoot yourself in the foot.
In actual Zanzibar-like system implementations, there are a few common mistakes. One is, not being able to reverse-index the ACL [access control list]. You end up with systems where you can ask, "can this user see this document?" but you can't ask, "what can this user see?" "What can this user see?" is very important — that's the reverse indexing.
Policy languages in general have a hard time with being understandable, especially once you have a complicated system with many rules. Business logic often makes its way into the access control systems and the access control systems make their way into the business logic. The Zanzibar API is carefully designed to make that better. It's also designed so that the API results are understandable to the end users. Every time a user looks at an ACL, they need to know — who can see this? Why? And, how do you make it stop? The Zanzibar semantics are designed so that you can build that, and if you're implementing Zanzibar, you should be aware of that.
It also lets you deal with things like new enemy problems, which crop up in many places you have editable object contents or metadata. For example — you have access to my document and I decide I want to remove you. Once I've removed you, if I then add something into my document, you're not supposed to see it. In a large distributed system, it's really easy to slip up somewhere, especially when your indexing system is separate from your storage system. There's also a bunch of system considerations in there. If you ever have ACLs that touch each other, you'll have to think very seriously about system load.
For instance, a system where you have comments on your YouTube video — comments and video aren't the same thing, so you'll put them in different storages. There are three different ways of handling this. The first is you make live calls for the different ACLs when you need them. Google found that this was a bad idea! There was a thing where Google Plus let you share music with people, and the recipient could play it once. This had the worst load characteristics ever. You can do some caching there, but it's particularly bad if you add more services, which might themselves fail. The math isn't good on that.
The second way is to make copies of the access control list — one copy for the video and one copy for the comments. What happens when you want to remove somebody from that ACL? What you have built yourself is a giant privacy incident manufacturing system! Or what you have built yourself is something that needs a system to consistently modify the ACL across multiple other systems, which is not a good idea from a scalability point of view. It's another one of those things that gets worse and worse and worse.
The third way of building it is that you stick the ACLs somewhere else and allow them to touch each other in a way that's consistent, which is how you get Zanzibar.
When we built Zanzibar, there was a certain amount of doubt about whether we could make it actually work, and actually scale. The semantics are very carefully designed — both from a security perspective, and to make sure the system can scale and be highly replicated.
[Instead of going into the details of Zanzibar here, our CTO Sam and Lea make tentative plans to collaborate on a blog post about why the semantics are the way they are.]
When should companies build their own Zanzibar? When shouldn't they?
It doesn't make sense to build your own Zanzibar if you don't have people who really know how to do it. It's an extremely load-bearing piece of infrastructure. If you're not sure you can do it well, just don't even try it. That's true of most things in the security and privacy space! This one is a little less subtle and quick to anger than rolling your own crypto or anonymization, but there are still a bunch of cases you can mess up.
When you need to change ACLs a bunch, and especially when ACLs have semantics that cross multiple objects — that's what Zanzibar is for. "Across multiple objects" can even mean one object served in several forms, like an indexing system. When you start building that kind of system, Zanzibar starts to make a lot of sense.
For instance, I was Chief Privacy Officer at a Humu, a company that builds software to try and help people be happier and more productive at work. We made extremely serious security and privacy promises. Our access controls were based around the structure of the companies we were working with. Like, "Here's an org chart, you know what to do with that." Of course we built our own ACL system! If you screw that up and give someone access that they shouldn't have, it's completely unacceptable. Also, the semantics were complex enough that we wanted to make sure they were done solidly through the entire pipeline.
Is there anything you'd like to change about Zanzibar, in terms of usability?
I think the structure of the ACLs is very good! We solved a real problem: when you have firewall-style ACLs that cannot be reverse indexed, everyone gets confused. We have 30 years of UX studies showing that people can't understand more than about eight firewall rules. If you look up "firewall rules" on Google image search, you'll see pages and pages of people screenshotting their rules because they're confused. I felt very strongly that we should skip that kind of allow/deny semantics — what we want is a canonical, positive representation of ACLs. That's what Zanzibar did!
We almost entirely avoided intersections and negations in Zanzibar because they are one giant foot-gun. They are very hard to explain and hard to understand, and people tend to build confusing systems with those tools.
Also, we've found that the syntax for inter-verb pointers is a lot more prone to errors than we would like — I’d like to go back in time to rewrite that. However, Zanzibar got very popular very quickly inside Google — once we knew we wanted to change the syntax, there was already a lot of it there.
Where does the name Zanzibar come from?
The original name of the project was Spice — I really like Dune. All the names in the project were Dune themed, but my SVP force-renamed the project. He thought the name Spice had some inappropriate connotations. But out of all of the complaints he could have had about the project, "please change the name" was not super worrying. We named it Zanzibar after one of the spice islands.
If you enjoyed this foray into Zanzibar and want to see an example-driven explanation of the model, read Building Zanzibar from Scratch. In this post, Oso cofounder/CTO Sam Scott builds up the relationship tuple data model, a model to query the data, a configuration interface to compose queries plus the logic to evaluate it.