We’re in a transitional period in which humans still write software.
A long-term view of security, design, and fairness.
This week, I ran an “AI Fairness mini-bootcamp” in Josh Blumenstock’s Applied Machine Learning class with some of our MLFailures labs.
While I was doing the lectures, I realized: we’re in an interesting transitional period in which humans still write software.
All machine learning is a kind of “program synthesis”—people let the computer write a program instead of writing it themselves. The resulting program isn’t human-readable; it’s an opaque model that maps inputs to outputs. But those models can, in turn, write programs in human-readable languages:
Just as GPT-3 can generate text, modern AI models can and, eventually will write most programs. As modern compilers moved humans from punchcards to Assembly to C to Python, machine learning will provide an even higher-level abstraction: one where you can just describe what you want the program to do.
In the future, machines will mostly do the work we think of today as software engineering. In that future, software engineering as a practice will have mostly to do with specification and evaluation. Specifying the problem domain, seeing how the program works—making the thing, cleaning up the mess.
Broadly, design and security.
Algorithmic justice is about to become central to security
Machine learning fairness, accountability, explainability, and transparency—to borrow Joy Buolamwini’s term, algorithmic justice—is about to become central to security, even when security is defined narrowly as the confidentiality, integrity, and availability of data. Once you expand that definition to include the rich, social contexts of technology use and non-use, it’s clear that the communities like FAccT, CSCW, CHI, and DIS are going to start playing an enormous role in every aspect of (what we today call) “cybersecurity.”
The good news: the communities I listed above are, broadly speaking, way ahead of traditional security communities in centering concepts of diversity, equity, inclusion, and belonging. As these communities bring their rich, situated notions of technical harm to bear on issues in security, I expect it will usher in a new era of socio-technical concern (concerns that span the technological and the social) in the discipline. That will be a wonderful thing.
Also, design and security are deeply entwined. They always have been. Shifts to software engineering may make that more apparent. Work like Diane Freed’s and the Reconfigure Network’s testifies to these communities’ readiness to weave together the threads of specification and evaluation, design and security.
The bad news is: we have a lot of work to do when it comes to training students.
Amplifying AI justice in pedagogy
For students and teachers, we need to radically increase the prominence of machine learning fairness, transparency, and accountability in our curricula. Many technical students learn about ML bias, but how many of them can tell you whether the algorithm in front of them exhibits an anti-Black or anti-AAPI bias? If we expect students to “clean up the mess” from AI-generated programs, they better understand these issues—not just that they exist but also what to do about them.
MLFailures is a tiny piece of this puzzle. These labs teach students how to identify and address bias in supervised learning algorithms. Soon-to-be-posted lectures will help contextualize the labs with an introduction to fairness. All of these labs are freely available, and creative-commons licensed—teachers worldwide are welcome to use and modify them.
But there are blind spots in our curriculum. We don’t teach issues like reward function hacking, privacy leakage, adversarial ML, existential risk. I certainly hope to make labs that cover those topics in the future. But it’s going to take an expansive, multi-disciplinary effort, one that will need to adapt constantly to an ever-changing field.
A chance to start again
For the cybersecurity community, there’s a silver lining in all of this hard work. To the public—the almost-everyone who lives with and within computational technologies—we have a chance to introduce ourselves again. What do we do? Why are we important? What can you do to protect yourself and your community? Right now, cybersecurity is fighting an uphill battle in explaining itself—what it is, whom it affects, and why it matters. Work like Joy Buolamwini’s makes me think that the future there is bright:
The “call to action” part
If you’re interested in using or building on the MLFailures curriculum, please reach out.
If you have an idea for an art project that speaks to one of these issues, apply to CLTC’s Cybersecurity Arts Contest. The winner gets some cash to do an art project.
I’ll post a complete course plan for the mini-bootcamp soon.
Just as ML will automate the kinds of programs regular developers write, they’ll also automate the kind of programs that security researchers write—imagine saying to an automated penetration testing assistant, “look for buffer overflow attacks in this code.”
Maybe what I’m describing here ‘really just a compiler’?
Here’s a piece by Andrew Critch and Tom Gilbert on how code-generation is the most probable path to existential risk in AI.