Re: More Details
Doesn't this indicate that there's probably a crisis in security at the moment? It's almost inconceivable that this is the first ever attempt at dependency poisoning. How many others have been perpetrated unnoticed?
The way in which the Linux world is divided into myriad different projects doesn't help. Some projects are claiming to be the best thing for system security since the invention of sliced bread (cough systemD cough). But they may also pass the buck on the security of those dependencies whilst they also mandate use of minimum version numbers of those dependencies. Did they vet those versions carefully as part of their claim to bring security to systems?
The Linux and OSS environment is ripe for more patient attackers to get a foothold on all systems.
Build Systems Are Not Helping, and Developers Have Been Hypocritical
The build systems these days seem to be a major part of the problem. The whole autotools / M4 macros build system is hidesouly awful, and that seems to have played a big part in aiding obfuscation in this case. There is enthusiasm for cmake, yet that too seems littered with a lot of complexity.
Clearly something is very, very wrong when tools like Visual Studio Code consider it necessary to warn that merely opening a subdirectory and doing "normal" build things can potentially compromise the security of your system. It really shouldn't be like that.
One always needs some sort of "program" to convert a collection of source code into an executable, and in principal that program is always a potential threat. However, the development world has totally and utterly ignored the lessons learned by other purveyors of execution environments despite having often been critical of them. Javascript engine developers have had to work very hard to prevent escapes to arbitrary code execution. Adobe Reader was famously and repeatedly breached until they got some proper sandbox tech. Flash Player was a catastrophe execution environment to the end. And, so on. Yet the way that OSS build systems work these days basically invites, nay, demands arbitrary code execution as part of the software build process.
Unless build systems retreat towards being nothing other than a list of source code files to compile and link in an exactly specified and independently obtainable IDE / build environment, attacks on developers / the development process are going to succeed more and more. These attacks are clearly aided by the division of responsibility between multiple project teams.
Secure systems start, and can end, with secure development, and no one seems to be attending to that at the moment. Rather, the opposite.
How About This For An Idea?
One very obvious thing about how OSS source is distributed and built is that they all conflate "development build" with "distribution build".
When developing code, it's generally convenient to break it up into separate files, to use various other tools to generate / process source code (things like protoc, or the C preprocessor). Building that code involves a lengthy script relying on a variety of tools to process all those files. Anyway, after much pizza and late nights, the developer(s) generously upload their entire code base to some repo for the enjoyment / benefit of others.
And what that looks like is simply their colleciton of source files and build scripts, some of which no doubt call out to other repos of other stuff or includes submodules. So what you get as a distributee is a large colleciton of files, and scripts you have to review or trust that you have got to run to reproduce the executable on your system.
<u>Single File</u>
However, in principal, there is absolutely no fundamental reason why a distributee needs to get the same colleciton of files and scripts as the developer was using during development. If all they're going to do is build and run it, none of that structure / scriptage is of any use to the distributee. It's very commonly a pain in the rear end to deal with.
Instead, distribution could be of a single file. For example, any C project can be processed down to a single source code file devoid even of preprocessor statements. Building a project from that certainly doesn't need a script, you'd just run gcc (or whatever) against it. You'd also need to install any library dependencies, but that's not hard (it's just a list).
In short, the distributee could fetch code and build it knowing that they only having to trust the developer when they run the code (assuming the lack of an exploit in gcc...). And if you are the kind of distributee that is going to review code before trusting it, you don't have to reverse engineer a myriad complex build scripts and work out what they're actually going to do on your particular system.
If you want to do your own development on the code, fine, go get the whole file tree as we currently do.
<u>How?</u>
Achieving this could be quite simple. A project often also releases binaries of their code, perhaps even for different systems. It'd not be so hard to release the intermediate, fully pre-processed and generated source as a single file too. It'd be a piece of cake for your average CI/CD system to output a single source file for many different systems, certainly so if those systems were broadly similar (e.g. Linux distros).
<u>Benefits</u>
Developers could use whatever build systems they wanted, and all their distributees would need is gcc (or language / platform relevant equivalent) and the single source file right for their system.
It also strikes me that getting rid of that build complexity would make it more likely that distributees would review what's changed between versions, if there's just one signle file to look at and no build system to comprehend. Most changes in software are modest, incremental, without major structural changes, and a tool like meld or Beyond Compare would make it easy to spot what has actually been changed. It'd probably also help code review within a development project.
I suspect that the substitutions made in this attack would have stood out like a sore thumb, with this distribution approach. Indeed, if a version change was supposed to be minor but the structure / content of the merged source code file had radically changed, one might get suspicious...