3 areas where gen AI improves productivity — until its limits are exceeded
It’s a collaborator that never tires and always has the bandwidth to help, he adds.
In February, global management and strategy consulting firm Zinnov and digital services transformation company Ness Digital Engineering released an in-depth study of over 100 software engineers in live engineering environments and concluded that when engineers used gen AI, they reduced time to update existing code by 70%, test code by 41%, and write new code by 32%.
For code updates, gen AI was good at using functions already present in the code base, and were invaluable in suggesting improvements for code performance, researchers say. In testing, gen AI was also particularly good at generating test cases and creating dummy data for testing.
But in new code development, impact was hampered by limited availability of training data and understanding of the project context.
Overall, there was a 38% reduction in task completion time with gen AI, with the biggest gains seen by senior engineers. Experienced professionals better understood the AI’s suggestions and were able to fix mistakes before adding them to the code. They also required fewer prompts to get the results they needed, and were able to provide more context in their prompts because they had a better understanding of the existing codebase and project requirements.
But it’s not just big software development projects where gen AI can help. It can also be used to write small scripts and queries, empowering non-technical employees.
Management consulting firm AArete has been using other gen AI to do just that for about a year.
“We like to believe we’re client zero,” says Priya Iragavarapu, VP of digital technology services. “We apply new technologies on ourselves and identify the growing pains right away.”
One immediate productivity benefit is that employees no longer have to turn to the company’s center of data excellence team to create complex queries and special scripts. That team, which was composed of 20 people, was no longer needed.
“They’re still busy,” she adds. “But they’re doing different tasks. We integrated the team that was previously doing all these data pulls into a larger delivery team that was doing more complex work.”
Meanwhile, there’s also a downside to using AI for coding. According to developer tool company GitClear, four years of data analysis, spanning more than 150 million changed lines of code, has revealed an uptick in churn code and a decrease in code reuse — two signs of falling code quality.
The company expects the churn rate to be 7% this year, twice as high as before gen AI. And this isn’t the only warning sign where productivity is concerned.
Every year, Google surveys tens of thousands of developers for its annual state of DevOps report, and this year AI was a major topic.
Respondents said AI was already showing value when it comes to writing and optimizing code, analyzing security, helping them learn new skills, identify bugs, write tests, create documentation, and more. But, according to the report’s authors, the survey data also shows AI has a neutral or even negative effect on team performance and software delivery performance.
“A lot of people talk about programmer productivity,” says Mike Loukides, VP of emerging tech content at O’Reilly Media. “But the gains may be smaller than the hype curve would have us believe. I suspect we’re not seeing significant productivity gains on the big corporate scale.”
In his own survey, released in November, the most common use of gen AI was in software development; 34% of companies are experimenting with it, and 44% are using it in their work. Eventually, he says, everyone will be using it, with AI tools being integral and reliable. Still, he urges companies to look beyond measurements of coding speed.
“What if writing code wasn’t the real issue?” he asks. “What if the real issue is understanding what the customers’ problems were? Maybe we can spend less time writing code and put that time into understanding the customer and how to build something that works for them.”
Gen AI and knowledge work
Gen AI is particularly good at generating text, which makes it invaluable for knowledge workers. But the first big public breakthrough in gen AI creativity was actually with images.
Dall-E 2 and Midjourney created images that fooled humans and even won awards as early as 2022. So it’s no surprise creative professionals use these tools to create marketing and sales materials to supplement internal communications and for other purposes.
And according to a study released in February by Adobe research, based on a survey of over 2,500 creative professionals, 83% say they use gen AI tools, 66% say they make better content, and 58% say they’ve increased the quantity of content they produce.
But gen AI is also capable of helping knowledge workers in other creative areas.
“When generative AI first came, we saw a massive opportunity to increase the productivity of our resources,” says Sam Masri, SVP and head of digital in the customer success area at enterprise software giant SAP. “We got 600 people together to test gen AI in a sandbox to try different use cases in 54 different categories.” Some of the most successful were in industry and customer research.
“It resulted in 46% savings in time to do the same work they did before gen AI at the same quality or better,” he says. “That was an eye-opener. We realized it was something the rest of the organization can use.”
It’s not about reducing headcount, he adds. It’s about increasing coverage and productivity so an individual can support more customers at a faster pace.
Today, SAP has dozens of different gen AI tools, hosted in a company-wide platform, that allow employees to create content, graphics, and more. They’re used by developers, marketers, and many other job functions.
“Within our customer-facing organization of about 30,000 people, every person uses a number of these tools,” he says. “In my Digital Hub organization, with about a thousand people, everyone uses a minimum of four or five tools they didn’t a year ago.” With the use of these tools, SAP is seeing average productivity increases of 20 to 30%.
“That number varies widely across roles,” he adds. “Some were already highly digitized, and some had improvements that were higher because we started with a lower baseline.”
The most value, he says, was in tasks that required the analysis of large volumes of information, such as market research, industry research, and account research, as well as in prospecting, customer support, and creating content.
In customer, industry, and market research productivity, increases were between 40 and 50%, he says. And in content creation and delivery, productivity increases ranged from 20 to 30%, based on weighted averages across roles and use cases.
“All of these were labor intensive and gen AI has helped us improve exponentially,” he says. “That’s where we saw the most value.”
SAP isn’t alone in seeing productivity gains for knowledge workers with gen AI.
In late February, The Harris Poll conducted a survey of over 1,000 knowledge workers and over 250 business leaders on behalf of Grammarly about the state of business communication. Of those who use gen AI, 80% said it improves the overall quality of their work, and that using gen AI saves them 7.8 hours per week.
That would translate into a total average annual savings of $16,455 per worker if they all began using gen AI to help them with communication — an annual savings of $16.5 million for a 1,000-employee company, or $1.6 trillion a year in total for US productivity.
Of course, Grammarly has a vested interest in these results since it makes AI-powered grammar and writing tools. But similar results were found in an MIT study released in July. Researchers conducted an experiment with 453 experienced professionals and found that those who used ChatGPT reduced the time their writing tasks took by 40%, while quality improved by 18%.
The biggest gains were seen in people who received the lowest grades on their first task. After the experiment ended, those participants who used ChatGPT were twice as likely to use it in their real job two weeks later.
At Gallagher, a global insurance brokerage, risk management, and consulting services firm, executive search practice managing director Tom Wilson says his team uses gen AI for research and written communications.
For example, it used to take about an hour to create a job candidate profile based on a standard format. Today, it takes half the time.
“You do have to go back and put in your own voice,” he adds. “You have to personalize.”
Generative AI is also used to write emails, he says, but they still have to be personalized so they don’t sound like they were written by AI. “But it’s much easier than writing something from scratch,” he says. “We typically had our own templates we developed for emails, but generative AI has been helpful to freshen things up. Now it doesn’t sound like I just pulled a template from my shared drive.”
He estimates that gen AI has reduced the time it takes to write these standard emails by 30 to 40%.
“I’ve been recruiting people my whole career,” he says. “And this is the fastest-moving transition around a technology I’ve seen. It’s like putting a PC or a Mac at everyone’s desk 30 years ago.”
When productivity drops using gen AI
Productivity increases are particularly notable in areas where the AI is proficient, but when it’s used for tasks beyond its ability, it can fall dramatically, according to a recent study by researchers at Harvard, Warton, Warwick, MIT, and the Boston Consulting Group.
The experiment involved more than 700 consultants working at BCG, and found that consultants working with the help of OpenAI’s GPT-4 completed, on average, 12% more tasks, completed tasks 25% quicker, and the results were of 40% better quality than those of consultants who didn’t use the AI.
Below-average consultants benefitted the most, with their performance increasing by 43%. Meanwhile, above-average consultants saw only a 17% benefit. But that was for tasks that the AI was good at. For tasks beyond the capabilities of gen AI, consultants saw a 19% decline in performance.
And the frontier wasn’t where people might expect it to be, according to the report’s authors. For example, and counter-intuitively, gen AI is good at coming up with new ideas but bad at basic math.
The experiment included tasks normally part of the day-to-day jobs of BCG consultants, such as developing new product ideas and solving business problems. Consultants needed to use creativity, analytical skills, persuasiveness, and writing skills to carry out these tasks.
“You want to have an understanding of when to question the output of AI,” says Forrester analyst J.P. Gownder. “That’s not a simple thing to solve. It requires a lot of context, judgment, and not only an understanding of AI but also of your environment.”
He expects that many companies will under-invest in the training dimension of gen AI this year.
“They’ll think one hour in a classroom is going to get people up to speed,” he says. “There are hard things, like prompt engineering, you can’t learn in a day. Gen AI has incredible potential to be transformative, but it’s not an easy journey and a lot of that journey flows right through the people who are going to need those skills.”