What Can It Do and Where Are Its Limits?

Debunking GenAI

After the AI boom kicked off by the emergence of ChatGPT at the end of 2022, it’s now safe to say that we officially reached the age of AI (or more specifically, GenAI). That means: AI is all the rage – everybody wants it and it’s the ultimate marketing buzzword. Unfortunately that also means that more often than not, it’s used incorrectly and assumed to be able to do much more than it really can. But what’s left when you strip off all the promotional promises? Let’s find out!
Here at storywise, we know our GenAI. We know its limits and we know where it can actually add value – which is why we’ve only included Artificial Intelligence in certain workflows of our requirements engineering software. For us, AI is more than an overworked marketing term: we use it where it makes sense, and don’t just slap it on everything we see just because it’s cool.

Unfortunately, that’s not really how the majority of the world is dealing with AI – the impressive capabilities of Generative AI engines (paired with smart marketing) have led to a colossal, global overestimation of what it can do for us and even the idea that it can replace humans.

So let’s have a look at what AI actually can do for us and where its limits are.

First Things First: There’s Nothing Intelligent About AI

Despite the name, AI is not intelligent – that’s just a brilliant marketing gag. At the time of writing (November 2024), what the public calls Artificial Intelligence, stripped of all its pretty marketing make-up, is nothing but a Large Language Model (LLM). And LLMs don’t work on some alien intelligence, consciousness or even logic.

Instead, an LLM uses statistical probability to create content by adding the token that’s most likely to appear next based on the millions and millions of texts it has been trained on. And while that results in quite impressive output sometimes, there’s nothing intelligent about it – so if you’re looking for logic, reason or real understanding when using an LLM, you’re in the wrong place.

Take this picture for example: It was created with Fooocus AI and shows a fox that, at the first glance, looks quite realistic. If you look at it more closely though, you’ll notice that the shadow beneath its tail looks suspiciously like a fifth paw.

Created with Fooocus, February 2024

Things like that happen because in previous iterations, such a shadow in the image of a fox was “most likely” a paw – but the model doesn’t understand that the fox already has four paws and doesn’t need a fifth one.

Nevertheless, LLMs can be really, really impressive too – for example, when they show what’s called emergent abilities: some kind of unexpected skill like doing simple calculations or following basic instructions without being specifically trained to do so. Even in those cases, however, they are not actually being smart, they don’t understand how things work, they’re merely crunching numbers and probabilities.

How Does AI Know What It Knows?

Now you’re probably wondering where the LLM gets those numbers and probabilities from, and the answer is: text. Lots and lots and lots of text – that’s why they are called Large Language Models. The text these models are usually trained on comes from the internet, so they have wide-ranging vocabulary and “know” about all kinds of different topics.

And just to give you an idea of what exactly we mean when we say “lots and lots and lots of text”, here’s a fun fact to give you an idea of the volume: the entirety of Wikipedia in all languages makes up about 3% of the training data of GPT-4.

Due to the massive amount of data LLMs are trained on, they are usually quite good when it comes to things like rephrasing things or mimicking specific styles – so there are definitely things they can do for us. But at the moment, if we’re being honest, the applications are somewhat limited since there is no intelligence in AI – so it always needs the Human in the Loop (HITL).

A great example of the limits of GenAI is the following: if you ask an LLM to create a database for a food delivery service, it will yield a comprehensive list that most likely contains everything you need. If, however, you ask it for a database on a very technical or very specific niche topic, the output will most likely be lacking – unless you feed it detailed info first.

(Give the GenAI something to read, and it will “know”! Created with Fooocus, February 2024)

If It’s Not Smart … What CAN Our AI Do for Us?

The good news first – there are many things you can use your AI for where it’s actually helpful. At the moment, It has a limited role, but it’s quite powerful at the things it can do well. Think of your GenAI less like a smart entity that understands you and more like a supercharged autocomplete.

With its vast training data, there’s a myriad of things an LLM can help you with, but it’s important to keep in mind that everything it does, it does because statistics say it’s the most likely thing to come next, not because it understands what you want. That is especially noticeable when you let GenAI write an article for you: they usually lack reasoning, don’t get their facts right and more often than not end up contradicting themselves at least once throughout the article.

So things your LLM can help with are, for example: - brainstorming ideas - rephrasing things you wrote - mimicking specific styles like more colloquial or more formal wordings - structuring messy documents like meeting minutes - formatting text - writing code based on your requirements

So, a lot of what AI can do for us at the moment (and do well) is grunt work. It’s repetitive tasks that don’t really need a lot of brain power but do eat up a lot of time. And even at those tasks, you always need a human with an actual brain to check the output and see if it makes sense or not. Because the LLM doesn’t understand context, doesn’t do reasoning and lacks implicit knowledge. It’s a tool for humans to use, not a smart entity that replaces us.

Another cute clever fox, generated by Fooocus

Use Case: AI in Requirements Engineering

In requirements engineering (RE), GenAI can be a tremendous help in repetitive tasks, especially at the start of a project. That’s why we have included AI as a feature in certain workflows in our RE software storywise.

Some of the tasks we use GenAI for at the moment are:

  • Generating user stories, epics and personas: Since LLMs are trained on humongous text corpora, they are more than capable to turn a sentence of your brief into a user story and suggest epics and personas related to it. Whether these user stories, epics and personas actually make sense for your real life project still has to be checked by somebody with human smarts though.
  • Suggesting additional requirements: Again, the huge amount of text LLMs are trained on means that they’ve seen all kinds of combinations of requirements. Based on that, they can suggest requirements you may have forgotten – but don’t expect anything new or ground-breaking and always, always check if a suggestion makes sense instead of blindly accepting it.
  • Structuring raw notes into a usable format: This ability can be hugely helpful when you want to make sense of raw meeting minutes quickly, visualize a bunch of user stories into a structured user story map or even create a requirements document. As the output will be based solely on probabilities though, be prepared to double-check it.

As you can see, AI is more of a sidekick than a superhero at storywise. This is also important here in Europe because of the EU AI Act, which specifically states that humans have to have final control over the completeness and correctness of content. While this is not a law yet, we’re prepared in case it becomes one in the future.

In a Nutshell

There are things AI can do and there are things AI can’t do – let’s not overestimate it. Here at storywise, we evaluate its usefulness for a specific task carefully and only implement it where it makes sense. If you’d like to find out more about our requirements engineering software, you can have a look at its features here, try it for free or book a demo to see it in action.