The Trouble with AI: What it Can and Can't Do

posted 46 days ago by Lupinate 46 days ago by Lupinate +140 / -0

I'm gonna try (and fail) to keep this brief, but a lot of people are starting to see that the AI bubble is real and that AI isn't the cure all for business that their creators said they would be. I wanted to give my thoughts on this, as I've been working in large volume data systems (aka Big Data) my whole career, and AI has only come about in the past 3 years or so as some kind of game changer in my industry (when in reality it is anything but)

What AI is good at

Let's start with the positives. AI, as much as it winds me up sometimes, is very, very good at specific use cases. Let's hit them up:

Brainstorming & ideation: I use ai for this myself in my writing, and it has made a lot of my process streamlined in a way I couldn't do myself without weeks of ideation. I can be creative, but AI has helped me create entire characters conceptually, and ideas that I can then take and evolve to be my own. The same is true in business - ai is very good at coming up with a concept that can then be tested by people to see its viability. This is one of two great use cases for it in R&D.
Data quality in large datasets: ai can be good at data analysis and finding duplicates or anomalies in large datasets that no human can reasonably infer. It can get data quality up to 80% better with nominal effort, letting a person do the rest of the clean up using more nuance and accuracy. This isn't a silver bullet, but it is a massive work flow improvement solution for data quality improvements, something which large companies are notoriously bad at inherently.
Summarising CLEAN data: if your data is good but there is lots of it (say in research papers for instance), ai is excellent in summarising the findings in a way people can easily digest them. This is the second big R&D use case.
Copilot activity: whether it is coding, writing, or analysis, ai can generate results faster than a person. Provided a person is there to validate the result, ai can provide some serious boosts to efficiency across businesses when used well. It has the potential to maximise all of your existing staff, but that isn't an excuse to replace them.
Unstructured data handling: if you have a lot of your data buried in emails, pdfs, or messy texts and tickets, ai is great at extracting the value and critical info from them. People aren't good with unstructured data, but LLMs are designed for them.

Depending on the use case, this can dramatically accelerate the company growth and value granted by R&D teams, analysts, developers and product managers. It can also ease the friction between the little guys who are the doers and the executive teams by making the minutiae of what is being done easily digested by ceos and executives. You don't need to have a business degree to communicate complex concepts around computing to a board if an ai can summarise your findings into an easily digested blurb.

Where AI goes wrong

This is a huge issue, and one businesses are now finding out are seriously non-trivial problems. A lot of what ai is good at at a topic level is also where it can struggle at a more operational and fundamental level. Fred Brooks said it best - "There is no silver bullet" when it comes to software engineering. The trouble is, everyone seems to have forgotten this and has been trying to create said silver bullet despite his essay to the contrary. So let's go over where AI fails and why.

Data quality: there is a lot AI can do here as i said, but there is also a lot it can fuck up if left unsupervised. I mentioned it can find 80% of duplicates and identifies anomalies well, but it is shit at edge cases. It also has a nasty habit of seeing false positives (milk 1.0L vs milk 1.5L can be seen as the same item to an ai without proper context). This is because an LLM works in probabilities rather than exact matching & semantic understanding. although some semantics are encoded, understanding all semantic nuances across untold languages is a huge ask. Couple this with abbrievating and product names having similar iterations, you can have a serious issue squeezing the last drop of quality out of your data with just an LLM doing the graft alone.
R&D operations: although good at ideation, AI is too costly to use for any operating of R&D. The test to fail and iterative nature of that kind of work often can and does cause budgets to get wiped fast by the people trying to see if an idea is feasible or valuable. Great for ideation and research summarization, yes, but for actual experimentation? Fuck no. The high volume of testing R&D tend to do in software development is insane, and those types of test are exactly the use cases that ramp up costs for an ai based solution. It is better to test as a person and take the results of those tests, feed it to the ai and get a summary of what happened to show the business your R&D is not wasting budget. Getting the ai to do the graft here is a guaranteed way to wipe the department's funding out fast.
Temporal management: time is not an ai friendly concept. Be it semantic understanding of dates like ordering vs shipping vs delivery, or ideas like timezones and leap years, computers have always had a devil of a time understanding wtf we are doing with the calendar. This is not a trivial issue that you can throw an ai at to resolve either. Humans have built entire date management code libraries and data warehousing solutions for this one issue, so a computer can do things with time. We also can inherently understand things like "a refund must occur after a sale", or "pregnancy happens before birth", while an ai needs to be told this (and that second example it got wrong when I was asking is it any good at time). This is a nightmare if you are trying to use it for a forecasting or attribution modelling solution.
Knowledge retention: ai has a very bad habit of summarising data into the ground when exposed to more and more of it. Critical info often gets obliterated in the process of feeding an ai data. I have hit this a number of times in my writing as I use ai as an auditing and editing tool, and often it starts forgetting what I told it to check for as I throw more chapters at it. If you seed it well, it can manage, but if you don't? Expect it to forget the critical email you fed it as a ruleset after feeding it another 100 more to assess. The larger the ruleset you need for an ai, the more likely it is to try and crunch that ruleset down to save on compute, leading to it forget what you told it.
Temporal leakage: this is a fun piece of ai specific nonsense related to how all computers suck at the Human understanding of time, couple with the knowledge issue I just mentioned. An ai forecasting model can and does often start seeing future events it predicted as being events it can use as part of its model. This is basically the ai becoming so confident it "knows" the future that it uses it as if it is part of the past that it is building the data from.
Slowly Changing Dimensions (SCDs): this is a database / data concept where a unique id is only unique inside a specific timeframe, and can be overwritten due to business logic. Say you have a barcode that is being used by supply one month to represent raspberries and another to represent a blouse (yes this does happen for cost saving purposes, I can attest to this personally). An ai will look at the data and see the unique id as not reliable and code it's way around it, potentially in high cost to compute methods rather than building a different unique id specifically to manage the issue (which is how SCDs are supposed to be handled long term). There are also different types of SCD which add further complexity to the problem (4 main types with 3 hybrids), and ai is really bad at knowing how to handle these effectively.
Infosec: to build certain products like forecasting tools, an ai will need almost unfettered access to a host of internal systems. This is a security officer's and CDO's nightmare. In business, most people are treated as getting the least amount of access they need to do their job. With an AI, you need to give it a lot more access, and we have seen instances where businesses have lost almost everything because the ai had unfettered access to everything. This paradox of needing an ai to be unrestricted while having it limited to what i needs access to is still something that security management teams and platforms haven't solved.
Repeatability: we have probably all seen the photo of the Dwayne "The Rock" Johnson being turned into an abstract Picasso painting over 100 iterations by chatgpt by now. That problem is not limited to images. Because of the more probabilistic nature of ai, you can ask it to do a task multiple times and it will provide slightly different results each time. This is fine if the issue is a real time problem and new data is coming in technically, but often it is just the ai doing what it naturally does - behaving probabilistically. If the data is unchanged, the solution should not change. However, because of the inherent issues in ai, it can give different answers for the same problem. If you need a consistent result, ai aint the best way forward.
Hallucinations & "Garbage in, Garbage out": I've had this happen and it's become a bit of a meme. An ai will not reason like a person. It isn't "question > research data > answer" that an ai does. It's "context > pattern inference from data > probabilistic outcome". This may seem to be the same, but humans are better at seeing a fact based on nuances that a computer is. An LLM is only as good as the data it is trained on, and if some of that data is logically impossible to be true alongside other data, the ai won't see that and say "hang on something is off here" as it can only work on what it is fed. This always happens with the old "garbage in, garbage out" issue in data, but ai can taken it further by trying to bridge the gaps in data it is missing. The LLM doesn't know fact from fiction, nor does it even know what is fact unless it is told. Even then, if said facts are rendered out of existence due to the knowledge retention problem, the ai will be prone to seeing a mirage.

When Humans fail at AI

There are a lot of foibles I've already highlighted that are ai specific, but the following are things that are more human centric issues. Things we don't consider, or things we aren't aware of at all, they have an affect on why ai becomes an issue rather than a fix.

Hidden Costs: ai is not just a token x price model for costing. There are about 17 other variable costs under the hood that also are impacted by its usage. One moment people think it's a cheap automation solution, the next they realise they have built an entire platform in parallel to their original product.
Wrong tool for a problem: we have already had a lot of tools that ai is actively being used as a replacement for when it is more expensive than the original option. It's like buying a themomixer and using it just to heat water for a cup of tea. A lot of tools in data existed before ai that did the same thing ai does, and those tools are often cheaper than burning a token or 12 to find the same answer.
Amplifier vs Replacement: using an ai to amplify the efficency of your current staff is a good use of ai. Using ai to replace your staff and assuming it will retain their business knowledge and understanding of systems? Not so much. The first gives you a massive productivity boost. The second gives you massive budgets, high risks, and trust fails.
80/20 rule: ai is great at getting 80% of the way through a problem, especially issues like data quality as i said earlier. However the last 20% will be comprised of edge cases, nuances that often have a temporal component, business exceptions that humans grasp instantly while an ai won't get without patent prompting that may cause it to forget other exceptions, or good old fashioned legal situations. These can lead to huge cost risks if an ai is left to try to do those things alone. Those are human centric problems that an ai is just shit at solving, but a person can (with the appropriate training) solve these problems cheaper than an LLM ever could.
Cost Scaling: similar but different to hidden costs. When you ask an engineer to solve a problem, he isn't going to charge you extra on top of his contract. When you ask an ai, they will charge you for what they do (be it in the form of tokens+compute). Businesses have yet to realise the cost of an ai isn't linear, and the use cases for reducing those costs do not include replacing whole teams with agents.

So is AI useless in the modern business?

No. Absolutely not. I day this as someone who swears at AI responses a lot too. AI has its use cases. You do not want to pay a data scientist a six figure salary to work on:

repetitive tasks
probabilistic outcomes
narrow use cases
unstructured data
first pass analysis
classifying datasets
transcribing and summarising

These things are time heavy for a high value asset to work on, while an AI can do them at speed with low compute costs and high rates of return.

However, you don't want an ai to be responsible for:

architecture with weak data governance
multi-month R&D projects or high scale iteration projects
high accuracy solutions
autonomous systems
systems with huge context requirements
cross system orchestration
source of truth development
any projects where trust collapse is a business risk to avoid
frontier model dependent work
problems with significant temporal & governance requirements

Giving AI free reign over these is asking for pain. It leads to creating parallel systems of operation that both need to be maintained, or situations where a single prompt can generate hundreds or thousands of calculations under the hood.

TLDR

AI is great for getting rid of the menial effort of business. However it cannot be used to replace your workforce, especially anyone with in depth business knowledge. It isn't good at providing facts/truth, but if you need what is probably true (give or take 20%) it is a decent alternative to a person.

103 comments

103 comments share save hide report block hide replies

You're viewing a single comment thread. View all comments, or full comment thread.

Comments (103)

sorted by:

▲ 3 ▼

– NotAgainTwo 3 points 45 days ago +3 / -0

For the fiction writers out there who are open to using AI, using it to check for continuity is one of the best uses I've found for it, especially if you're writing a series.

Most writers have things like character bibles, event ledgers, relationship progression ledgers, etc ..

Using AI to help keep track of all this has saved me literally days and weeks of work. You can have an AI audit your chapters against your bibles and ledgers and such for you.

You'll get an idea of how much time you can save checking for continuity issues by my personal list of what I have AI audit per book and then the entire series: (I'm currently writing Romantasy (romance x fantasy) so I have to keep track of things like magic systems along with everything else)

EDITED TO ADD: I wanted to add that it's usually better to do this with a local AI like LM Studio or Ollama, not cloud based AI like ChatGPT or Claude. Using local keeps your IP secure because it's all run right on your own computer. You're not uploading your entire IP into the cloud where it can be used for AI training and God only knows what else.

Master Continuity Tracking Checklist

Character Identity

Basic Identity

Full name

Nicknames

Titles

Honorifics

Aliases

Secret identities

Birth name

Married name

Demographics

Age

Birth date

Birth year

Species

Race/ethnicity

Nationality

Social class

Occupation

Appearance

Height

Weight

Build

Eye color

Hair color

Hair length

Hairstyle

Skin tone

Distinguishing features

Scars

Tattoos

Birthmarks

Piercings

Disabilities

Missing limbs

Prosthetics

Physical Traits

Dominant hand

Posture

Gait

Athletic ability

Strength

Flexibility

Endurance

Character Psychology

Personality

Core personality traits

Temperament

Moral alignment

Sense of humor

Introvert/extrovert tendencies

Preferences

Likes

Dislikes

Favorite foods

Favorite colors

Hobbies

Music preferences

Fears

Phobias

Traumas

Triggers

Motivations

Goals

Ambitions

Needs

Desires

Beliefs

Political beliefs

Religious beliefs

Cultural beliefs

Ethical code

Character History

Family

Parents

Siblings

Extended family

Guardians

Relationships

Friends

Enemies

Mentors

Lovers

Exes

Life Events

Birth

Education

Military service

Marriages

Divorces

Deaths

Major traumas

Character Knowledge Tracking

Track exactly what each character knows.

Knowledge State

Secrets known

Secrets unknown

Lies believed

Discoveries made

Misunderstandings

Timing

When learned

How learned

Who told them

Character Voice

Speech Patterns

Vocabulary

Accent

Dialect

Favorite phrases

Curse habits

Formality level

Communication Style

Direct

Indirect

Sarcastic

Blunt

Diplomatic

Relationship Continuity

Relationship Status

Stranger

Acquaintance

Friend

Lover

Enemy

Rival

Relationship Milestones

First meeting

First touch

First kiss

First sex

First "I love you"

First fight

Breakup

Reconciliation

Relationship State

Trust level

Attraction level

Loyalty level

Emotional intimacy

Physical Condition

Health

Illnesses

Chronic conditions

Allergies

Disabilities

Injuries

Cuts

Bruises

Burns

Broken bones

Concussions

Magical injuries

Recovery

Treatment

Healing progress

Permanent damage

Clothing Continuity

Current Outfit

Shirt

Pants

Dress

Coat

Shoes

Accessories

Jewelry

Watches

Belts

Bags

Glasses

Inventory Tracking

Personal Possessions

Weapons

Keys

Phones

Wallets

Documents

Story-Critical Objects

Artifacts

Maps

Rings

Letters

Magical items

Track:

Owner

Current location

Last seen

Timeline Continuity

Calendar

Year

Month

Day

Weekday

Time

Hour

Time of day

Duration

Travel time

Recovery time

Training time

Pregnancy timeline

Geography

World Map

Countries

Regions

Cities

Villages

Travel

Distances

Routes

Transportation methods

Locations

Building layouts

Room layouts

Hidden passages

Worldbuilding

Government

Political systems

Laws

Leaders

Economy

Currency

Trade

Resources

Religion

Gods

Rituals

Beliefs

Culture

Customs

Holidays

Taboos

Magic System

Rules

What magic can do

What magic cannot do

Costs

Energy

Resources

Consequences

Limitations

Range

Duration

Restrictions

Special Cases

Rare powers

Forbidden powers

Technology Continuity

Technology Level

Weapons

Transportation

Communication

Availability

Rare technology

Common technology

Creature Continuity

Species Rules

Lifespan

Reproduction

Abilities

Weaknesses

Individual Creatures

Names

Ownership

Status

Political Continuity

Factions

Alliances

Enemies

Neutral parties

Leadership

Rulers

Successions

Coups

Military Continuity

Forces

Army sizes

Fleet sizes

Unit names

Battles

Casualties

Outcomes

Strategic consequences

Economic Continuity

Wealth

Character wealth

National wealth

Resources

Food

Fuel

Magic resources

Legal Continuity

Laws

Criminal laws

Civil laws

Consequences

Arrests

Sentences

Pardons

Mystery Continuity

Clues

Introduced clues

Revealed clues

Suspects

Known suspects

Eliminated suspects

Plot Continuity

Main Plot

Objectives

Obstacles

Turning points

Subplots

Introduction

Development

Resolution

Foreshadowing Continuity

Track every:

Prophecy

Vision

Hint

Omen

Setup

Track whether it:

Paid off

Has not paid off

Was intentionally subverted

Series Canon

Immutable Facts

Birth dates

Family trees

Historical events

Magic laws

Geography

These should never change unless formally retconned.

Reader Promise Continuity

Track every promise made to the reader.

Examples:

Prophecy

Hidden identity

Mystery setup

Romance setup

Revenge setup

Readers expect payoff.

Romance Continuity

Attraction Progression

Initial attraction

Sexual tension

Emotional attachment

Intimacy Progression

Hand holding

Touching

Kissing

Sexual activity

Emotional Progression

Trust

Vulnerability

Commitment

Relationship State

Exclusive?

Bonded?

Married?

Mated?

Separated?

Series-Level Franchise Continuity

Track across all books:

Character Ages

Family Trees

Timeline of Events

Death Registry

Power Progression

Relationship History

Political Changes

Territorial Changes

Canon Quotes

Recurring Symbols

Running Jokes

Recurring Objects

permalink save report block reply

▲ 1 ▼

– Lupinate [S] 1 point 44 days ago +1 / -0

This. 10000000 times this. That is a huge piece of what ai helps manage, and i can tell it is a big problem for series writers. I keep a living bible of my stuff, but I fed that to the ai as a checking tool.

I'm literally listening to a series where whole characters, items, and concepts were basically forgotten about or treated poorly because of the need to shift the plot significantly. One item got mentioned twice in a single sentence across 2 books that were 4 books apart in the series. Whole concepts have been abandoned, and "interludes" seemed to be forgotten about.

permalink parent save report block reply

▲ 1 ▼

– HonestBobbin 1 point 45 days ago +1 / -0

Found it as a comment! This is awesome, thank you again!

edit: With LM Studio or Ollama, how long does it take to run your audits locally?

When do you use each of those and why choose it over the other?

I need to look into doing something locally, maybe use the Claude 3.5 engine or one of the ones you suggested.

permalink parent save report block reply

▲ 2 ▼

– NotAgainTwo 2 points 45 days ago +2 / -0

Claude 3.5 is not a local model. It runs on the cloud. You can use Claude through Claude.ai or the Anthropic API, but not fully locally.

For local continuity audits, I’d start with LM Studio because it’s easier to install and test.

Timing depends on how powerful your computer is (how much RAM, etc) and how big what you're auditing is. For example, if I'm doing a search for all the times an object shows up in a book, it will only take a few seconds. If I'm auditing an entire chapter against my Story Bible, it can take up to 10 minutes. The longest is doing a comprehensive audit of an entire series against the full Story Bible, which can take several hours.

permalink parent save report block reply

▲ 1 ▼

– HonestBobbin 1 point 45 days ago +1 / -0

Thank you for that. I had a tech buddy show me a Claude Download he had installed locally & said only the 3.5 was available to install like that.

I will need to tinker with these. I have a machine with either 32 or 64 GB of ram, but they are older & I built them for family to play Minecraft together & run a local server for modded MC. I will need to see how it holds up with LM Studio.

For which use cases would you suggest Ollama? I don't mind tinkering with tech.

Thank you again for all of this advice.

permalink parent save report block reply

▲ 2 ▼

– NotAgainTwo 2 points 45 days ago +2 / -0

You're very welcome.

With 32–64 GB RAM, even if the hardware is older, you're actually in a pretty good position to experiment. A lot will depend on CPU, GPU, and VRAM, but that's enough memory to run several useful local models.

For Ollama specifically, I use it when I want repeatable workflows rather than a chat window. For example:

Automatically scan chapters for continuity issues. Build and update character databases. Extract timelines from books. Generate relationship ledgers. Compare a new chapter against established canon.

If I just want to sit down and chat with a local model, test prompts, or load a document and ask questions about it, I'd start with LM Studio because it's simpler.

permalink parent save report block reply

... continue reading thread?

▲ 2 ▼

– NotAgainTwo 2 points 45 days ago +2 / -0

I also meant to ask, when you say they had installed Claude locally, do you know more about that? Because from what I understand, that can't be done.

I'm wondering if they have configured a local AI to call Claude's API. If that's the case, Claude is still being run on the cloud. Being routed through a local AI doesn't change that. So the issues with exposing your IP is the same as if you're just using Claude like normal

It sounds like an interesting setup and I'd love to know more about it. The only other thing I can think of is if it's some sort of enterprise or internal access for an employee or something similar.

permalink parent save report block reply

... continue reading thread?