Image
MHR Labs banner

Views from the Lab

Our take on just about everything in the world of work!

Troublesome Chatbots

20/02/2024

With the release of ChatGPT and the widespread adoption of large language models (LLMs), companies have been rushing to implement this new technology into their products and services as quickly as possible. However, as everyone knows by now, these models are prone to make mistakes. When used at a large scale even a small error rate can lead to problems, so when features are rushed to implementation, the problems can be even greater.  

In the news last week, a man had taken Air Canada to court over things its AI chatbot had told him. The chatbot had given him incorrect information about the company’s refund policy with regards to bereavement, and Air Canada had refused to follow through on what the bot had offered.  

We have seen this many times before. Last year at the start of the AI hype train, Microsoft added AI chat features to Bing search and released it to the world. As you might expect, it didn’t take long for the world to break it by, among other unsettling things, getting the bot to admit to “wanting to destroy everything”.  

In December last year, a car dealership had implemented ChatGPT into its chatbot but hadn’t added proper protections. This resulted in numerous funny screenshots circulating the internet, including one of the bot agreeing to sell a car for $1. There are many more examples of this, but the worst outcome has only ever been bad press for the companies involved.

What makes last week’s chatbot story different is it went to court, with Air Canada claiming “it cannot be held liable for information provided by one of its agents, servants, or representatives”, which is a strong statement to make. The judge presiding over the case ruled against Air Canada however, awarding a partial refund and legal fees to the claimant, stating: 

“It should be obvious to Air Canada that it is responsible for all the information on its website. It makes no difference whether the information comes from a static page or a chatbot.” 

Our Take

Even though this is only a single case in a small claims court, it sets an interesting precedent (note, I am not a lawyer) that companies cannot shirk responsibilities for the AI products they are putting out there. 

This may also lead into a larger theme for the year to come. 2023 was the wild west of exploring what could be done with these new tools, without much care (or maybe not as much as there should have been) for safety or controls. With AI regulations beginning to take shape (see my previous blog on the EU’s AI Act) and several high profile AI related copyright cases on the horizon, 2024 may be shaping up to be the year of reigning in AI. 

By Chris Judd, Data Scientist.

Is AI good enough to take my job?

14/12/2023

One year on from the launch of the ChatGPT, it’s fascinating to see the variety of views on the impact of AI on the workforce and predictions for the future. 

AI augmentation and AI workers 

One UK study, for example, explored the possibilities of augmenting workers’ roles with AI, specifically large language models (LLMs) - like the one that powers ChatGPT - to reduce their work week but maintain their productivity.  

The study found that 88% of the workforce (that’s 27.9 million workers) in the UK could have at least a 10% reduction in work time, due to AI-led productivity gains, with 8.8 million workers potentially moving to a 4-day work week.  

I love the idea that this technology could enable a better work-life balance, but only time will tell if this productivity gain will actually be passed on to workers. 

Another article, discusses the possibilities of ‘AI-powered digital colleagues’. In production lines and factories, robots have been working alongside workers for some time now. But the rise of large language models has changed this dynamic, and knowledge workers could be affected by this possibility much sooner than previously thought.  

The article introduces ‘Ava’ an AI-powered digital sales representative who can become part of a sales team, and can make suggestions, edit campaigns, join meetings, or take notes. This is a good example of how an AI tool could boost the productivity of teams and begs the question – Could this lead to needing fewer workers overall? 

AI is not intelligent 

If you’re worried that AI might take your jobs, then an alternative viewpoint that AI is not yet good enough is just as popular. This article shares Meta's latest intelligence test made up of questions that “are conceptually simple for humans yet challenging for most advanced AIs”. Whilst they do not publish the results of how their own LLM fares, they do publicly assess Open AI’s LLM GPT-4 which gets a feeble 15% test score. 

And in recent news, after the announcement of Google’s new model ‘Gemini’, they admitted that they edited the demo video to make it look better. This is fuelling the hype posing that AI can do more than it actually can and is a worrying concern that Google thought it was acceptable to embellish reality. 

Our take 

It is still very early days to assess what is possible with this new generation of AI tools. In fact, most people aren’t using tools created specifically to support their jobs but are using generic tools like ChatGPT that have not been tested for accuracy in their particular scenarios.  

With new regulation still being worked on, the safeguards are not all in place to keep any new tools in check. I still maintain that new technology is only good if you carefully and deliberately apply it to valuable use cases, i.e. not to use technology for the sake of it, and to assess the impact of each application on each party involved to ensure it does not cause harm. 

As for the impact on workers, I am hopeful that, as with many evolutions of new technology, it will bring benefits of better work and working conditions to all. This may take some time – much longer than many are predicting. 

Hannah Jeacock, Research Director

Initial impact of AI on work

16/05/2023

ChatGPT and other large language models have been in the public eye for a few months now, but what impact have they actually had? 

A year-long study has been published that measures the impact of providing generative AI tools to technical customer service reps. 5000 workers were given access to a chat-based generative AI platform based on GPT and fine-tuned using a large dataset of customer service interactions.  

The study found worker efficiency improved on average by 14%, a very significant improvement.!  

From the report itself: “we show that AI assistance improves customer sentiment, reduces requests for managerial intervention, and improves employee retention.” Interestingly, researchers found that the AI model enabled junior reps (2 months experience) to perform at the same level as more senior employees (>6 months experience), who comparatively saw fewer benefits. 

This is one of the first studies measuring the benefits AI tools can bring to workers. It should be noted that this began before the release of newer tools like ChatGPT. With these more powerful tools, distributed to workers on a much larger scale, it will be interesting to see what the next few studies like this find. 

On the other hand, we are starting to see layoffs in the tech world attributed to several of these new AI tools. IBM recently announced plans to pause hiring in all areas that could be replaced by AI, expecting to replace 7800 jobs over 5 years through attrition.  

Alternatively, Dropbox announced layoffs of 500 employees 16% of staff, due to shifting resources to focus on developing new features around these AI tools. It is worth noting however that layoffs in the tech world have been happening for the past year now, before chatGPT started causing a stir.  

These job losses may have happened anyway, and AI is just a convenient excuse. 

Chris Judd, Data Scientist

Inside McHire – using AI in the recruitment process

09/05/2023

Reading this article about how McDonald’s have transformed their recruitment process with artificial intelligence you can’t help but be impressed with the headline improvements: 

“Following the introduction of McHire, 1,450 McDonald’s restaurants in the UK and Ireland have seen: 

  • Time-to-hire reduced by nearly 65% 

  • A 20% increase in the number of candidates completing the process compared to the previous system 

  • 85.9% of surveyed candidates rated their experience as four out of five or five out of five. (December 2022) 

  • Three to five hours of time saved per recruiter per week” 

The McHire platform is available to candidates 24 hours a day and has a conversational interface, with AI assistant ‘Olivia’, who helps candidates navigate the application process. It asks questions, gathering information to populate the application form.  

I like the idea of a conversational interface for job applications, especially for roles where the requirements are relatively simple or where employers want to attract candidates quickly and easily.  

So far so good. 

However, the sentence that worries me is this one: 

“The system assesses their suitability for the role, and if successful, they are automatically set up for an interview at the restaurant they are applying for.” 

Of course whether applicant assessments are undertaken by AI or a human, not all applicants will get an interview. And maybe the automatic assessment referred to here is covering solid yes-or-no questions regarding must-have qualifications or other legislative requirements.  

But I can’t help thinking of the multiple examples (for example as reported here) where people have been unfairly rejected from jobs by an AI assessment tool, because biases have been built in.  

Done correctly an AI tool should help with assessing candidates more fairly as each application will need to pass the same criteria to get accepted. But there is a very fine line between speeding up the recruitment process and ignoring swathes of people who don’t fit the algorithm. 

If you are interested in learning more about biased algorithms, I highly recommend the books ‘Weapons of Math Destruction’ by Cathy O’Neil and ‘Hello World’ by Hannah Fry. 

Hannah Jeacock, Research Director

Concerns over the rapid development of AI

25/04/2023

Lately, there has been a lot of discussion around the rapid development of AI systems, what this means for our individual privacy and how their existence could impact humanity as a whole.

For instance, the Italian data protection authority has recently decided to block access to ChatGPT in the country, citing the potential breaking of GDPR legislation. Their concerns include collecting individuals’ data to train the AI model without properly informing them.

They also have concerns over ChatGPT’s tendency to “hallucinate” information not in its training data and return inaccurate responses. Furthermore, the lack of age verification on the platform could potentially expose young children to inappropriate content.

This news comes off the back of a recently published open letter calling for a pause in AI development. The letter warns that we are likely not ready for the effects that powerful AI systems will have on society and suggests that these should only be developed “once we are confident that their effects will be positive and their risks will be manageable.”

In the meantime, AI researchers should investigate how best to regulate these systems and come up with safety measures to ensure they are used properly.

 

Our take

AI systems have been developing rapidly in the last six months and regulation is just starting to catch up. It will be interesting to see where AI legislation goes in Europe, with ChatGPT potentially causing GDPR problems, as well as the upcoming Artificial Intelligence Act adding further regulation to the area.

As for the open letter, the pessimistic view would be that it's an excuse to allow the other companies a chance to catch up to OpenAI. However, I do agree that something needs to be done to ensure these systems are used safely, but without stifling the opportunities they bring.

 

By Chris Judd, Data Scientist

Using ChatGPT as your Second Brain 

20/03/2023

ChatGPT and similar tools have become notorious recently for allowing students to cheat, judges to be lazy, and writers to become redundant. But here is a nicer, more honest use case for the AI system. The second brain. 

The concept of a "second brain" isn't new. For centuries, scholars and academics have been utilising the concept of the Zettelkasten (traditionally a card filing system) to hold information that they might otherwise forget. These days of course, people tend to hold their second brain in documents online (or on a laptop at least). However, accessing this information in a substantial document hoard can be tricky. 

Enter ChatGPT (or one of the free alternatives). 

In this (slightly technical) article, the author does just that and trains GPT-3 on his own knowledge base. So, when he asks a question, it can return the most relevant articles through semantic similarities rather than lexical ones (i.e. taking the meaning rather than matching exact text). 

In the research team we’re trying something similar with job titles. Using other language models (in our case BERT or more specifically BERTopic), we’re experimenting with training the model on a huge list of job titles along with information about the job (where available). The idea being that whenever a new job is detected we can align it with something similar that isn’t necessarily lexically similar. For example take a company that works in the health sector and has a job title of “Imaging lead” which would probably match up to something like “Radiographer supervisor”. As a human we can spot the similar meaning, but an ordinary text search has no chance. Having this information at hand can help us with several projects going forward, such as career paths, people analytics, recruitment, and other areas. 

Watch this space for more on this subject and while you’re reading about the subject, why not read our article on ChatGPT here.

 

By Neil Stenton, Research Engineer 

Hybrid Working - The Pros and Cons

20/03/2023

I was browsing hackernews and reading the reactions to Github's office closures (warning this is comments section!) and it was interesting to see the problem mulled from lots of viewpoints: 

  • Those that make friends at work 

  • Those that remote-only will become less job secure 

  • Those that have become contractors and say there never was any job security 

  • Those that think this is the new offshoring/outsourcing initiative 

  • Those saying I’ve got more employment options 

  • Those saying that there are going to be higher demands on good talent 

  • Those that say it improves retention 

  • Those that say it damages retention 

  • Those that say it’s going to settle on hybrid 

  • Those that say it’s going to settle on fully remote 

  • Those that say it doesn't help junior roles learn the ropes quickly 

  • Those that say remoting to help juniors for long periods is easier now 

  • Those worried this is reminiscent of the .com bubble bursting 

  • Those laughing at people that weren't working during a "real recession" 

I was watching a couple of Gartner talks about remote/hybrid management the other day, and two things stood out for me. The first was about how people tend to bias toward those they can see working, and the potential damage on performance evaluation of those who are remote, and therefore don’t see working. The second study showed that the remote workers wanted more feedback and recognition from a wider variety of people, including manager’s manager and peers from other teams. 

So honestly, I have no idea what to think but I'm still leaning toward being in an office. I know I’m personally more engaged and receptive during offline than online meetings. I've yet to find something online that replaces people talking around a whiteboard, a notebook, or the back of a napkin. And I love chewing ideas and assumptions with other people. 

 

 By Joe Norley, Research Engineer 

Four-day working week trial success

20/03/2023

Results for a UK-wide trial of a four-day week have recently been released. The six-month scheme which involved 61 companies and 2900 employees from across the UK took place between June and December 2022.  

The idea behind this trial is that our current working week is outdated, with many roles simply filling time. Through incentives and improving efficiencies such as removing pointless meetings, it is thought that length of a working week can be significantly reduced. 

Each company in the trial implemented its own version of a four-day working week that had to follow one key principle: significantly reduce the amount of time employees spent working while maintaining the same level of productivity with no reduction in pay. 

 

The trial results 

In general employers saw no decrease in outputs or income, with many reporting overall increases in productivity. The key benefit for employers however appears to come from employee wellbeing. This is seen particularly in a 57% reduction in attrition comparing the start and end of the trial period.  92% of companies plan to keep these new work practices in place. 

Employees had an average reduction in hours worked of 4 hours (38 before, to 34 after). This is not quite the desired four-day week but is still a significant reduction in time spent working. Interestingly, a minority of employees (15%) reported an increase in their work time. However, employees did see significant improvements in wellbeing, reporting reduction in stress and burnout and an increase in job satisfaction. Unsurprisingly, 96% of employees wish to continue the scheme. 

 

The report states: 

“The benefits of a shorter working week for no reduction in pay are now both well-known and well-evidenced: employees are happier and healthier, and the organisations they work for are often more productive, more efficient, and retain their staff more readily.” 

 

Our take 

We are very interested to see how the four-day week trend continues to grow. These results show clear measurable benefits to both employer and employee. I particularly like that the study let each company try its own version of a four-day week. This shows it's potential to work in a variety of settings, especially considering the wide range of industries in the trial. How these results would transfer to other companies remains to be seen. Only 22% of the companies involved had more than 50 employees, and larger companies may find it harder to implement such a radical change. Overall, in a time where retaining employees is so important, and attracting new talent can be a challenge, radical changes such as this may become a more enticing option to employers. 

 

By Chris Judd, Data Scientist 

Read more from MHR Labs

Looking for something specific?