Why many companies have a hard time making use of machine learning…

…and what they can do to get on the right path

Burkhard Schwab
DataDrivenInvestor

--

In the last year, I have been working intensively with stakeholders in companies that are interested in using their data to improve operations, make more sound decisions or even enter a market and monetize their data. That’s great, because leaders in these fields are often 3 to 5 years ahead of the field of more conservative enterprises that only now wake up to the opportunities presented by becoming more data-driven.

In these environments, machine learning and AI are often understood as silver bullets that will make things more profitable or solve some previously unmanageable situation. Of course, this is exactly how it had been sold in recent years, so it’s hard to blame people for believing it. What few people say, however, is that there is a significant amount of work involved before it is possible to use AI in a meaningful, value generating way.

Many enterprises simply can’t right now, and it has been an important part of my work to explain and communicate why they won’t be able to do so for a while and how a path to AI readiness can look.

Let’s look at the three most common reasons why AI can only be the ultimate step, not the first:

#1 Unclear goals

When we try to identify the business priorities of a client, we tend to start with a couple of guiding questions. Often, the question

“What is the core issue that you want to solve with machine learning?”

simply can’t be answered at this point. This indicates that the goals of the project haven’t been properly established. It also indicates that there isn’t a solid data strategy on which the business case for using machine learning can be built.

The lack of strategy and not having defined a set of business questions to address are the main reasons why most companies set themselves up for failure with their data-centered projects. It’s hard to emphasize it enough: Becoming data-driven and using machine learning is a large scale transformation project. It entails changing business processes, business technologies and applications, and the very way people are dealing with data and how they are solving their daily tasks. Beginning the journey with clear questions and goals helps to keep focus in the beginning.

In the first step, it’s therefore important to sit down and find out what the business priorities are. Questions can often quickly be found and sharpened in a workshop with stakeholders. The most important factor is to bring together participants who are able to straddle the worlds of business and technology. Usually, it is simple to find people from both sides of the aisle. Business leaders are usually experts in their field. Technical stakeholder know their way around the most important technologies. But finding a way to make them talk with each other in a meaningful manner and with enough perspective for strategic business questions is often a hard problem.

Having solved this issue, the ideation process should lead to a set of questions. Prioritizing these by business impact will lead to a short list of the most impactful for business purposes. Often questions revolve around customers, products and production methods. Addressing them means to find data, clean it and transform it into information, which brings us to our next point.

#2 Low data quality and availability

Low data quality was one of the main issues for all companies I have visited so far. While these companies had been collecting data for years — at least since the big data hype — their data was just not able to answer the questions that were most pressing. One large problem was that their data often wasn’t structured in the right way to do analysis with it. And as data scientists, statisticians and machine learning engineers around the world know: If you cannot analyze and understand your data, you surely won’t be able to build models on top of it.

For these companies, the first steps were fairly simple ones, even though they needed considerable effort: Find data adapted to your business questions, restructure the data, get it out of silos, focus collection on your business questions and start analyzing it.

Getting the data out of the silos also addresses the second most common problem under the headline of data quality and availability. Legacy systems around the world have been great at storing and collecting data, but not very good at linking valuable data to generate insights or even making it available to the company as a whole. Generally, that’s understandable as it wasn’t the main focus. Usually, data was used to understand the past, i.e. for controlling and monitoring purposes, and data from one business unit wasn’t used directly by another business unit.

Data availability comes with a whole set of new questions most company leaders do not anticipate when they ask for a machine learning solution for their problems. What, for example, about Data Governance, i.e. who is responsible for the collection, the quality checks, the continuous availability and the possible deletion (for example for regulatory reasons) of data. Are the right processes in place to ensure continuous data quality? Are your collection efforts targeted? Are they automated? Truly tech-savvy companies have pipelines in place that take data from the point of generation through a cleaning and homogenization process into a data warehouse that is available to all interested parties within the company.

A well formulated data strategy can help with these issues and can enable companies to get their data up to speed. But what’s a truly decisive factor here is business leader awareness. Without the awareness that machine learning needs a supply chain, just like any other product, it cannot get off the ground.

A few companies simply had very little data to work with. These were often smaller players that simply didn’t understand the necessity to collect data so far. In some way, these companies are in a good situation, as they essentially start with a blank slate. They can start with pain points and unanswered business questions and make a targeted effort to collect the data they need to answer these questions. Along the way they can build competencies and a culture that values data-driven decision making.

#3 Missing competencies and underdeveloped culture

Everyone knows that machine learning is all the rage right now. Businesses are tripping over each other to hire people with fancy sounding titles. What they don’t know is that machine learning and even data analysis aren’t a magical pills for business problems, but that they require a careful realignment of business culture and decision making processes.

Companies that operate top-down, with strategic decisions being handed down from management based on “intuition” and limited use of data (caged in presentations and spreadsheets) have a hard time truly making use of data. A culture change is one of the first steps I recommend when starting a data analysis and machine learning effort. This is usually slow and painful.

For some leaders it’s very hard to accept that to get truly data-driven decisions, they will have to let the data speak for itself. A lack of willingness to participate in such a decision making process can and will disrupt their efforts to use data.

That isn’t just a management problem, however. “Historical” rivalries between business units, change aversion, and simple information gaps can all lead to a disruption of efforts to use data. Having a sound change management strategy is important to make sure people understand why decisions should be handled differently, why data has to be collected, stored and maintained and why cooperation between business units is key to making good business decisions.

Building such a culture takes time, but can be very rewarding. Customers notice when businesses operate on a sound information basis, and so do new employees. Since analysis and machine learning are largely driven by people with very specialized competency portfolios that are in very high demand, you want to be able to attract and retain these people. This won’t happen in a business that doesn’t value data as an asset — incidentally a reason for the large churn in new data science departments. Culture generates and helps to retain important competencies.

Focusing on strategy, culture and the right data will help leading businesses from their historical operation model to become data-driven and eventually AI-driven. Business leaders should make sure that they have defined business priorities before they start a technical transformation to address these priorities. Only then success can be guaranteed. Leaders in data-driven decision-making and use of machine learning and AI showcase the benefits that can be gained, but they also show the path that must be taken to get to this point.

The advantage about being in the second row is that one doesn’t have to make all the same mistakes as the trailblazers.

--

--

Ph.D. in theoretical physics — Data Scientist — technology enthusiast — voracious reader — staunch citizen of the world (with a European bias…)