Imperative 2: Define your matching dimensions
Common wisdom dictates that use cases should start with a business problem/opportunity and work back to the data required to solve it. With AI, itâs more âchicken and eggâ: sometimes you start with a business problem/opportunity and sometimes with a data set. A common and elastic technology backbone is important but should never be the starting point. The content of a business problem/opportunity is often narrow, and the context matters. Good business problem definitions need to be specific, relevant, objective, and quantifiable (as with AI, data will be at the core of the solution).
For example, a healthcare executive described a use case where the early business problem was defined as: âWe want to leverage AI to make our hospital admission process more efficient.â The executive admitted that such a definition was unlikely to get the company far as it had no specific problem area, context, success metrics, or indication of the data source. The company iterated on the definition and restated it as: âWe want to lower the rate of patient readmission by identifying individuals at high risk of returning within 30 days and ensuring proper follow-up care with the objective of reducing readmission rates by 10% and improving patient outcome.â This redefinition started a fruitful matching exercise. The team began by looking at electronic health records, patient demographics, treatment plans, and historical readmission data, and then applied an AI model on top of those datasets.
Existing or accessible data sets can also be a good starting point. When powered with AI, useful patterns can be uncovered from data sets that can lead to assumptions or insights into a business problem/opportunity.
For instance, a credit card company was looking at potential applications of AI in credit card fraud. The company applied unsupervised machine learning to large volumes of transaction logs without a pre-defined question (context). The AI system uncovered a cluster of transactions originating from different merchant categories and regions that consistently showed suspicious timing anomalies and unusual card usage sequences (pattern identification). The pattern suggested the potential existence of a coordinated fraud ring operating across multiple merchants (hypothesis/insight). From this data insight, the company was able to define a use case and develop a targeted fraud detection model to proactively flag and block these sophisticated attack vectors.
Unfortunately, matching datasets and business problems/opportunities does not work like matching individuals on a dating site. The business and the data side will have dynamic characteristics and evolve. Datasets are not static: they exhibit complementarities where the value of the data increases when meshed with other data with complementary attributes (e.g., the weather conditions in which a machine is used). Equally, business problems/opportunities evolve as economic conditions, market, competition, and customer needs and behaviors change (e.g., growing health-conscious consumers seeking organic foods, transparency in sourcing, nutritional data, and sustainable production methods).
In addition, both datasets and business problems/opportunities have known and unknowns that will need to be identified. For example, a dataset covers a specific timeframe, but data patterns might change if we look further into our historical archives. A business opportunity might be based on todayâs privacy regulatory environment, but regulatory changes may affect its feasibility.
So, under those circumstances, how do we start the matching exercise?
First, the business problem and the data need to be assessed and qualified regardless of the starting point. The key criteria for a business problem/opportunity are its feasibility (Can we deliver the outcome?) and its impact (Will the outcome substantially affect performance?). The key criteria to assess the data side are findability (can we find the reliable data needed to effectively inform the decision/action?) and accessibility (Can we economically access the data?). Second, you will need to iterate to properly qualify your matching dimensions. For example, could the feasibility of a business problem be improved if we were to change or adapt our workflow? Or could we find proxy data or public data that will inform the decision with a sufficiently high confidence ratio?
A deep understanding of your matching dimensions is critical to setting up the use case and will increase your chances of success.