The cold-start problem is commonplace amongst Applied AI startups
There are four key strategies that can help tackle this
The outcome is the flywheel effect, which drives value to your product, your customers and ultimately your business
What is the cold-start problem?
The cold-start problem is a typical challenge amongst Applied AI startups. As the disruptive potential for Applied AI becomes commonplace, startups face the challenge of building their value proposition with a limited amount of data.
The more data that is collected or gathered, the more useful the service becomes - this is known as the “data network effect”. This gives any Applied AI business stronger powers of prediction, recommendation, translation and decision analysis.
Take SwiftKey - an intelligent predictive keyboard - as an example: the more words and sentences that are written, the smarter the technology becomes. This in turn creates a better product and drives wider adoption.
But it’s not easy to get started, but there are four strategies to building a data network effect, which can all be used simultaneously. Each one tackles the cold-start problem differently:
Using Dummy Data
Create theoretical scenarios by feeding dummy data into the system - a good example would be virtual assistant apps such as Magic, the company created a few scenarios of data that would interest their customer base to jump start their growth. While this represents a ‘quick win’ for a lot of startups, it is important not to mistake these scenarios for reality. The sooner you can start to test your models on live customer data the better.
Crowdsourcing from customers
Another approach is to actively collect data from your customers as a function of your product or service offering. As a data-savvy startup, you could be ingesting millions of unstructured documents from client records. It is therefore important to set up your operating model and data / analytics platform to collect this data efficiently and accurately, hence data capture and architecture is key. Facebook is a great example of crowdsourcing whereafter a decade of adding data fields to user profiles can be used to build Applied AI applications (such as targeted advertising). It is important here to provide incentives for users/customers to share their data and to make data entry as enjoyable as possible. For instance, ScentBird in the US collects consumer preferences to recommend exciting new fragrances using a fun, visual survey.
Building strategic partnerships
Another approach is to partner with businesses that have access to proprietary data but don’t have the skills or know-how to generate insights. For example, there are startups working on drug discovery, building strategic partnerships with pharmaceutical companies to analyse their own data for recommending the antigens required for disease prevention on the basis that they can do so faster, cheaper and with greater accuracy due to more robust machine learning algorithms.
Scraping publically available data
While it may be less defensible as a startup strategy, scraping public forums, the internet and public databases to gather data can be a powerful tool in a startups data armoury. A good example could be collecting data on airline pricing to build smarter technology for price predictions in the airline industry. Similarly, SwiftKey crawled the web for words and sentences to better predict user typing behavior.
It is worth bearing in mind that recent compliance changes resulting from EU General Data Protection Regulations (GDPR), coming into enforcement next year, is making scraping increasingly difficult. Services such as Amazon’s Mechanical Turk offers a way around this, where rather than scraping the data yourself you can put Amazon’s platform to work on similar tasks provided they have signed the necessary compliance documents.
What happens once you’ve solved it?
Once you’ve solved the cold-start problem using one, two (or a combination) of these strategies, you can tap into the “data network effect”. This is a similar concept to the traditional “network effect” that we’ve seen with countless platform, e-commerce and marketplace companies that we use everyday.
In a similar way to a rapid growth in users adopting a particular platform, an increase in the number of data inputs into an Applied AI startup can lead to a virtuous circle. The ‘freshness’ and accuracy of these data assets is also crucial. For instance, Applied AI “lead-gen” startups are increasingly able to take on incumbents such as Inside View and Data.com who have fallen down on data acquisition and allowed their data assets to degrade to the point of redundancy.
As you increase the volume and quality of data, your models can improve, offering a more robust and effective product/service, thus enhancing customer experience. Although tough to crack from a cold start, this “flywheel” effect generates it’s own momentum and growth for your startup.
The post by Matt Turck on the power of data networks here.
A detailed post on data acquisition strategy by Moritz Mueller-Freitag here.