Ensuring your AI Projects are Successful

Achieving enterprise-wide AI is only possible if you eliminate barriers to data availability.
10 years ago we were talking about how we cope with the 4 or 5 ‘V’s of Big Data; about how 80% Big Data projects fail and that of these most fail at the data onboarding stage. Many of Gemini’s staff lived through that era and were actively part of fixing the problem. But that was yesterday’s problem.
Today’s problem is how we leverage that data to get valuable insights. After all, a large repository of data is useless unless we can get some answers from it.
Enter Artificial Intelligence (AI). Some are calling it the 4th Industrial Revolution, its impact is expected to be so large. Yet you rarely hear companies shouting about their successful AI initiatives and even organisations who say they are using AI are often not getting tangible benefits.
So what are the barriers to success with AI?
Infrastructure
AI is currently mostly machine learning (ML) and it needs to do a lot of extra computing on large amounts of data to learn patterns. It turns out that the mathematical basis of machine learning and image manipulation are somewhat similar and so this extra computing has traditionally been done using GPUs. This, in turn, means that you need to choose different hardware, with more GPUs. Lately, the emergence of Application Specific ICs (ASICs), especially AI accelerators, has moved things along even further. So, in summary, your old hardware with which you are familiar is no longer suitable and you need to understand how to spec. the newer hardware to be successful with AI. The Cloud alone doesn’t fix this challenge since you still need to have a fundamental understanding of what the underlying hardware needs to look like.
Data Availability
However you use ML, you need data. To train the ML algorithm, you typically need a lot of data. This is often a problem for smaller ML projects – the non-availability of training data. The other problem is having massive amounts of data which is stuck in a silo and difficult to get to in a useful way. This issue affects nearly every organization – disparate data silos with massive amounts of data that aren’t usable. The revolutionary problems that AI can solve are held back by the lack of simple access to the relevant data sets.
Replication
The answer to these first two problems and the subsequent one of being able to perform ML, is often to adopt Vendor “X”‘s proprietary solution – “copy all of your data from your other silos into my silo and we can help you do better things”. The problem with this approach is that it is part of the Big Data problem. How many copies of the same data exists in how many different silos, all promising to add incremental or distinct value? More data silos mean more storage and licence costs. This is surely not the answer to the problem?
Gemini Enterprise
At Gemini Data, we took a different approach to the problem. We already have a great track record of fixing the Infrastructure problems with our Manage solution. We combined this with years of experience around Big Data platforms and cloud offerings and created something new that we call Autonomous Data Cloud (ADC). ADC is at the heart of Gemini Enterprise.
Gemini Manage takes care of all the Infrastructure needs, choosing the right type of HW for the job and then hosting it so you don’t have to.
Gemini Care ensures consistent uptime for the platform, guaranteeing reliability. ADC is our zero-copy data pipeline that converges data from all of your silos and makes it available for query when you need it, whether that’s using SQL, R, Python or your favourite AI technology.
In Summary, the Gemini Enterprise platform provides everything you need to be able to do AI-enabled analysis to gain that competitive edge. It does this by leveraging the benefits of your existing data silos whilst providing hosted data solutions where needed and then enabling query and analysis across all of your data.
Written by Ian Tinney on behalf of Gemini Data. Gemini Data will be exhibiting at the AI & Big Data London Expo on the 25-26th April 2019, and are also sponsors of the ‘Data Analytics for AI & IoT’ track’.