Big Data and Technology Today: Are We Ready for AI Yet?

 

The State of Data

Successful AI requires two basic components: 1) massive amounts of data and 2) data science, i.e., algorithms that can extract knowledge from the data. So where do we stand today with these two requirements?

Requirement #1: We Have Massive Data

In contrast to internal or proprietary company data, Open Data initiatives are organized worldwide by private, public, and government entities to make specific datasets available for anyone to use, re-use, or redistribute. An Open Data project aims for developers to use the data to produce new and valuable products that create a demand for more data, thus creating a feedback loop of data improvement in quantity, quality, and use. Essentially, if you can say it, do it, photograph it, or measure it, someone is likely collecting data about it. There is infinite knowledge to be gleaned from all the data in the world, so it might be surprising to hear that only 1% of a company’s data is accessible for use, on average.

Data today and in the future

Examples of Data Categories

 

Data Types

Structured Quantitative and tabular data, with columns and rows that are clearly defined. Ex: names, dates, geolocation, credit card numbers, stock information
Unstructured Raw data in any format that could contain anything. Ex: images, videos, .pdf documents, social media comments, transcriptions
Semi-structured Data that doesn’t consist of structured data, but still conforms to some structure. Ex: email, HTML, NoSQL databases, CSV, XML, JSON documents

Data collection methods

Batch Streaming Real time

Data collection sources

types of data collection sources

Requirement #2: We Have Good Data Science

In the 1960s, mathematicians formally recognized the significant value of accessing, understanding, and extracting meaning from data. The field of data science is rooted in this realization, so it began with data analytics. Over the years, analytics has grown and evolved – AI was born from this evolution. The critical difference between data analytics and AI is that machine learning algorithms can test, iterate, and learn autonomously.

Evolution of Data Analytics:

evolution of data analytics The data science industry is brimming with thousands of companies that offer enterprise customers AI capabilities ranging from niche open-source technologies to one-stop-AI shops. As a result, AI is the fastest-growing software market in history and will facilitate more than $13 trillion annually, according to McKinsey in their “The State of AI in 2020 Report.”a screenshot featuring hundreds of AI company logos

These logos are a random selection of only a fraction of the companies offering products and services in the data and AI industry. For a truly overwhelming view of hundreds more, broken down by category and accompanied by analysis, see Matt Turck’s yearly data and AI ecosystem assessment in his “State of the Union” Report.

 

Get our complete Feature-First Field Guide to continue reading. 

 

Download the Field Guide