Unit.1 | Big Data
Learning Unit | Big Data

The Technologies Behind Big Data Analytics

Chapter 03/07

Snapshot

Discover how big data works, with a focus on storing, aggregating, combining and analyzing collected data. Learn how big data powers our everyday lives, from cloud computing to machine learning.

Key Terms:

  • Big data
  • Cloud computing
  • Distributed computing
  • Deep learning
  • Natural language processing
  • Machine learning

Your Discover Weekly playlist from Spotify, your mobile phone’s autocorrect feature, and the traffic prediction service on your preferred mapping app share a common thread: They’re all powered by big data analytics.

The ability to store, aggregate, and combine data, and then use the results to perform deep analysis is one way experts define big data analytics.

In a nutshell, big data is focused on:

  • Storing data
  • Aggregating data
  • Combining data
  • Analyzing data
Group of co-workers watching their colleague’s presentation on large format screens.

How Big Data Works

Big data’s power lies in the analysis of this stored and combined data to uncover useful insights. Because of big data technologies, your favorite mapping app can tell you how many cars are on the road right now, which streets are under construction and how long it will take you to drive from your house to the grocery store. This is useful information that makes life easier.

Two technicians handling massive coils of fiber optic cables.

fiber optic cables are similar to a electrical cables, but they are used to carry light instead of electricity

The tools and technologies that have made the benefits of big data accessible to everyone include both hardware and software innovations. On the hardware side, communities and individuals around the world are connecting devices and services via fiber-optic, copper, wireless and even low-speed “mesh” networks that use radio signals. High-capacity solid-state hard drives with no moving parts enable computers to store and retrieve digital data almost instantaneously.

With these hardware advances, we can now cheaply and easily store massive amounts of data. On the software side of the equation, data analytics programs perform the work of harvesting, curating and interpreting massive amounts of data. This is what makes the analysis of huge data sets possible.

A Closer Look at Five Technology Innovations

The growth and adoption of certain technologies has helped solidify big data’s presence in everyday life. Cloud computing, distributed computing, machine learning, natural language processing and deep learning are a few of the byproducts of the big data revolution. These technologies have advanced business and industry, increased social and personal engagement, and established new and exciting ways for people and organizations to connect and operate at every scale, from the local to the national and international. Although we may not even realize it, we encounter applications of these innovations constantly.

Distributed Computing

A distributed computing system uses software to coordinate tasks that are performed on multiple computers simultaneously. The computers in a distributed system interact to achieve a common goal. For example, when you use the Google search engine to find information online, Google draws upon a vast network of computers located all over the world to locate and provide relevant answers within milliseconds. To take advantage of the processing power and storage capacity offered by distributed computing, scientists have developed innovative databases that can be easily scaled up to handle even the largest and most complex streams of data.

Cloud Computing

Alternative text for screen readers.

Cloud computing makes it possible to store information on a distant hard drive (i.e., "in the cloud") and then retrieve it on demand from any internet-connected computer, tablet, smartphone or other digital device. Google Drive, Dropbox and Office 365 are some of the most common consumer products that leverage cloud computing, but the technology has become vital to all aspects of our connected lives. Cloud computing and distributed computing go hand in hand, and the two innovations have collectively revolutionized data storage and large-scale data processing.

Machine Learning

It’s hard to overlook how personalized our digital experiences are becoming. Amazon knows what you want to buy, Spotify knows what you want to hear and Gmail knows what you want to say. These services know these things because, increasingly, they know you through machine learning, a branch of artificial intelligence that enables computers to use algorithms to acquire skills usually associated with humans, such as recognizing objects and faces, identifying patterns and understanding language. Algorithms are sets of rules that help computers solve problems. As the rules have become more sophisticated, computers have become able to make sense of your human behavior and deliver a personalized output.

Natural Language Processing

Boy and Grandfather speaking to a digital assistant.

Devices like this one use external servers to process language

From binary to C++, HTML and JavaScript, humans have interacted with computers in computers’ native languages. Natural language processing, a field of computer programming that uses algorithms to interpret text and audio, is turning that conversation on its head. Natural language processing makes it possible for computers to understand your language—and to speak it, too. When you ask Alexa to play a song by The Police, a request is sent to a remote server where algorithms determine that you want to hear a song, not report a crime. The same technologies that power virtual assistants like Alexa also help run online chatbots and automatically create keyword tags from a string of text.

Deep Learning

It may be hard to believe, but the same technology that suggests which friends to tag in your latest Facebook image upload also allows a driverless car to navigate a street system without hitting a curb. Deep learning is an extension of machine learning that uses artificial neural networks with many deep layers, modeled after the human brain, to accomplish complex tasks such as facial recognition and real-time language translation. Computers using the network’s highly flexible architecture can learn directly from raw data and become more accurate as they’re provided with more data.

Big Data and What’s Next

Big data analytics is changing the world. The rapid collection and analysis of vast quantities of data is producing an extraordinary and exponentially growing amount of knowledge. But without the innovations in software and hardware that support big data, the value of this new knowledge would be limited. Technologies like distributed computing, machine learning and natural language processing have the potential to unlock life-saving scientific discoveries, improve business and social engagement through innovation and empathy, and support and more aptly respond to the human condition and our global natural environment.

Next Section

The Impact of Data on Our World

Chapter 04 of 07

Learn how big data affects us every day by informing and improving our lives.