media update’s Aisling McCarthy analyses what Big Data is, and some of the challenges that have come with it.

The history of Big Data

Big Data refers to extremely large data sets, which can be analysed by computers to reveal patterns, trends, and associations. This analysis especially relates to human behaviour and interactions.

The term “Big Data” was added to the Oxford English Dictionary in 2013, but the term does not necessarily describe something that had never existed before. In 1944, Wesleyan University Librarian, Fremont Ryder, referred to an “information explosion” in the Yale Library.

The idea of large data sets is nothing new, but as digital technology’s capabilities have grown, the amount of data being produced has grown exponentially.

In an article for Forbes, Lisa Arthur says that big data reflects the changing world we live in.

“The more things change, the more the changes are captured and recorded as data … Big data is a collection of data from traditional and digital sources inside, and outside, your company that represents a source for ongoing discovery and analysis.”

Big Data, big problems?

The problem with Big Data is that current systems are not built to deal with so many factors, and so much information. This makes any kind of data capturing or analysis much harder than ever before, as the information has increased tenfold.

According to New York University’s professor of computer science Ernest Davis, in an article for World Economic Forum, Big Data poses three major problems:

  1. Having more data is no substitute for having high-quality data;
  2. When people know that a data set is being used to make important decisions that will affect them, they have an incentive to tip the scales in their favour; and
  3. Privacy violations, because so much of the data now available contains personal information.

Davis suggests that for companies who can handle the mass amounts of data, there is also an issue of separating the useful data from the useless. Subsequently, to deal with these large data sets efficiently, various forms of artificial intelligence have to be used.

However, he says that sometimes, these data sets can be skewed due to a struggle to find proportionally representative data.

Further, Davis says that the fact that data results are being used to gauge a person’s performance can drive them to manipulate the system.

“When people know that a data set is being used to make important decisions that will affect them, they have an incentive to tip the scales in their favour. For example, teachers who are judged according to their students’ test scores may be more likely to ‘teach to the test’, or even to cheat.”

The problem with Big Data is that current systems are not built to deal with so many factors, and so much information.
The issue of privacy violation is also rampant because so much information can be gathered from online data.

“In recent years, enormous collections of confidential data have been stolen from commercial and government sites. Researchers have shown how people’s political opinions can be accurately gleaned from seemingly innocuous online postings, such as movie reviews – even when they are published pseudonymously.”

The good news is that the hazards of Big Data can be largely avoided. However, that is only the case if people zealously protect people’s privacy, correct unfairness, and take algorithmic recommendations with a pinch of salt.

Want to stay up to date with the latest media news? Subscribe to our newsletter.
Big Data has some incredible applications, offering deep insights that cannot be accessed through smaller data pools. Although it does present some problems, mining the data can offer a solution. Read more in our article, How data mining can solve the problems of Big Data