What is Big Data?
Big data is certainly an expression that is thrown around liberally, yet many people do not even understand what is meant by the term. The expression big data essentially refers to large quantities of information, which are usually generated by digital and automated systems. Although this information has been around for decades, it has increased rapidly in recent years, while cataloguing and analysing this information has become more feasible.
Just how big is Big Data?
Traditionally, data was structured and organised in databases, but this changed with the advent of the digitally-dominated world. As soon as the Internet came along, everything changed. Now virtually everything that people do online can be recorded, installed, and analysed. Huge tech companies around the world do this in order to understand human behavior and to get a competitive advantage on the market.
To give you the big picture, we generate daily 2.5 quintillion bytes of data and the pace is accelerating with the growth of Internet of Things (IoT).
The global population performs 40,000 search queries every second (on Google alone), which amounts to 1.2 trillion searches per year. Over the last two years alone, 90 percent of the data in the world was generated.
On Social Media, according to Domo’s Data Never Sleeps 5.0 report, every minute of the day, users watch 4,146,600 YouTube videos, 456,000 tweets are sent on Twitter, while Instagram users post 46,740 photos. Every single minute.
More than a quarter of the world’s 7 billion humans are active on the largest social media network, Facebook, with 1.5 billion users being active daily. Five new Facebook profiles are created every second and more than 300 million photos get uploaded every day.
All these are staggering numbers, pretty much more than what our mind can come around. And these already mind-blowing stats are growing at an ever-increasing rate as more of our world becomes digitized.
Are there any challenges with dealing with this data?
There are several, but one of the biggest things is that approximately 75% of this data is completely unstructured. Sources such as text, voice, and video provide this information, and understanding and analysing it certainly poses major challenges, even for the world’s brightest technicians and engineers.
Why is big data important?
Organizations are already gaining massive advantages over their competitors simply by collecting and analysing big data. For example, a Bain & Co report discovered that out of 400 large companies analysed, those that had a coherent big data policy gained a significant commercial advantage.
And with data analytics becoming increasingly available to even small businesses, it is clear that big data will be a major part of commerce, and ultimately society, going forward.
Who owns big data?
This is undoubtedly an important question and principle. One of the complex aspects of big data is that it has legal implications. The ownership of this information is dependent on the service provider that has the data, the global jurisdiction in which it is stored, and how it was originally generated. As you can see, there is potential for conflict between these competing factions.
Another aspect of this dilemma is the fact that big data can vary quite considerably in form. It can indeed be written, recorded, or captured in video format. And it can encompass an incredibly diverse range of sources. This will certainly pose legal issues in both the present and future, but this won’t prevent companies from attempting to gain a competitive advantage from this vast and valuable source of information.
“It has become profitable to build a database containing the entire world’s knowledge. The few for-profit companies that own the data and the tools to mine it – the data infrastructure – possess great power to understand and predict the world.” – Michael Nielsen, MIT Technology Review Author
According to MIT Technology Review, the always existing fantasy (since Internet is around, at least) is to create a database that contains all the world’s knowledge. This fantasy is now transforming, step by step, into today’s reality. Facebook, for example, has stored and analyzed the social connections for billions of people already and Google aspired since 2002 to digitize all the books in the world. By 2017, Google succeeded to digitize around 25 million books from major university libraries and though lately it came across some epic legal battles between authors, publishers and the internet giant, the process is still going on.
“We are continuing to digitize and add books to this world-changing index, improving the quality of our image-processing algorithms and the effectiveness of search, and plan to carry on doing so for years to come.” – Satyajeet Salgar, product manager for Google Books
Will big data continue to grow?
Yes. Once the Internet of things – a much-mooted form of technology which will see household objects connected to the Internet – goes live, then the level of big data will even increase at quantum levels. With privacy and intellectual property laws having failed to keep up with the pace of technological change, this will pose further legal and ethical issues, while also being a massive opportunity for society in general.
Big Data Future
In this climate, deep and machine learning will play a major role in understanding the almost unimaginable quantities of data being produced. Companies will increasingly seek to collect, quantify and understand such data in order to gain a competitive advantage. In most industries, established competitors and new entrants alike will leverage data-driven strategies to innovate, compete, and capture value.
Machine learning will simply be required in order to sift through and understand this enormous quantity of information, as it is only this sophisticated technology that will be able to analyse the information effectively. Machine learning algorithms are already capable of speech and image recognition, and with major investment from massive corporations such as Google, it is certain that the capabilities of this technology will expand in the coming years.
Many experts on machine learning and artificial intelligence believe that the technology will play a collaborative role with humans in many industries.
Some big data and machine learning initiatives have been controversial, with the combination of the two having profound implications. For example, Google’s London-based artificial intelligence research lab, DeepMind, has already announced a partnership with the Royal Free Hospital in London, and machine learning will very much form part of this agreement.
A Memorandum of Understanding between DeepMind and the Royal Free indicates that the organisations envisage a “broad ranging, mutually beneficial partnership, engaging in high levels of collaborative activity and maximizing the potential to work on genuinely innovative and transformational projects.”
DeepMind apparently hopes to gain access to “data for machine learning research under appropriate regulatory and ethical approvals” within the next five years, according to the document. While many have been critical of the Google collaboration, it is nonetheless hoped that this will eventually result in speedier diagnoses and the optimisation of treatment.
As big data, AI and machine learning all develop, we can at no doubt expect many similar agreements across multiple industrial arenas.