Sunday, April 26, 2015

Final exam: What is Big Data? And why it is so hard to define it?

Big data is a Mac truck with the speed of a NASCAR. This sounds completely awesome until you think about it for a second. The amount of data generated on earth is growing exponentially. Retailers are compiling databases with customer information with every swipe of the card. Organizations in logistics, finance, health data, as well as, many other sectors are also capturing more data. Social media has the public at large creating vase quantities of digital material. The government and private organizations collect data for the purpose of visual recognition, as well as, many other types of data bases. Not to mention that even our appliances have Wi-Fi now. In addition to all of this data, technology is also unearthing large and complex data sets such as the human genome.
The flow of data is so fast and furious at this point that some companies have to sit back and watch valuable data go un-minded, as their systems just cannot keep up. And here in lies the big problem with big data. Our ability to gather data is far beyond our ability to process it.
What is Big Data? I think it is more correct to ask what big data are: New technics that generate value from very large data sets that cannot be analyzed with traditional technics. These technics have opened a virtual Pandora Box of information. Much like so many other complex ideas – it is easier to say what big data are like, as opposed to, what they are. And the reason for that is well……we are just not really sure.
What we are sure about when it comes to big data is this- It’s big, I mean really big. The sheer volume of data is the greatest challenge, as well as, its greatest opportunity. There is too much of it for any traditional relational data base. Global internet traffic is getting close to the five zettabyte mark, per year. There is no doubt that with the ability to harness all of that information – we could be a far more productive and reliable society.
In addition to the being big, big data is also fast. The velocity of flow of data across a modern day server is often too great to be anywhere close to 100% useful. Epically for a growing consumer base that is accustomed to real time information.
Another V that describes what big data is like also describes why it is so hard to define. The Variety of big data is far greater that anything we could have imagined, just a few decades ago. Whereas, the traditional data were things like document, financial report, stick info and personnel files - todays data consists mostly of photos, audio/video, 3D models, simulations and location data. This unstructured data is difficult to even categorize, none the less, process.
On my personal journey through the world of big data this semester; I think the most impressive software I found was Hadoop. Hadoop is an open source software library for a reliable, scalable, distributed computing. It was the first reliable platform for big data analytics. One of the most well-known users of Hadoop is Linked In. They have taken advantage of the speed afforded by the process of distributive computing. This is where the data is sent to multiple servers or clusters of servers for processing before the information is sent back.
If there is one thing that I have learned about this semester that has me optimistic about the future of computing; it’s the idea of quantum computing. Hough only theoretical at this point, they should excel at computing the massive unstructured data sets, because they would compute using quantum mechanics states.

.
One may be included to wonder what the big deal is with big data. Well, the upside of being able to accurately analyze data could make a world of difference for a large portion of the world’s population. Big data provides opportunity and insight in to new and emerging types of data. Whereas in the past the human body would give off more information that a human body could compute. In the future this may not be the case
Oracle stated that “big data holds the promise of giving enterprises deeper insight into their customers, partners and business”. And this is a good thing for everyone involved. It means less overhead and thus lower prices. The McKinsey Global institute states that “the US health care sector could achieve an estimated saving of 300 billion dollars a year, by way of increasing efficiency and quality”.
Big data may if fact is able to pride help to sectors far from Silicon Valley or Wall St. Big data could help help farmers accurately forecast wither and prevent crop failures. It could help prevent a pandemic or maybe even a war
The McKinsey Global institute states that the US health care sector could achieve an estimated saving of 300 billion dollars a year, by way of increasing efficiency and quality

No comments:

Post a Comment