At the recent Infoscale 2014 conference held in Seoul, South Korea, Pavel Zezula from Masaryk University presented his paper on big data challenges. Zezula’s paper makes a highly relevant inquiry into the big data problem, namely the difficulty to find an efficient access to unstructured data.
It is not a secret anymore that with the growth of data, many organizations struggle to filter it and get value of it. What complicates everything is the key issues of big data challenge: heterogeneity, timeliness, complexity, and privacy. Alas, orthodox technologies are not able to handle processing demands of such data, and it is not that easy to create new ones. Why? Zezula identifies two main challenges of finding a new solution: increasing descriptive knowledge of raw data to improve findability, and applying such knowledge for efficient multimodal and similarity search.
Here is something interesting: we can set clear objectives in order to become a step closer to developing proper technologies. Far fetched? Not at all. Here are the four objectives proposed by Zezula: 1) to establish a framework of algorithms for automatic determination of various domain-specific models; 2) to process heterogeneous datasets combining batch and stream data processing; 3) to generalize the idea of geo-textual indices in the similarity search ; and 4) to preserve privacy in the search.
Unfortunately, the current state of research does not provide solutions for complex big data analysis requirements, which opens a large research area. However, setting clear objectives is always a good starting point for finding a cost-effective solution.