Big Data Privacy by Brijesh Mehta

Big Data is a collection of dataset which is large and complex so that existing data processing tools can not handle it. Big Data is collected and processed using different sources and tools, which leads to privacy issues. Privacy preserving data publishing techniques such as, k-anonymity, l-diversity, t-closeness, etc., are used to de-identify data, but the chance of re-identification is still present as data is collected from multiple sources. Big data is having a characteristic of 3Vs (Volume, Velocity, and Variety), which makes the de-identification task difficult.

We have collected research papers and articles from various journals related to privacy issues in big data, existing privacy preserving data publishing techniques, and privacy preserving big data publishing techniques.The main objective of this collection is to help reader to learn basics of privacy preserving techniques and its application to big data analytics or big data publication.