Home > Computer science essays > Data privacy

Essay: Data privacy

Essay details and download:

  • Subject area(s): Computer science essays
  • Reading time: 17 minutes
  • Price: Free download
  • Published: 25 August 2014*
  • File format: Text
  • Words: 3,601 (approx)
  • Number of pages: 15 (approx)

Text preview of this essay:

This page of the essay has 3,601 words. Download the full version above.

5.1 Introduction
Each of the techniques presented so far adds noise either to a numerical attribute or categorical attribute. We now present a Model that combines these techniques for adding noise to all the attributes of a data set. The Model adds noise in such a way that original data set are preserved. Additionally, our Model can be extended so as to preserve the correlation among the attributes as well. This extension makes the Model applicable to a wider range of data sets, both those to be used for classification and those used for statistical analysis. Our experimental results, presented indicate that the data are very well preserved.
Data privacy, also called information privacy, deals with the ability an organization or individual has to determine what data in a computer system can be shared with third parties. Data protection is important for a business record keeping. A lot of information is irreplaceable such as financial and employee records in case of theft, fire or floods. Backing up all your important data is very important. In an effort to minimize intruders into your important electronic documents, you must protect the data. It is the relationship between collection and dissemination of data, technology, the public expectation of privacy, and the legal and political issues surrounding them.
Privacy concerns exist wherever personally identifiable information is collected and stored ‘ in digital form or otherwise. Improper or non-existent disclosure control can be the root cause for privacy issues. Data privacy issues can arise in response to information from a wide range of sources, such as:
‘ Healthcare records
‘ Criminal justice investigations and proceedings
‘ Financial institutions and transactions
‘ Biological traits, such as genetic material
‘ Residence and geographic records
‘ Ethnicity
‘ Privacy breach
‘ Location-based service and geo location
The challenge in data privacy is to share data while protecting personally identifiable information. The fields of data security and information security design and utilize software, hardware and human resources to address this issue. As the laws and regulations related to Data Protection are constantly changing, it is important to keep abreast of any changes in the law and continually reassess your compliance with data privacy and security regulations.The ability to control the information one reveals about oneself over the Internet, and who can access that information, has become a growing concern. These concerns include whether email can be stored or read by third parties without consent, or whether third parties can continue to track the web sites someone has visited. Another concern is web sites which are visited collect, store, and possibly share personally identifiable information about users.
The advent of various search engines and the use of data mining created a capability for data about individuals to be collected and combined from a wide variety of sources very easily.[107][108][111] The FTC has provided a set of guidelines that represent widely accepted concepts concerning fair information practices in an electronic marketplace called the Fair Information Practice Principles.In order not to give away too much personal information, e-mails should be encrypted and browsing of webpages as well as other online activities should be done trace-less via anonymizers, or, in cases those are not trusted, by open source distributed anonymizers, so called mix nets.
Email isn’t the only internet use with concern of privacy. Everything is accessible over the internet nowadays. However a major issue with privacy relates back to social networking. For example, there are millions of users on Facebook and regulations have changed. People may be tagged in photos or have valuable information exposed about themselves either by choice or most of the time unexpectedly by others. It is important to be cautious of what is being said over the internet and what information is being displayed as well as photos because this all can searched across the web and used to access private databases making it easy for anyone to quickly go online and profile a person.
5.2 Privacy Preserving Data Mining:
Due to the enormous benefits of data mining, yet high public concerns regarding individual privacy, the implementation of privacy preserving data mining techniques has become a demand of the moment. A privacy preserving data mining provides individual privacy while allowing extraction of useful knowledge from data. There are several different methods that can be used to enable privacy preserving data mining. One particular class of such techniques modifies the collected data set before its release, in an attempt to protect individual records from being re-identified. An intruder even with supplementary knowledge, cannot be certain about the correctness of a re-identification, when the data set has been modified. This class of privacy preserving techniques relies on the fact that the data sets used for data mining purposes do not necessarily need to contain 100% accurate data. In fact, that is almost never the case, due to the existence of natural noise in data sets. In the context of data mining it is important to maintain the patterns in the data set. Additionally, maintenance of statistical parameters, namely means, variances and co variances of attributes is important in the context of statistical databases. High data quality and privacy/security are two important requirements that a good privacy preserving technique needs to satisfy. We need to evaluate the data quality and the degree of privacy of a perturbed data set. Data quality of a perturbed data set can be evaluated through a few quality indicators such as extent to which the original patterns are preserved, and maintenance of statistical parameters. There is no single agreed upon definition of privacy. Therefore, measuring privacy/security is a challenging tasks.
5.3 Data Security means protecting a database from destructive forces and the unwanted actions of unauthorized users. It is critical for most businesses and even home computer users. Client information, payment information, personal files, bank account details – all of this information can be hard to replace and potentially dangerous if it falls into the wrong hands. Data lost due to disasters such as a flood or fire is crushing, but losing it to hackers or a malware infection can have much greater consequences.
Thorough data security begins with an overall strategy and risk assessment. This will enable you to identify the risks you are faced with and what could happen if valuable data is lost through theft, malware infection or a system crash. Other potential threats you want to identify include the following:
‘ Physical threats such as a fire, power outage, theft or malicious damage
‘ Human error such as the mistaken processing of information, unintended disposal of data or input errors
‘ Exploits from corporate espionage and other malicious activity
You can then identify areas of vulnerability and develop strategies for securing your data and information systems. Here are several aspects that need to be considered:
‘ Just who has access to what data.
‘ Who uses the internet, email systems and how they access it
‘ Who will be allowed access and who will be restricted
‘ Whether or not to use passwords and how they will be maintained
‘ What type of firewalls and anti-malware solutions to put in place.
‘ Properly training the staff and enforcing data security.
After the above analysis, you can then prioritize specific data along with your more critical systems and determine those that require additional security measures. It is also a good idea to layout a BCP (Business Continuity Plan) so that your staff is still able to work effectively if the systems happen to fail. Company risks and security implementations should be reviewed frequently to support changes such as the growth of your business and other circumstances.
5.4 Securing Data
Once you draw up a plan and assess your risks, it is time to put your data security system into action. Since data can be compromised in many ways, the best security against misuse or theft involves a combination of technical measures, physical security and a well educated staff. You should implement clearly defined polices into your infrastructure and effectively present them to the staff. Here are things that you may do:
‘ Protect your office or data center with alarms and monitoring systems
‘ Keep computers and associated components out of public view
‘ Enforce restrictions on internet access
‘ Ensure that your anti-malware solution is up to date
‘ Ensure that your operating system is up to date
‘ Fight off hacking attacks with intrusion detection technology
‘ Utilize a protected power supply and backup energy sources
Measuring of Risk Factor of the modified Data
Before we can proceed to security analysis, we need to produce a definition of disclosure. We note that an exposure of any sensitive information can be considered as disclosure. For example, sometimes a set of rules obtained from a data set is considered sensitive and therefore the exposure of the rules is regarded as disclosure [4,5]. A system proposed in [4,5] alerts a data miner to sensitive rules, since the miner may not be aware of the sensitivity level of a rule. In another scenario an exposure of any single value is considered as disclosure [118]. Generally, revealing a sensitive attribute value belonging to an individual is considered as disclosure [121]. Such disclosure usually occurs through a re-identification of the record. However, even a re-identification that does not cause an exposure of a sensitive attribute value can still be considered as disclosure [122]. Similarly, an exposure of a sensitive attribute value, without a re-identification, is also regarded as disclosure known as attribute disclosure [122]. If an intruder can narrow down the list of all possible records that could have originated from the target record” to a set of records having the same sensitive attribute value, then the attribute value is considered to be disclosed, even without any exact record re-identification. By target record” we mean an original record, in which the intruder is interested.
Due to the varying definitions of disclosure it is not trivial to measure disclosure risk. Moreover, disclosure risk depends on various other factors such as supplementary knowledge of an intruder and the approach taken by an intruder. However, effective measuring of disclosure risk is important as the effectiveness of a data perturbation technique is evaluated by the disclosure risk and the data quality of a perturbed data set. In order to measure disclosure risk Lambert considered the re-identification of a record 156 as disclosure and the resultant exposure of a sensitive attribute value as the harm of the disclosure [122]. Lambert assumes that for every perturbed record an intruder estimates the probability of the record coming from a target record. The intruder then obtains a sensitive attribute value from the perturbed record having the maximum probability. When an intruder believes that he/she has worked out a sensitive information then, regardless of whether or not the intruder has obtained a correct information, Lambert considers this as perceived disclosure”. However, when an intruder obtains a true information then this is considered as true disclosure’. In what follows we consider both re-identification of the target record and disclosing a confidential class as disclosure and we evaluate each one of them separately.
The main aim of our noise addition technique is twofold:
1. To prevent disclosure of confidential individual class values contained in the data set; we achieve this not only by perturbing the values of the class attribute itself but also by introducing perturbation to other (non-confidential) attributes, in order to make re-identification of the records difficult and in some instances even impossible.
2. To preserve not only statistical parameters of the data set (means, variances, etc.), but also the patterns discovered by the decision tree builder prior to perturbing the data set. We note that there has been a research report published in the literature where the goal has been to hide confidential patterns [37, 90], but that is beyond the scope of our study.
What we consider the most valuable property of our noise addition technique is the fact that we can always adjust the level of noise we are adding to the attributes so as to achieve the desired level of the security. As an extreme case, we can add noise uniformly distributed over whole domain to all innocent attributes and still preserve the patterns. Then from the point of view of an intruder who knows only the values of innocent attributes, all perturbed records will be equally likely and the entropy will be maximum. The only attributes we cannot perturb beyond the limits given by the leaves are the leaf influential attributes.
However, we argue if leaf influential attributes are known to the intruder then he/she can run the record through the decision tree to learn the class rather than attempt to re-identify the record. He/she will not however be able to learn other attributes in addition to the class attribute as the re-identification entropy remains high regardless of intruders knowledge of influential attributes.
In order for perturbed data set to be useful beyond classification it is desirable to keep the noise level low. In that case our method works better for dense data sets such as WBC, than for very sparse data sets where preventing re-identification would require higher level of noise simply because the records are very diverse.
5.5 Fusion Approach Framework:
Figure 5.1 Fusion Approach Framework
5.5.1 Fusion Approach Algorithm:
Let us consider a database which is a combination of both numerical and categorical type of data.
ENCRYPT-DATABASE (database)
file’ null
field ‘ null
expr ‘ null
FOREACH field in database
field ‘ CONVERT-TO-ASCII (field)
field ‘ CONVERT-TO-BINARY(field)
END FOREACH
file ‘ COMPRESS (database)
FOREACH value in file
value ‘ CONVERT-TO-ASCII(value)
value ‘ CALCULATE-ROOT (value)
END FOREACH
expr ‘CREATE-POLYNOMIAL-EXPRESSION (file)
RETURN expr
CONVERT-TO-ASCII ( ) : will convert each character to its ASCII value
CONVERT-TO-BINARY ( ) : will convert each number to its BINARY equivalent
CALCULATE- ROOT ( ) : will calculate roots by putting value of each row in equation x2 + 2x + 1 =0
CREATE-POLYNOMIAL-EXPRESSION( ): will create a polynomial expression by putting root values as : (x ‘ root1)(x-root2) ‘ (x ‘ root(n)) = 0
5.5.2 Calculation and Result Analysis:
Suppose a database, having fields named as NAME and AGE has the following data to it:
Name Age
A 1
B 2
C 3
Now FOREACH field in database convert each field (character) to its ASCII equivalent and then to binary form as given below
A = 65 = 1000001 1 = 48 = 0110000
B = 66 = 1000010 2 = 49 = 0110001
C = 67 = 1000011 3 = 50 = 0110010
Now in file, data can be represented as
Name Age
01000001 00110000
01000010 00110001
01000011 00110010
In computer science and information theory, data compression, source coding or bit-rate reduction involves encoding information using fewer bits than the original representation. Compression can be either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by identifying unnecessary information and removing it The process of reducing the size of a data file is popularly referred to as data compression, although its formal name is source coding (coding done at the source of the data before it is stored or transmitted).
Compression is useful because it helps reduce resource usage, such as data storage space or transmission capacity. Because compressed data must be decompressed to use, this extra processing imposes computational or other costs through decompression; this situation is far from being a free lunch. Data compression is subject to a space’time complexity trade-off. For instance, a compression scheme for video may require expensive hardware for the video to be decompressed fast enough to be viewed as it is being decompressed, and the option to decompress the video in full before watching it may be inconvenient or require additional storage. The design of data compression schemes involves trade-offs among various factors, including the degree of compression, the amount of distortion introduced (e.g., when using lossy data compression), and the computational resources required to compress and uncompress the data.
The above given data can now be compressed as a single data string. We now go about compressing the file and for compressing the same we write the whole data in a single string as :
010000010011000001000010001100010100001100110010
Using Lz77 Algorithm for compression, compress the above data.LZ77 algorithms achieve compression by replacing repeated occurrences of data with references to a single copy of that data existing earlier in the input (uncompressed) data stream. A match is encoded by a pair of numbers called a length-distance pair, which is equivalent to the statement “each of the next length characters is equal to the characters exactly distance characters behind it in the uncompressed stream”. (The “distance” is sometimes called the “offset” instead.)
 
Steps Input (bits) Output (findAt , bit)
1 0 0,0
2 1 0,1
3 00 1,0
4 000 3,0
5 10 2,0
6 01 1,1
7 100 5,0
8 0001 4,1
9 0000 4,0
10 1000 7,0
11 11 2,1
12 00010 8,0
13 10000 10,0
14 110 11,0
15 011 6,1
16 001 3,1
17 0 1
 
 
 
 
 
LZ77 is probably the most straightforward. It tries to replace recurring patterns in the data with a short code. The code tells the decompressor how many symbols to copy and from where in the output to copy them. To compress the data, LZ77 maintains a history buffer which contains the data that has been processed and tries to match the next part of the message to it. If there is no match, the next symbol is output as – .
After compressing of the data we get the output field which contains the pattern for compress data.
Our compressed data is in the form:
0,0,0,1,1,0,3,0,2,0,1,1,5,0,4,1,4,0,7,0,2,1,8,0,10,0,11,0,6,1,3,1,1
Now, we calculate the roots by putting each value of the compressed data (separated by comma) in place of x in equation say f(x)=x2+2x+1
x2 + 2x + 1 = value
Which will give:
(0)2 + 2 x 0 + 1 = 1 (1)2 + 2 x 1 + 1 = 4 (3)2 + 2 x 3 + 1 = 16
(2)2 + 2 x 2 + 1 = 9 (5)2 + 2 x 5 + 1 = 36 (4)2 + 2 x 4 + 1 = 25
(7)2 + 2 x 7 + 1 = 64 (8)2 + 2 x 8 + 1 = 81 (10)2 + 2 x 10 + 1 = 121
(11)2 + 2 x 11 + 1 = 144 (6)2 + 2 x 6 + 1 = 49
Now our data is
1,1,1,4,4,1,16,1,9,1,4,4,36,1,25,1,64,1,9,4,81,1,121,1,144,1,49,4,16,4,4
Now if we assume these values to be the roots of a polynomial equation then we can create a polynomial expression as
‘ (x-1)(x-1)(x-1)(x-4)(x-4)(x-1)(x-16)(x-1)(x-9)(x-1)(x-4)(x-4)(x-36)(x-1)(x-25)(x-1)(x-64)(x-1)(x-9)(x-4)(x-81)(x-1)(x-121)(x-1)(x-144)(x-1)(x-49)(x-4)(x-16)(x-4)(x-4)=0
The above equation will represent our data which is in a form where the data integrity and security is maintained to the highest level. The risk factor of losing the original data is the minimum in this as the original data can be retrieved with greatest ease as shown.
Now for Decryption, reverse the whole process
From the polynomial expression, take out all the roots and separate them as
1,1,1,4,4,1,16,1,9,1,4,4,36,1,25,1,64,1,9,4,81,1,121,1,144,1,49,4,16,4,4
For each value find the value of ‘x’ by putting root in the equation x2 + 2x + 1 = root
On solving :
(0)2 + 2 x 0 + 1 = 1
(1)2 + 2 x 1 + 1 = 4
(3)2 + 2 x 3 + 1 = 16
(2)2 + 2 x 2 + 1 = 9
(5)2 + 2 x 5 + 1 = 36
(4)2 + 2 x 4 + 1 = 25
(7)2 + 2 x 7 + 1 = 64
(8)2 + 2 x 8 + 1 = 81
(10)2 + 2 x 10 + 1 = 121
(11)2 + 2 x 11 + 1 = 144
(6)2 + 2 x 6 + 1 = 49
This will give our compress data, as
0,0,0,1,1,0,3,0,2,0,1,1,5,0,4,1,4,0,7,0,2,1,8,0,10,0,11,0,6,1,3,1,1
Now prepare data in a tabular form with a pair of value separated by comma as
0,0 0,1 1,0 3,0 2,0 1,1 5,0 4,1 4,0 7,0 2,1 8,0 10,0 11,0 6,1 3,1 1
Using Lz77 decompressing algorithm, starting from the bottom, traverse to the start of table, this will give our data in 8-bit ASCII string as
010000010011000001000010001100010100001100110010
Now extract 8 bits from this and place them in the respected fields, out database recreated from a polynomial expression
Name Age
01000001 00110000
01000010 00110001
01000011 00110010
Converting the binary values in ASCII format and then to their decimal equivalent we get back the original data.
Name Age
A 1
B 2
C 3
5.6 Implementation of The fusion Model
The model has been successfully been implemented and used in maintaining the privacy of the sensitive data in the database.
Database Encryption with Compression ( using Lz7 Algorithm ).
package database_enc_com;
import java.util.ArrayList;
// class creating polynomials
class createpoly {
int pol1[][], pol2[][], result[][];
int l1, l2; //no of x in string
int i, j, flag, pointer, n1, n2, nn1, nn2;
String eq1, eq2, part1, part2;
/* Function to create a new polynomial with multiplication of two polynomials */
public void create(String p1, String p2) {
n1 = 0;
n2 = 0;
flag = 0;
//calculation of x in both equations
/* determining the number of terms in first polynomial */
for (i = 0; i < p1.length(); i++) {
if (p1.charAt(i) == ‘x’) {
l1++;
}
}
/* determining the number of terms in Secong polynomial */
for (i = 0; i < p2.length(); i++) {
if (p2.charAt(i) == ‘x’) {
l2++;
}
}
// creating array for both strings
pol1 = new int[l1][2];
pol2 = new int[l2][2];
result = new int[l1 * l2][2];
//equations are
eq1 = p1 + “+”;
eq2 = p2 + “+”;
/*retrieving polynomials */
pol1 = getPoly(eq1, pol1, l1);
pol2 = getPoly(eq2, pol2, l2);
multiply(l1, l2);
} // constructor Dummy
// separating number and power part
/* degree and coefficient part will be stored in different arrays */
int[][] getPoly(String s, int arr[][], int len) {
part1 = “”;
part2 = “”;
arr = new int[len][2];
flag = 0;
pointer = 0;
/* loop will iterator to the length of polynomial */
for (i = 0; i < s.length(); i++) {
/* finding x in polynomial, if found… the corresponding degree and coefficient will be stored in array */
if (s.charAt(i) == ‘x’) {
part1 = part1.trim();
arr[pointer][0] = Integer.parseInt(part1);
part1 = “”;
flag = 1;
} else if (s.charAt(i) == ‘+’ || s.charAt(i) == ‘-‘ && i > 0) {
part2 = part2.replace(‘x’, ‘ ‘);
part2 = part2.trim();
arr[pointer][1] = Integer.parseInt(part2);
part2 = “”;
flag = 0;
pointer++;
}
if (flag == 0) {
part1 = part1 + s.charAt(i);
} else {
part2 = part2 + s.charAt(i);
}
}//for
return arr;
}
/* multiplying two polynomials , where coefficients get multiplied for each degree of x, and degress will get added*/
void multiply(int len1, int len2) {
pointer = 0;
for (i = 0; i < len2; i++)
{
n2 = pol2[i][0];
nn2 = pol2[i][1];
for (j = 0; j < len1; j++) {
n1 = pol1[j][0];
nn1 = pol1[j][1];
result[pointer][0] = (n1 * n2);
result[pointer][1] = (nn1 + nn2);
pointer++;
}//for2
}//for1
} // multiply
/*returning the resultant polynomial*/
String display() {
String str = “”;
int res1[][];
res1 = result.clone();
/* combining the degree and coefficient part */
for (i = 0; i < result.length; i++) {
for (j = i + 1; j < result.length; j++) {
if (result[i][1] == result[j][1]) {
result[i][0] += result[j][0];
result[j][0] = 0;
}
}
}
/* adding coefficients of similar degrees of x */
for (i = 0; i < result.length; i++) {
if (!(result[i][0] == 0)) {
str = str + ” +(” + result[i][0] + “x” + result[i][1] + “)”;
}
}
return str; /* resultant polynomial will be returned */
} //display
}
//class for compressing data
class compress {
/*ArrayList value index and newbit to store the intermediate data of Lz7Algorithm */
public ArrayList index = new ArrayList();
public ArrayList value = new ArrayList();
public ArrayList newbit = new ArrayList();
public int last, i, j;
public String temp, temp2 = “”, compressdata = “”;
/* Applying Compression of data */
public String compressDatabase(String str) {
temp = str;
/*iterating for each bit in data */
for (i = 0; i < temp.length(); i++) {
for (j = i + 1; j <= temp.length(); j++) {
temp2 = temp.substring(i, j);
/* getting the substring from data and checking if it already exists, if yes then getting its index in the variable last*/
if (value.contains(temp2)) {
last = value.indexOf(temp2);
last++;
} else {
break;
}
}
String s1 = temp2;
s1 = s1.substring(temp2.length() – 1);
if (value.contains(temp2)) {
compressdata = “S” + compressdata;/*checking at the end if value array has two similar substrings than a ‘S’ is added to them to make the consistency of data while decrypting it */
}
/*adding values to their corresponding arrays */
value.add(temp2);
index.add(last);
newbit.add(s1);
if (!(temp2.length() == 0)) {
temp = temp.substring(temp2.length() – 1);
}
last = 0;
temp2 = “”;
}
System.out.println(“Lz7 ApplicationnValuet FindAt/NewBit”);
for (i = 0; i < value.size(); i++) {
System.out.println(value.get(i) + “t” + index.get(i) + “,” + newbit.get(i));
compressdata += “” + index.get(i) + “” + newbit.get(i);
}
return compressdata;
}
public void decryptDatabase(ArrayList rootval) {
/* from the roots, exttracting the index and bit part of the Algorithm*/
ArrayList index = new ArrayList();
ArrayList bit = new ArrayList();
/*if roots starts with S then extracting will begin from index 1 else from index 0 */
if (rootval.get(0).toString() == “S”) {
for (int i = 1; i < rootval.size(); i = i + 2) {
index.add(((int) Math.sqrt(Double.parseDouble(rootval.get(i).toString())) – 1));
bit.add(((int) Math.sqrt(Double.parseDouble(rootval.get(i + 1).toString())) – 1));
}
} else {
for (int i = 0; i < rootval.size(); i = i + 2) {
index.add(((int) Math.sqrt(Double.parseDouble(rootval.get(i).toString())) – 1));
bit.add(((int) Math.sqrt(Double.parseDouble(rootval.get(i + 1).toString())) – 1));
}
}
System.out.println(“Lz7 intermediate decompressing data”);
System.out.println(index.toString());
System.out.println(bit.toString());
String str = “”;
ArrayList value = new ArrayList();
/*adding the values to the value array of Lz7 algorithm */
for (int i = 0; i < bit.size(); i++) {
if (index.get(i) == 0) {
str += bit.get(i);
value.add(bit.get(i));
} else {
str += value.get((int) index.get(i) – 1) + “” + bit.get(i);
value.add(value.get((int) index.get(i) – 1) + “” + bit.get(i));
}
}
/* if roots starts with S then the last bit from the resultant value will be removed to maintain the consistency of the data */
if (rootval.get(0).toString() == “S”) {
str = str.substring(0, str.length() – 1);
}
System.out.println(“Decrypted data in Binary Form : ” + str);
/* converting binary data to string and printing it */
String output = “”;
for (int i = 0; i <= str.length() – 8; i += 8) {
int k = Integer.parseInt(str.substring(i, i + 8), 2);
output += (char) k;
}
System.out.println(“Decrypted data in String Form : ” + output);
}
}
class Database_Enc_Com {
public static void main(String[] args) throws Exception {
/* creating objects of classes */
createpoly d = new createpoly();
compress cd = new compress();
System.out.println(“Data in String Form: A0”);
// Data in String Form
byte[] infoBin = null;
/* converting String Data to its corresponding binary */
infoBin = “A0”.getBytes(“UTF-8”);
String bin = “”, binary = “”;
for (byte b : infoBin) {
bin = Integer.toBinaryString(b);
if (bin.length() < 8) {
int j = 8 – bin.length();
for (; j > 0; j–) {
bin = “0” + bin;
}
}
binary += bin;
}
System.out.println(“Data in binary Form : ” + binary);
String cdata = cd.compressDatabase(binary); // sending binary data to compression method
ArrayList rootval = new ArrayList();
/*returned roots will be passed in equation x2 + 2x + 1 for encryption*/
if (cdata.startsWith(“S”)) {
rootval.add(“S”);
for (int i = 1; i < cdata.length(); i++) {
int num = Integer.parseInt(String.valueOf(cdata.charAt(i))) + 1;
rootval.add(Integer.toString(num * num));
}
} else {
for (int i = 0; i < cdata.length(); i++) {
int num = Integer.parseInt(String.valueOf(cdata.charAt(i))) + 1;
rootval.add(Integer.toString(num * num));
}
}
if (rootval.get(0).toString() == “S”) {
System.out.println(“Roots of Polynomial: ” + rootval.toString().replace(“S,”, “”));
} else {
System.out.println(“Roots of Polynomial: ” + rootval);
}
/*from encrypted roots , making polynomials by calling create() method */
String p1 = “”, p2 = “”;
if (rootval.get(0).toString() == “S”) {
p1 = “1×1+” + rootval.get(1) + “x0”;
for (int i = 2; i < rootval.size(); i++) {
p2 = “1×1+” + rootval.get(i) + “x0”;
d.create(p1, p2);
p1 = d.display();
p1 = p1.replace(“(“, “”);
p1 = p1.replace(“)”, “”);
p1 = p1.replace(” “, “”);
p1 = p1.substring(1);
}
} else {
p1 = “1×1+” + rootval.get(0) + “x0”;
for (int i = 1; i < rootval.size(); i++) {
p2 = “1×1+” + rootval.get(i) + “x0”;
d.create(p1, p2);
p1 = d.display();
p1 = p1.replace(“(“, “”);
p1 = p1.replace(“)”, “”);
p1 = p1.replace(” “, “”);
p1 = p1.substring(1);
}
}
System.out.println(“Compressed Data in Polynomial: ” + p1);
cd.decryptDatabase(rootval);//calling function to decrypt the encrypted roots and return original data */
}
}
OUTPUT :
5.7 Conclusion
The approach taken in this paper integrates both categorical and numeric data types The noise addition methods used are effective in preserving the privacy of the data proper and producing prediction accuracies on par with the original dataset. Crucial properties of a noise addition technique are the ability to maintain good data quality and ensure individual privacy. More experiments are to be conducted on data quality and security level measurements. In the context of various data mining tasks, our approach is addressing the issue of Privacy Preserving Data Mining almost completely.
 

...(download the rest of the essay above)

About this essay:

If you use part of this page in your own work, you need to provide a citation, as follows:

Essay Sauce, Data privacy. Available from:<https://www.essaysauce.com/computer-science-essays/essay-data-privacy/> [Accessed 27-02-24].

These Computer science essays have been submitted to us by students in order to help you with your studies.

* This essay may have been previously published on Essay.uk.com at an earlier date.