A Novel Approach to an Efficient and Reliable Data Hosting in Multi Cloud Environment
Abstract
Now a days more and more organizations are interested to host their data into the cloud to reduce maintenance cost and enhance the data reliability. There are several cloud service providers exhibiting great variations in their pricing policies and working performances. Due to the enormous growth of heterogeneous cloud data centers the users are now able to put their data in any of the cloud data centers. Usually customers put their data in a single cloud and simply trust to luck. However the problem is choosing a reliable cloud data center which performs well and also to understand which hosting strategy is better. To ensure reliability cloud service providers uses replicas. It is essential to load balance among the cloud data centers for better performance. Based on the commercial analysis proposing a novel data hosting scheme by selecting several suitable clouds and an appropriate redundancy strategy to store data with minimum cost and high availability. We evaluate the performance , it exhibits sound adaptability to price variations
Introduction
In recent years there is a rapid movement of people towards online data hosting services. So that many cloud service providers are offering such services. Data hosting is to store data on a server or other computer so that it can be accessed over the internet. Sometimes companies required particular resources for limited period of time then they need not to purchase those resources. Companies can use resources over a network on pay per use basis.
Cloud computing provides different types of services to the users over the network. It enables companies to consume resources as a utility just like electricity. Data hosting services provide users with a efficient and reliable way to store data and this stored data can be accessed from anywhere, on any device, and at any time. Cloud computing is internet based computing which provides on demend access to shared pool of resources and data on pay per use basis. Cloud computing provides distributed environment which is essential to develop large scale applications rapidly.
In recent years data hosting services became more popular so that there are many cloud service providers offering data hosting services. In most of the cases compamies moving towards hosting their data into a single cloud. However in market there are several options became available from various cloud vendors
Heterogenous clouds:
There are various cloud vendors ehxhibiting variations in working performances and pricing policies. They design with different system architectures and apply various techniques to provide better services. So that customers are unable to understand which clouds are suitable to host their data. This is called vendor lock in risk. It is inefficient for an organization to host all the data in a single cloud. It does not provide guaranteed availability
Multicloud data hosting :
Multicloud data hosting is to distribute across multiple clouds to gain more availability of the data and to minimize the risk of data loss or system failure due to a centralized component failure in a cloud computing environment. Such a failure can occur in hardware, software, or infrastructure. Such a strategy also improves the overall enterprise performance by avoiding potential risks such as “vendor lock-in”.
EXISTING SYSTEM
In existing cloud data hosting systems, availability of data are usually guaranteed by replication or erasure coding. In the multi-cloud environment we also use the above two mechanisms to achieve different availability requirements, but the implementation is different for both of them . replication is achieved by using redundancy, replicas are placed in several clouds, to read data it accesses the“cheapest” cloud that charges minimal out-going bandwidth and GET operation unless it is unavailable. Data replication is suitable for systems with distributed applications. For erasure coding, there are m data blocks and data is encoded into n blocks . m data blocks and n-m coding blocks are placed into n different clouds. In this case, compared with replication data availability is guaranteed with lower storage space , to read data multiple clouds need to be accessed which are storing the corresponding data blocks. However erasure coding cannot make full use of the cheapest cloud as replication . In the multi-cloud scenario bandwidth is generally (much) more expensive than storage space. In the multi-cloud scenario the replication techniques and the erasure coding mechanisms are used to meet different availability requirements, but the implementation of these are very different. The two problems related to multicloud are
• How to choose appropriate clouds in the presence of heterogeneous pricing policies which provides minimum monetary cost.
• How to meet different cloud availability requirements of different hosting services.
PROBLEM STATEMENT
To host data in multi-cloud people encounter the two critical problems:
How to choose appropriate clouds in the presence of heterogeneous pricing policies to minimize monetary cost .
How to achieve different availability requirements to provide different services?
monetary cost mainly depends on the usage of data , particularly storage capacity consumption and network bandwidth consumption.
For availability requirement, consideration is which redundancy mechanism (i.e., replication or erasure coding) is more economical based on specific data access patterns.
Here the main consideration is combining the two mechanisms efficiently to reduce monetary cost and guarantee required availability?
Proposed system
We propose a new allocation strategy for data storage in cloud data centers using load balancing based on redundancy mechanisms , i.e., replication and erasure coding. It intelligently puts data into multiple clouds with minimized monetary cost and guaranteed availability. Here, we combine the two frequently used redundancy mechanisms, i.e., replication and erasure coding, into a uniform model to meet the desired availability for different data access patterns. Next, we propose an efficient heuristic-based data distribution algorithm to choose reliable cloud and proper data storage modes to host data in multiple clouds. It decides which storage mechanism is better based on the size and access frequency of the data. It also perform load balancing to improve the performance.
ADVANTAGES:
• Scalable
• Realiable
• Lowcost data hosting functionality
• Minimized monetory cost
• Guaranteed availability
SYSTEM ARCHITECTURE
MODULES
1) MULTI-CLOUD
2) DATA HOSTING
3) FILE UPLOAD
4) CLOUD STORAGE SECURITY
5) FILE DOWNLOAD
6) COST PROCESS
MODULE DESCRPECTION
1) MULTI-CLOUD
The basic principle of multi-cloud (data hosting) is to distribute data across multiple clouds to gain enhanced redundancy and to minimize the risk of data loss .There are several cloud data centers belongs to same or different cloud providers.All the data centers are accessed by users in a particular region but user experience different performance. It chooses a set of clouds for storing data from all the available clouds which meet the performance requirement. Some data centers latency is very low and some data center’s latency may be intolerable high. This algorithm chooses clouds for storing data from all the available clouds to achieve the performance requirement, it offer acceptable throughput and latency when they are not in outage. It also perform load balancing among multiple clouds when some cloud has low work load.
2) DATA HOSTING
Data Hosting stores data among multiple clouds using the two widely used redundancy mechanisms . Data hosting depends upon the size and access frequency of the data.
Replication – In replication mechanism replicas are put into several clouds, and a read access is served by the cloud that charges Minimal bandwidth and operation costs
Erasure coding – There are m data blocks to store data and data is encoded into n blocks . Here m data blocks and the remaining n-m coding blocks are
Placed into n number of different clouds. To read data ie read access of a
File needs acess multiple clouds that store corresponding data blocks.
. The architecture of is shown in Figure 3. The whole model is located in the near by server proxy in Figure 1. There are four main components :
Data Hosting : Data hosting component stores data on multiple clouds using the redundancy mechanisms either redundancy or erasure coding.
Storage Mode Switching (SMS): . SMS decides storage mode of certain data based on the size and access frequency. It should be changed from replication to erasure coding or erasure coding to replication, it takes decision according to the output of Predictor.
Workload Statistic: Workload Statistic keeps collecting and analysing access logs which is used to guide the placement of data. this statistic information is sent to Predictor which guides storage mode switching.
Predictor: Predictor is the component which is used to predict the future access frequency of data stored in the cloud. The time interval for prediction is set as one month, that is, we use the previous months to predict access frequency of data stored in the next month
3) CLOUD STORAGE
Cloud storage services are became more and more popular. Because of the importance storing private data, To achieve this many cloud storage encryption schemes have been proposed to protect data from unauthorized access. All such schemes are shown that cloud storage providers are secured and cannot be hacked; In practice, some outside authorities (i.e., coercers) may force cloud storage providers to reveal user private data or confidential data on the cloud, so that voilating storage encryption schemes. Here, we propose our design with a new cloud storage encryption scheme called deniable encryption that enables cloud storage providers to create duplicate or fake user private data to protect user data. Since authorities cannot tell if the given secrets are true or not, the cloud storage providers ensure that privacyis protected for user data. Most of the encryption schemes assume cloud storage service providers or trusted authorities handling key management are trusted and cannot be hacked; however, in practice, some may intercept communications between users and cloud storage providers and then force storage providers to release user secrets by using some mechanisms. In this case, cloud storage providers are requested to release user secrets and they assume that data is to be known. we propose to build an encryption scheme that could help cloud storage providers avoid this situation. In our approach, we allow cloud storage providers to create fake user secrets. Such fake user secrets are given to the outside coercers, they can only obtained forged data from a user’s stored ciphertext. Once outside authorities think the received secrets are real, they will be satisfied y cloud storage providers will not have revealed any real secrets to secure user privacy. Therefore, user privacy is achieved. This concept is derived from a special kind of encryption scheme called deniable encryption.
4) FILE UPLOAD
• In the File Uploading process, the user need to select the file to be uploaded to the cloud. And user need to select the number of copies of data to be stored that is how much replication is required to achieve availability .
• Owner of the file is to upload their files using some access policy. Every company is having some access policy . First of all owner get the public key for uploading a file , after getting this public key owner has to request the secret key( private key) to upload the file. Using that key pair owner can upload their file
• While uploading the file , the server reads the file size. Then server selects the best cloud , based on the storage availability, pricing cost, predictor and size.
5) FILE DOWNLOAD
This file download module is used to help the client to search the file which is stored on a particular cloud . Searching is done using the file id and file name .If the file id and name is incorrect meansthat is unauthorized access, we do not get the file, otherwise server ask user the public key and user get the encryption file.If user wants the decryption file means user have the secret key., by using that secret key user can decrypt the file.
ALGORITHM
The key idea of this heuristic data placement algorithm can be described as follows:
Initially each cloud is assigned a value which is calculated based on four factors
• Availability
• Storage
• Bandwidth
• operation prices
These four factors indicate the preference of a cloud. The algorithm chooses the most preferred n number of clouds, and then heuristically exchange the cloud in the selected set with the cloud in the complementary set to achieve better solution. This concept is similar to the Kernighan-Lin heuristic algorithm which is applied to partition graphs effectively to minimize the sum of the costs on all edges cut. The preference of a cloud is influenced by the four factors, and they have different weights. The availability is the higher the efficient , and the price is the lower that is selected.
Data placement Algorithm
Setup (n datacenters )
Allocate m blocks to each dc
Schedule:
Choose a datacenter based on load
For k=1 to n
Check the availability of kth dc suitable for µ
If µ = sflag
Allocate to K
Else
Ealloc (n, µ)
End
Ealloc (n,µ)
//The output is minimum cost C, The set of the selected clouds H.
//S -storage
1.Cinf;
2.H={}//initially empty.
3.Sort the clouds by S+ µ // Accessibility
4. for m= 1 to n do
Acalculate the availability of G
If A<=Amax then
Mcostminimamal cost.
If Mcost<C then
HG.
End
SYSTEM CONFIGURATION
HARDWARE REQUIREMENTS
Hardware – dual core
Speed – 1.3 GHz
RAM – 2GB
Hard Disk – 160 GB
Key Board – Standard Windows Keyboard
Mouse – Two or Three Button Mouse
Monitor – LCD
SOFTWARE REQUIREMENTS
Operating System : Windows
Technology : Java and J2EE
cloud Server : Cloudsim
Java Version : JDK1.7
SEQUENCE DIAGRAM
CLASS DIAGRAM
ACTIVITY DIAGRAM
COLLABRATION DIAGRAM
DATA FLOW DIAGRAM
TEST CASES
1) Login credentials check
Input: Username and Password
If user provides correct credential, then redirect to home screen. If not then redirect to login screens.
Out Put: Home screen redirect
2) File upload
Input: File upload to multi server
If user uploads file to multi-server then it stores the file in the configured server path.
Out Put: File upload successfully.
3) File download
Input: File download from multi server
If user and provides correct secret key, then it gets downloaded from the server after decrypted.
Out Put: Providers secret key with permission.
SOFTWARE FEASIBILITY
FEASIBILITY STUDY
The feasibility of the project is analyzed in this phase and business proposal is put forth with a very general plan for the project and some cost estimates. During system analysis the feasibility study of the proposed system is to be carried out. This is to ensure that the proposed system is not a burden to the company. For feasibility analysis, some understanding of the major requirements for the system is essential.
Three key considerations involved in the feasibility analysis are
• ECONOMICAL FEASIBILITY
• TECHNICAL FEASIBILITY
• SOCIAL FEASIBILITY
ECONOMICAL FEASIBILITY
This study is carried out to check the economic impact that the system will have on the organization. The amount of fund that the company can pour into the research and development of the system is limited. The expenditures must be justified. Thus the developed system as well within the budget and this was achieved because most of the technologies used are freely available. Only the customized products had to be purchased.
TECHNICAL FEASIBILITY
This study is carried out to check the technical feasibility, that is, the technical requirements of the system. Any system developed must not have a high demand on the available technical resources. This will lead to high demands on the available technical resources. This will lead to high demands being placed on the client. The developed system must have a modest requirement, as only minimal or null changes are required for implementing this system.
SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system by the user. This includes the process of training the user to use the system efficiently. The user must not feel threatened by the system, instead must accept it as a necessity. The level of acceptance by the users solely depends on the methods that are employed to educate the user about the system and to make him familiar with it. His level of confidence must be raised so that he is also able to make some constructive criticism, which is welcomed, as he is the final user of the system.
FUNCTIONAL REQUIREMENTS:
Functional requirements specify which output file should be produced from the given file they describe the relationship between the input and output of the system, for each functional requirement a detailed description of all data inputs and their source and the range of valid inputs must be specified.
NON FUNCTIONAL REQUIREMENTS:
Describe user-visible aspects of the system that are not directly related with the functional behavior of the system. Non-Functional requirements include quantitative constraints, such as response time (i.e. how fast the system reacts to user commands.) or accuracy ((.e. how precise are the systems numerical answers.)
PSEUDO REQUIREMENTS:
The client that restricts the implementation of the system imposes these requirements. Typical pseudo requirements are the implementation language and the platform on which the system is to be implemented. These have usually no direct effect on the user’s view of the system.
SYSTEM TESTING
Introduction:
After finishing the development of any computer based system the next complicated time consuming process is system testing. During the time of testing only the development company can know that, how far the user requirements have been met out, and so on.
Following are the some of the testing methods applied to this effective project:
SOURCE CODE TESTING:
This examines the logic of the system. If we are getting the output that is required by the user, then we can say that the logic is perfect.
SPECIFICATION TESTING:
We can set with, what program should do and how it should perform under various condition. This testing is a comparative study of evolution of system performance and system requirements.
MODULE LEVEL TESTING:
In this the error will be found at each individual module, it encourages the programmer to find and rectify the errors without affecting the other modules.
UNIT TESTING:
Unit testing focuses on verifying the effort on the smallest unit of software-module. The local data structure is examined to ensure that the date stored temporarily maintains its integrity during all steps in the algorithm’s execution. Boundary conditions are tested to ensure that the module operates properly at boundaries established to limit or restrict processing.
INTEGRATION TESTING:
Data can be tested across an interface. One module can have an inadvertent, adverse effect on the other. Integration testing is a systematic technique for constructing a program structure while conducting tests to uncover errors associated with interring.
VALIDATION TESTING:
It begins after the integration testing is successfully assembled. Validation succeeds when the software functions in a manner that can be reasonably accepted by the client. In this the majority of the validation is done during the data entry operation where there is a maximum possibility of entering wrong data. Other validation will be performed in all process where correct details and data should be entered to get the required results.
RECOVERY TESTING:
Recovery Testing is a system that forces the software to fail in variety of ways and verifies that the recovery is properly performed. If recovery is automatic, re-initialization, and data recovery are each evaluated for correctness.
SECURITY TESTING:
Security testing attempts to verify that protection mechanism built into system will in fact protect it from improper penetration. The tester may attempt to acquire password through external clerical means, may attack the system with custom software design to break down any defenses to others, and may purposely cause errors.
PERFORMANCE TESTING:
Performance Testing is used to test runtime performance of software within the context of an integrated system. Performance test are often coupled with stress testing and require both software instrumentation.
BLACKBOX TESTING:
Black- box testing focuses on functional requirement of software. It enables to derive ets of input conditions that will fully exercise all functional requirements for a program.
Black box testing attempts to find error in the following category:
• Incorrect or missing function
• Interface errors
• Errors in data structures or external database access and performance errors.
OUTPUT TESTING:
After performing the validation testing, the next step is output testing of the proposed system since no system would be termed as useful until it does produce the required output in the specified format. Output format is considered in two ways, the screen format and the printer format.
USER ACCEPTANCE TESTING:
User Acceptance Testing is the key factor for the success of any system. The system under consideration is tested for user acceptance by constantly keeping in touch with prospective system users at the time of developing and making changes whenever required.
SOFTWARE ENVIRONMENT
Java Technology
Java technology is both a programming language and a platform.
The Java Programming Language
The Java programming language is a high-level language that can be characterized by all of the following buzzwords:
Simple
Architecture neutral
Object oriented
Portable
Distributed
High performance
Interpreted
Multithreaded
Robust
Dynamic
Secure
With most programming languages, you either compile or interpret a program so that you can run it on your computer. The Java programming language is unusual in that a program is both compiled and interpreted. With the compiler, first you translate a program into an intermediate language called Java byte codes —the platform-independent codes interpreted by the interpreter on the Java platform. The interpreter parses and runs each Java byte code instruction on the computer. Compilation happens just once; interpretation occurs each time the program is executed. The following figure illustrates how this works.
You can think of Java byte codes as the machine code instructions for the Java Virtual Machine (Java VM). Every Java interpreter, whether it’s a development tool or a Web browser that can run applets, is an implementation of the Java VM. Java byte codes help make “write once, run anywhere” possible. You can compile your program into byte codes on any platform that has a Java compiler. The byte codes can then be run on any implementation of the Java VM. That means that as long as a computer has a Java VM, the same program written in the Java programming language can run on Windows 2000, a Solaris workstation, or on an iMac.
The Java Platform
A platform is the hardware or software environment in which a program runs. We’ve already mentioned some of the most popular platforms like Windows 2000, Linux, Solaris, and MacOS. Most platforms can be described as a combination of the operating system and hardware. The Java platform differs from most other platforms in that it’s a software-only platform that runs on top of other hardware-based platforms.
The Java platform has two components:
• The Java Virtual Machine (Java VM)
• The Java Application Programming Interface (Java API)
You’ve already been introduced to the Java VM. It’s the base for the Java platform and is ported onto various hardware-based platforms.
The Java API is a large collection of ready-made software components that provide many useful capabilities, such as graphical user interface (GUI) widgets. The Java API is grouped into libraries of related classes and interfaces; these libraries are known as packages. The next section, What Can Java Technology Do? Highlights what functionality some of the packages in the Java API provide.
The following figure depicts a program that’s running on the Java platform. As the figure shows, the Java API and the virtual machine insulate the program from the hardware.
Native code is code that after you compile it, the compiled code runs on a specific hardware platform. As a platform-independent environment, the Java platform can be a bit slower than native code. However, smart compilers, well-tuned interpreters, and just-in-time byte code compilers can bring performance close to that of native code without threatening portability.
What Can Java Technology Do?
The most common types of programs written in the Java programming language are applets and applications. If you’ve surfed the Web, you’re probably already familiar with applets. An applet is a program that adheres to certain conventions that allow it to run within a Java-enabled browser.
However, the Java programming language is not just for writing cute, entertaining applets for the Web. The general-purpose, high-level Java programming language is also a powerful software platform. Using the generous API, you can write many types of programs.
An application is a standalone program that runs directly on the Java platform. A special kind of application known as a server serves and supports clients on a network. Examples of servers are Web servers, proxy servers, mail servers, and print servers. Another specialized program is a servlet. A servlet can almost be thought of as an applet that runs on the server side. Java Servlets are a popular choice for building interactive web applications, replacing the use of CGI scripts. Servlets are similar to applets in that they are runtime extensions of applications.
Instead of working in browsers, though, Servlets run within Java Web servers, configuring or tailoring the server.
How does the API support all these kinds of programs? It does so with packages of software components that provides a wide range of functionality. Every full implementation of the Java platform gives you the following features:
• The essentials: Objects, strings, threads, numbers, input and output, data structures, system properties, date and time, and so on.
• Applets: The set of conventions used by applets.
• Networking: URLs, TCP (Transmission Control Protocol), UDP (User Data gram Protocol) sockets, and IP (Internet Protocol) addresses.
• Internationalization: Help for writing programs that can be localized for users worldwide. Programs can automatically adapt to specific locales and be displayed in the appropriate language.
• Security: Both low level and high level, including electronic signatures, public and private key management, access control, and certificates.
• Software components: Known as JavaBeansTM, can plug into existing component architectures.
• Object serialization: Allows lightweight persistence and communication via Remote Method Invocation (RMI).
• Java Database Connectivity (JDBCTM): Provides uniform access to a wide range of relational databases.
The Java platform also has APIs for 2D and 3D graphics, accessibility, servers, collaboration, telephony, speech, animation, and more. The following figure depicts what is included in the Java 2 SDK.
How Will Java Technology Change My Life?
We can’t promise you fame, fortune, or even a job if you learn the Java programming language. Still, it is likely to make your programs better and requires less effort than other languages. We believe that Java technology will help you do the following:
• Get started quickly: Although the Java programming language is a powerful object-oriented language, it’s easy to learn, especially for programmers already familiar with C or C++.
• Write less code: Comparisons of program metrics (class counts, method counts, and so on) suggest that a program written in the Java programming language can be four times smaller than the same program in C++.
• Write better code: The Java programming language encourages good coding practices, and its garbage collection helps you avoid memory leaks. Its object orientation, its JavaBeans component architecture, and its wide-ranging, easily extendible API let you reuse other people’s tested code and introduce fewer bugs.
• Develop programs more quickly: Your development time may be as much as twice as fast versus writing the same program in C++. Why? You write fewer lines of code and it is a simpler programming language than C++.
• Avoid platform dependencies with 100% Pure Java: You can keep your program portable by avoiding the use of libraries written in other languages. The 100% Pure JavaTM Product Certification Program has a repository of historical process manuals, white papers, brochures, and similar materials online.
• Write once, run anywhere: Because 100% Pure Java programs are compiled into machine-independent byte codes, they run consistently on any Java platform.
• Distribute software more easily: You can upgrade applets easily from a central server. Applets take advantage of the feature of allowing new classes to be loaded “on the fly,” without recompiling the entire program.
SOURCE CODING
/*
* Title: CloudSim Toolkit
* Description: CloudSim (Cloud Simulation) Toolkit for Modeling and Simulation
* of Clouds
* Licence: GPL – http://www.gnu.org/copyleft/gpl.html
*
* Copyright (c) 2009, The University of Melbourne, Australia
*/
package examples.org.cloudbus.cloudsim.examples;
import com.app.Database.DatabaseFile;
import java.sql.ResultSet;
import java.text.DecimalFormat;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.LinkedList;
import java.util.List;
import org.cloudbus.cloudsim.Cloudlet;
import org.cloudbus.cloudsim.CloudletSchedulerTimeShared;
import org.cloudbus.cloudsim.Datacenter;
import org.cloudbus.cloudsim.DatacenterBroker;
import org.cloudbus.cloudsim.DatacenterCharacteristics;
import org.cloudbus.cloudsim.Host;
import org.cloudbus.cloudsim.Log;
import org.cloudbus.cloudsim.Pe;
import org.cloudbus.cloudsim.Storage;
import org.cloudbus.cloudsim.UtilizationModel;
import org.cloudbus.cloudsim.UtilizationModelFull;
import org.cloudbus.cloudsim.Vm;
import org.cloudbus.cloudsim.VmAllocationPolicySimple;
import org.cloudbus.cloudsim.VmSchedulerTimeShared;
import org.cloudbus.cloudsim.core.CloudSim;
import org.cloudbus.cloudsim.core.SimEntity;
import org.cloudbus.cloudsim.core.SimEvent;
import org.cloudbus.cloudsim.provisioners.BwProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.PeProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.RamProvisionerSimple;
/**
* An example showing how to create simulation entities
* (a DatacenterBroker in this example) in run-time using
* a globar manager entity (GlobalBroker).
*/
public class CloudSimTest {
public CloudSimTest()
{
}
/** The cloudlet list. */
private static List<Cloudlet> cloudletList;
/** The vmList. */
private static List<Vm> vmList;
private static List<Vm> createVM(int userId, int vms, int idShift,
String vmm, String size1, String ram1, String mips1, String bw1, String pesNumber1) {
//Creates a container to store VMs. This list is passed to the broker later
LinkedList<Vm> list = new LinkedList<Vm>();
//VM Parameters
long size = Long.parseLong(size1); //image size (MB)
int ram = Integer.parseInt(ram1); //vm memory (MB)
int mips = Integer.parseInt(mips1);
long bw = Long.parseLong(bw1);
int pesNumber = Integer.parseInt(pesNumber1); //number of cpus
//create VMs
Vm[] vm = new Vm[vms];
for(int i=0;i<vms;i++){
vm[i] = new Vm(idShift + i, userId, mips, pesNumber, ram, bw, size, vmm, new CloudletSchedulerTimeShared());
list.add(vm[i]);
}
return list;
}
private static List<Cloudlet> createCloudlet(int userId, int cloudlets, int idShift,
String name, String len, String size, String out, String pesnumber){
// Creates a container to store Cloudlets
LinkedList<Cloudlet> list = new LinkedList<Cloudlet>();
//cloudlet parameters
long length = Long.parseLong(len);
long fileSize = Long.parseLong(size);
long outputSize = Long.parseLong(out);
int pesNumber = Integer.parseInt(pesnumber);
UtilizationModel utilizationModel = new UtilizationModelFull();
Cloudlet[] cloudlet = new Cloudlet[cloudlets];
for(int i=0;i<cloudlets;i++){
cloudlet[i] = new Cloudlet(idShift + i, length, pesNumber, fileSize, outputSize, utilizationModel, utilizationModel, utilizationModel);
// setting the owner of these Cloudlets
cloudlet[i].setUserId(userId);
list.add(cloudlet[i]);
}
return list;
}
private static Datacenter createDatacenter(String name){
// Here are the steps needed to create a PowerDatacenter:
// 1. We need to create a list to store one or more
// Machines
List<Host> hostList = new ArrayList<Host>();
// 2. A Machine contains one or more PEs or CPUs/Cores. Therefore, should
// create a list to store these PEs before creating
// a Machine.
List<Pe> peList1 = new ArrayList<Pe>();
int mips = 1000;
// 3. Create PEs and add these into the list.
//for a quad-core machine, a list of 4 PEs is required:
peList1.add(new Pe(0, new PeProvisionerSimple(mips))); // need to store Pe id and MIPS Rating
peList1.add(new Pe(1, new PeProvisionerSimple(mips)));
peList1.add(new Pe(2, new PeProvisionerSimple(mips)));
peList1.add(new Pe(3, new PeProvisionerSimple(mips)));
//Another list, for a dual-core machine
List<Pe> peList2 = new ArrayList<Pe>();
peList2.add(new Pe(0, new PeProvisionerSimple(mips)));
peList2.add(new Pe(1, new PeProvisionerSimple(mips)));
//4. Create Hosts with its id and list of PEs and add them to the list of machines
int hostId=0;
int ram = 16384; //host memory (MB)
long storage = 1000000; //host storage
int bw = 10000;
hostList.add(
new Host(
hostId,
new RamProvisionerSimple(ram),
new BwProvisionerSimple(bw),
storage,
peList1,
new VmSchedulerTimeShared(peList1)
)
); // This is our first machine
hostId++;
hostList.add(
new Host(
hostId,
new RamProvisionerSimple(ram),
new BwProvisionerSimple(bw),
storage,
peList2,
new VmSchedulerTimeShared(peList2)
)
); // Second machine
// 5. Create a DatacenterCharacteristics object that stores the
// properties of a data center: architecture, OS, list of
// Machines, allocation policy: time- or space-shared, time zone
// and its price (G$/Pe time unit).
String arch = “x86”; // system architecture
String os = “Linux”; // operating system
String vmm = “Xen”;
double time_zone = 10.0; // time zone this resource located
double cost = 3.0; // the cost of using processing in this resource
double costPerMem = 0.05; // the cost of using memory in this resource
double costPerStorage = 0.1; // the cost of using storage in this resource
double costPerBw = 0.1; // the cost of using bw in this resource
LinkedList<Storage> storageList = new LinkedList<Storage>(); //we are not adding SAN devices by now
DatacenterCharacteristics characteristics = new DatacenterCharacteristics(
arch, os, vmm, hostList, time_zone, cost, costPerMem, costPerStorage, costPerBw);
// 6. Finally, we need to create a PowerDatacenter object.
Datacenter datacenter = null;
try {
datacenter = new Datacenter(name, characteristics, new VmAllocationPolicySimple(hostList), storageList, 0);
} catch (Exception e) {
e.printStackTrace();
}
return datacenter;
}
//We strongly encourage users to develop their own broker policies, to submit vms and cloudlets according
//to the specific rules of the simulated scenario
private static DatacenterBroker createBroker(String name){
DatacenterBroker broker = null;
try {
broker = new DatacenterBroker(name);
} catch (Exception e) {
e.printStackTrace();
return null;
}
return broker;
}
/**
* Prints the Cloudlet objects
* @param list list of Cloudlets
*/
private static void printCloudletList(List<Cloudlet> list) {
int size = list.size();
Cloudlet cloudlet;
String indent = ” “;
Log.printLine();
Log.printLine(“========== OUTPUT ==========”);
Log.printLine(“Cloudlet ID” + indent + “STATUS” + indent +
“Data center ID” + indent + “VM ID” + indent + indent + “Time” + indent + “Start Time” + indent + “Finish Time”);
DecimalFormat dft = new DecimalFormat(“###.##”);
for (int i = 0; i < size; i++) {
cloudlet = list.get(i);
Log.print(indent + cloudlet.getCloudletId() + indent + indent);
if (cloudlet.getCloudletStatus() == Cloudlet.SUCCESS){
Log.print(“SUCCESS”);
Log.printLine( indent + indent + cloudlet.getResourceId() + indent + indent + indent + cloudlet.getVmId() +
indent + indent + indent + dft.format(cloudlet.getActualCPUTime()) +
indent + indent + dft.format(cloudlet.getExecStartTime())+ indent + indent + indent + dft.format(cloudlet.getFinishTime()));
}
}
}
public void createCloudSim()
{
Log.printLine(“Starting CloudSim…”);
DatabaseFile objDatabaseFile = new DatabaseFile();
try
{
// First step: Initialize the CloudSim package. It should be called
// before creating any entities.
String sql = ” Select count(*) as val “
+ ” FROM consumerdetails “;
ResultSet rs = objDatabaseFile.codeselect(sql);
int num_user = 0; // number of grid users
while(rs.next())
{
num_user = rs.getInt(“val”);
}
Calendar calendar = Calendar.getInstance();
boolean trace_flag = false; // mean trace events
// Initialize the CloudSim library
CloudSim.init(num_user, calendar, trace_flag);
GlobalBroker globalBroker = new GlobalBroker(“GlobalBroker”);
// Second step: Create Datacenters
//Datacenters are the resource providers in CloudSim. We need at list one of them to run a CloudSim simulation
String sql1 = ” SELECT DataCenterId, DataCenterName “
+ ” FROM datacenterdetails “;
ResultSet rs5 = objDatabaseFile.codeselect(sql1);
while(rs5.next())
{
@SuppressWarnings(“unused”)
Datacenter datacenter_0 = createDatacenter(rs5.getString(“DataCenterName”));
}
String sql2 = ” SELECT name “
+ ” FROM brokerdetails “;
ResultSet rs2 = objDatabaseFile.codeselect(sql2);
DatacenterBroker broker = null;
while(rs2.next())
{
//Third step: Create Broker
broker = createBroker(rs2.getString(1));
int brokerId = broker.getId();
String sql3 = ” SELECT VM_Id, VM_Name, VM_Image_Size, VM_Ram, VM_mips, VM_bw, VM_pesNumber “
+ ” FROM vmdetails “;
ResultSet rs3 = objDatabaseFile.codeselect(sql3);
while(rs3.next())
{
//Fourth step: Create VMs and Cloudlets and send them to broker
vmList = createVM(
brokerId,
rs3.getInt(“VM_Id”),
rs3.getInt(“VM_Id”)-1,
rs3.getString(“VM_Name”),
rs3.getString(“VM_Image_Size”),
rs3.getString(“VM_Ram”),
rs3.getString(“VM_mips”),
rs3.getString(“VM_bw”),
rs3.getString(“VM_pesNumber”)
); //creating 5 vms
}
String sql6 = ” SELECT Cloudlets_Id, Cloudlets_Name, Cloudlets_Length, Cloudlets_FileSize, Cloudlets_OutputSize, Cloudlets_pesNumber “
+ ” FROM cloudlets “;
ResultSet rs6 = objDatabaseFile.codeselect(sql6);
while(rs6.next())
{
cloudletList = createCloudlet(
brokerId,
rs6.getInt(“Cloudlets_Id”),
rs6.getInt(“Cloudlets_Id”)-2,
rs6.getString(“Cloudlets_Name”),
rs6.getString(“Cloudlets_Length”),
rs6.getString(“Cloudlets_FileSize”),
rs6.getString(“Cloudlets_OutputSize”),
rs6.getString(“Cloudlets_pesNumber”)
); // creating 10 cloudlets
}
broker.submitVmList(vmList);
broker.submitCloudletList(cloudletList);
}
// Fifth step: Starts the simulation
CloudSim.startSimulation();
// Final step: Print results when simulation is over
List<Cloudlet> newList = broker.getCloudletReceivedList();
newList.addAll(globalBroker.getBroker().getCloudletReceivedList());
CloudSim.stopSimulation();
printCloudletList(newList);
Log.printLine(“CloudSim finished!”);
}
catch (Exception e)
{
e.printStackTrace();
Log.printLine(“The simulation has been terminated due to an unexpected error”);
}
Log.printLine(“——————————————————————————————————————————————————————————“);
}
public static class GlobalBroker extends SimEntity {
private static final int CREATE_BROKER = 0;
private List<Vm> vmList;
private List<Cloudlet> cloudletList;
private DatacenterBroker broker;
public GlobalBroker(String name) {
super(name);
}
@Override
public void processEvent(SimEvent ev) {
switch (ev.getTag()) {
case CREATE_BROKER:
setBroker(createBroker(super.getName()+”_”));
try
{
String sql3 = ” SELECT VM_Id, VM_Name, VM_Image_Size, VM_Ram, VM_mips, VM_bw, VM_pesNumber “
+ ” FROM vmdetails “;
DatabaseFile objDatabaseFile = new DatabaseFile();
ResultSet rs4 = objDatabaseFile.codeselect(sql3);
while(rs4.next())
{
//Create VMs and Cloudlets and send them to broker
setVmList(createVM( getBroker().getId(),
rs4.getInt(“VM_Id”),
rs4.getInt(“VM_Id”)-1,
rs4.getString(“VM_Name”),
rs4.getString(“VM_Image_Size”),
rs4.getString(“VM_Ram”),
rs4.getString(“VM_mips”),
rs4.getString(“VM_bw”),
rs4.getString(“VM_pesNumber”)
)
); //creating 5 vms
}
String sql6 = ” SELECT Cloudlets_Id, Cloudlets_Name, Cloudlets_Length, Cloudlets_FileSize, Cloudlets_OutputSize, Cloudlets_pesNumber “
+ ” FROM cloudlets “;
ResultSet rs6 = objDatabaseFile.codeselect(sql6);
while(rs6.next())
{
setCloudletList(createCloudlet(
getBroker().getId(),
rs6.getInt(“Cloudlets_Id”),
rs6.getInt(“Cloudlets_Id”)-2,
rs6.getString(“Cloudlets_Name”),
rs6.getString(“Cloudlets_Length”),
rs6.getString(“Cloudlets_FileSize”),
rs6.getString(“Cloudlets_OutputSize”),
rs6.getString(“Cloudlets_pesNumber”)
)
); // creating 10 cloudlets
}
}
catch (Exception ex)
{
ex.printStackTrace();
}
broker.submitVmList(getVmList());
broker.submitCloudletList(getCloudletList());
CloudSim.resumeSimulation();
break;
default:
Log.printLine(getName() + “: unknown event type”);
break;
}
}
@Override
public void startEntity() {
Log.printLine(super.getName()+” is starting…”);
schedule(getId(), 200, CREATE_BROKER);
}
@Override
public void shutdownEntity() {
}
public List<Vm> getVmList() {
return vmList;
}
protected void setVmList(List<Vm> vmList) {
this.vmList = vmList;
}
public List<Cloudlet> getCloudletList() {
return cloudletList;
}
protected void setCloudletList(List<Cloudlet> cloudletList) {
this.cloudletList = cloudletList;
}
public DatacenterBroker getBroker() {
return broker;
}
protected void setBroker(DatacenterBroker broker) {
this.broker = broker;
}
}
}
SCREEN SHOT
Conclusion
Cloud services are experiencing rapid development and the services based on multi-cloud also become prevailing. One of the most concerns, when moving services into clouds, is capital expenditure. So, in this paper, we design a novel storage scheme CHARM, which guides customers to distribute data among clouds cost-effectively. CHARM makes fine-grained decisions about which storage mode to use and which clouds to place data in. The evaluation proves the efficiency of CHARM.