VERIFICATION OF PROPERTIES IN WEB SERVICE MINING

CHAPTER – 1
FUNDAMENTALS OF WEB SERVICE MINING
1.1 INTRODUCTION
Current Web services are created and updated on the run. Web service mining is a relatively new area of research. This growing popularity of the Web and Web services presents an entirely new area in research. It is process aimed at discovering interesting and useful compositions in existing Web services. The Web service composition approaches are very useful where primary goals are unavailable or unknown. It differs from the conventional top down approaches driven by specific a criterion. The service does not assume any prior knowledge and instead of searching specific compositions, it relies on component services to explore data and the result may vary from a simple composition to a complex one. Though, the Web was invented in the 1990’s by Tim Berners-Lee for information sharing among scientists [1], only after 1993 the Web was available for public information sharing due to the Web browser Mosaic. Mosaic quickly captured the imagination of the public on the Web through its graphical user interface. Governmental Organizations, Business establishments realized the potential of the Web and took advantage of its popularity by sharing their data and applications on the Web. The web from its starting position as s a repository of information with text and images, evolved into a host for providing multimedia content and service providing application like map-finding, weather-reporting, e-commerce etc. Real-time applications involved hardware devices like temperature sensors and traffic monitoring cameras.
Business and government integrated existing Web applications to provide new value-added services. Customized interfaces required applications to access this data and the lack of semantics in data made integration of the applications a overwhelming challenge. Attempts to solve these problems resulted in underlying and enabling technologies. Many of the researches are driven by cross-enterprise workflow and AI planning. This thesis presents a novel approach in Web service mining by introducing a Web service. A set of algorithms and a conceptual nut novel framework is also introduced for service compositions. The effectiveness of the framework and challenges addressed by the framework are also discussed. The thesis also discusses transactional approaches [2].
1.2 WEB SERVICE MINING
Definition of Web Serving Mining: “Given a set of Web services WS with a set of rules that can be composed, CR, interesting measures IM for a user UR, a composite Web service is derived by using a subset of rules in CR in a subset of Web services WS, that exhibits values for a user UR, selected subset of interesting measures IM.” . The W3C consortium defines a Web service as an application or a software component identified by a URI with interfaces and binding described as XML, capable of being discovered by other Web services by definition and as a service which can interact directly with other Web services using XML and internet protocols Three major standardization initiatives were proposed by the consortium to support interactions between the Web services:
SOAP(a protocol of transport allowing the exchange of XML documents in a distributed and decentralized environment), UDDI (a specification, defining mechanisms which make it possible to publish services, discover and interact with other services on the Web) and WSDL(document describing a Web service, its location and the way to call the service). WSDL Web service mining requires Web services and the subject for mining information has to be clearly defined to target unambiguous items. Any Web service mining provides an effective means of describing related Web service concepts and their relationships. The key concepts of Web service mining can be categorized as
• Message. Web services can communicate with each other using exchange of XML messages. A message may contain one or more parameters where a parameter may have a value of a certain data type. Messages can be used to send parameters to a service operation or use results from a service operation.
• Condition. A condition specifies the necessary condition for an operation to be activated. Conditions also describe the state of the information after execution and can also be related to parameters for retrieving information.
• Domain. Web service operations can be categorized by areas of interest called domains. A domain has descriptions on the purpose of the domain and its functionalities like travel, entertainment, healthcare, drug design, etc.
• Locale of Interest. is a semantic restriction describing the applicability of an interface. For Example restricting a regional or geographical boundary.
• Quality of Web Service is evaluating an operation provided by a Web service with a quality attribute. The quality attributes to a Web service operation can be organized into three categories: runtime, business, and security
• Operation Interface. Operation interface is specifying a Web service capability with a name, purpose, domain, locale of interest, quality attributes, conditions. There are four operation modes:
o One-way – operation interface contains only Message sent
o Notification – operation interface contains only results
o Request-response – operation gets results and send a Message.
o Solicit-response – operation generates results and waits for a message.
• Operation is a specific implementation of an Operation Interface by a Web service. An operation can also exhibit values for the quality attributes..
• Web Service is always defined by a tuple (Name, Description, Operations, other non functional properties), where:
o • Name: name of the Web service;
o • Description: A text summary about service capabilities;
o • Operations : Set of capabilities provided by a Web service
o • NFP : non functional property which describe the Web service.
A simple web service model is shown in Fig. 1
Fig. 1 A simple web service model
1.3 WEB SERVICE MINING ARCHITECTURE
Web services are self-describing, self-contained, modular applications published and invoked on the Web. Many organizations have implemented their core business applications on the Internet. The ability to efficiently and effectively select or integrate their heterogeneous services on the Web at runtime is an important step towards the development of any Web service application. The ease of application integration and ontologies have contributed to the popularity of a Web service composition. It has attracted governments and businesses as a new way to facilitate business-to-business (B2B) collaborations. A Web service composition aims at providing value-added services. The application necessarily takes advantage of existing services and assembles them to meet specific service requirements. Two different types of approaches helped standardize compositions. The business world targeted an earlier Web service paradigm and developed a number of XML-based standards like WSFL [3], WSCI [4], XLANG [5] and BPEL4WS [6]) focusing on formalizing Web service specifications, flow composition and executions based on syntactical characteristics [7]. Recently the Semantic Web research community focused on using the concept of Semantic Web services and developed complementary standards like DAML-S, OWL-S [8.]. The standards complement each other and take a top down approach. The user provides search criteria and defines the exact service functionality required. For Example composition of a travel service may require flight/train bookings, car hires or hotel reservations and the search may reflect the interest, knowledge of the service. A Simple Web Service Architecture is shown in Fig. 2.
Fig. 2. A Simple Web Service Architecture
Another technology is the use of ontologies in describing the content of the application processes [9]. Ontologies have their roots to the AI community for facilitating knowledge sharing and reuse. Ontology allows the specification of a shared conceptualization in an explicit readable format resulting in a standard interpretation. Ontologies help build a vocabulary of concepts that can be used by applications to unambiguously describe the content generated and interpret the contents. Empowerment of Web services with semantics using ontologies has brought about the next generation of Web services called Semantic Web services. The Semantic deployment of independently developed applications allows applications to interoperate at the application interface and information semantics levels. The integration for value-added services becomes an easier and inexpensive task compared to earlier approaches relying on technologies like EDI, CORBA and EJB. W3C had to revise its standard Web service architecture [10], with the advent of Semantic Web services.
The discovery service first obtains the Web Service Description (WSD) in WSDL [11] and its associated Functional Description (FD) of the provider. The functional description describes the functionality of the provider service and can be processed by a machine like RDF [12], DAMLS, OWL-S [13]). The requester supplies the criteria to the discovery service for selecting a WSD based on its associated FD. The next Step ensures that boh the requester and the provider agree on the semantics of the interaction with the use of ontologies as shown in Fig. 3.
Fig. 3. Sematic Web Service Architecture
1.4 WEB SERVICE COMPOSITION APPROACHES
Data mining is extracting interesting information from data in databases. It is the sifting through databases to find patterns of data using computer programs. Data mining receives attention due to the voluminous data made available. Enterprises maintain terabytes of data due the availability of cheap storage media. [14]. Data mining techniques classify data, detect anomalies and predict data. The Data Mining techniques search for consistency in patterns and relationships between variables [15]. The results are then applied on new data sets for validating them. Data mining techniques have been used to forecast weather [16.], predict stocks [17], sports analysis [18], medical diagnosis [19], and fraud detection [20]. Data mining is similar to Web service mining in extracting previously unknown and useful information. Web services encapsulate behaviors using a set of dynamic operations and traditional data mining techniques cannot be used to determine the composability of two services. Service composition can be from two fundamentally different strategies. The first is the top down strategy requiring a specific goal based search criteria to start the composition process like traditional Web service composition approaches. Since user provides the goal the type of composition is anticipated by the user. The evaluation of composition interestingness is not a major concern area in such approaches. The bottom up strategy is driven by the need to find interesting and potentially useful compositions of existing Web services and without using specific search criteria (Web service mining). Web service mining techniques need to address the interestingness of service compositions. Semantics-based composition approaches use ontologies to an advantage while describing Web services.
The use of ontologies allows unambiguous interpretation of a Web service semantics by a computer. Based on the DARPA Agent Markup Language Ontology Inference Layer (DAML+OIL), Web Ontology Language for Services (OWL-S) [21]. Service ontology aims at enabling Web services composition planning , reasoning and automatic use of services by software agents. Fig. 4 shows the elements of a web composition model.
Fig. 4. Web Service Composition Model
A service profile is an abstract of what it can do. The abstract includes the required input of the service, the output the service, the preconditions and the after effects of the execution, collectively known as the IOPEs of a given service. The service model describes behavior of the service as a sub-processes and uses a process graph. Atomic processes can be without sub-processes and be invocable. Simple processes also can be without sub-processes and be revocable. Composite processes generally have sub-processes linked by control constructs like sequence, split, choice, iteration and if-then-else. Service grounding describes the access control to a Web service by including descriptions for message formatting, mechanisms of transport, protocols, and serializations. The more recent Web Service Modeling Language (WSML) [22] is based on the Web Service Modeling Ontology (WSMO) [23.]. The core language uses description logics and programming. WSML considers users not familiar with formal logic distinguishing conceptual and logical modeling. WSMO declares inputs, outputs, pre-conditions, and results associated with services like OWL-S, but do not provide a notation for building composite processes in terms of controlling flow and data. It alternatively focuses on specification of internal and external choreography using approach based abstract state machines. Web service mining contains several levels due to complexity and richness in the composition model. For Example the Web service WSDL interface level is a set of operations offered by the Web service, the Web service abstract process level is the execution order logic between the interfaces. Web services choreography level defines the interactions exchanged within a given choreography and the Orchestration model of a composite Web service is the executable process implementing a composite Web service.
Dustdar [24] relied on analyzing log data OF Web service executions to discover process workflow instances in these services. Identifying interesting workflows is difficult when such logs are absent or component Web services are at the introductory stage. Web Service Logging is gathering the relevant Web data, to be analyzed for useful information about Web Service behaviors. Logging can produce richness in data needed for implementing additional features. The Web services logging can be trivial and advanced. Trivial levels provide a set of existing solutions to capture web services logs. Web service log collection can be from two main sources. The Web log collection corresponds to the software systems data on the Web server and Web client. The collections are achieved by enabling Web servers logging facilities. Web Usage Mining WUM [25] researches describe the most common mean of web log collection , since server logs are stored in the Common Log Format [26] or the recently Combined Log Format [27]. Most Web servers support the Common Log Format as a default option. The log tracks different elements of the Web transactions with each request recorded in a line of text, with elements of the request separated by spaces and items not sent as a hyphen or dash. The log conceived originally for administrative purposes, stores data as sequential strings containing: the requestors IP address, user identification, timestamp, request method, request status code , sent data (number of bytes), authenticated user name, the User Agent. For Example
127.0.0.1- -[02/Apr/2013:9:50:11+0100] “POST/MMS-Server/services/Document Delivery by DHL HTTP/1.0” 500 819 “-” ” MMS-Server /1.1″
1.5 BACKGROUND LITERATURE SURVEY
Researchers have addressed problems of interleaving web service discovery and compositions by considering simple workflows of web services with one input and one output parameter [28]. The web service composition is restricted to a sequence of limited web services corresponding to a linear workflow of web services. The suggested solution retrieves a sequence of causal links between web services, Aiming to generate a composite service plan from existing services, a composition path was proposed in [29]. The path consisted of a sequence of operators computing data, and connectors with provision for data transport between operators. The search for operators to construct a sequence is based on the shortest path algorithm. Only two kinds of services operator and connector, with an input and output parameter are considered, contrary to the model proposed with more than one input and output parameter [30,31]. A composition of services as a directed graph with nodes linked by matching compatibility between input and output parameters is considered in [32],.The shortest sequence of web services are derived from the graph. The sequence corresponds to an ordered set of web services matching all expected output parameters for the given inputs by a user. Semantic web service compositions are performed by pre-computing the causal link matrix in [33]. The composition strategy is based on AI planning and performs a regression-based approach and returns a set of correct, complete and consistent plans. The services are actions semantically linked by causal links.
The composition strategy is based on AI planning and performs a regression-based approach and returns a set of correct, complete and consistent plans. The services are actions semantically linked by causal links. However, these two approaches compute the best composition based on the semantic similarity of output and input parameters of a web services and not considering any non-functional properties [34].A modelling tool called interface automata was introduced to represent web services and perform compositions where Atomic services are stored as a graph and each node represents input and output parameters while edges represent web services. Each service has a description of inputs, outputs, and dependencies in other web services. The service descriptions and a graph used to discover composition results that satisfy a service request. A that composer that supports the end user to select web services for each activity in the composition, create flow specifications to link them is introduced in [35]. After selecting a web service, the web services producing an output are fed as the input of a selected service based on profile descriptions. The user can manually select the service fitting in a particular activity and the system generates a composite process in DAML-S. The composition is executed by calling each service separately, and passing the results between services based on flow specifications. In Web Service Compositions, Several standardization and prototype efforts were undertaken[36]. Composition related approaches can be grouped into two different categories business process-oriented and semantics-based [37]. The Petri-net approach [38] graphically represents represent operations as a process and a connected graph where nodes (places) are used to represent states and other nodes ( transitions). One token in every place connected to an operation enables the operation.
The operation may remove one token from every input and deposit the token as output .A service can be in one of the following states: not instantiated, ready, running, suspended, or completed, At any given time. After each service is defined, a variety of compositions can be defined by including sequence, alternatives, iterations etc. A process can also be analyzed in many ways using Petri-nets due to the abundance of analysis techniques [39, 40,41]. Petri-nets can be used to determine the presence of live locks. Algebraic Process Composition models processes are based on calculus [42], in which the basic entity can be one of the following,: an empty process, a choice between I/O operations, a parallel composition, a recursive definition or invocation. I/O operations can receive or send. IBM’s Web Services Flow Language (WSFL) [43] is an XML based language for describing Web service compositions. WSFL is based on Petri-nets and provides two models, the flow model and global model. The flow model aims at specifying the logic of a business process. It uses a directed graph to model the sequence of the functionality provided by a composed service to control flow and data between component services. Each node in the graph, is an activity and represents a step in the business goal the composition tries to achieve. The control links type uses the links to connect activities and prescribe the activities order. The second type called data links, represent the flow of data between activities. The global model aims at defining the mutual exploitation of Web services in decentralized or distributed business processes. Since no specification of an execution sequence is provided, it relies on the use of plug links to represent interactions. WSFL also aims at supporting recursive compositions of services.
In WSFL, every Web service composition (flow, global) can transform itself into a new Web Service and be a component for new compositions. BEA Systems’ Web Service Choreography Interface (WSCI) [44] a XML-based interface description language describes Web Service operations choreography in the context of a message exchange of participating Web Services. WSCI describes how the choreography should expose relevant information like message correlation, exception handling, and descriptions of transactions. Message correlation is achieved by associating exchanged messages with correlation properties identifying a conversion. Exceptions occur due to the receipt of an out-of-context message, or a fault or a timeout. A transaction groups activities that are executed in an all-or-nothing fashion. Activities can be atomic or complex (recursively composed of other activities). Choreography describes logical dependencies between activities. Microsoft’s XLANG [45] is based on calculus [46] and extends a WSDL service description with a behavioral element and at the intra-service level. A behavior defines a list of actions belonging to the service and the order performing actions. XLANG supports two action types, WSDL operations and XLANG-specific actions which includes exceptions and deadline/duration based timeouts. Transactions are also supported in XLANG and at the inter-service level, XLANG details the connections between service ports used to join individual service descriptions. The incompatibility between WSFL and WSCI/XLANG resulted in the development of Business Process Execution Language for Web Services (BPEL4WS) [47]. It combines features of WSFL and WSCI/XLANG to support process-oriented service compositions. Process is composed of activities and the execution of a process might encounter exceptions.
Message correlation and transactions(WSCI and XLANG) are supported. BPEL4WS has several implementations for both J2EE and .NET platforms, including IBM WebSphere [48], Oracle BPEL Process Manager [49], Microsoft BizTalk [50], OpenStorm Service Orchestrator [51] and ActiveBPEL [52.]. Business Transaction Protocol (BTP) [53] is designed to support interactions crossing application and administrative boundaries. Business Process Modeling Language (BPML) [54] shares the same root as WSCI with BPEL4WS. It uses WSCI for expressing public interfaces and choreographies and povides advanced process model semantics like nested processes and complex compensated transactions. Electronic Business XML (ebXML) [55] is an international initiative by the United Nations Centre for Trade Facilitation and Electronic Business (UN/CEFACT) and the Organization for the Advancement of Structured Information Standards (OASIS). ebXML defines standard business processes and trading agreements among different organizations. The vocabulary consists of a process specification document describing the activities of the parties in an ebXML interaction, a collaboration protocol profile describing the organization’s profile and a collaborative partner agreement representing an agreement between partners. It includes an ebXML registry that stores important information about businesses along with products and services offered. ebXML registries have an advantage over UDDI registries since they allow SQL-based queries on keywords. A framework composed of a multilayered architecture and a transactional model was presented [56]. Standards BTP [57], Web ServiceAtomic Transaction [57] and Web ServiceBusinessActivity [58] define transaction protocols between composed services. [59] presented a transaction management model based on tentative hold and compensation concepts.
A Web service was announced by IBM, HP, Sun and Microsoft in the year 2000. The initiatives included IBM’s Web services, Sun’s Open Network Environment (ONE), HP’s e-speak and Microsoft’s Dot net. The World Wide Web Consortium (W3C) published the specification of a Web service [60]. Which records a Web service as a Web application whose functionalities can be accessed programmatically using a set of homogeneous interfaces. Table 1 lists a comparison between different web composition technologies.
Table 1. Comparison of BPEL4WS, BPML, WSCI, WS-CDL and DAML-S
BPEL4WS BPML WS-CDL WSCI DAML-S
Modeling the collaboration Strong support Indirect support Strong support Strong support Strong support
Modeling the execution control Strong support Strong support No support No support Strong support
Representation of the Role Weak support No support Strong support Strong support No support
Transaction and Compensation Indirect support Strong support Indirect support Strong support Indirect support
Exception handling Strong support Strong support Support Strong support Strong support
Semantic support No support No support No support No support Strong support
Business agreement support No support No support No support No support No support
Software vendor support Many Few No Few Few
1.6 OBJECTIVE OF THE STUDY
The objectives include Identification of activities in the mining process, suggesting a new mining framework and suggest efficient algorithms to automate activities. The study also suggests measures to objectively evaluate the interestingness and usefulness of the mining results and determine strategies for evaluating the usefulness of the mining results.
1.7 MOTIVATION OF THE THESIS
The Web services are in transition from data based to Semantic based services. The Web services would be the primary objects with increasing opportunities to compose new useful and interesting Web services from existing resources. The collective opportunities of composing services will be an unexpected application to many. Further, the indefinite scope of search queries and the ability to discover them makes it motivating and can be equated to gaining competitive business advantages. Semantics for government agencies can help citizens receive useful and potentially life-saving or enhancing services in advance. The ability to proactively discover useful composite services even when the goals are unspecified is also a challenging area of research. Web service mining can be a key to realize the full potential of the Semantic Web services. An effective framework in Web service mining would generate interesting and usefully composed Web services.
1.8 ISSUES IN WEB COMPOSITIONS
Indistinctiveness or lack of standardization in dependencies of web compositions , can complicate design of a composition. Agreement might not be reached due to reasons like conflicting interests, lack of co-operation resulting in actor-oriented problems. The consequence of unclear decisions results in heavy risks linking the failure of a composition. Thus, the characteristics of a web composition are an important set of requirements for creating a composition. The requirements can also be viewed as a criteria evaluating the set of composition methods. Such problems determine the extent success and failure of a composition The problems can be categorized into four as detailed below.
• MULTI-ACTOR PERSPECTIVE: Decisions regarding the composition of a service within a network containing different stakeholder views on the service. One stake holder may focus on user friendliness, one on scalability, one about cost.etc. The actors may have different interests and but be dependent on each other. a solution is a compromise that has to combine several contradictory goals and interests. The actors in the network will need to cooperate to a certain extent in order to realize a solution and help realize the common goal. The decision making process culminates in a set of rules to which every actor involved will comply.
• NON-FUNCTIONAL REQUIREMENTS: A common division of requirement types is by differentiating functional and non-functional requirements. Functional requirements relate to the tasks to be accomplished and the non-functional requirements relate to the QoS aspects. Composition are designed to assist users solve a problem like checking a license number. Starting a new service which is undefined the user then invokes the corresponding function in the composition. A number of compositions are used before finally getting the result by iterations and trace.
• ALTERNATIVES INSIGHT: A compositions performance is impacted when one service fails. An alternative service should be made available or specified for failures as a failsafe option. Any composition should support failure analysis and reasons for failure.
• PLANNING SUPPORT: Absence of a shared view amongst actors hinders communications and possibly block further development cycles in a composition. Lack of iterations also results in identifying limited number of alternatives and alternative compositions that can be evaluated. The composition has to provide possible services that can be changed when services are not available waiting for implementation. The composition should provide partners a plan for the realization of the composition, sine, the objectives are related to both the composition and resulting composition.
Processes should facilitate reuse, dependability and planning. The result should reuse services, contain an overview of functional and non-functional specifications and planning.
• PRIVACY ISSUES: A major factor limiting web mining is the need to protect users privacy. Though Web Mining can use data available on the web anyway it can also induce general patterns from personalized data and inductively infer data that should be accessed in a controlled manner. remain private. Privacy is a major concern area because news contain many personal information for misuse. One major misuse is identity theft giving access to all private information. Personalization helps users have a better experience and improves user efficiency but Personal information has names, addresses, telephone numbers, and other identifying information. Personal information gathered for marketing purposes without permissions is another area of privacy infringement. The abundance of spy ware blockers shows users awareness on attempts by such Web sites with pop-up advertisements, to steal data from systems. Mining services using open Web Google API, can help developers address privacy issues by providing a privacy policy, since many users request a privacy policy and may not visit the site more than once if a privacy policy is not provided.
1.9 CHALLENGES IN WEB SERVICE MINING
Since computing services are becoming dependent on web services, the Services they are more complex and depend on results provided by other web services. The Quality and correctness of any web service depends upon the other services used and data mining techniques to get patters becomes in-applicable. Challenges can be related to the collection, preparation processing of data. This thesis addresses these problems with proposed solutions of other research papers. In this world of cloud computing Cloud services can be used to provide an integrated infrastructure for data mining [71] and systems can be analyzed. The event logs can be used to verify the behavioral properties of web service composition [72]. Knowledge discovery may no m ore be a challenging task but combining data mining techniques on web services logs to discover knowledge from the web service can be. Proposed Web services changes over a period of time [73] Another challenging aspect of service mining is to build semantic relations between Web services and construct a semantic relation-based web services registry with complementary functions [74.]. Further, discovered web services needs to be accurate and reliability of social network has been used combining service mining by K optimal trust paths for the selection of trustworthy service in such a service oriented online social network [75.]. Business flows are an important mechanism to find or create enterprise web services. Understanding a business control flow is again a challenging task. Process mining has its own share of challenges in determining the scope of the process without disturbances making existing algorithms less useful.
Designing business intelligence solution with web service mining architecture is another major challenge [76]. Each aspect of web service mining is a candidate for challenge. Composability with its syntactic operation structures, the semantics of messages, quality that determines services. Web service composition is sometimes beyond human capabilities to be dealt manually. The complexity has its causes due to the number of services available over the Web, Web service changes since they are updated easily, making it important to update the composition. Different organizations as actors to a composition do not have a unique language to define and evaluate the Web services. Though this new area is applicable to governmental pr public organization for reducing overheads and deploys solutions more quickly, it is a concept which has to be acknowledged, accepted and standardized over a longer period of time.
1.10 CONTRIBUTIONS
Web compositions incorporate information from various sources and domains to achieve a task in a Service-Oriented Architecture (SOA). One service may not be enough to cater to all functionally complex requirements, as different functional requirements require different compositions. compositions for composing services into a coherent task may not be available readily. It is difficult to find a single web service that produces the desired output from inputs. Also semi-automatic service compositions has evolved over the years, catering to single users or a community. Tools and interfaces to facilitate this kind of web composition are mashups, BPEL-tools etc. The community compositions target the knowledge produced by communities.
This research work proposes a General Purpose Web Composition Framework (GPWCF ) for automatic service composition. GPWCF is a different angle for web compositions and can be implemented by organizations. The user selects and sets constraints. The requested service is fetched from the services repository. A scope definition is constructed and determines the search space and the return of information sets to the user. When a corresponding service is found, an error message is sent to the user and trapped in the error log. All errors like Errors on execution of a service, errors in fetching of results, network failures are logged in the error log, enabling improvement of the compositions in future versions. The GPWCF can be used by organizations as a starting point for web services compositions and developed for future requirements.
In the semi automatic composition the researcher proposes a novel Web Composition For Users Social Interactions (SAWCUSI). Any social network composition should cater to an interaction between a user and services requested by him. The SAWCUSI composition offers a complete solution to user’s requests. If a user’s service request is not found, SAWCUSI, connects to other social network compositions, searches for the service’s availability and updates its services list on the service availability and services the request on receiving the service from the other framework. The error log created at run time is looked up periodically to check services that are unavailable in the frameworks list and gets transformed into a creation for a new service. The error log is checked manually and the decision of creating a new service with the help of a domain expert is taken. The framework can also be used to extract useful knowledge on these social interactions.
Web services that link applications need to address security policies in the development of web service security systems. Ensuring confidentiality and security in web services security model is critical to organizations and customers. The systems have to work on any platform by adopting a neutral language and accept popular security mechanisms. The framework needs to be extended to existing security infrastructure, while allowing web service providers and requesters to develop solutions that meet individual security requirements of applications. WS-Security defines the core facilities for protecting the integrity and confidentiality of a message by providing a model with security functions and components for Web services. It also demands solutions in technological and business perspectives. The efforts require co-ordination between vendors, developers, service providers and customers. For example, a customer making an on-line purchase should not be impacted by the instrument used in the transaction. The goal has to be building interoperable solutions in heterogeneous environments. WS-Security also describes enhancements to SOAP messaging for providing protection through integrity, confidentiality and single message authentication. If a Web service provider does not accept requests from a specific IP address, a choice needs to be given in compositions to overcome this constraint. This chapter discusses a security model with constraints that are compliant with existing standards and thus be adapted to varying business needs. It exploits a syntactic approach to model security requirements of a Web service and considers security requirements of both Web service requestors and Web services taking part in the composition.
Web services that link applications need to address security policies in the development of web service security systems. Ensuring confidentiality and security in web services security is critical to organizations and customers. It also demands solutions in technological and business perspectives. The efforts require co-ordination between vendors, developers, service providers and customers. This thesis proposes a security model called the SWS-Broker, a Web service. The SWS-Broker consists of four main components namely WF-Modeler, WSs-Locator, Security Matchmaker and WSBPELgenerator It first performs the creation of an appropriate workflow (WF), to model the business process before generating the required service and with help of libraries of business processes. The Broker generates the WSBPEL document representing a secure composition.
1.11 LIMITATIONS
Web Services defined run complex applications encompassing several Web service calls and till a composition is tested it needs to be controlled manually. The semantics of Web services and compositions are rather limited. The proposal for enriching Web services are quite restrictive since several Web service calls are considered to be in one singleton Web service call and is treated independently. Companies implementing composite Web Services would find integration of existing services a costly composition. interoperability certification will again play an important role in business process integration.
The results are analogous to previous researches which may have certain variations due to advancements in current technology. Though the framework is a proposition for a faster and efficient composition, it has to be implemented in full by organizations to reap the benefits. Also, fully automatic compositions are not at a stage where the user can fully trust the composition and expect or predict the execution results.
1.11 ORGANIZATION OF THE THESIS
The rest of the document is organized as follows: In Chapter 2, describes web service mining techniques. Chapter 3 discusses Web service compositions. Chapter 4 discusses the proposed Automatic Web Service Composition Framework For User Dependent Web Mining . Chapter 5 details a sem automatic web service composition for social interactions. Chapter 6 discusses web service implementations while chapter 7 proposes a generic model for web service security with a service broker. Chapter 8 discusses the results and evaluates proposed architectures and models. The thesis concludes in chapter 9.
CHAPTER – 2
WEB SERVICE MINING TECHNIQUES
2.1 INTRODUCTION
Web services are being deployed at an accelerating rate. New services are created by extending or combing existing services. Process mining is considered an extension of service mining, due to the inter dependence between business process and web services. Specific queries discovering patterns for a competitive advantage in a business environment may be unavailable. Such mining is a bottom up search process proactively targeting potentially interesting and useful Web Services from existing ones. Systems demand a sophisticated design to answer to user needs and requirements with reliable executions. It is important to track Web services utilization. Business opportunities can be discovered by analyzing different logs and tracking Web Service Interactions with autonomous parties. This chapter compares different web mining approaches for discovery, conformance and extension of examined properties and behaviors and algorithms to extract process trace data from the process logs to develop a novel [61]. Models aim to improve processes used in business process mining. Service mining challenges and their solutions are discussed below proposed in different service mining researches.
2.2 COMPOSITE PATTERNS EXTRACTION FROM EXECUTION LOGS
The service composition re-usage of patterns provides an efficient way to improve the quality of new applications. The service pattern is identified by locating associated services commonly used by different applications and understanding control flow in the set of associated services. The application infrastructure facilitates the monitoring of services-oriented applications from execution logs. Composite service patterns constitute requirements of multiple organizations with a specific control flow and effects best practices. Re-using services pattern provides good quality composite web services. Service pattern composition can be built in two ways.
• Top-Down: The business processes from different organization are reviewed to identify patterns
• Bottom-UP: Execution logs of applications are analyzed to mine business patterns like frequently executed service patterns. This pattern mining of execution logs can be broken in three sub tasks
o Pre-processing: A service-oriented application is executed as multiple instances where each Instance is identified by a unique identifier. Different types of events of the instances are logged. Events like resource adaptor event, business rule event and service invocation event.etc. The entry and exit points of a process are logged in the application logs along with its instance identifier and time stamp. Logs are processed to filter events.
o Identifying frequently associated services: Services which occur frequently are considered for service pattern. The number of services to be analyzed is pre-defined.
o Recovering the control flow: The control flow of a service in the service pattern is reusable. The executing instance of a service is considered and execution flow is extracted and similar execution flow is extracted for all services. The Common execution flows in these services is considered
An approach called Event Calculus (EC) with a time structure to model event based interactions independent of any sequence of events was presented [62]. The time structure facilitates interactions independent of the input events and system behavior close to EC ontology, The focus of any process mining approach is verification of the specified properties for discrepancies between the process model and related instances in the Log Based Verification. Formulating Properties is checking the footprints of system event logs to verify the authenticity of the properties. Web Service Logging is the first step in any mining process and consists of gathering relevant data from the web for analyzing useful information about Web Service behavior from two main sources data on the client and data on the server. The Web service logging facility retrieves the Web data log. Advanced logging solutions for Identifying web service can be obtained by using SOAP messages.
The Event Calculus approach specifies which properties are to be verified. Specifications are expressed as events, which occur during interactions and retrieved from the execution logs. The technique is applied to the discovered fields in process mining.
2.4 WEB SERVICE INTERACTION MINING
Business Process Execution Language (BPEL) standardizes web service compositions into business processes. BPEL defines and monitors workflows. Web Service Interaction Mining (WSIM) as an extension to BPEL[62]. WSIM proposes three levels of abstraction on performance. The event log is mined to get information about the web service behaviors, reducing the amount of data. Information on interactions between the web services determines critical dependencies. The transactions are categorized into four types: One way, notification, response for request and solicit response. One-way and notification operations are messages between the sender and receiver. The sender knows the receiver, but the receiver does not have knowledge about the sender. Request-response and solicit response are messages exchanged by the sender and receiver. The initiator sends a message and the called service replies to the message.
2.5 LOG BASED WEB SERVICE MINING
A web service application accessible to customers is identified by a URI with its interfaces and bindings described in an XML document. The service is discovered by other web services and interactions between web services happen with SOAP, UDDI and WSDL. The execution log is mined and analyzed for Web services behavior, which contain many levels due to complexity and the richness of the composition in Web services model. In Level-1 the set of operations offered by a web services are gathered. Level-2 defines additional to set of operations and Level-3 has the set of interactions exchanged within a given choreography. Level-4 defines the composite Executable process implementing a composite Web service. The mining combines data mining techniques on web services log to discover knowledge from the web service model [63]. The first step is gathering data for analysis and gets useful information about web services behavior. The logging is done at two levels Trivial and Advanced.
2.6 WEB SERVICES USAGE MINING
Learning the usage sequence of users from web services can give important information. The information can be used to find better web services by applying web mining techniques to analyze patterns in behavior of web services. Frequent occurrence of a web services could be mined using AprioriAll algorithm. Optimizations for faster timing in the execution of the mining algorithm can be done. The correlations between operations and web services can be discovered using a proper log format.
The log must contain start time, connection time, disconnection time, session-id, user-id, service with operations. The log sequential pattern is fed as the input with a defined threshold (occurrence count a web service). The k-item set of operation sets are generated and by reducing size of the candidate set the speed is improved. For Example filtering unrelated services log from the data set.
2.7 PROCESS MINING
Process mining is the discovery or verification on the conformance of processes based on event logs, when Web services are distributed amongst different parties. Process mining helps monitoring the exact execution of processes and determine bottlenecks, unused paths and verify deviations. The sequence of events is recorded for every process instance called a trace. The event log contains a set of process instances with various properties of processes associated. When the events are not correlated to the process instances, the execution/ monitoring of processes and Key Performance Indicators measurements go wrong. Analysis of Process Mining is useful when the web services are distributed over autonomous parties and they show an emerging behavior. The mining can be of three types, Discovery when no prior model exists and a new model needs to be constructed using event logs. Conformance is done when a prior model exists. The third type of Extension happens when a prior model exists, but is extended with new perspective for enriching the model. IBM Web Sphere Business Monitor satisfies the three types of process mining and thus Process mining techniques can be used to discovery, confirm and extend the existing web services.
2.8 CLUSTERING EVEN LOGS FOR PROCESS MINING
Process Mining can be used to discover, monitor and improve real processes by extracting knowledge from event logs. Though Algorithms perform well on structured processes it is not easy to determine the scope of the process. Event logs can be clustered iteratively to form a set of similar cases and be adequately represented in a process model. The observed executions of a cluster can be used to discover new models or check conformance to a model. The process represented as Petri-Nets i.e. using two types of nodes namely places and transition. Places indicate states and transitions represent actions. The logs can be iteratively split into clusters until a precise model is formed maintaining the partitions to a minimum. The clusters are split into smaller clusters using k-means method and by finding centroids over which a set of vectors are clustered. The relevance is based on frequency of occurrence in the log.
2.9 WEB SERVICES DOCUMENT BY CLUSTERING WSDL DOCUMENT
The WSDL document has six major components types, portType, messages, binding and service. WSDL documents when mined display features that describe the semantic and behavior of the services. Integrating the features together a cluster of web services could be derived. Search engines can be used to discover and extract components WSDL documents. A Query on the clustered web services returns semantically relevant web services.
A. Extraction : WSDL is parsed to produce tokens of the content
B. Word Stemming: Base words are extracted from the vector created in step A
C. Function word removal: The function words are identified using Poisson distribution and removed from the vector.
D. Content word recognition: k-mean algorithm is used for most frequently occurring words like ‘data‘, ‘web‘, ‘port‘ identified and removed.
E. Extraction of WSDL types like complexType.
F. Messages of web services from WSDL extracted.
G. WSDL portType extracted representing the combination and sequence of message operation.
Quality Threshold clustering algorithm is used to extract features and integrated. Two criteria are used to evaluate performance, Precision and Recall. Precision measures the correctness and Recall measures completeness. The web services are placed into the clusters [64].
2.10 COMMON LOG FORMATS
2.10.1 NCSA Common (access log)
The NCSA Common log format contains only basic HTTP access information. The NCSA Common Log, sometimes referred to as the Access Log, is the first of three logs in the NCSA Separate log format.
The Common log format can also be thought of as the NCSA Combined log format without the referral and user agent. The Common log contains the requested resource and a few other pieces of information, but does not contain referral, user agent, or cookie information. The information is contained in a single file. The fields in the Common log file format are host rfc931, username. date:time request and statuscode in bytes. For Example
125.125.125.125 – username [02/Apr/2013:21:15:05 +0500] “GET /index.html HTTP/1.0” 200 1043
Description of the fields in the Common log format:
• Host: The IP address or host/subdomain name of the HTTP client that made the HTTP resource request (“125.125.125.125”).
• rfc931 The identifier used to identify the client making the HTTP request and If no value is present, a “-” is substituted.( “-“)
• username: The username, (or user ID) used by the client for authentication. If no value is present, a “-” is substituted.
o date:time timezone The date and time stamp of the HTTP request. (“02/Apr/2013:21:15:05 +0500” where the fields are dd is the day of the month, MMM is the month, yyyy is the year, :hh is the hour, :mm is the minute, :ss is the seconds and +-hhmm is the time zone
• request The HTTP request with the requested resource, the HTTP method and the protocol version. ((“GET /index.html HTTP/1.0”)
• statuscode: The status is the numeric code indicating the success or failure of the HTTP request.(“ 200”).
• Bytes: The bytes numeric field containing the number of bytes of data transferred as part of the HTTP request, not including the HTTP header “1043”.
2.10.2 NCSA COMBINED LOG FORMAT
The NCSA Combined log format is an extension of the NCSA Common log format. The Combined format HAS the same information as the Common log format plus three (optional) additional fields: the referral field, the user_agent field, and the cookie field. The following are the fields in Combined log format host, rfc931, username, date:time, request statuscode, bytes, referrer, user_agent and cookie. For Example :
125.125.125.125 – dsmith [02/Apr/2013:21:15:05 +0500] “GET /index.html HTTP/1.0” 200 1043 “http://www.ibm.com/” “Mozilla/4.05 [en] (WinNT; I)” “USERID=CustomerA;IMPID=01234”
The following are descriptions of the three additional fields:
referrer: http://www.ibm.com/ is the URL linking the user to the website.
user_agent: “Mozilla/4.05 [en] (WinNT; I)” is the Web browser and platform used by the user to visit the website.
Cookies: “USERID=CustomerA;IMPID=01234” are the pieces of information that the
HTTP server sends back to client along the with the requested resources. A client’s browser stores this information and subsequently sends it back to the HTTP server upon making additional resource requests. A HTTP server can establish multiple cookies per HTTP request.Cookies take the form KEY = VALUE. Multiple cookie key-value pairs are delineated by semicolons(;).
2.10.3 NCSA SEPARATE (THREE-LOG FORMAT)
The NCSA Separate log format, sometimes called three-log format, refers to a log format in which the information gathered is separated into three separate files (or logs), rather than a single file. The three logs are often referred to as Common log or access log, Referral log and Agent log The three-log format contains the basic information in the NCSA Common log format in one file, and referral and user agent information in subsequent files. However, no cookie information is recorded in this log format.
• Common or access log: The first of the three logs is Common log, sometimes referred to as the access log, which is identical in format and syntax to the NCSA Common log format.
• Referral log: The referral log is the second of the three logs. The referral log contains a corresponding entry for each entry in the common log.
The fields in the Referral log are date:time and referrer. For Example
02/Apr/2013:21:15:05 +0500] “http://www.ibm.com/index.html”
The following is a description of the fields in the Referral log:
• date:time timezone: 02/Apr/2013:21:15:05 +0500 is the date and time stamp of HTTP request. The date and time of an entry logged in the referral log corresponds to the resource access entry in the common log. As a result, the date and time of corresponding records from each of these logs will be the same. The syntax of the date stamp is identical to the date stamp in the common log.
• referrer: “http://www.ibm.com/index.html” is the referrer is the URL of the HTTP resource that referred the user to the resource requested. For example, if a user is browsing a Web page such as http://www.ibm.com/index.html and the user clicks on a link to a secondary page, then the initial page has referred the user to the secondary page. The entry in the referral log for the secondary page will list the URL of the first page (http://www.ibm.com/index.html) as its referral.
• Agent log: The Agent log is the third of the three logs making up the three-log format. Like the referral log, the agent log contains a corresponding entry for each entry in the common log. The fields in the Agent log are date:time and agent. For Example
02/Apr/2013:21:15:05 +0500] “Microsoft Internet Explorer – 5.0”
The following is a description of the fields in the Agent log:
• Date:time timezone : [02/Apr/2013:21:15:05 +0500 is the date and time stamp of HTTP request. The date and time of an entry logged in the agent log corresponds to the resource access entry in the common log. Because information logged in the agent log supplements information logged in the common log, the date and time of corresponding records from each of these logs will be the same. The syntax of the date stamp is identical to the date stamp in the Common log.
• Agent “Microsoft Internet Explorer – 5.0” is the customary HTTP client request for the Web browser, to identify itself by name when making an HTTP request. It is not required, but most HTTP clients do identify themselves by name. The Web server writes this name in the agent log
2.10.4 W3C Extended Log Format
This log file format is used by used by Microsoft Internet Information Server (IIS). A log file in the extended format contains a sequence of lines containing ASCII characters. Each line may contain either a directive or an entry. Entries consist of a sequence of fields relating to a single HTTP transaction. Fields are separated by white space. If a field is unused in a particular entry dash “-” marks the omitted field. Directives record information about the logging process itself. Lines beginning with the # character contain directives. The following directives are defined in the W3C Extended format:
• Version: .: The version of the extended log file format used. This draft defines version 1.0.
• Fields: […]: lists a sequence of field identifiers specifying the information recorded in each entry.
• Software: string : Identifies the software which generated the log.
• Start-Date: : The date and time at which the log was started.
• End-Date: : The date and time at which the log was finished.
• Date: : The date and time at which the entry was added.
• Remark: : Comment information. Data recorded in this field should be ignored by analysis tools.
The directives Version and Fields are required and should precede all entries in the log. The Fields directive specifies the data recorded in the fields of each entry. For Example
#Software: Microsoft Internet Information Server 4.0
#Version: 1.0
#Date: 1998-11-19 22:48:39
#Fields: date time c-ip cs-username s-ip cs-method cs-uri-stem cs-uri-query sc-status sc-bytes cs-bytes time-taken cs-version cs(User-Agent) cs(Cookie) cs(Referrer)
2015-06-02 22:48:39 206.175.82.5 – 208.201.133.173 GET /global/images/navlineboards.gif – 200 540 324 157 HTTP/1.0 Mozilla/4.0+(compatible;+MSIE+4.01;+Windows+95) USERID=CustomerA;+IMPID=01234 http://xxx.yyyy.com/webx?98@@webx1.html
CHAPTER – 3
WEB SERVICE COMPOSITIONS IN WEB MINING
3.1 INTRODUCTION
Web today is a collection of Web Services. A number of companies have projects involving Web services and growing further. Web Services are Open standards with widespread support for universal access in a neutral Platform. Web service to the learner is promising technology allowing sharing individual and autonomous software. Surveys show application Integrations or Web Service Compositions are gaining primary importance. Application integration costs consume Twenty Five percent of the total IT budget in many companies. New systems can also be designed by composing from existing software components, accessible as web services. This reuse of components lowers maintenance costs. Web services composition is a challenging problem as the number of service provider’s increase. Further, many of the applications may overlap in functionality and content and dis-jointed. With web services, the functionality and content can be shared and remotely accessed. Any modular approach in information systems engineering breaks down a system into parts where the parts are individually designed, constructed, tested and often multiple service providers need to co-ordinate to create a new service. This service-oriented paradigm enables creation of services that are well-described, implemented independently and interoperable.
The role of Developers also differs in creating new systems by combining available web services provided by service providers, resulting in web compositions. Even if a single service cannot serve user needs, a combination of services can solve the problem. The term Semantic Web is applied to an extension of current Web in which information is given a well-defined meaning. Semantic Web services are designed to support automatic discovery, composition, invocation, and interoperation. Service interfaces, discovery and service invocation are performed using XML based standards called WSDL, UDDI and SOAP [65]. Web Service Compositions are presently based on Workflows, XML or Ontology based. The Web service models consist of three parts, the service provider, the registry and the consumer [66]. The use of standard protocols, SOAP, WSDL and UDDI helps integrate a composite service irrespective of platforms or execution speeds. Web service composition has also received increasing attention from the research community [67].
3.2 WEB SERVICE COMPOSITIONS
3.2.1 WORKFLOW-BASED WSC APPROACH
A workflow based composition can be classified as static or dynamic in generation. In a static composition the requester builds an abstract process model before planning. In a dynamic composition the process model and atomic services are selected automatically. The requester specifies user’s preference and other constraints.
3.2.2 E-FLOW
E-Services which are usually point-to-point create an opportunity for providing value-added or integrated services. If the compositions include existing e-services, it helps organizations pursue their business opportunity [68]. E-flow is static workflow based approach, a system supporting specification and enactment. The management is modeled as processes that enact a service process engine. eFlow can also be visualized as process schema with other composite services, modeled by a graph defining the order of execution amongst nodes in the process. eFlow platform has the functionalities to satisfy the need of Internet-based service providers.
3.2.3 POLYMORPHIC PROCESS MODEL (PPM)
Polymorphic Process Model (PPM) [69] contains static and dynamic service compositions. The static part of the composition is referencing multi-enterprise based processes with abstract sub processes functionally described without implementation. These abstract processes are implemented at runtime. The dynamic part of PPM is service-based processes modeled by a state machine with specifications a service can attain based on the service activity, invocations or internal service transitions. The dynamic composition is enabled by reasoning.
3.2.4 WEB SERVICE ORCHESTRATION
Web service compositions can also be coordinated using a Web service orchestration. The orchestration controls the order of flow when different Web services are invoked [70]. Additionally the conditions for control for invocation are also specified [70]. The orchestrated Web service invokes one or more Web services from multiple service providers in a coordinated way. It is the means by which the service is automated removing the need for manual coordination. The orchestration can be classified as Web Service Choreography Interface (WSCI), Business Process Execution Language for Web Services (BPEL4WS or BPEL). WSCI is a XML-based interface description language describing the flow of messages exchanged by a Web Service during interaction. WS-BPEL 2.0 is an orchestration language. It is a revision of BPEL4WS 1.0 and 1.1 and an XML based language. It enables users to describe business process activities as Web services and defines their connections to accomplish specific tasks. WS-BPEL is designed to specify business processes that are both composed of, and exposed as, Web Services. WSBPEL2.0.
3.2.5 ONTOLOGY-BASED WSC APPROACH
Ontologies are fast becoming a key enabling technology for the Semantic Web, since, they combine human understanding with machines processing ability.
Ontologies are popular since users can translate their domain knowledge into application systems. Ontology tools offer knowledge management improvement capabilities and are used in large organizations like Querying and browsing semantically enriched information resources. Ontologies are also used as data modelers
3.3 QOS BASED METHODS
Qos BASED methods for service compositions are based on non-evolutionary algorithms. The algorithms use a QoS broker to collect information about servers and take decisions. The goal is maximizing the utility function while satisfying constraints, where constraints are client conditions that need to be satisfied in an end to end service composition like cost not more than 1000. The business processes can also be modeled as sequential, parallel, loop, etc. Several QoS attributes like, Response, Reliability and cost are used. Table 2 lists the Qos Attributes. There are two different approaches used for problem solving. The first called a Combinatorial approach uses Multiple-Choice option. Each item in each service class has a candidate service presenting for example profit and weight.
The algorithm selects an item from each service class to maximize on a Qos attribute top find the best match as given below
Max
Subject to
,
Utility value at step i for candidate j
Response time of candidate j at step i
Total response time
Table 2. QOS Attributes
Attribute
Group Attribute Definition
Run-time Response Time Time between an invocation request and results
Reliability Is the number of successful operations against invocations
Availability UpTime(op) / TotalTime when UpTime is the time op was accessible during the total measurement time Total Time
Business Cost Dollar amout to execture the operation
Reputation Is the ranking by user u and n is the number of times op has been ranked
Regulatory Compliance with government regulations, 1 ≤ Reputation ≤ 10
Security Encryption A boolean equal to true if messages are encrypted
Authentication A boolean equal to true if consumers are authenticated
Non-repudiation A boolean equal to true if participants cannot deny requesting or delivering the service
Confidentiality List of parameters that are not divulged to external parties
.
Dynamic programming is used to solve the problem using Pisinger’s algorithm. In the graph theory approach, the composition uses a constrained shortest path to a problem. Each service in each service class represents a node. The QoS parameter is moved from nodes to their corresponding edges.
Several algorithms like Constrained Bellman-Ford (CBF), Constrained Shortest Path (CSP) are applied to find a solution. Different workflows are used in business process and efficient heuristic algorithms are applied to tackle the problem. Several forms of workflow like AND/OR and loops are used. Service selection is approached in two ways local optimization and global planning. Local optimization computes the overall QoS of each web service to select the optimal one and the Score of web services is computed as follows.
Matrix of QoS Attribute for services
Wj : User define weight for each QoS attribute
i : Represent number of tasks
j : Represent number of QoS attributes
Global planning considers global constraints and uses Integer Programming to solve the problem. The running time is considerable. For selection the greedy algorithm is used to select web services with highest score. Each candidate service represents a point, that x-coordinate represents the constraint and y-coordinate represents the QoS score and each task is a set of points. For each task a convex hull is constructed using Graham-scan or Quickhull algorithm.
The frontiers are sorted in descending order and segments with greater gradient while meeting constraint are selected. Evaluations show better algorithmic performance in IP programming with high levels of complexity. Table 3 lists the differences between compositions.
Table 3. Comparison between Web Service Compositions
Approaches BENCHMARKS
Connectivity Exception Handling Scalability Correctness QOs
E – Flow Low Low Low Low Low
PPM Low Low Low Low Low
BPEL4WS Good Good Average Low Average
OWL – S Good Average Good Low Good
WSMO Good Good Good Low Good
3.4 EVOLUTIONARY ALGORITHMS IN WEB COMPOSITIONS
Web compositions based on evolutionary algorithms use Genetic algorithm to tackle problems in composition and fitness functions to compare solutions. Constraints become fitness function. The algorithms consider all forms of workflows in business process. The computation of overall QoS is similar to Cardoso except that the loop uses method. The problem is encoded into a genome and Genetic operations performed. Fitness functions maximize QoS attributes while minimizing some attributes.
The fitness function is shown below
Wi : User define weight for each Qos attribute
D(g) : a penalty for meeting constraints
3.5 REQUIREMENTS FOR WEB SERVICE COMPOSITIONS
The main aim of a service composition is to support the process of connecting different services from vendors to execute a specific task. Decisions about services selection, deployment are taken during the composition process. Each stage of a composition has different requirements on methods. In the orientation phase the cooperation is described in general terms and an overview of available of services may not be available. In the negotiation services needed are decided in co-operation with the parties offering the services and their specifications. The compositions of the negotiation phase can be realized subsequently in the usage phase. The usage phase starts the composition and ends when one of the actors/services ends the co-operation. Reaching an agreement regarding a composition is a the crucial step in the process. The parties decide on functions to be executed in case of a disaster. Fig. 5 depicts the stages of a composition process and Table 4 lists the differences between composition standards.
Fig. 5.Stages of the composition process using services delivered by A, B& C
3.6 DESIGN OF A COMPOSITE SERVICE
The basic SOA is an interaction and exchange of messages between service requesters (clients) and service providers. Providers publish their description of services they provide. Clients find these descriptions of the required services and get bound to them. The design of a composite service is depicted in Fig. 6 and the process which is a five step process is depicted in Fig. 7
Fig. 6. Design of a composite service.
:Service provider publishes a service in a UDDI
o Requester searches the UDDI for a required service
o Descriptions of candidate services are returned R
o Requester decides and invokes the chosen service of the provider
o The service gets executed and results are sent back to the requester.
Fig. 7. Service Discovery, Selection and Enactment.
An approach for a composition is the core part of a web composition method. Several evaluations of compositions can be found in literature. Milanovic and Malek compared these approaches using Service connectivity, Non-functional properties, Correctness, Automatic composition and Composition scalability.
3.7 EVALUATING A WEB SERVICE COMPOSITION
Several parameters need to be considered while evaluating a Web Composition like connectivity, exception handling, scalability, correctness and Qos attributes of the composition. .
• Connectivity: Reliability in connectivity is the basic requirement of a Composition and in its absence even the best composition cannot function.
• Exception handling: must deal with exceptional cases of failure in transactions and roll back the transaction to its previous state.
• Scalability: Scalability is the ability of the Web service to process multiple requests within a specified time interval. It is measured by the number of requests resolved within a time span and scalability increases with its handling capacity.
• Correctness: is a term attached to the behavior of a composition when the service compositions become concurrent or large or complex in nature..
• QoS: is the requirements parameters from users used to produce the end results in service. .
Though all compositions offer connectivity without which it would be impossible to integrate a set of services, the ability of a service to handle non-functional properties (Qos) is desirable. For example a customer should not wait at the ATM for three whole minutes to get a response. When Compositions neglect non-functional properties like response times, security, costs, reliability and scalability, it is incomplete. In case of correctness of a composition, the service and composition specifications are used to assess and verify its behavior under different circumstances. Automated compositions treated as a part of semantic web vision are intelligent processes with specifications on services and requirements. This design leads to a composition generated without human involvement. All well-defined services should have an unambiguous selection in requirements which is the clear undisputed goal for performance. Table four lists a comparison of composite standards
Table 4. Comparison of Composition Standards
WSFL XLANG BPEL4WS DAML-S
Process Modeling Supports graph based process modeling Supports construct based process modeling Supports a blend of graph/construct based process modeling Supports construct based process modeling
Support for Semantics Does not support semantics Does not support semantics Does not support semantics Supports semantic description and discovery of services using ontologies
Support for QoS specification Uses WSEL extensibility elements to specify QoS Does not support QoS specification Does not support QoS specification yet Support QoS specification through its Service Profile class
Relationship to WSDL Layered on top of WSDL providing composition Layered on top of WSDL providing composition Layered on top of WSDL providing composition Models a language which provides more features than that supported by WSDL
CHAPTER – 4
A NOVEL AUTOMATIC WEB SERVICE COMPOSITION FRAMEWORK FOR USER DEPENDENT WEB MINING
4.1 INTRODUCTION
Compositions incorporate information from various sources and domains to achieve a task in a Service-Oriented Architecture (SOA). A Single service becomes the basic unit of operation and may be unable to cater to all functional requirements. The services when connected can satisfy complex requirements. Different functional requirements require different compositions . Traditionally, web service composition has been performed manually, making it a difficult and error prone task. Though manual Web compositions are time-consuming and error prone, automatic compositions for composing services into a coherent task may not be available readily. It is tricky to find a single web service that produces the desired output from inputs. A composite service with process logic can be decomposed into subtasks that respond to web services. The task is subdivided into sub tasks and binding them to single web services based on the logic of the workflow [77]. This paper examines various approaches and proposes a general-purpose framework for automatic service composition.
4.2 MANUAL WEB SERVICE COMPOSITION PROCESS
In a manual composition a user creates an abstract specification of a high-level task.
User manipulation provides autonomy and control over execution but the user needs to possess enough domain knowledge to decompose the specification and create a workflow. Users utilize standard web service languages like WS-BPEL [78] or [79]. The user then binds web services to the workflow. Such methods are widely used in search and discovery services in a web service repository like UDDI [80]. The services will have to be monitored to check if the service is bound to the task while handling faults as they arise. Thus user domain knowledge plays a primary role in such compositions. It is a time-consuming and error-prone process with no guarantees making it imperative to automate compositions. The Fig. 8 is an example of a workflow derived from high-level specifications.
Fig. 8 – Abstract Web Composition Workflow.
4.3 AUTOMATIC WEB SERVICE COMPOSITION
Automatic web composition can be defined as a five phase process namely specification, planning, validation, discovery and execution. In the Specification phase the User specifies goals, requirements and constraints. The planning phase provides an automatic way to compose an abstract workflow. Validation phase has techniques to ensure realization of composite process. Discovery phase, Discovers services satisfying task specifications and the request is executed finally in the Execution phase. Fig. 9 shows a composition framework with each stage of the composition process.
Fig. 9. Web service composition framework with inputs and outputs
4.3.1. SPECIFICATION PHASE
In this phase, the user specifies the required goal that can be achieved through the composition of web services and producing an abstract specification. The specification should be detail enough detail to create the abstract specification with functional and non-functional requirements. Functional requirements are the high-level goal of the task and constraints are Non-functional aspects. Tasks are described in terms of the goal to achieve. The specification can also be in different levels depending on the user’s domain knowledge. The specification has to be decomposed into smaller subtasks to create an abstract workflow. When the user has a set of constraints and preferences, tasks are described by their preconditions. This approach is used in many AI planning schemes and does not require the knowledge of a predefined workflow. Languages specifying abstract specification can range from OWL-S, WS-BPEL, or a formal service specification language [80].
4.3.2. PLANNING PHASE
The planning phase follows the abstract specification of the specification phase and produces an abstract composite workflow. Workflow-based compositions require domain knowledge and some level of manual implementation by developers. Template-based workflows describe outline activities needed to solve a problem and can be parameterized based on user needs and Once generated can be reused and extended.
AI planning techniques are a way to generate a process automatically based on the specification of a problem. Planning tools use WS-BPEL with WSDL descriptions, OWL-S, and proprietary languages like PDDL, CSSL that provide logic validation.
4.3.3.VALIDATION PHASE
This phase addresses validation aspects of the abstract composite workflow of the planning phase. It validates Syntactically by checking if the workflow is well-formed and structurally correct. The validation is typically done with the help of tools. Semantic validation is also done to check goals and requirements specified in the specification phase. Most validation algorithms for semantic validation are based on utility theory, some on shortest path heuristics [81] or model checking [82]. Assumptions are made on the subtasks within the workflow modeled in a formal language. The process may be iterative in nature based on the specification or planning phase. Majority of tools are developed based on WS-BPEL and OWL-S
4.3.4. DISCOVERY PHASE
The discovery phase finds suitable web services for each task of the composite workflow by querying service repositories. Discovered services are bound to the corresponding subtasks in the workflow. The Design-time binding may change if the status of the service changes while it is executing.
If services are not bound in the discovery phase, alternatives need to be found at run time. A query may many results due to lack of sufficient description of the web services or constraints or insufficient decomposition requiring additional tweaking of the workflow or specification. The most widely used tool is UDDI, a web service repository and a XML-based open-industry initiative. ebXML registry is an alternative. The repositories differ in the way they store information
4. 3.5. EXECUTION PHASE
The execution phase deploys and executes the created composite services. It includes control flows specified in the workflow and recovery mechanisms ensuring proper execution of composition. Design-time binding is static. Run-time binding increases the lifetime of a workflow since alternate services can be selected when the first choice is unavailable at the time of execution. Monitoring the execution is an extension to the validation phase. Monitoring tools also examine input/output messages passed between services to check conformity of constraints. If a service fails during the execution, the recovery can also be done via web service substitution. In a substitution the new service should support the functionalities of the original service being replaced or alter the context of the composition [83]. Many execution engines created using WS-BPEL or OWL-S are available for composite services. The engines are capable of specification, validation, deployment, and management of composite workflows [84]. Monitoring tools use AI planning algorithms for checking run-time validation and formulation of properties.
Fig. 10. Composition process with different phases
4.4 DYNAMIC WEB SERVICES COMPOSITION
One great benefit of SOA is, it enables dynamic service binding allowing services to discover, select and invoke services at runtime in Web services technologies [85]. Web services are programmatically accessible over the Internet and interoperate independently and offer flexibility and scalability needed by compositions to profit from the SOA benefits. Automated service discovery, selection and composition are expected to enrich the experience of service end-users through value-added services, and to allow automated processes to interact with minimal human intervention [86]. Dynamic composition of web services is an approach for service oriented applications. It exploits semantic matchmaking between the service parameters outputs and inputs to enable interconnections and interactions. Functional and non-functional properties of services are considered, for the computation of suitable service compositions of a service request. For example a service developer has to develop a new service receiving text for translation and send the translated text to a given place. The developer creates the service by connecting available single web services manually when service is unavailable. In a dynamic web service composition a Composition Factory component creates service compositions based on a service request. The Composition Factory queries the service repository to retrieve an unordered set of services required for the service composition. Fig. 11 gives an overview of the steps performed by a dynamic service composition framework. However, some work still has to be done automate dynamic discovery and composition of services with current Web technologies, since the automation depends on knowledge about the services.
Fig. 11 Composition Factory
4.5. WEB COMPOSITION SERVICE REQUEST
Developers specify service requests by annotations defining the requested service goals, inputs/outputs or preconditions. The annotations are referred to elements defined on ontologies described in OWL [87]. An example of annotated service request is:

<"LanguageOnt#Language" name="srcLang">
<"LanguageOnt#English" name="trgtLang">
<"LanguageOnt#Text" name="txtToTrans">
<"Target#Location" name="TargetLocr">

<" Target#Location " name="AcknowledgmentInformation">

<"GoalOnt#translate">
<"GoalOnt#sendTranslatedText">

4.6 GENERAL PURPOSE WEB COMPOSITION FRAMEWORK (GPWCF)
GPWCF is a different angle for web compositions. It is a framework which has five stages and can be implemented by organizations. The user selects a service and sets constraints. The required service is searched in the services repository and selected. In case a corresponding service is found or not listed, an error message is sent to the user and the error is trapped in the error log.
A scope definition is constructed on the selected service and user defined constraints. The scope definition also determines the search space and the return of information sets to the user. All errors like Errors on execution of a service, errors in fetching of results, network failures are logged in the error log, enabling improvement of the compositions in future versions. The error logging is a simple and useful utility not found in many compositions. The GPWCF can be used by organizations as a starting point for web services compositions and developed for future requirements. Fig. 12 depicts the GPWCF framework.
Fig. 12. GPWCF Architecture
4.6.1 GPWCF SCOPE DEFINITION
The mining process begins with an evaluation of the user request. User constraints are considered for the specified request and defined as a list of functional activity in the scope specifications of the service. For Example a user request for a cheap travel with an upper limit of 1000, is a constraint to be considered. The scope definition would build a constraint and filter all modes of travel, up to 1000 rupees. Similarly each constraint or request may pertain to a different domain with separate functional parameters. The mining perspective MP can be formally defined as.

Essay: VERIFICATION OF PROPERTIES IN WEB SERVICE MINING

Essay details and download:

Text preview of this essay:

About this essay:

Essay details and download:

Text preview of this essay:

About this essay:

Essay Categories: