Recommendation systems can be considered as a valuable extension of traditional information systems used in industries such as travel and hospitality. However, recommendation systems have mathematical roots and are more akin to artificial intelligence (AI) than any other IT discipline. A recommendation system learns from a customer’s behavior and recommends a product in which users may be interested. At the heart of recommendation systems are machine-learning constructs. Leading e-commerce players use recommendation engines that separate users’ past purchase histories to recommend products such as magazine articles, books, goods, etc. Online companies that leverage recommendation systems can increase sales by 8% to 12%. Companies that succeed with recommendation engines are those that can quickly and efficiently turn vast amounts of data into actionable information.
Anatomy of a Recommendation Engine
The key component of a recommendation system is data. This data may be garnered by a variety of means such as customer ratings of products, feedback/reviews from purchasers, etc. This data will serve as the basis for recommendations to users. After data collection, recommendation systems use machine-learning algorithms to find similarities and affinities between products and users. Recommender logic programs are then used to build suggestions for specific user profiles. This technique of filtering the input data and giving recommendations to users is also known as ‘collaborative filtering.’
Along with collaborative filtering, recommendation systems also use other machine-learning techniques such as clustering and classification of data. Clustering is a technique that is used to bundle large amounts of data together into similar categories. It is also used to see data patterns and render huge amounts of data simpler to manage. For instance, Google News creates clusters of similar news information when grouping diverse arrays of news articles. Many other search engines use clustering to group results for similar search terms.
Classification is a technique used to decide whether new input or a search term matches a previously observed pattern. It is also used to detect suspicious network activity. Yahoo! Mail uses classification to decide if an incoming message is spam. Image-sharing sites like Picasa use classification techniques to determine whether photos contain human faces. They then offer recommendations of people that are identified in the user contacts list.
Approaches to Personalization
From an architectural and algorithmic point of view, personalization systems fall into three basic categories: Rule-based systems, content-filtering systems, and collaborative filtering systems.
Rule-Based Personalization Systems
Rule-based filtering systems rely on manually or automatically generated decision rules that are used to recommend items to users. Many existing e-commerce websites that employ personalization or recommendation technologies use manual rule-based systems. Such systems allow website administrators to specify rules, often based on demographic, psychographic, or other personal characteristics of users. In some cases, the rules may be highly domain-dependent and reflect particular business objectives of the website. The rules are used to affect the content served to a user whose profile satisfies one or more rule conditions. Like most rule-based systems, this type of personalization relies heavily on knowledge engineering by system designers to construct a rule base in accordance with the specific characteristics of the domain or market research. The user profiles are generally obtained through explicit interactions with users. Some research has focused on machine learning techniques for classifying users into one of several categories based on their demographic attributes, and therefore, automatically derive decision rules that can be used for personalization.
The primary drawbacks of rule-based filtering techniques, in addition to the usual knowledge engineering bottleneck problem, emanate from the methods used for the generation of user profiles. The input is usually the subjective description of users or their interests by the users themselves, and thus is prone to bias. Furthermore, the profiles are often static, and thus the system performance degrades over time as the profiles age.
Content-Based Filtering Systems
Content-based filtering approaches, also called item-to-item correlation, recommend to the customers based on the historic customer records. This kind of recommendation mainly relies on association rules and patterns between the goods. Associations may be based on co-purchase data, co-visited data, content similarities, preference by common customers, or other measures. The association measurement between goods A and goods B should inspect their records purchased in the past. If the two products are frequently purchased at the same time, they have higher associations; otherwise, they have lower associations. In such recommendation systems, if goods A and goods B have higher associations, then we may recommend goods B to those who have purchased goods A.
In content-based filtering systems, a user profile represents the content descriptions of items in which that user has previously expressed interest. The content descriptions of items are represented by a set of features or attributes that characterize that item. The recommendation generation task in such systems usually involves the comparison of extracted features from unseen or unrated items with content descriptions in the user profile. Items that are considered sufficiently similar to the user profile are recommended to the user.
In most content-based filtering systems, particularly those used on the Web and in e-commerce applications, the content descriptions are textual features extracted from Web pages or product descriptions. As such, these systems often rely on well-known document modeling techniques with roots in information retrieval and information filtering research. Both user profiles, as well as items themselves, are represented as weighted term vectors (e.g., based on the TF-IDF term-weighting model). Predictions of user interest in a particular item can be derived based on the computation of vector similarities (e.g., using the Cosine similarity measure) or using probabilistic approaches such as Bayesian classification. Furthermore, in contrast with approaches based on collaborative filtering, the profiles are individual in nature, built only from features associated with items previously seen or rated by the active user.
The primary drawback of content-based filtering systems is their tendency to over-specialize the item selection since profiles are solely based on the user’s previous ratings of items. User studies have shown that users find online recommenders most useful when they recommend unexpected items, suggesting that using content similarity alone may result in missing important ‘pragmatic’ relationships among Web objects, such as their common or complementary utility in the context of a particular task. Furthermore, content-based filtering requires that items can be represented effectively using extracted textual features, which is not always practical given the heterogeneous nature of Web data.
Collaborative Filtering Systems
Collaborative filtering approaches, also called customer-to-customer correlation, recommend products to special customers based on their similarity with other customers’ preferences. It can discover new content of interest to the customer. Collaborative filtering distinguishes customers’ neighbors from historical information and predicts which content the customers are likely to like by analyzing these neighbors. It is different from content-based filtering, where recommendations are based on similar products that the customer liked in the past. In collaborative filtering, only the similarities between customers are calculated, not between the products.
There are three steps to provide suggestions for new customers using collaborative filtering:
- A new customer file is established by selecting items involved in the website.
- The new file is compared with other customers’ files to find similar files.
- For products that the new customer has not listed, recommendations may be made based on similar customers’ files.
Collaborative filtering has tried to address some of the shortcomings of other approaches mentioned above. Particularly in the context of e-commerce, recommender systems based on collaborative filtering have achieved notable successes. These techniques generally involve matching the ratings of a current user for objects (e.g., movies or products) with those of similar users (nearest neighbors) to produce recommendations for objects not yet rated or seen by an active user.
However, collaborative filtering techniques have their own potentially serious limitations. The most important of these limitations is their lack of scalability. Essentially, kNN requires that the neighborhood formation phase be performed as an online process, in contrast to model-based approaches where model learning is performed offline from training data. As the numbers of users and items increase, this approach may lead to unacceptable latency in providing recommendations or dynamic content during user interaction.
Another limitation of kNN-based techniques arises from the sparse nature of the dataset. As the number of items in the database increases, the density of each user record with respect to these items will decrease. This, in turn, decreases the likelihood of a significant overlap of visited or rated items among pairs of users, resulting in less reliable computed correlations. Furthermore, collaborative filtering usually performs best when explicit non-binary user ratings for similar objects are available. In many websites, however, it may be desirable to integrate personalization actions throughout the site, involving different types of objects, including navigational and content pages, as well as implicit product-oriented user events such as shopping cart changes or product information requests.
A number of optimization strategies have been proposed and employed to remedy these shortcomings. These strategies include similarity indexing and dimensionality reduction to reduce real-time search costs and remedy the sparsity problems, as well as offline clustering of user records, allowing the online component of the system to search only within a matching cluster. A model-based variant of collaborative filtering is known as item-based collaborative filtering, where, starting from the same user-rating profile databases, an item-item similarity matrix is built offline and used in the prediction phase to generate recommendations. Rather than basing item similarity on content descriptions of the items, similarity between items is based on user ratings of these items. Each item is represented by a vector, and the similarities are computed using metrics such as cosine similarity and correlation-based similarity. The recommendation process predicts the rating for items not previously seen or rated by an active user using a weighted sum of the ratings by that user of items in the item neighborhood of the target item. Evaluation of the item-based collaborative filtering approach has shown that item-based collaborative filtering can provide recommendations that are, in general, of similar quality when compared to memory-based collaborative approaches. Most data mining approaches to personalization can be viewed as extensions of collaborative filtering. In these approaches, the pattern discovery algorithms take as input the historical rating or navigational profiles of past users and generate aggregate user models. The user models, in turn, can be used, in conjunction with the profile of an active user, to predict future user behavior or generate recommendations.
Service-Oriented Architecture (SOA) to Implement E-Business
Software development has undergone various stages of paradigms. The Service-Oriented Architecture (SOA), as the next generation software architecture, and through the utilization of web services, XML, and other related technologies, provides a viable working solution to implement dynamic e-business. In this paper, we develop a framework with implementation based on SOA.
A Service-Oriented Architecture is essentially a collection of services, among which the communication can involve either simple data passing or it could involve two or more services coordinating some activity, requiring means of connecting services to each other. The first service-oriented architecture in the past was with the use of DCOM or Object Request Brokers (ORBs) based on the CORBA specification.
To understand service-oriented architecture, one must begin with a clear understanding of the term ‘service.’ A service is a function that is well-defined, self-contained, and does not depend on the context or state of other services. The technology of web services is the most likely connection technology for service-oriented architectures. Web services essentially use XML to create a robust connection. A service consumer sends a service request message to a service provider. The service provider returns a response message to the service consumer. The request and subsequent response connections are defined in some way that is understandable to both the service consumer and service provider. How those connections are defined is explained in Web Services.
As a distributed software model, an SOA is usually comprised of three primary parties: Producer (of services), Consumer (of services), and Directory (of services). Web Services are considered an example of Service-Oriented Architecture. Service Networks take on the properties of an SOA.
Related Work
Bachus et al. have been holding a U.S. patent that is Healthcare Provider Recommendation System (HPRS). A consumer who experienced good healthcare service can register the service’s provider on this system. Information about registered and accumulated services is provided to other users; a user who is willing to be treated can query the system using the provider’s contexts such as location, specialty, and reputation. This system can encourage information sharing between consumers, and rating healthcare services may be honest and actually helpful from the consumer’s point of view. However, information about the services may not be sufficient to show professionalism, because services are registered by laymen. Besides, the system cannot consider a user’s contexts, especially health status, so novice users may not receive successful recommendation results. ABC Homeopathy is a healthcare website that proposes remedies or medicines according to the users’ symptoms. Using this website, a user can choose multiple symptoms of each body part and retrieve information about remedies for the selected symptoms. We can regard that this system considers health status to recommend remedies. However, the results are too simple and limited to remedies only. Moreover, it does not handle enough health status that is required for fine recommendations, and still, it is not useful when users could not be aware of their specific symptoms.
Bahram Amini, Roliana Ibrahim, Mohd Shahizan Othman, faculty of Computer Science and Information Systems, University of Teknologi Malaysia, ‘A Framework for Personalized Information Integration in Higher Education Institutes’, International Journal of Computer Applications (0975-8887) Vol 23-No.4, June 2011, is a service-oriented framework that augments recommendation approaches with components of semantic-based information integration and provides interactive and contextual-based information integration for decision makers in Higher Education Institutes. The underlying semantic web technology facilitates on-demand integration of information from internal sources as well as the Web and provides web service discovery and invocation for effective information analysis. In addition, the framework enables users to analyze instances of students’ information and to receive recommendations of new information sources as well as appropriate analytical services based on the students’ status. The service orientation paradigm provides dynamic and flexible means of communication for service interoperability among the framework components.
Proposed Framework
System Service Recommendation Framework (SSRF) is a computerized system that recommends suitable services to service consumers based on their various interests. In other words, the framework acts as a mediator for business or non-business interactions between system service providers and consumers. A service provider is a Network node that provides a service interface for a software asset that manages a specific set of tasks. A service provider node can represent the services of a business entity, or it can simply represent the service interface for a reusable subsystem. Therefore, it is an essential functionality for career web software such as e-career portals or search engines for system services.
For more personalized recommendations of the services, SSRF applies the user’s interest to its recommendation process. Interest is the information about the user’s current states or conditions, and it is the most important key to determine what specific services are suitable for the user. However, to use interest without any technical obstacle, the interest must be measurable and standardized. As a mediator, SSRF manages complex interactions between system service providers, consumers, and system administrators. The system service providers such as Professors, Directors, Engineers, Technical advisors, etc., can describe and register their own services on SSRF. Even if the service is not registered in SSRF, SSRF can provide a service on-demand to incorporate new sources of data as required. Then, multiple recommendation mechanisms developed by the system administrators eventually search and recommend those registered or unregistered services for the users. Users can retrieve information about recommended services and evaluate them. The web portal system provides various services and information about the system and also acts as an interface of SSRF for service consumers. The web portal actually triggers a recommendation process, automatically delivering users’ interests to SSRF; SSRF performs the recommendation process with the given users’ contexts, analysis of data in runtime done by runtime analysis tools, and sends recommended results back to the portal.
Framework Architecture
We designed a flexible architecture of SSRF considering the extensibility and scalability of the framework. Because a brand-new type of service and interest can emerge at any time after the system is published, SSRF should require less effort to adapt to those changes. Also, SSRF must be able to handle large amounts of services and consumers. A consumer should be able to receive recommended results with high quality and low delay even if there are many services or requests from other consumers. To meet the above requirements, we adopted SOA (Service-Oriented Architecture) design paradigm to SSRF. System services can be implemented using the Web Services technology and registered easily at runtime. Also, core logics for the recommendation can be realized as web services. For instance, we can imagine that there are several web services available and a recommendation web service that gathers and arranges the services is deployed on the system. Likewise, there are recommendation web services that are in charge of their own categories, and all the results from them are reorganized for users.
The entire architecture of SSRF consists of three types of modules: façade, core logic, and data access object. Façade modules are outer interfaces of SSRF. Each of them is connected to its core logic that performs the main functionality of the system. Core logic modules may need to use the service repository to access information about consultancy services; data access object modules are wholly responsible for all transactions to the System Service Repository and provide handy interfaces to it. There are three façade modules in SSRF: interfaces for service providers, interfaces for users, and interfaces for administrators. Each of them is in charge of SSRF operations that aid each system user properly. Moreover, they can protect the framework from incorrect operations at the same time. The interface module for consultancy providers interacts with the System Service Manager, the interface for the administrator is on the Web Service Manager, and users’ interface deals with the Web Services Pool that is a set of various web services for system service recommendation. The data access object is a well-known module that is effective for handling databases. SSRF puts all information about system services into the database called System Service Repository, and if data is not available in the database, it is provided by searching data and stored in a database called Data Provided on-Demand. All transactions to the repository must be performed through the data access objects only. It is such a common design pattern to handle the database to prevent damages to data from any unintended database operations. SSRF has two core logic modules for managing the system: the Web Service Manager and the System Service Manager, and numerous web services for the recommendation logics in the Web Services Pool. The Web Service Manager manages the Web Services Pool of SSRF enabling recommendation web services to be deployed and managed at runtime by the system administrators. The only parameter that is required for deploying a new recommendation web service is the URL (Uniform Resource Locator) of the WSDL (Web Service Description Language) document of a new web service, and the system administrator can manage deployed web services using their URL as a primary key.
The System Service Manager provides various functionalities to manage the System Service Repository: registering a new system service, modifying properties of the services, inquiring recommendation statistics, activating or deactivating the service, deregistering the services, and so on.
This framework can also analyze the result through runtime analysis tools. This tool analyzes the result using various software like MS Excel, SPSS, etc., before providing the result to the user. So the user can choose the best service.
Implementation
We have implemented the developed framework in a case study of Consultancy Services (CS).
Implementation of Framework for CS
To evaluate the functionality and feasibility of SSRF, we present an example of the Consultancy Service recommendation process for service consumers that may occupy most of the transactions of the framework. A recommendation process starts, as mentioned before, from the e-Career Web Portal system. The Web Portal system requests consultancy service recommendations for a consumer by sending the consumer’s interest to the Consultancy Service Recommendation Framework (CSRF). A process for the service consumers automatically starts while they are using the e-Career Web Portal System, and they need some consultancy from consultancy services. To start the recommendation process, the e-Career Web Portal triggers a recommendation request to CSRF using the CSRF Invoker module. As an input to CSRF, a request message contains the user’s contexts, including interest. Passing through the façade module, the request is propagated to multiple recommendation web services in the Web Services Pool simultaneously. CSRF receives the request and passes it to multiple recommendation web services in the Web Services Pool. After simultaneous tasks by those web services, the framework gathers and rearranges results from them and analyzes the results using runtime analysis tools. Finally, CSRF returns a composed recommendation result back to the Web Portal system, and the Web Portal system displays information about recommended services with analysis of the result to a consumer. From the implementation result above, we have confirmed that a recommendation process of CSRF works pretty well, and that a result for consultancy recommendations is valuable and reasonable to the consumers. However, a more certain evaluation of the framework would be a statistical investigation into service consumers’ satisfaction with recommendation results. Further evaluation of CSRF, which was difficult without a large experimental group and long observation, will be possible when the e-Career Web Portal is used widely and frequently by service consumers.
Conclusion
In this chapter, we suggested a personalized consultancy service recommendation framework that considers consumers’ interests to find and analyze adequate services for them. Our framework gathers information about a service consumer’s interest and calculates service similarities between consumers and consultancy services automatically and runtime analysis of that service. Based on these similarities for each consumer, the framework arranges and recommends proper consultancy services. Also, we implemented CSRF and evaluated its functionality and feasibility. Although the evaluation was not fully certain to prove all approaches of this paper, we concluded that our framework is quite enough to provide better consultancy service recommendations to novice users.