and valuable knowledge and may be an important resource for both consumers and
firms. Consumers always look for quality information from online reviews before
buying a product and firms can use these reviews as feedback for better quality product development, customer relationship management and for the development of new
marketing strategies.
A product may have large amount of aspects. Some of the product aspects are
more important than the others and have strong impact on the consumers decision
making as well as firms product development strategies. Identification of important
product aspects become necessary as both consumers and firms are benefited by this
technique. Consumers can easily make purchasing decision by paying attention to
the important aspects and firms can focus on improving the quality of these aspects
and thus enhance product reputation efficiently.
Simple solution for important aspect identification is frequency based approach
in which most frequently commented aspects in consumers reviews are selected as
important aspect. However, consumer’s opinions on the frequent aspects may not
influence their overall opinions on that product, and thus not influence consumer’s
purchase decisions.
Motivated by the above observations, this paper presents a product aspect ranking
system which can identify important product aspects from consumer reviews and
rank them by taking into account the frequency and consumers opinion on frequent
aspects.
3.1 PROJECT IDEA
Use of internet and e-commerce is growing very rapidly. Numerous consumer
reviews of products are now available on the Internet. Consumer reviews contain rich
VPCOE,Baramati, Department of Computer Engineering 2015 12
and valuable knowledge for both firms and users. The reviews are often unordered,
leading to confusion in consumers mind about aspect on which they should give
importance. Consumers needs to know on which aspects they should purchase the
product.
Hence, there is a need to develop an intelligent application which remove the
confusion in consumers mind and also helps firms ,to improve the product quality.
A product aspect ranking framework, automatically identifies the important aspects of products from online consumer reviews, aiming at improving the usability
of the numerous reviews.
The important aspect are identified based on two observations:
1. The important aspects are usually commented by a large number of consumers.
2. Consumer opinion on each aspects.
3.2 MOTIVATION OF THE PROJECT
Consumers can conveniently make purchasing decision by paying more attentions
to the important aspects, while firms can focus on improving the quality of the aspect
and thus enhance product quality effectively. However, it is difficult for people to
manually identify the important aspects of products from numerous reviews. Therefore, an approach to automatically identify the important aspects is required.
Motivated by the above observations ,the system proposes a product aspect ranking framework which automatically recognizes the important aspects of product from
online consumer reviews, aiming at improving the usability of the numerous reviews.
3.3 LITERATURE SURVEY
1. Overview of Text Summarization Extractive Techniques[2]
Text summarization plays an important role in the area of natural language processing and text mining. Text summarization aims to create a compressed summary
VPCOE,Baramati, Department of Computer Engineering 2015 13
while retaining the main characteristics of the original set of documents. It uses linguistic methods to examine and interpret the text and then to find the new concepts
and expressions to best describe it by generating a new shorter text that conveys
the most important information from the original text document. In this paper, the
extractive Methods are used like Query based extractive text summarization, Multidocument extractive summarization.
A. Extractive Methods:
The approach naturally integrates linguistic features, such as part-of-speech and
surrounding contextual clues of words into automatic learning.
1.Term Frequency-Inverse Document Frequency (TF-IDF) Method:
In this method,to generate a generic summary, non stop-words that occur most frequently in the document(s) may be taken as the query words. Since these words
represent the them of the document, they generate generic summaries.
2.Cluster based method:
The document clustering becomes almost essential to generate a meaningful summary. Documents are represented using term frequency inverse document frequency
of scores of words. Term frequency used in this context is the average number of
occurrences (per document) over the cluster.
Drawbacks
1. Extracted sentences usually tend to be longer than average. Due to this, part of
the segments that are not essential for summary also get included, consuming
space.
2. Important or relevant information is usually spread across sentences, and extractive summaries cannot capture this (unless the summary is long enough to
hold all those sentences).
3. Conflicting information may not be presented accurately.
4. Pure extraction often leads to problems in overall coherence of the summary.