Optimising search queries through recognition

Updates posted on various Social Media platforms like Twitter and Facebook has opened for us a new area for research. Tweets are up to date and provide inclusive data which can be great help for companies that want to target a particular type of clients. The volume of tweets is increasing day by day and hence we consider named-entity recognition, information extraction, and text mining over tweets. Typically Named Entity Recognition is a task that extracts different entity classes such as location, person, company, time etc. This task is now being used to optimize search queries better so that optimum result from these queries is driven out according to the user satisfaction. Search queries usually consist of linguistic units that are submitted by the user to the search engines. Search queries typically consist of words that specify the users need. Recent research has concentrated on analysing the constituent units comprising search queries. But since there search queries are often unstructured, ambiguous and short; techniques to detect and classify named entities are required. Named Entity Recognition (NER) is one of the techniques which classify text into person, location and organization etc. Using NER (named entity recognition), search engines can make sense about the contents on web as interconnected entities rather than seeing them as strings of information to be retrieved for every keyword match in a search query. Google first patented the technique for re-writing query where named entity recognition was introduced. The following year, Microsoft patented their augmenting ‘blocks’, also known as; passages of web pages that included linguistic features which also mentions of named entities. In the year 2009, Yahoo! also acquired patent for utilising named entities for search query disambiguation. Google’s acquired the Metaweb in 2010, which operated the Freebase directory, they released the Google Knowledge Graph in 2012. The objectives was to provide direct answers to search queries submitted by users without the need for further navigation to other resources. Recently, in the year 2013, Microsoft also patented for an ‘entity-based search system’ that enables to detect named entities on the Web and it also organises search results according to the search query. Different named entity recognition (NER) techniques have been used to utilize named entities present in the web, progress in this area have shown that a search log is enriched with named entities. Typically a search log is a kind of repository which maintained by search engines to record its user’s activities along with their submitted search queries. The use of extracting named entities from respective search log is that they have mentions of named entities that are written in the form of users perspectives. Thus, a novel approach for named entity recognition and classiﬁcation in search queries is required that satisfy the users need for information. Nevertheless, named entity recognition and classiﬁcation is a challenging task given the linguistic nature of search queries.
2 Problem Deﬁnition and Contributions
Let us take an example and consider a suppose ‘Bill Clinton work’ is a query and if the user of search engine submits this query following the English language spelling rules, like capitalising proper nouns, then ‘Bill Clinton’ will be detected as a named entity while the word, ‘work’ can be used as a contextual clue to identify other named entities of the class Person in search queries whose last keyword is ‘work’. However, this is not the case in all the scenarios some users choose to typically express their information need through only fewer keywords, so queries in these cases do not have enough context to identify or classify a named entity. Sometimes, even the form through which query keywords are spelled, such as capitalisation, cannot be relied upon or trusted because some users who are not aware of these rules often do not follow the orthographic rules of spelling when entering their search queries into the engines. Furthermore, search queries tend to be highly ambiguous as are the contextual clues embedded within these queries, and resolving this ambiguity is very important to optimize the query end result. Finally, the Web is now consistently changing and also search queries, so relying completely on supervised named entity recognition models made or customized for speciﬁc search queries may also not be able to give us desired results.
The research area confined within the context of named entity recognition in search queries has to address the following questions:
– Considering the lack of time or brevity and lack of correct grammatical structure usage how can we identify and classify named entities so that our search queries give an optimized result?
– How can we identify the distinction between named entities when there are more than one named entity in a search query?
– Also is it possible to classify queries according to the target entity classes using contextual features that are domain independent and grammar-independent and also adaptable to the dynamic nature of the Web and search queries?
– what is the percentage of search queries that has no contextual clues to the named entity and how often do users of the search engines further specify their entity-centric queries by adding additional keywords related to

Essay: Optimising search queries through recognition

Essay details and download:

Text preview of this essay:

About this essay:

Essay details and download:

Text preview of this essay:

About this essay:

Essay Categories: