Academic field of study information

I. Introduction

As an Academic field of study Information Retrieval might be defined thus “Information Retrieval is findingmaterial ( usually documents) of an unstructured nature( usually text) that satisfies an information need from within a large collections ( usually stored on computer)” []. Furthermore, in the case of textual documents in which we are interest a significant part of difficulties are due to the difficulty lies in ambiguity inherent to human languages, [1] and the carelessness of the user. As a consequence, we observe that in order to build more robust Information Retrieval System (IRS) able to interact naturally with human, we should imply the user. The work presented in this paper deals with an overview of IR. In section III, we present the reasons of the emergency of the personalized IR. In section IV we present our work in progress, proposing some key ideas in order to improve information retrieval. Finally, we conclude this paper in section V.

II. Information Retrieval

According to[Rijsbergen] who defines Information Retrieval as “The user expresses his information need in the form of a request for information, Information Retrieval is concerned with retrieving those documents that are likely to be relevant to his information need as expressed by his request”. This definition contains two important things that we should define: the document and the request.

The document: we call document any unit of information which can constitute an answer to a user’s request. It can be a text, a part of text, a picture, a video band

The Request: constitutes the expression of the need in user’s information, the user has to subject to the search engine his need in information. Diverse types of query languages have been proposed in the literature:

List of keywords: SMART system and OKAPI []
Natural language: SMART system and SPIRIT []
Boolean language: DIALOG system []
Graphic language: NEURODOC []

The principle goal of an information retrieval system is to select the nearest documents that answer a user request. For that purpose, the information retrieval system regroups a set of methods and procedures allowing the management of the collections of documents stored in the form of an allowing intermediate representation.

Thus, the interrogation of the collection of documents by means of request requires the representation of this last one under a shape one unified compatible with those of documents. These features are represented with a global process of IR, collectively named process in U illustrated in figure 1.

This process consists in two main phases: the indexing and the interrogation.

The indexing mechanism: it’s a very important step in the process of Information Retrieval, it consists of determining and extracting the representative terms of a document or request. The result of the indexing process is called descriptor: it can be a list of terms, or a set of significant terms

According to [5], these could be classified into three main , test pattern.

This paper focuses on the first classification’s category (result versus process pattern). In fact, result patterns are widely used in several works among which, the most significant are: analysis patterns, design patterns, architectural patterns and implementation patterns. Regarding process patterns, they still represent an emerging topic that is not fully enhanced and investigated.

Result patterns describe how the solution for the problem looks like (the solution is the result) whereas process patterns describe which process leads to the desired result (the solution is the process)

Regardless the application domain and the pattern’s type, the main benefits of patterns consist in proven and helpful knowledge representation, communication and understanding [6] as well as best practices capitalization and enhancement.

Process Patterns Description Models and Languages: A Theoretical Survey

This section deals with the study that we carried out in order to assess process patterns representation and reuse within software development communities. First, principal works found in the literature, are cited. Second, proposed evaluation criteria are described to achieve this study. Third, comparative tables are presented showing the study’s results. Finally, main open issues are addressed discussing our survey’s findings.

A. Survey’s Focus

Different works have been carried out concerning process pattern’s modeling and formalization. These works could be divided into two main categories, namely:

1) Process Patterns’Description Models: This category regroups works that provide a terminology for process patterns description. Main process pattern description models include:

AMBLER (1998)
RHODES (2000)
GNATZ (2001)
P-SIGMA (2001)
STÖRRLE (2001)

2) Process Patterns’Description Languages: This category regroups works that provide a language for process patterns description. Main process pattern description languages include:

PROMENADE (2002)
PPDL (2002)
PROPEL (2004)
PLMLx (2004)
UML-PP (2007)

Because of space limitations, we will not present details about the studied works. References are provided to give further information concerning these works.

B. Survey’s Criteria

In order to discuss an eventual improvement of process patterns capitalization, management and reuse during software development processes, we propose the following criteria to evaluate the aforementioned works.

Pattern’s Formalization Degree: It aims to classify a given work depending on the formalization level used for presenting pattern’s knowledge.
Pattern’s Context Scope: It identifies the different types of context covered by the pattern, i.e. if the pattern provides a description of initial context (preconditions) and/orresulting context (post conditions).
Pattern’s Coverage: It serves to identify the software development phase (s) concerned by the pattern, namely: analysis, design, implementation and testing.
Pattern’s Domain: It identifies the domain addressed by the pattern.
Pattern’s Artifact Referencing: It indicates if the described process pattern provides any reference to artifact(s) developed and / or used during pattern’s application.Pattern’s Referencing. It identifies whether a pattern maintains relationships with one or several pattern (s) and if it is the case, the relationship type.
Experience’s Referencing: It identifies if the process pattern refers to cases in which the pattern was applied.
Pattern’s Guidance Level: It identifies the assistance level provided by the pattern’s formalism by the offered guideline’s form e.g. examples of use or application, adaptation parameters, adaptation scenarios.
Role’s Implication: It specifies if the pattern’s formalism takes into consideration roles participating in a pattern.
Work Purpose: It identifies the context in which the pattern model or language appeared (i.e. if the work focuses on patterns’ formalization and management issues or other issues related to or using the concept of patterns).
Pattern’s Meta-modeling: It identifies if the proposed format is supported by a meta-model.
Pattern’s Support Tool: It identifies if the proposed pattern’s format is supported by a process pattern based prototype, environment, tool or system.

C. Survey’s Results

Table 2 and table 3 show the comparative study’s results of the presented works according to the agreed evaluation’s criteria. Table 1 presents the comparative tables’ legend.

Tables’ legend

Notation	Description
I.	Initial context
R.	Resulting context
I. and R.	Initial and Resulting context
S.D.	Software Development
H.C.I.	Human Computer Interaction
S.E.	Software Engineering
I.S.E.	Information System Engineering
+	Very well supported criterion
—	Not supported criterion
+-	Not very well supported criterion

Table II

Evaluation results for process patterns’ description models

Criterion	Process Pattern Description Model
Criterion	AMBLER	RHODES	P-SIGMA	GNATZ	STÖRRLE
Formalization	Informal	Formal	Semi formal	Semi formal	Semi formal
Context	I. and R.	I.	I.	I. and R.	I.
Coverage	Software life cycle	Software life cycle	—	—	—
Domain	S.D.	S.D.	I.S.E.	S.D.	S.D.
Artifact	—	—	+	—	—
Relationships	—	—	Informal	Informal	Informal
Experience	—	—	—	+	+
Guidance	+-	+-	+-	+	+
Roles	—	+	—	—	+
Purpose	Process patterns description	Process modeling	Process patterns management and reuse	Software development process management	Process patterns reuse
Meta-modeling	—	—	+	+-	—
Support tool	—	RHODES	AGAP	LiSA-PRO	—

Table III

Criterion	Process Pattern Description Language
Criterion	PROMENADE	PPDL	PROPEL	PLMLx	UML-PP
Formalization	Semi formal	Semi-formal	Semi formal	Informal	Formal
Context	I. and R.	I. and R.	I. and R.	I. and R.	I. and R.
Coverage	—	Software life cycle	—	Design, Implementation	Software life cycle
Domain	S.D.	S.D.	S.D.	H.C.I.	S.D.
Artifact	—	+	—	+	+
Relationships	—	Semi-formal	Semi-formal	Semi-formal	Formal
Experience	—	—	—	+	—
Guidance	+-	+-	+	+	+-
Roles	+	+	+	—	+
Purpose	Process modeling and reuse	Process patterns’ description and mangement	Process patterns management	Pattern languages description	Software process modeling
Meta-modeling	+	+	+	—	+
Support tool	—	—	Process Pattern Workbench	PLML PET	PATPRO-MOD

Evaluation results for process patterns’ description Languages

D. Survey Results’ Discussion

The theoretical study that we carried out reveals the following main issues:

Lack of Formalization: RHODES (using PBOOL [14]) and UML-PP (using OCL 2.0 constraints [15]) are the only works that tried to formalize the process patterns’ description. The formalization of relevant parts in a process pattern description (e.g. problem, context, solution and patterns’ relationships) will, indeed, improve its identification, selection and consequently its reuse. We do not reject informal (texts in natural language) or semi formal (e.g. UML diagrams) description but we encourage the three description levels.
Weak Pattern’s Consideration and Apprehension: Among the studied works, AMBLER, P-SIGMA, STÖRRLE, PPDL, PROPEL and PLMLx are the only ones that are dedicated to pattern’s matters i.e. they directly and explicitly deal with process patterns. The other ones use this concept to tackle other issues related to processes.
Lack of Process Pattern Meta Modeling: Among the studied works, few of them are supported by a process pattern meta model, namely: PROMENADE, P-SIGMA, PPDL, PROPEL and UML-PP. GNATZ does not emphasize on process patterns as it principally focuses on process meta modeling using the pattern’s concept.
Guidance Insufficiency: Only four works (Gnatz, Störrle, PROPEL and PLMLx) accord importance and support means for pattern comprehension, application and reuse by giving examples, remarks, guidelines, application constraints, adaptation parameters, and known uses.
Lack of Software Life Cycle Coverage:Only four works consider the importance of patterns during the whole software life cycle, namely: AMBLER and RHODES among the process pattern description models and PPDL as well as UML-PP among the process pattern description languages (cf. Table 2 and Table 3). We assume that patterns be of different types covering the different phases, steps and activities of a software development process. This principle aims to provide a pattern support for the whole software life cycle.
Lack of Terminological Consent: With the exception of the pattern’s triplet (problem, context and solution) for the majority’s works, most of them suffer from synonymy’s problem which refers to a case where different terms are used to refer to the same meaning as well as polysemy’s problem which refers to a case where a same term is used for multiple meanings. For examples, to refer to the pattern’s intention (goal), STÖRRLE, GNATZ and PROMENADE use the term “intent”, PLMLx and P-SIGMA use the term “force”. Although, in RHODES, the term “intention” refers to the “problem” resolved by the pattern. Furthermore, to describe the pattern’s context, some works (i.e. AMBLER, PPDL, PROPEL, UML-PP) consider both “initial context” and “resulting context” (for PROMENADE, “result context”), while other works use “context” to refer to the initial context and “consequences” (GNATZ) or “resulting context” (PLMLx) for the resulting context, others use only “context” and do not distinguish between the two types (i.e. RHODES).
Lack of Architectural Consent: In fact, different process pattern description formats have been proposed; each one has its proper architecture that best fit the model designer’s needs. Consequently, some works added new facets and included new concepts in their models such as roles implied in the pattern (or participant) and artifacts used or produced by the pattern, pattern management parameters, and many other features.
Missing Process Patterns’ Support Approaches and Tools: Even though, some works as P-SIGMA, PROPEL, PLMLx and UML-PP deal with this issue, some important challenges still remain such as: unified pattern’s description, pattern’s semantic annotation, pattern’s classification, search and mining. Indeed, the majority of the studied works provide process support methods and tools based on patterns (e.g. RHODES, GNATZ, cf. Table 2) and do not focus entirely on patterns’ support improvement.

As a conclusion, we notice that different heterogeneity’s levels emerged from this study, namely:

The architectural level which refers to patterns’ structure.
The terminological level which refers to terms used as labels to describe process patterns.
The knowledge level which refers to the different facets (e.g. guidance, roles, artifacts, management, organization, evaluation) that are considered by the process patterns.

These levels imply another one which is the “semantic level” due to different interpretations and considerations of the process pattern concept.

IV. Work in Progress

To overcome these deficiencies and to benefit from large and different process patterns’ collections, unification and mediation efforts are needed.

In fact, we think that if we make abstraction of the above mentioned disparities, we could learn more from different process patterns and as a consequence capitalize and reuse more knowledge.

To reach this goal, we are now focusing on the construction of a meta pattern’s ontology providing two unification levels, as shown in Fig. 1:

A. Architectural Unification

The architectural unification level is ensured by means of an Architectural Core (i.e. a native pattern) which contains a set of normalized Terms (NT) concerning common concepts supported by the studied works. Each NT is annotated by a set of terms matching those used in the studied works.

In addition to the Core, we include Derived Architectures (i.e. derived patterns) containing specific concepts used by the different proposed works. Furthermore, Shared concepts are added corresponding to concepts that are supported by some works but not all of them. The whole is semantically and/or hierarchically linked to form an architectural mediation graph.

This unification aims to normalize the description of any given process pattern so as to better reuse it later.

B. Semantic Unification

This level concerns the content of different process patterns’ fields. This unification consists in a text mining process, extracting terms and / or concepts that are most representative for the different patterns fields’ contents. These latter (terms, concepts) are weighted according to their occurrence number in the content and then, sorted according to their weight representing thus, a semantic annotation for the concerned features. The fields’ labels might also be enriched by synonyms extracted from the WordNet’s ontology, contributing thus, to the enrichment of these semantic annotations.

The objective of our work is to provide a framework for software process patterns’ capitalization and reuse. So, to achieve this, we adopt a Process Pattern Warehousing and Pattern Mining approach to well manage best practices and process implementation traces that are embedded in process patterns.

Process Pattern Warehousing consists in the integration of different process patterns’ collections in a Process Pattern Warehouse via a unification schema.This unification is performed by the ontology described above, consisting in concepts and their semantic annotations, relations, as well as the respective unification axioms.

On the other hand, Process Patterns’ Mining consists in the reasoning process on the unified patterns in the warehouse based on the proposed ontology. Consequently, we could better search similar patterns for a given problem by clustering them according to their similarity’s level to the problem. Hence, the process patterns’ mining process will improve process pattern’s capitalization and process reuse quality.

As a consequence to these two unification levels, different patterns would be capitalized in a unique and general format in the warehouse, used then, for mining and finally if necessary, converted to the desired format for reuse during the whole software development process.

V. Conclusion

Although a considerable number of works focused on process patterns, some important challenges for the research community still remain. This paper provides some reflections on issues arising from the study that we carried out concerning process patterns’ description models and languages. These reflections consist in the need of software process patterns’ capitalization to improve patterns’ reuse during software development processes. To do this, a meta pattern’s ontology is being constructed providing an architectural and semantic mediation for different process pattern descriptions.

...(download the rest of the essay above)

Essay: Academic field of study information

Essay details and download:

Text preview of this essay:

I. Introduction

II. Information Retrieval

Process Patterns Description Models and Languages: A Theoretical Survey

B. Survey’s Criteria

C. Survey’s Results

Tables’ legend

Notation

Description

Table II

Criterion

Process Pattern Description Model

AMBLER

RHODES

P-SIGMA

GNATZ

STÖRRLE

+

+-

Table III

Criterion

Process Pattern Description Language

PROMENADE

PPDL

PROPEL

PLMLx

UML-PP

D. Survey Results’ Discussion

IV. Work in Progress

A. Architectural Unification

B. Semantic Unification

V. Conclusion

About this essay:

Essay details and download:

Text preview of this essay:

I. Introduction

II. Information Retrieval

Process Patterns Description Models and Languages: A Theoretical Survey

B. Survey’s Criteria

C. Survey’s Results

Tables’ legend

Notation

Description

Table II

Criterion

Process Pattern Description Model

AMBLER

RHODES

P-SIGMA

GNATZ

STÖRRLE

+

+-

Table III

Criterion

Process Pattern Description Language

PROMENADE

PPDL

PROPEL

PLMLx

UML-PP

D. Survey Results’ Discussion

IV. Work in Progress

A. Architectural Unification

B. Semantic Unification

V. Conclusion

About this essay:

Essay Categories: