Home > Sample essays > Organize Files Efficiently: What Is File Organization?

Essay: Organize Files Efficiently: What Is File Organization?

Essay details and download:

  • Subject area(s): Sample essays
  • Reading time: 4 minutes
  • Price: Free download
  • Published: 1 April 2019*
  • File format: Text
  • Words: 1,149 (approx)
  • Number of pages: 5 (approx)

Text preview of this essay:

This page of the essay has 1,149 words. Download the full version above.



File organization is the methodology which is applied to structured computer files. Files contain computer records which can be documents or information which is stored in a certain way for later retrieval.

File organization refers primarily to the logical arrangement of data (which can itself be organized in a system of records with correlation between the fields/columns) in a file system.

It should not be confused with the physical storage of the file in some types of storage media.

There are certain basic types of computer file, which can include files stored as blocks of data and streams of data, where the information streams out of the file while it is being read until the end of the file is encountered.

Internal File Structure :

Methods and Design Paradigm :

However, all things considered the most important considerations might be:

1. Rapid access to a record or a number of records which are related to each other.

2. The Adding, modification or deletion of records.

3. Efficiency of storage and retrieval of records.

4. Redundancy, being the method of ensuring data integrity.

A file should be organized in such a way that the records are always available for processing with no delay. This should be done in line with the activity and volatility of the information.

Types of File Organization :

The three techniques of file organization are:

1. Heap (unordered)

2. Sorted

(i) Sequential (SAM)

(ii) Line Sequential (LSAM)

(iii) Indexed Sequential (ISAM).

3. Hashed or Direct :

In addition to the three techniques, there are four methods of organizing files are : Sequential, line-sequential, indexed-sequential, inverted list and direct or hashed Access organization.

Sequential Organization

A sequential file contains records organized in the order they were entered. The order of the records is fixed. The records are stored and sorted in physical, contiguous blocks within each block the records are in sequence.

Records in these files can only be read or written sequentially. Once stored in the file, the record cannot be made shorter or longer or deleted. However, the record can be updated if the length does not change.

Line-Sequential Organization :

Line-sequential files are like sequential files, except that the records can contain only characters as data. Line-sequential files are maintained by the native byte stream files of the operating system.

In the COBOL environment, line-sequential files that are created with WRITE statements with the ADVANCING phrase can be directed to a printer as well as to a disk.

Indexed-Sequential Organization :

Key searches are improved by this system too.

The single-level indexing structure is the simplest one where a file, whose records are pairs, contains a key pointer.

This pointer is the position in the data file of the record with the given key. A subset o f the records, which are evenly spaced along the data file, is indexed, in order to mark intervals of data Records.

There are three areas in the disc storage:

• Primary Area: Contains file records stored by key or ID numbers.

• Overflow Area: Contains records area that cannot be placed in primary area.

• Index Area: It contains keys of records and there locations on the disc.

Inverted List :

In file organization, this is a file that is indexed on many of the attributes of the data Itself.

The inverted list method has a single index for each key type. The records are not necessarily stored in a sequence.

They are placed in the are data storage area, but indexes are updated for the record keys and location.

Direct or Hashed Access :

With direct or hashed access a portion of disk space is reserved and a "hashing" algorithm computes the record address.

So there is additional space required for this kind of file in the store.

Records are placed randomly through out the file. Records are accessed by addresses that specify their disc location.

Let us discuss how does a DBMS represent a relational query evaluation plan :

An execution plan for a relational algebra expression represented as a query tree includes information about the access methods available for each relation as well as the algorithm to be used in computing the relational operators represented in the tree.

Example: Retrieve the name and address of all employees who work for the "Research" department.

To convert this into an execution plan, the optimizer might choose an index search for the SELECT operation, a table scan as access method for EMPLOYEE, a nested-loop algorithm for the join and a scan of the join results or the PROJECT operation.

Execution of the query may be :

1.Materialized evaluation,

2.Pipelined evaluation.

1.Materialized evaluation :

The result of an operation is stored as a temporary relation. The join operation can be computed and the entire result stored as a temporary relation, which is then read as i/p by the algorithm that compute the project operation which would produce the query result table.

2.Pipelined evaluation :

Pipelined evaluation as the resulting tuples of an operation are produced, they are forwarded directly to the next operation in the query sequence.

The advantage of pipelining is the cost savings in not having to write the intermediate result to disk and not having to read them back for the next operation.

4.6 Organization of Records in Files

Let us see a short note on shadow paging :

Shadow paging considers the data base to be make up of a number of fixed size disk page (or disk blocks) say n.

A directory with "n" entries is constructed where the ith entry points to the ith data base page on disk. The directory is kept in main memory if it is not too large.

When a transaction begin executing, the current directory whose entries point to the most recent or current database pages on disk – is copied into a shadow directory.

The shadow directory is then saved on disk while the current directory is used by the transaction.

An example of shadow paging

•During transaction execution, the shadow directory is never modified.

•When a write-item operation is performed, a new copy of the modified database page is created, but the old copy of that page is not overwritten.

•Instead, the new page is written an some previously unused disk block.

•The current directory entry is modified to point to the new disk block, whereas the shadow directory is not modified and point to the old unmodified disk block.

•To recover from a failure during transaction execution, it is sufficient to free the modified database pages and to discard the current directory.

•The state of the database before transaction execution is available through the shadow directory and that state is recovered by re-installing the shadow directory.

•The database thus is returned to its state prior to the transaction that was executing when the crash occured and any modified pages are discarded.

Disadvantage:

The updated database pages change location on disk. This makes it difficult to keep related database pages close together on disk without complex storage management strategies.

...(download the rest of the essay above)

About this essay:

If you use part of this page in your own work, you need to provide a citation, as follows:

Essay Sauce, Organize Files Efficiently: What Is File Organization?. Available from:<https://www.essaysauce.com/sample-essays/2015-10-14-1444821199/> [Accessed 29-03-24].

These Sample essays have been submitted to us by students in order to help you with your studies.

* This essay may have been previously published on Essay.uk.com at an earlier date.