Advantages to using databases for storing information
— more accurately, the question should ask, “what kind of database should you use for ___ blank information.”
Structure of my essay = 5 paragraphs, short intro and conclusion, 500 words for 3 body paragraphs
Introduction: Since the beginning of human existence, humans have been collecting information and storing it. In tandem with the evolution of technology, the way we store our collection of knowledge has changed as well. From cave paintings to papyrus to parchment to manuscripts to books and finally now to computers. And now with the most information we've ever had at our fingertips, we have now to differentiate our data. And with different types of information comes different organizational systems. For raw numerical data, statistical graphs; for literary works, libraries; for scientific records, encyclopedias. Recently however, humans have found need to organize all this data so the public may easily easily access knowledge. For this reason E.F. Codd, in 1972, proposed a digital upgrade to the archaic archival system. This proposition included ideas such as the, “relational database model… In his model, the database’s schema, or logical organization, is disconnected from physical information storage, and this became the standard principle for database systems.” This revolutionary idea for databases transformed the way data has been organized and accessed in the digital age.
Definitions:
Table: graphical organizer for data
Queries:
Forms:
Reports:
Relational
Network
Hierarchal
Object oriented
SQL: table query form report and macro objects
Macro: set of actions
Mining: finding data
Database architecture:
2 body paragraphs on relational databases
“The relational model's central idea is to describe a database as a collection of predicates over a finite set of predicate variables, describing constraints on the possible values and combinations of values. The content of the database at any given time is a finite (logical) model of the database, i.e. a set of relations, one per predicate variable, such that all predicates are satisfied. A request for information from the database (a database query) is also a predicate.”- Wikipedia
1970s: Two major relational database system prototypes were created between the years 1974 and 1977, and they were the Ingres, which was developed at UBC, and System R, created at IBM San Jose. Ingres used a query language known as QUEL, and it led to the creation of systems such as Ingres Corp., MS SQL Server, Sybase, Wang’s PACE, and Britton-Lee. On the other hand, System R used the SEQUEL query language, and it contributed to the development of SQL/DS, DB2, Allbase, Oracle, and Non-Stop SQL. It was also in this decade that Relational Database Management System, or RDBMS, became a recognized term.”
“Among the critical technologies developed for System R were:
Structured Query Language (SQL), developed by Chamberlin and Ray Boyce, for expressing queries.
Pat Selinger’s cost-based optimizer, which automatically translates a high-level query into an efficient plan for executing the query.
Raymond Lorie’s query compiler, which saved query plans for future use.
Ad hoc query formulation and execution allowing rapid development and testing.
Online data definition in support of new applications without shutting down the system.” -IBM
1 body paragraph on non relational databases
“Other models are the hierarchical model and network model. Some systems using these older architectures are still in use today in data centers with high data volume needs, or where existing systems are so complex and abstract that it would be cost-prohibitive to migrate to systems employing the relational model. Also of note are newer object-oriented databases.”
Wikipedia
Conclusion:
Queries made against the relational database, and the derived relvars in the database are expressed in a relational calculus or a relational algebra. In his original relational algebra, Codd introduced eight relational operators in two groups of four operators each. The first four operators were based on the traditional mathematical set operations:
Relational Databases Relational databases have been the power-horse of software applications since the 80s, and continue so to this day. They store highly structured data in tables with predetermined columns of certain types and many rows of the same type of information, and, thanks in part to the rigidity of their organization, require developers and applications to strictly structure the data used in their applications. In relational databases, references to other rows and tables are indicated by referring to their (primary-)key attributes via foreign-key columns. This is enforceable with constraints, but only when the reference is never optional. Joins are computed at query time by matching primary- and foreign-keys of the many (potentially indexed) rows of the to-be-joined tables. These operations are compute- and memory-intensive and have an exponential cost. If you use many-to-many relationships, you have to introduce a JOIN table (or junction table) that holds foreign keys of both participating tables which further increases join operation costs. Those costly join operations are usually addressed by denormalising data to reduce the number of joins necessary. Although not every use-case is a good fit for this type of stringent data model, in the past, the lack of viable alternatives and the great support for relational databases has made it difficult for alternative models to break into the mainstream. Meet graph databases.
Let's illustrate with a "product sales" database. We begin with two tables: Products and Orders. The table products contains information about the products (such as name, description and quantityInStock) with productID as its primary key. The table orders contains customer's orders (customerID, dateOrdered, dateRequired and status). Again, we cannot store the items ordered inside the Orders table, as we do not know how many columns to reserve for the items. We also cannot store the order information in the Products table.
Relationships are first-class citizens in graph databases; most of the value of graph databases is derived from the relationships. Relationships don't only have a type, a start node, and an end node, but can have properties of their own. Using these properties on the relationships, we can add intelligence to the relationship—for example, since when did they become friends, what is the distance between the nodes, or what aspects are shared between the nodes. These properties on the relationships can be used to query the graph.
Documentation – All product-related materials, specifications, technical manuals, user manuals, flow diagrams, file descriptions, or other written information either included with products or otherwise. Raima's documentation is online.
SQL makes some things easy, but other things more difficult. Some people find SQL easy to work with, others find it horribly cryptic. The teams personal comfort is a big issue here. I would suggest that if you go the route of putting a lot of logic in SQL, don't expect to be portable – use all of your vendors extensions and cheerfully bind yourself to their technology. If you want portability keep logic out of SQL.
Protocol – A specific method in which messages are formulated, formatted, and passed between computers in a network. Internet messages are passed between computers using the TCP/IP protocol.
Now, we can specify the columns for the entities. Similar to identifying entity, you need to think carefully what columns you need to store in each entity. Do not add columns for data that have no value to the system.
Part of this processing involves consistently being able to select or modify one and only one row in a table. Therefore, most physical implementations have a unique primary key (PK) for each table. When a new row is written to the table, a new unique value for the primary key is generated; this is the key that the system uses primarily for accessing the table. System performance is optimized for PKs. Other, more natural keys may also be identified and defined as alternate keys (AK). Often several columns are needed to form an AK (this is one reason why a single integer column is usually made the PK). Both PKs and AKs have the ability to uniquely identify a row within a table. Additional technology may be applied to ensure a unique ID across the world, a globally unique identifier, when there are broader system requirements.