Data Mining ( the analysis step of the knowledge Discovery in Database process or KDD), a relatively young and interdisciplinary field of computer science, is the process of extracting patterns from large data sets by combining methods from statistics from artificial intelligence with database management.
With recent technical advances in processing power, storage capacity, and inter-connectivity of computer technology, data mining is seen as an increasingly important tool by modern business to transform unprecedented quantities of digital data into business intelligence giving an informational advantage. It is currently used in a wide range of profiling practices, such as marketing, surveillance, fraud detection and scientific discovery. The growing consensus that data mining can bring real value has led to an explosion in demand for novel data mining technologies.
Data mining Process
â¢ Data Cleaning: Used to remove noise and inconsistent data.
â¢ Data Integration: It is used where the multiple data sources may be combined.
â¢ Data Selection: IN this the data relevant to the analysis task are retrieved.
â¢ Data Transformation: It is where the data are transformed or consolidated into forms appropriate for mining by performing summery and aggregate operation.
â¢ Data Mining: It is essential process where intelligent methods are applied in order to extract data patterns.
â¢ Pattern Evaluation: It is to identify the truly interesting patterns representing knowledge based on some interestingness measures.
â¢ Knowledge Presentation: It is where visualization and knowledge representation techniques are used to present the mined knowledge to the user.
Knowledge discovery process
The term Knowledge Discovery in Databases refers to the extensive process of discovery knowledge in data, and emphasizes the “high-level” application of particular data mining methods. It is of interest to researchers in machine learning, pattern recognition, databases, statistics, artificial intelligence, knowledge achievement for expert systems, and data revelation.
The unifying goal of the KDD process is to extract knowledge from data in the context of large database. It does this by using data mining methods (algorithms) to extract (identify) what is deemed knowledge, according to the specifications of measures and thresholds, using a database along with any required preprocessing, sub sampling, and transformations of that database
Data mining Model
Techniques of data mining
There are several major data mining techniques have been developed and used in data mining projects recently including association, classification, clustering, regression,prediction and sequential patterns.
Searche for relationship between variables. For example a supermarket might gather data on customer purchasing habits. Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes called as market basket analysis.
Classification is the task of generalizing known structure to apply to new data. For example, an email program might attempt to classify an email as legitimate or spam. Common algorithms include decision tree learning, nearest neighbor, neural networks and support vector machines.
Clustering is the assignment of a set of observations into subsets so that observations in the same cluster are similar in some sense. Clustering is a method of unsupervised learning , and a common technique for statistical data analysis used in many fields , including machine learning , data mining, pattern recognition, image analysis, information retrieval and bioinformatics.
Regression is a data mining function that predicts a number, Profit, sales , house value, square footage, temperature, or distance could all be predicted using regression technique. For example, a regression model could be used to predict the value of house based on location, number of rooms, lot size, and other factor.
Regression models are tested by computing various statistics that measure the difference between the predicted and the expected values. The historical data for a regression project a
typically divided into two data sets : one for building a model ,the other for testing the model.
Types of regression:
A regression models the past relationship between variables to predict their future behavior.
1. Simple regression: When one independent variable is used in a regression, it is called a simple regression
2. Multiple regression: when two or more independent variables are used, it is called a multiple regression
5. Sequential Pattern
Sequence analysis is concerned with a subsequent purchase of a product or products given a previous buy. For instance, buying an extended warranty is more likely to follow the purchase of TV or other electric appliances. There is a wide range of applications for sequence analysis in many areas of industry including customer shopping patterns, phone call patterns and web log streams.
Management Information System
MIS is a system that provides information needed to manage organizations effectively. Management information system are regarded to be subset of the overall internal controls procedures in a business ,which cover the application of people, documents, technologies and procedures used by management accountants to solve business problems such as costing the product , service or a business- wide strategy.
Characteristics of Management information system
â¢ MIS support structured decisions at the operational and management control levels.
â¢ MIS have little analytical capability.
â¢ MIS generally aid in decision making using past and present data.
â¢ MIS are relatively inflexible.
â¢ MIS have an internal rather than an external orientation
Technical versus non technical education in colleges
A technical college, also known as a professional college , is an educational institution that prepares students for a career in a particular field. For example, if students attend a technical college in hopes to be an engineer they will focus on engineering only. A degree, diploma or certificate in a technical major will give you a special skill or practical knowledge in a particular field such as engineering, health and medicine, computers and a lot of other technically oriented fields.
The Technical College System was designed in 1961 to help prepare people for the kind of jobs that companies need employees for. In other words, they offer programs that will help you get a job Career .
In the 21st century, the rate of change in social and technical systems is accelerating, and the mechanization of society and work requires that all Individuals reach new levels of educational accomplishment.Career and Technical Education is defined as providing a context for learning and applying educational skills.
Education systems in developing countries struggle with many issues like grade repetition, leaving colleges, teacher absenteeism, and less learning than the prospectus standards suggest.
A good education system is important in the advancement of a developing country Education systems in developing countries face the similar issues of inefficient budgets, the building of unnecessary non technical colleges, paying inadequate teachers, and buying colleges the predicted academic
Performance of students in technical is higher than students attending non technical colleges. Poor households who may struggle to afford non technical consider many
factors such as less fees, proximity, and quality. Despite the cost of technical education
poor families value the high quality education that this provides to students and strive to
afford this opportunity but in less fess. Non technical colleges in developing countries face the issue of teacher absenteeism. Teacher absence rates are higher in non technical colleges as compared to technical colleges. Non technical collegesâ teachers compose less money than technical colleges.
India has seen an increase in technical colleges opening in recent years. Non technical need to focus on improving quality, because academic development influences earnings, productivity, and economic growth. This lack of basic resources does not create an environment encouraging to learning. Additionally, furniture, teaching aids, books library, Computer labs and sports clubs are not provided to non technical colleges.
There is a common belief that non technical offer a lower quality education than technical. Non technical quality is not as enormous in terms of resources and infrastructure which then inhibits the learning experience for students in technical colleges. Technical education is not always worth the investment of childrenâs homes on tight budgets. Many of the children come from backgrounds of poor families, rental home, tragedy, where survival was valued over technical education.
Technical college is exclusive and is not an choice for homes with a small Budget, especially when the substitute non technical are less fees or maybe free.
A survey helps the develover to construct backround on the survey problem. This type of research is necessary when the research problem is new or when the information presented about the problem is inadequate. I used a number of surveys with different audiences. I conducted questionnaire cum interviews based survey method developed from reference, books and prior research related to this topic. I also conducted a survey on a sample of teachers and educational and administrative staff at technical and non technical college
Samira Talebi.Ali, Asghar Sayficarâ Using Educational Data Mining (EDM) to Prediction and Classify Studentsâ IJECS Volume 3 Issue 12 December, 2014 Page No.9395-9398.The aim of this paper is to predict the studentsâ academic performance. It is useful for identifying weak students at an earlier stage. In this study, we used WEKA open source data mining tool to analyze attributes for predicting studentsâ academic performance. The data set comprised of 180student records and 21attributes of students registered between year 2010 and 2013.We chose them from FERDOWSI University of Mashhad.The student’s academic performance can be predicted by using past experience knowledge discovered from the existing database. A cross-valid with 10 folds was used to evaluate the prediction accuracy.
Umamaheswari.Kâ A Study on Student Data Analysis Using Data Mining TechniquesâVolume 3, Issue 8, August 2013. Data mining methodology has a tremendous contribution for researchers to extract the hidden knowledge and information which have been inherited in the data used by researchers. It is a processing procedure of extracting credible, novel, effective and understandable patterns from database. This paper is used to categorize the students into grade order in all their education studies and it helps in interview situation. This study explores the socio-demographic variables (age, gender, name, lower class grade, higher class grade, degree proficiency and extra knowledge or skill, etc). It examines to what extent these factors helps to categorize students in rank order to arrange for the recruitment process. Due to this, all students get benefitted and it also reduces the short listings. Here, clustering, association rules, classification and outlier detection has been used to evaluate the students performance. Keywordsâ”Data mining, Clustering, Classification, Association rule, Outlier detection, Preprocessing
P. Ajithâ ROLE OF DATA MINING:A TECHNICAL EDUCATION PERSPECTIVEâ IJECST | Sept – Oct 2012 .Internet has made the world a real global village, that places educational organizations are in a very high competitive environment. The educational organizations should improve quality of their services to stakeholders such as students, faculty and future employers to get more competitive advantages over other competitors. Every educational organization, small or big, has the need to make use of the large scale data available and hopefully turn it into a prediction/analytical model that supports decision making process. Data mining is a young interdisciplinary field, the patterns/knowledge from large amounts of data stored in database, data warehouse or other information repositories. In this paper, an attempt has been made to identify the recent status of higher education system of India and explore the capabilities of Data mining to revamp the core functional areas so that growth in this crucial sector becomes qualitative and sustainable
Brijesh Kumar Bhardwaj âData Mining: A prediction for performance improvement using classification âVol. 9, No. 4, April 2011. These databases contain hidden information for improvement of studentsâ performance.The performance in higher education in India is a turning point in the academics for all students. This academic performance is influenced by many factors, therefore it is essential to develop predictive data mining model for studentsâ performance so as to identify the difference between high learners and slow learners student. In the present investigation, an experimental methodology was adopted to generate a database. The raw data was preprocessed in terms of filling up missing values, transforming values in one form into another and relevance attribute/ variable selection. As a result, we had 300 student records, which were used for by Byes classification prediction model construction.
â¢ The mechanism adopted in the management of worrshop facilities in technical colleges not in non technical studies.
â¢ Review of study related to technical teaching methods, technical students acehivements, quality of students and teachers in both education sectors.
â¢ Session with those who have educational and scientific knowledge in the field of study and those associated to technical and non technical education.
â¢ Selection of study sample for those occupied with teaching methods and students achievements in both education sectors.
â¢ The workshop manangement practices adopted in the planning of training facilities in technical colleges not in non technical colleges.
â¢ Technical teachers participate the planing of education but facilities not provided to non technical teachers.
â¢ Improper planing of training in non technical facilities leads to disappointment.
â¢ Learning objective influences the planing of workshop facilities.
â¢ The organization not help to provide enhanced counseling services, and create the healthy enviornment for improving the performance.
â¢ To access the current scenario of technical and non technical teachers and students with T- test.
â¢ To examine the teaching skills to anticipate the education level.
â¢ To predict the level of education according to various factors like academic quality, career options, Parents spots to choose education career etc with regression technique.
â¢ To plan effective trainings for entities in every field for improving performance
Following will be the steps involved during the research:
â¢ Selection of study sample on educatonal culture differences among technical and non technical organizational performance.
â¢ To develop a questionnaire according of various questions/ Parameters.
â¢ To interact with teachers and students in acquiring their views based on questionnaire.
â¢ To analyse the collected responses by using various statistical tests and data mining techniques.
â¢ Analyse the primary influencial factors by using data mining toolsa such as SPSS and AMOS.
â¢ Formulate results and conclusions.
RESULTS AND DISCUSSION
The distribution of respondents according to various socio economic characterstics is described below:
â¢ Annual income
â¢ Education Type
Age Demograpic Profile
Highest proportion of teachers belongs to age group of 20-40 years, followed by 40-60 years. The lowest proportion was of age greater than 60 years. The highest proportion of students belongs to age group of 20-40 years and lowest proportion has age group 10-20 years.
Gender Demograpic Profile
Majority of teachers as well as students are female.
Locale Demographic Profile
Majority of teachers as well as students are living in urban areas.
Occupation Demographic Profile
All teachers are in service and majority of students are unemployed. Only few are in service/business or agriculturist.
Annual Income Demographic Profile
The highest propotion of teachers belongs to group 3 to 5 lacs whereas lowest is greater than 7 lacs.
Education Type Demographic Profile
Out of each of 50 teachers and students,response has been collected from technical and non technical education sector(50% each)
PERCEPTION OF TEACHERS
There is a significant difference between technical and non technical education with respect to teachers and they prefer technical education over the non technical in many parameters.
Parameters such as academic quality, Workshops, Training, Conferences/ Seminars, parents spots to choose education career have been found highly significant for technical education whereas the teachers make private home business to teach students have been found highly significant for non technical education as indicated byâ**â in t- values.
Parameters such as career option, âbuildings, infrastructure, sports club & libraryâ, âTeam work, management and coordinationâ, toughness of course, transport facilities, Facility of technological & multimedia instructional resources have been found to be significant as indicated byâ*â in t values.
Other factors like parential academic interest, Availabilty of resources, satisfactory fanancial rewards for faculty, Availability of experinced teachers, Student enrolment and placement ,Financial constriants to join non technical, Faculty calibre, Extra curricular activities, Adequacy of financial resources has been found to be at par.
PERCEPTION OF TEACHERS
TECH NON-TECH T-TEST
Academic quality 4.24 3.56 3.108**
Satisfactory financial rewards for faculty 4.32 4.28 0.19
Parental academic interest 4.28 3.88 1.741
Availability of Resources (e.g. paper, pencils, books) 3.92 4.12 0.834
Career Options 4.24 3.68 2.69*
Workshops, training, Conferences/seminars facilities for faculty & students
facilities for faculty & students 4.12 3.28 3.777**
Availability of experienced teachers 4.12 3.88 0.99
Buildings, infrastructure, sports club & library 3.68 4.32 2.452*
Team work, management and Coordination 4.36 3.88 2.288*
Student enrolment and placements 4.12 3.92 0.834
Financial constraints to join non-technical 4.12 4.28 0.696
Toughness of course 4.24 3.64 2.315*
Transport facilities 4.16 3.64 2.115*
Co-curricular/ Extra-curricular activities 3.72 3.96 1.111
Encouragement of inter-disciplinary initiatives 3.8 2.88 1.033
Teachers make private home business to teach students 2.88 3.76 4.133**
Parents spots to choose educational career 4.24 3.56 3.108**
Faculty caliber 4.36 3.94 0.463
Facility of technological & multimedia instructional resources.
Resources 4.04 3.44 2.554*
Adequacy of financial resources for instruction 4.08 4.32 0.963
All the significant values have been represented in Figure 7-1. The values that were non-significant have been removed from the figure. It gives the clear picture of the perception of teachers
PERCEPTION OF TEACHERS
...(download the rest of the essay above)