What is data mining?
Data mining is a process used by companies to turn raw data into useful information. By using software to look for patterns in large batches of data, businesses can learn more about their customers and develop more effective marketing strategies as well as increase sales and decrease costs. Data mining depends on effective data collection and warehousing as well as computer processing.
The phrase data mining is commonly misused to describe software that presents data in new ways. True data mining software doesn’t just change the presentation, but actually discovers previously unknown relationships among the data.
Data mining is popular in the science and mathematical fields but also is utilized increasingly by marketers trying to distill useful consumer data from Web sites.
The drop in the price of data storage has given companies willing to make the investment a tremendous resource: Data about their customers and potential customers stored in “Data Warehouses.” Data warehouses are becoming part of the technology. Data warehouses are used to consolidate data located in disparate databases. A data warehouse stores large quantities of data by specific categories so it can be more easily retrieved, interpreted, and sorted by users. Warehouses enable executives and managers to work with vast stores of transactional or other data to respond faster to markets and make more informed business decisions. It has been predicted that every business will have a data warehouse within ten years. But merely storing data in a data warehouse does a company little good. Companies will want to learn more about that data to improve knowledge of customers and markets. The company benefits when meaningful trends and patterns are extracted from the data.
Data Mining Technologies
The analytical techniques used in data mining are often well-known mathematical algorithms and techniques. What is new is the application of those techniques to general business problems made possible by the increased availability of data and inexpensive storage and processing power. Also, the use of graphical interfaces has led to tools becoming available that business experts can easily use.
Some of the tools used for data mining are:
Artificial neural networks – Non-linear predictive models that learn through training and resemble biological neural networks in structure.
Decision trees – Tree-shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset.
Rule induction – The extraction of useful if-then rules from data based on statistical significance.
Genetic algorithms – Optimization techniques based on the concepts of genetic combination, mutation, and natural selection.
Nearest neighbor – A classification technique that classifies each record based on the records most similar to it in a historical database.
Applications of Data Mining:
Data mining has a lot of applications out of which some are mentioned below
- Fraud Detection
- Intrusion Detection
- Financial Banking
- Research Analysis
- Bio Informatics