Data mining is the process of uncovering patterns inside large sets of structured data to predict future outcomes. Structured data is data that is organized into columns and rows so that it can be accessed and modified efficiently. Using a wide range of machine learning algorithms, you can use data mining approaches for a wide variety of use cases to increase revenues, reduce costs, and avoid risks.If you are looking to analyze unstructured data (e.g. data from essays, articles, computer log files, etc.) see text mining.
Data mining process and toolsThe Cross-Industry Standard Process for Data Mining (CRISP-DM) is a conceptual tool that exists as a standard approach to data mining. The process outlines six phases:
- Business understanding
- Data understanding
- Data preparation