Data mining and Knowledge Discovery in Databases: two key tools in data analysis
By: Diana González-Bravo
We have explored the term data mining and how is the information extraction process to obtain insights useful from a mass of information, but, data mining and Knowledge Discovery in Databases They are the same? What is done with the information obtained from data mining?
¿Data Mining and Knowledge Discovery in Databases (KDD), are they the same?
We might think that these English terms are complicated and intellectual. The truth is that data mining is a rigorous but everyday process in research of any kind.
While data mining and Knowledge Discovery in databases (KDD) they are often treated as synonyms, they are dissimilar but complementary terms.
The common goal of data mining and Knowledge Discovery is to derive expressions of common characteristics, from a data set
Data mining is the process of searching for information among a large amount of data and it is in it that the stage of identification of the variables that will be analyzed in the KDD process.
On the other hand, the discovery of knowledge from databases (KDD- Knowledge Discovery in Databases) it is the process of extraction and analysis of information, previously unknown and potentially useful.
How do you use data mining and KKD?
Several algorithms and techniques are used for the data mining process and its subsequent analysis, within these are:
classification, cluster or grouping, regression, association rules, decision trees, among others.
With the enormous amount of data stored in files, databases and other repositories, it is increasingly important, if not necessary, to develop powerful means for analysis and techniques that facilitate the interpretation of this information and the extraction of knowledge. Here we explain the most relevant ones: These techniques are sophisticated algorithms that are applied on a set of data to obtain useful results, which could help in research and decision-making.
The most representative techniques are:
- Decision trees: is a predictive model that serves to represent and categorize conditions that happen successively, in order to solve a problem
- Statistical models: are equations that are used in research designs to indicate the different factors that modify the variables of interest
- Neural networks: its name is inspired by the spectacular functioning of the nervous system. This is a system of interconnections in a network, the purpose of which is to generate a robust output from a highly complex analysis.
Click on this link to access a free course on the subject, online, offered by the National Program on Technology Enhanced Learning (NPTEL), with a training certificate.
In our next blog tools for analysis We will show you applicable forms and examples of how these methodologies can be used in the health sector.
- Han Jiawei, Kamber Micheline, Pei Jian. Data Mining: Concepts and Techniques. Third Edition. Morgan Kaufmann, Elsevier. 2012.
- Altamiranda, L .; Peña, AM; Ospino, M .; Volpe, I .; Ortega, D. and Cantillo, E. Data mining as a tool for the development of B2B marketing strategies in productive sectors, related to Colombians: a review of cases in Sotavento. 2013. 22: 126-136.
- Ramageri Bharati M. Data mining techniques and applications. Indian Journal of Computer Science and Engineering. 2010. 4: 301-305
- Encyclopedia Birtannica. Data mining, by Christopher Clifton. Access date 03/13/2018. Available in: https://www.britannica.com/technology/data-mining
- Ohsuga. Difference between data mining and knowledge discovery - a view to discovery from knowledge-processing. IEEE International Conference on Granular Computing. 2005. 1: 7-12