A large and rapidly increasing amount of data is being generated and collected in connection with every human activity. A large part of this data is stored and can be accessed on the Internet. The data stores may be very large and may contain very large datasets. The ability to analyze these datasets, converting their information content into knowledge, understanding and insight necessary to make decisions and implement the decisions by corresponding actions is becoming increasingly important. This ability has until recently been limited to large organizations and institutions using large computers – even supercomputers. The development of relatively inexpensive personal computers with increasing computing power and the development of a number of relatively inexpensive and effective software packages has made it possible for ordinary people to analyze large datasets.
The availability of a large amount of data and effective software is not enough to derive reliable knowledge from the information in the data and to make reliable decisions as well as to take effective actions. In order to do this it is necessary to adhere to the stages in an orderly data analytic process and execute the process in a competent manner.