A large and rapidly increasing amount of data is being generated and collected in connection with every human activity. A large part of this data is stored and can be accessed on the Internet. The data stores may be very large and may contain very large datasets. The ability to analyze these datasets, converting their information content into knowledge, understanding and insight necessary to make decisions and implement the decisions by corresponding actions is becoming increasingly important. This ability has until recently been limited to large organizations and institutions using large computers – even supercomputers. The development of relatively inexpensive personal computers with increasing computing power and the development of a number of relatively inexpensive and effective software packages has made it possible for ordinary people to analyze large datasets.
The availability of a large amount of data and effective software is not enough to derive reliable knowledge from the information in the data and to make reliable decisions as well as to take effective actions. In order to do this it is necessary to adhere to the stages in an orderly data analytic process and execute the process in a competent manner.
The Rising Data Wave – Page
The recent tidal wave of data has given rise to the development of a large number of software programs relevant to the analysis of the data. From a long list of programs I have chosen the following for my own use:
- BestView – Addon to Mathematica
Some of these programs are preexisting programs that have been adapted to the requirements of big data, some are new, as for example Tableau and Ayasdi. The programs I have chosen are not necessarily the best for all but they are the best for my present needs.
Data Analysis Software – Page
The data and decision analytic process is a path leading from the larva of data to the butterfly of knowledge, understanding, and insight.
Before starting to work on data analytic and associated decision analytic projects it is necessary, in order to ensure the quality of the results, to
- define an orderly data analytic and decision analytic process
- select methods for executing the process
- select software packages for implementing the methods
In order to ensure the reliability of the answers/solutions and the quality of the decisions made and actions taken, it is necessary to adhere to the analytic process in an orderly manner and apply the methods and the software packages in a competent manner
Before starting an analytic process it is necessary to state the question/problem under consideration and ask the following preliminary questions:
- Is the answer/solution considered known?
- Is the the answer/solution based on sufficiently recent/reliable data?
- Was the analysis performed in a competent/reliable manner?
- Is the results of the analysis presented/visualized in such a way that it sufficiently increases the understanding and insight of the target group ?
- Do the results of the analysis, their presentation/visualization, and the resulting understanding and insight form a sufficently firm basis for decision making and action?
If any of the answers are no there may be a reason to go ahead with the analytic and decision analytic process. If all the the answers are yes it is unnecessary to go ahead with the process unless you are confident that you can improve the results materially or introduce your particular results to a new or wider audience. But beware of hubris.
The main stages of a combined data analytic and decision analytic process
- State an important question/problem
- Data analysis
- Select data relevant to answering the questions or solving the problems
- Prepare the data for analysis. Employ visualization during preparation
- Analyze the data – Increase knowledge about the past, present, and future state of the system generating the data – Increase knowledge about individual variables and the relationship between variables. Employ visualization extensively during analysis
- Descriptive data analysis
- Exploratory data analysis
- Confirmatory data analysis
- Predictive data analysis
- Present/visualize the results of the analysis
- Evaluate the results of the analysis – Have the original questions been answered?
- Decision analysis
- Make decisions based on the results of the analysis
- Implement decisions – Act
- Present/visualize the results of the actions
- Evaluate the results of the actions – Have the original problems originally posed been solved?
- Reiterate the process or its individual stages as necessary
Data and Decision Analytic Process – Page
Decision Analysis is a systematic, quantitative and visual approach to addressing and evaluating important choices confronted by decision makers. Decision analysis utilizes a variety of tools to evaluate all relevant information to aid in the decision making process.
After all of the alternatives have been analyzed and a final decision has been reached, there are steps that should be taken during the implementation process for that decision. Three essential actions to implementing a decision include creating an implementation plan, informing stakeholders, and finally, adjusting the decision to make compromises as necessary.
Decision Analysis – Page
Data analysis proper is a process consisting of a sequence of stages beginning with data and ending in knowledge derived from the information in the data.
This knowledge may then be used in making and implementing decisions.
There are many different kinds of data analysis:
- Descriptive data analysis – describe the features of the data
- Exploratory data analysis – discover new features in the data
- Confirmatory data analysis – confirm or disconfirm/falsify existing hypotheses
- Predictive data analysis – apply statistical or structural models for predictive forecasting or classification
Data Analysis – Page