An Amazingly Elaborate Explanation of Data Analysis Methods

Data Analysis Methods
Data analysis is the process of extracting useful information from the given data series, that will be useful in taking important decisions. As the job opportunities for data analysts are on the rise, knowledge of data analysis methods is essential.
Data analysis methods help us to understand facts, observe patterns, formulate explanations, and try out hypotheses. They are not only used in all kinds of science and business processes, but also in administration and policy-making.

Data analysis can be carried out in all domains, including medicine and social sciences. All the analysis that is carried out is well-documented for future use.
Data Analysis Explained
Data analysis is defined as a practice in which, unorganized or unfinished data is ordered and organized, so that useful information can be highlighted. It involves processing and working on data, in order to understand what all is present in the data and vice-versa.
To understand what is involved in data analysis, take a look at this example:

Between 1800 and 2000, United States' population increased from 5 million to 255 million people, i.e., growth of 250 million. So, these figures illustrate the facts. But, to conclude that the population rose at an average rate of 1.25 million people per year (250 million divided by 200 years), would be wrong. The information would be correct and so would be the arithmetic, but the interpretation, "an average growth rate of 1.25 million people per year", would be dead wrong. The analysis would not correctly interpret facts, as population of the US did not grow in that fashion, not even approximately.

Here's where correct data analysis methods and procedures come into picture. Charts, graphs, and write-ups in text form, are various methods to analyze data. These methods are designed to polish and refine the data, so that the end users can reap interesting or useful information, without any need of going through the entire data themselves.
Qualitative Data Analysis
Qualitative research analysts define 15 types of data analysis methods. Let's go through each one of them:
Typology
It's basically a classification system or methodology, taken from patterns, themes or other kinds of groups of data. This type of method implements the thought that, ideally, categories should be mutually exclusive and exhaustive, if possible. Here's a list of categories as example: acts, activities, meanings, participation, relationships, settings, etc.
Analytic Induction
This is one of the oldest and the most appreciated method. Here, an event is studied and a hypothetical statement is developed of whatever happened. Now, other similar events are studied, and checked if they fit the hypothesis. If they don't, then the hypothesis is revised. This process is started by first looking for exceptions in the derived hypothesis, and then, each of them is revised to suit all examples encountered. Eventually, hypotheses is developed that supports all the observed cases.
analytic induction - based on a hypothesis
Taxonomy
This method is a complex classification containing multiple levels of conceptions or abstractions. Higher levels include lower levels forming superordinate and subordinate categories.
Domain Analysis
This type of analysis is mostly used to describe social and cultural situations, and patterns within it. The method starts by emphasizing what is social situation to participants, while they can interrelate it with cultural meanings.
Logical Analysis/Matrix Analysis
It is basically an outline of generalized causation, logical reasoning process, etc. It mostly includes, the use of flow charts, diagrams, etc., to graphically represent them, as well as written descriptions.
logical analysis
Constant Comparison/Grounded Theory
This method was developed in the 60s, and has the following steps:
  • Look at the document to be analyzed, such as a field note.
  • Identify parameters to categorize events and behavior, which will be named and coded on document.
  • Code comparison will help find consistencies and deviations.This is done till categories saturate, and no new codes related to it are formed.
  • Finally, certain categories become centrally-focused categories, more commonly known as core categories. These core categories are made subjects of case study.
Quasi-statistics
More often than not, enumeration is used in this method to provide manifest for categories formed, or to determine if observations are untrue.
Event Analysis/Microanalysis
In this method, importance is given to finding accurate beginnings and endings of events, by determining specific boundaries or points, that mark boundaries or events. This is the method that is specifically oriented towards film and video making. After end points are determined, repeated viewing can help us find phases in the event.
microanalysis
Metaphorical Analysis
Here, it's required to go on with various metaphors while checking how well they correspond with what is being observed. Participant may be asked for metaphors which they should interpret. For example; "Hallway as a highway." Many participants will take highway and its components in different ways like, students as traffic and teachers as police, etc.
Hermeneutical Analysis
The word 'hermeneutical' literally means, not going for objective meaning of text, but interpreting the text for the people involved in the situation. This is done by never overemphasizing self in an analysis, instead reiterating the people's story. Meaning of any content resides in the author intent, context, and the reader - finding themes and relating these three is involved in this method.
Discourse analysis
This method usually involves video taping of events, so that they can be played over and over again for deeper analysis.
Semiotics
Here, we determine how signs and symbols are related to their meanings while they are being constructed. The analysis needs to assume that the meaning is not inherent, and it comes from other things related to the symbol.
semiotics
Content Analysis
This method is never used with video, and it is only qualitative in development of categories. Standard rules of categorization in content analysis include:
  • A chunk of data to be analyzed at a time (whether it is a line, a sentence, a phrase, a paragraph?) must be identified
  • Categories must be inclusive and mutually exclusive
  • Should have precisely defined properties
  • All data must fit some category, i.e., exhaustive categorization
Phenomenology/Heuristic Analysis
There is emphasis on individual explanation to people. This method emphasizes the effects of research and the researcher's personal experience. The term "phenomenology" is used to describe a researcher's experience.
Narrative Analysis
Also known as 'Discourse analysis', this method gives more importance to interaction. How the narrator chooses to tell frame wise, decides how he/she will be perceived. Always compare ideas while avoiding the revelation of negatives about self. This analysis can involve study of literature, journals or folklore.
narrative analysis
Quantitative Data Analysis
mathematical formula to calculate sample size
For any data analysis, it is necessary to calculate the sample size of the population that is under consideration. The formula to calculate this is as shown above:

where;
N - the population size
e - the margin of error
n - the sample size
Mean
mathematical formula to calculate mean
where;
N - the total number of observations
X - Observations

It is nothing but the average of various samples of the population. The value of mean can be obtained by adding all the samples, and then dividing it by the number of observations. Mean highlights the value that is used most often from the given sample data.
Median
mathematical formula to calculate median
where;
N - Total number of observations
f0 - Cumulative frequency
fw - Frequency of median class

Median is the middle value of a series of data taken when the data is arranged in an ascending manner, i.e., from the smallest value to the largest value. It helps to analyze the value that is present in the middle.
Mode
Mode represents the highest value in a histogram. It is the most important value of the given sample or population. It is the value that is most common in the sample data. This concept is useful when dealing with non-numeric data.
Standard Deviation
mathematical formula to calculate standard deviation
where;
xi - classmark
X - mean value
Fi - Frequency

It is the value that gives the amount of deviation from the average value. Upon calculation, if the value of standard deviation is low, then it indicates the proximity of the obtained SD value to the mean value.
Variance
mathematical formula to calculate variance
where;
xi - classmark
X - mean value
Fi - Frequency

Variance is the value that indicates how scattered each value is from the mean. Variance is the average of the differences between squared means. Standard deviation is the square root of variance.
Range
Range = Highest Value - Smallest Value

The difference between the smallest value and the highest value is known as range. It gives us a clear picture of the vastness of our data. This concept is dependent on the outliners.
Coefficient of Variation
mathematical formula to calculate the coefficient of variation
where;
σ - standard deviation
μ - mean value

The dispersion of the values of a data series around the mean value, is known as coefficient of variation. It is also known as utilized risk. The calculation of this value determines the risk involved in investing in any asset.
Standard Error
mathematical formula to calculate standard error
where;
s - standard deviation
n - number of observations

It is the measure of the standard deviation of a sampling distribution. It gives the amount of accuracy with which the samples represent the entire population. When the sample mean and actual mean values are different, it is known as standard error.
Pearson Product
mathematical formula to calculate pearson product -1
mathematical formula to calculate pearson product -2
mathematical formula to calculate pearson product -3
This value gives the level of linear relationship between two variables. It is named after its developer, Karl Pearson. This concept attempts to draw the best fitting line between the two variables and then measures how far the data values are from this line. The range of the coefficient is between +1 and -1. A value grater than zero indicates positive association and the ones below zero indicate negative association.
Regression Analysis
Y = a+bx
mathematical formula to do regression ananlysis - 1
mathematical formula to do regression ananlysis - 2
where;
ΣX - Sum of all values of X
ΣY - Sum of all values of Y
ΣXY - Sum of the product of X and Y
ΣX2 - Sum of squared values of X

Regression analysis helps determine the relationship between two variables out of which, one is dependent and the other is independent. The analysis determines the behavior of the dependent variable, when one of the independent variables is varied and the others are kept fixed.
Thus, it can be observed that data analysis methods have multiple aspects and approaches, along with diverse techniques and variety of names. It comes to use in different domains like business, science, and social science. This field of statistics is a very complex one, and the number of methods for data analysis aren't quite easy to learn without training and practice under expert guidance.