Classification of Data:




Introduction:

Data contains some information about a certain group of individuals. The entire group of individuals that we want information about is called population or target population. Our interest is to study a part of the population and then to infer about the whole population. Such a part of the population is called a sample.



Example:

We want to study BMI (Body Mass Index) of Indian citizens. Here the population contains all the citizens of India. Now let's assume that we want to find out the average BMI of Indian citizens. Then the standard way of approach would be to select a part of the entire population, i.e., to randomly select few individuals from different parts of India. So we are getting a sample, which is a part of the Indian citizen.

Variable:

In the earlier example, we are interested to find out the weight and height of the population. And weight and heights will vary from subject to subject. So Weight and Height are the variable here. The data set is termed as a bivariate data since it contains two variables.

  • The dataset containing one variable is termed as Univariate data. 
  • The dataset containing two variables is termed as Bivariate data.
  • The dataset containing two or more variables is termed as Multivariate data.

 Classification of Variable:

Variables are mainly classified into two categories viz., Qualitative (Categorical), and Quantitative (Numerical). 

Classification of data


Qualitative (Categorical):  

When the realization of a variable is not a numeric number instead of the quality of the individuals in a population. Actually, it creates classes inside the population. Hence it is termed as a categorical variable also.
Example: 
Variable: Hair Color (Black, Brown, White, etc.)

Qualitative variables are further classified with two classes viz., Ordinal and Non-Ordinal variables. 

Quantitative (Numerical): 

When the variable takes numerical values, we call it a Quantitative variable or Numerical variable.  Quantitative variables are further classified with two categories viz.,  Discrete and Continuous variable.

Discrete Variable:

Discrete variables can take values either from a finite set or a countably infinite set. A set is said to be countable if and only if there exists a bijection from the set onto the set of Natural numbers. 

Example:

Number of heads in a coin toss. It only can take values from the set A={0, 1, 2, 3,...}. One can show that A is countably infinite. 


Continuous variable.

Continuous variables can take values from an interval on the real line. Any interval on the real line is uncountable.

Data can be classified in the same manner as in the case of a variable.


Lecture Video:

Example:

Weights of students in a class.

------------------------------        Exercises:         ----------------------------------

A. Differentiate between a qualitative and quantitative variable. Indicate which of the following variables are qualitative and which are quantitative; also mention which are discrete and which are continuous. 
  1. Number of persons in a family
  2. Colour of cars in a parking lot
  3. Marital status of research scholars
  4. Number of students in colleges
  5. Brand of mobile phone
  6. Monthly income of the workers of a factory
B. Give an example of a Quantitative discrete data.
C. Give an example of a Quantitative continuous data.
D. Give an example of a categorical data.





Post a Comment

0 Comments