What is a dimension?
You might have heard the word “dimension.” You might have heard people say the term “high dimensional data.” Let us discuss what this term dimension means.
Here is the tabular data from the previous lesson.
Name | Salary ($) | Age (Years) |
Jane | 90000 | 52 |
John | 85000 | 48 |
Delilah | 75000 | 32 |
Dave | 90000 | 53 |
Ellen | 82000 | 44 |
We said that the actual data part in the table above is:
90000 | 52 |
85000 | 48 |
75000 | 32 |
90000 | 53 |
82000 | 44 |
In this running example, we have two features or two columns, as explained in the previous lesson. We have five objects or five rows.
We call the data of our running example a two-dimensional dataset. That is the number of features is equal to the number of dimensions of the dataset. Again, the table above is a two-dimensional dataset because the table has two features or columns.
That is:
Number of features = number of dimensions
If we had three features or three columns, we would have called this a three-dimensional dataset. An example is provided below. The table below has three features and five objects.
90000 | 52 | 10 |
85000 | 48 | 20 |
75000 | 32 | 30 |
90000 | 53 | 40 |
82000 | 44 | 20 |
If we had four features or four columns, we would have called this a four-dimensional dataset. An example is below.
90000 | 52 | 10 | 50 |
85000 | 48 | 20 | 60 |
75000 | 32 | 30 | 30 |
90000 | 53 | 40 | 35 |
82000 | 44 | 20 | 40 |
I am sure, the idea is clear by this time. If the dataset has 1 feature, it is called, 1-dimensional; with 2 features it is called 2-dimensional, so and so forth. With n features or n columns, the data is called n-dimensional.
Feature 1 | Feature 2 | Feature 3 | Feature 4 | —- —- | Feature n |
90000 | 52 | 10 | 50 | 43 | |
85000 | 48 | 20 | 60 | 2 | |
75000 | 32 | 30 | 30 | 73 | |
90000 | 53 | 40 | 35 | 36 | |
82000 | 44 | 20 | 40 | 90 |
Notice one thing here — regardless of the number of features or number of columns, or the number of dimensions, the data table can be stored in a two-dimensional array. That is, even one hundred-dimensional dataset can be kept in a 2D array or in a 2D matrix.
The word “dimension” in programming is used to count the number of cells. In data science, the word “dimension” has a different meaning. “Dimension” in data science refers to the mathematical space, such as the Euclidian space.
As an example, the following data table has three columns or three features. There are five objects or five rows.
90000 | 52 | 10 |
85000 | 48 | 20 |
75000 | 32 | 30 |
90000 | 53 | 40 |
82000 | 44 | 20 |
In programming, we will say that this table can be stored in a 2-dimensional array of size 5 times 3. That means, it has five rows and three columns.
In data science, this table is called a three-dimensional dataset because it composes a mathematical space of three dimensions.
Similarly, a data table with four columns, such as the following one, is referred to as a four-dimensional dataset even though we store it in a two-dimensional array.
90000 | 52 | 10 | 50 |
85000 | 48 | 20 | 60 |
75000 | 32 | 30 | 30 |
90000 | 53 | 40 | 35 |
82000 | 44 | 20 | 40 |
That is a higher number of features would mean a higher number of dimensional mathematical space. The physical memory space is the memory occupied with the corresponding two-dimensional array. The physical memory is a programming concept and always a 2-dimensional array for an any-dimensional dataset.
8 Comments
I learnt from this lesson, that number features in a data set = number of dimensions.
I also learnt that high dimensions are no visualise because 2D and 3D are the only dimensions we can visualise thank you for this clarity sir.
Can I rewrite the tests I failed
How did I fail b’cos I didn’t attempt answering even a single question ?
Very interesting topic
This is very interesting
Could u pls tell us what r all the mistakes we made in quiz which we are attending ….so that we can correct ourself with the right one
In my econometrics class, I always had fear identifying the dimensions of the metrics, but with this, the concept is clear. Thanks.
I am glad to hear that the content is helpful. I appreciate the comments. Have a wonderful week.