Objects and features of a data table
The table below is from the previous lesson. Real-life data is much more complex than this one. Many of the datasets have hundreds or even several thousand columns. The number of rows can go beyond millions. For example, consider a data table covering citizens of a country. There can be millions of rows.
Name | Salary ($) | Age (Years) |
Jane | 90000 | 52 |
John | 85000 | 48 |
Delilah | 75000 | 32 |
Dave | 90000 | 53 |
Ellen | 82000 | 44 |
Data is generally described using two things — objects and features. Let us speak in terms of the data table above, which we are using as our running example.
In the table above, the two columns — salary and age — are features. Notice that there are five people, whose names are placed in five rows. Each row is an object. Generally, in tabular form, we place objects in rows and features in columns.
An object is characterized/explained/defined by features. For example, given the data table above, Jane is defined or explained as a person of salary 90000 and age 52. That is two features — salary and age — explains Jane. Similarly, John is explained as a salary and age combination of 85000 and 48.
Notice that the name of Jane could be Person1, John could be Person2, Delilah could be Person3, so and so forth. The name column is not a feature of this dataset, rather it is an object identifier. The name column helps us in understanding which row belongs to whom.
Similar to the name column being an identifier, the header row which has the actual texts “Name”, “Salary ($)”, “Age (Years)”, is nothing but a descriptor that states what to call the columns. Therefore, the actual data part of the table above is composed of numbers only, as shown below.
90000 | 52 |
85000 | 48 |
75000 | 32 |
90000 | 53 |
82000 | 44 |
We practically use the data content table as a matrix. In this matrix, we have five rows and two columns. That is, we have five objects and two features in the table.
22 Comments
Good explanations
thank’s a lot
Thank you so much. I am really enjoying the insights from this lecture.
Thank you for this,I have never had a grasp of objects and features like I have had today.God bless you
I am very like and thanks for the step by step lesson.
I am glad to know that you are enjoying the lessons. Thank you.
Am from Ghana i just chance on this and i have to quickly join. I am to do a presentation on data science as part of my KPI. Reading i find the material very very helpful.
It is my passion to learn it all out. My back ground is in Statistics and Computer Science
Good work you are doing keep it up
Thank you for the kind words. I am glad to hear that the materials are helpful.
Sir………………is every row an object or only entries lying under first column are objects?
can one say that entry under 1st column is an object with features age and salary?
thanks
An entire row is an object. Thank you for your question.
Thank you, it is easy to understand as basic point to learn more about data science
I am glad to know that you liked the content. Thank you for your feedback. Have a wonderful time.
Good Lecture for beginners.
Otherwise, you are saying that, within the table, that we used, has 2 features one is “salary” and other one is “age”. So my question is that:
“Why ‘name’ is not the feature ?”
Thank you for visiting and commenting.
Yes. Name can be a feature too. In this example, we used the name as an identifier, especially because the name has nothing to do with features like salary and age. For some data analytic problems, the name might be important and one can use the name as a feature. For instance, finding clusters of people with similar names.
Thank you once again for your comment.
I’m from Ghana
I registered yesterday and I’m ok with the explanation
I pray that it will simple like this through out the course
Thanks
I try my best to keep the topics I teach as simple as possible. I hope you will enjoy the course. We will create more videos in the coming months. Please stay tuned.
Very lucid explanation of numerical puzzles. I’m in Dhaka, how can complete the course ? I’ve previously completed courses in Real Analysis, Calculus Basics, Statistics (with Mathematical Stat and Advanced Stat) and Advanced Econometrics. I feel an urge to comprehend the Data Science.
Dhaka is one of my favorite towns on earth. Great to hear from someone from Dhaka!
Since you have a good background in mathematics, statistics, and econometrics, I believe you will enjoy a data science course. The data science course on our site is currently introductory, and we are still developing it. We are planning to build this introductory course over this year. If you registered for the course, you would receive emails from us when there is a new video or a new lesson.
Thanks again for your interest in the course.
Thanks, excellent explanation for its easy to understand the importance of data science today I hope to move forward and deepen the subject.
Regards from Dominican Republic
I am glad to know that you liked it. Best regards to you as well.
Interesting literature. Do I have to watxh the video at all ? Data harmonization in Liberia
Thank you for visiting. We made the videos to clarify the text further. Many times, the text already covers the content of the video. You can skip the video if you find that the topic is clear by reading the text. Many of the visitors prefer to watch the video lectures.
Many thanks again.