An Introduction to Data
Data science is a field of study that focuses on techniques and algorithms to extract knowledge from data. The area combines data mining and machine learning with data-specific domains. This section focuses on defining "data" before going to any complicated topic.4
Data Dimensionality and Space
This section's focus is on defining the common terminology widely used in data science. The video lectures in this section focus on terms like objects, data points, features, dimensions, vectors, high-dimensional data, and mathematical space.6
Proximity in Data Science Context
Many data mining and machine learning algorithms rely on distance or similarity between objects/data points. Video lectures in this section focus on standard proximity measures used in data science. The section also explains how to use proximity measures to examine the neighborhood of a given point.5
A large portion of data science focuses on exploratory analysis. Scientists and practitioners use statistical techniques to understand the data. One way to explore the data is to check if there are clusters of data points. A cluster is a group of data points that have similar features. This section explains the clustering algorithms.7
Can data speak?
Given the following data table regarding employees, we will try to retrieve interesting information from it.
|Name||Salary ($)||Age (Years)|
Closely look at the table for several minutes. Then, write down anything interesting you can find.
Here is what I could find from the table above. You can see how many of your findings match with the findings listed below and how many of your findings are not listed below. Please feel free to write your additional findings in the Comments section.
- Jane and Dave earn the highest salary.
- Delilah earns the least.
- Jane and Dave are the oldest people in the group.
- Delilah is the youngest among all the employees in the table.
These are all interesting findings. What else?
Will the following statement be a correct one based on the information provided in the table?
Older people earn more in the company from where the data was collected.
It is indeed a correct piece of information based on the data provided to us in the table.
Now, let us go back to the definition: Data refers to “facts and statistics collected for reference or analysis.”
This table has facts. This table is collected from a company. We used the table for analysis. We revealed that the company appreciates experienced employees. Basically, the data reflects a general trend –
Experience, wisdom, (and money, which is the salary in this case) come with age.
There can be debates regarding the conclusion but the main point is — Data Speaks. Data gives us insights. Data gives us those light-bulb moments.