- How many people work here?
- How long has Bob worked here?
- Where does Nancy work, in Home Office or Regional Office One?
- What's our best-selling line of shoes?
- What is our profit margin on men's socks?
Here's an example of two ways to design tables, depending which questions need to be answered.
History vs Current StatusThere are two ways to store Salary information for employees. One would be to store only the Current Salary paid to employees. The other would be to store the entire Salary History for those employees. The first requires only one table, the second requires two tables.
So, before you even start building out the tables for this employee tracking Relational Database Application, you need to know which question(s) the organization wants to be able to answer. This is a partial list of the kinds of questions Human Resources might ask about employees' salaries.
- How much does Jose, in Sales, make each year?
- How much did Anastasia, in IT, make in her previous position in the Shipping Department?
- When did Mark, in Customer Support, last get a raise?
Current StatusThe first question can be answered from a single "Current Status" field in one table. It's a point in time question, and it refers only to THIS point in time. So we need only one value for Current Salary at a time.*
Dynamic vs Static StatusNote, however, that this does NOT mean the value of Current Salary is static. It can change from time to time. But regardless of those changes over time, the only question a Current Status field can answer is "How much is it when I ask the question?"
A static value, such as the employee's Date Of Birth, doesn't change. It too can be stored in a single "Current Status" field in one table.
If the organization decides, as a matter of policy, that they are concerned only with that Current Status, this "Current Status" field in a single table design is adequate. As a matter of fact, one can think of a number of data points for which we need to store only Current Status. Phone number comes to mind. If a customer changes their phone number, for example, we need to update our records to that new number, but there's no point to retaining the old, obsolete, phone number. You couldn't call it anyway!
HistoryMany other kinds of questions we typically ask in a Relational Database Application, though, require knowledge of the history beyond the current status. "When did this happen last?", "How Long has this been the case?", "What was it like Before and what is it like After?" questions all require us to keep track of the current status as well as each of the previous statuses along with the dates when those statuses change.
In order to answer the question about Anastasia's previous salary, we actually have to keep that data point in a table, along with the beginning and ending dates for that salary and her tenure in the Shipping Department.
Operational and Technical DecisionsAnswering questions about "Current Status" requires only one data point, the current salary. That means a single field in a single table can store that data point. So, when it comes time to design the employee table for our organization, we can add a field called "Salary" to the Employee table and update it from time to time as required.
Because answering "History" questions requires a different approach to storing data, we also have to take a different approach to designing the tables to support it. We can't put a new field in a table for each salary change that occurs for every employee. Think about how awkward that would become. Let's say our organization has ten employees on day one. We store their "StartingSalary" in the employee table along with other pertinent data points.
One month later, Anika gets promoted and receives a pay raise. If we want to keep the salary history, and if we're trying to store that history in the same table, we have to add a new field for Anika's new salary and call it something "Salary_2020_03_22" to reflect the date it happened. But that means the other nine employees now have the same field, "Salary_2020_03_22", but nothing to put in it; they didn't yet get a raise.
Then, six months later, four other employees get raises and that means another new field, "Salary_2020_09_12". So, now Anika has values under "StartingSalary" and "Salary_2020_03_22". Four employees also have "StartingSalaries", but they have "Salary_2020_09_12" salaries. And five other employees have only "StartingSalaries" and nothing in the other two salary fields.
That scenario sounds far-fetched, of course, and it's obvious (I hope!) that it would be a terrible way to design and maintain a Relational Database Application**. Unfortunately, we see that kind thing all too often. It's not always as easy as this to see how ridiculous it is, but it always leads to problems.
Because we want to answer a different kind of question, we have to design our tables differently.
In this case, we have a one-to-many relationship between Employees and Salary (history). One employee has one or more salaries OVER TIME.
The proper design, therefore, is an employee table and a related EmployeeSalary table.
These screenshots illustrate how it has to work:
|One Table Current Status of Salary vs Two Table of Employee Salary History|
|History of Salaries for One Employee|
Look Before You LeapBefore you leap into creating a new Relational Database Application, make sure you know which questions the organization will need to answer, this year, next year and beyond.
* The third question MIGHT be answered from a "Current Status" field in one table if we also include the "As Of Date" for that status.
**On the other hand, it's pretty much the only way you can do it in an Excel Worksheet, isn't it? 😀