Friday, March 27, 2020

I'm Not a Manager, I Just Play One in the EmployeeRoles Table

Last time out I outlined a couple of different ways to understand tables (the names of things) and fields (attributes of those things) in terms of our natural language understanding of common rhetorical structures.

I explained two of the X and Y rhetorical structures at work.
X is ONE KIND OF Y.
X is AN ATTRIBUTE OF Y. 
I offered the examples of departments within an organization and employees working for the organization. However, we need to drill further down in order to get a more complete and accurate picture of the real world we're trying to model in a relational database application. Let's do that next.

First, it is true that "Human Resources" is one kind of department, along with Information Technology, Marketing and so on. It is also true that "size" or "number of employees" is an attribute of departments.

However, I also went on to describe the different roles played by employees as attributes of those employees, although that formulation is not quite precise enough. I think it's more useful to propose a third X and Y rhetorical structure to account for the changeable nature of roles.
X CURRENTLY FILLS THE ROLE OF Y.
In my defense, it's not totally inaccurate to say that Role is an attribute of an employee of importance to the organization. On the other hand, I think it's more useful to recognize that employees can and do move more or less fluidly between roles.  And that means it's not simply the same thing as the immutable attributes of a person, like their birthdate.

Current Role is fundamentally different from some of the other attributes we can identify for employees. For example, the original Date of Hire for an employee doesn't change. That's an immutable attribute of the employee's relationship to the organization. People's names--for the most part--are also immutable attributes of people whether they work for the organization or not, and so on.

So, rhetorically speaking, we need to analyze candidates for inclusion in a relational database application as tables and as fields in those tables in at least the following three ways.

X is ONE KIND OF Y.
X is AN IMMUTABLE ATTRIBUTE OF Y.
X is currently IN THE ROLE OF Y. 
 I was initially a bit hesitant to include "immutable" as part of the second X and Y formula, because it's always possible that many such attributes can actually change. Women often change their last names after marriage, for example. Even a person's gender can be changed. So, although I am sure that it is useful to think of immutable and changeable as differentiators between attributes and roles, I do recognize that it's not totally unambiguous.

In relational database application design terms, we need a better way to think about the way we create tables and fields to handle these three structures.

As we saw last time, a table is the name of one thing: a noun, in other words.

At a general level, we can identify different attributes of those nouns: adjectives, in other words.

Now, though, we have a better, more granular, way to differentiate between those attributes, using the X and Y rhetorical structures.

X is ONE KIND OF Y.
  • Example: Human Resources is a Department at Company ABC. 
X is AN IMMUTABLE ATTRIBUTE OF Y.
  • Example: John Smith was hired in the Human Resources Department at Company ABC on June 1, 2010. 
X is CURRENTLY IN THE ROLE OF Y. 
  • Example: As of September 12, 2018, John Smith is Head of the Compensation Team in the Human Resources Department at Company ABC.
Because there is a fundamental difference between the unchangeable initial hire date (an immutable attribute) and the current Role Date (a current role), we may need to structure our tables differently. In other words, the important thing here is NOT the simple fact that we have to store "a date". The important thing is that different kinds of dates can have different implications about how we need to store the data.

Whether the current role is set up as an field in the Employee table depends on how Company ABC uses their data.
  • Is it important to know which roles John Smith has held and when he held them? 
  • Or do we only need to know John Smith's current role? 
 In most cases I would imagine that the historical record would be important here. And that, in turn, calls for a separate table: EmployeeRole. In this table, we store information about the employee, the role or roles they have held in the organization, and the begin and end dates of each of those roles.

I suppose that there are cases where only a current role or current attribute is important. For example, it could be argued that the employee's current name is all we ever need. The fact that Darlene changed her last name from Jones to Barker when she got married may not be relevant for our purposes. All we need to know here at Company ABC is what name to print on her paycheck.

Decisions of that sort can only be made within the context of the specific project. On the other hand, knowing that it is a decision to be made certainly is important.

If the history of an employee's different roles within the company is important, we need to create a table to hold it. That history is a thing, an entity separate from the employee.

Summing it all up.

For this hypothetical relational database application, I now see a Department Table, with at least one field for "DepartmentName".

I also see an Employee table, with fields for employee First, Middle and Last Names and HireDate. As noted, those four attributes are essentially immutable and, therefore, qualify to be attached to the Employee in that table.

However, "Current Role" is not an immutable attribute of an employee. Partly for that reason and partly because of decision about the purpose of the relational database application, it becomes a candidate for another table entirely: EmployeeRoleHistory. I like to call such tables "History" tables.

For now, we can sum up by identifying three X and Y rhetorical structures and their implications in designing a relational database application.

X is one kind of Y.
  • These are implemented as tables (Y) and fields (X) in those tables to hold the names of those kinds of things. 
X is an immutable attribute of Y.
  •  These are also implemented as tables (Y) and fields (X) in those tables to hold the domain of possible values of the attribute.
X is a current role held by Y.
  • These can be implemented as tables (Y) and fields (X) in those tables if, and only if, the CURRENT role is the only role of importance.
  • If the HISTORY of roles is important, these must be implemented as two tables (Y1 for Employee and Y2 for Role) in order to link up the role (X2) and the holder of that role (X1).

In my next installment, I'll dig deeper into the nature of the "X is a current role held by Y" structure and of the EmployeeRoleHistory table we need to properly support it in a relational database application.