Thursday, January 2, 2020

All At Once or One At a Time?

Like most Access developers, I pay a lot of attention to the technical aspects of creating tables in a Relational Database Application. Making sure all of the relevant entities are identified and tables are built for them. Identifying relationships between those entities and defining Primary and Foreign Keys to support and enforce referential integrity within those relationships. Sometimes, though, it's not enough to identify relationship as one-to-one, one-to-many or many-to-many. You also have to take into account the logic, or business rules, that apply to the business itself. That's what I want to talk about today.

Concurrent Status Vs Sequential Status

When designing and building a new Relational Database Application it's rather important that you understand whether a relationship is "All at Once" (i.e. concurrent) or "One at a Time" (i.e. sequential). Initially, they can look very much the same, but they're not. And that has implications for the design of the tables and the relationships between them. Let's look at a couple of examples, and see if we can sort out a workable solution.

One at a Time

These may sound like dumb questions, but I'm trying to make a point I'll explore later.

How many employees can drive the same company car at the same time? 
How many company cars can the same employee drive at the same time?

Pretty obviously ( I hope), only one person can physically drive a car at the same time, and one person can only drive one car at a time. Those logical restriction are based on the nature of "driving a car". Over time many different people can take turns driving different cars, of course. (If you happen to have a teen-age driver in your household and only one car, I suppose that could seem like more of a hypothetical proposition đŸ˜‰.)

It also makes sense that an employee can drive different cars on different days. And different employees can drive one specific company car on different days. This can be thought of as a "One at a Time" scenario.

All at Once

Here's another set of questions that might help understand why it's important to know how it works before you start building tables in a Relational Database Application.

How many classes is Yoshiko enrolled in this semester?
How many students are enrolled in the Math 101 class this semester?

Again, it should be obvious that a person can enroll in one or more classes each semester. There's an upper limit to the course load any one person can handle, but the exact number varies from person to person and from semester to semester. Moreover, because a Class can enroll one or more persons, there is no logical restriction on the number of concurrent enrollments by different students. The only restriction here is that one student can't take the same class more than once in a given semester. This can be thought of as an "All at Once" scenario.

Concurrent Vs Sequential History

At first glance, one might assume that both of these scenarios represent Many-to-Many relationships, which means, in turn, that they require a junction table like this one, which illustrates a concurrent status for students and classes.

Students Concurrently Enrolled in Classes
Each student can concurrently enroll in one or more classes each semester. Each class can concurrently be taken by one or more students each Semester that class is offered by the school.

Our junction table for students and classes includes two foreign keys, one for each student and one for each class. In addition, to enforce the requirement that students can enroll in a class only once each semester, the StudentClass junction table includes the SemesterStartDate as part of the three field composite Primary Key. That allows students to enroll in a class more than once, sequentially. That might happen if a student failed the class on the first attempt and need to retake it.

For Company Cars assigned to Employees, we initially might want to design our tables the same way because we know that one or more employees can check out a company car, and that a company car can be assigned to one or more employees.
Company Cars Assigned to Employees
Our first attempt at a junction table for employees and company cars includes two foreign keys, one for employees and one for company cars. In addition, the CheckOutDate is included in the composite key. That means an employee can only check out a Company Car once on any given date. That is, in fact, what we want.

However, it turns out that this table design actually allows us to create records that would violate our rule about Sequential, or One at a Time, assignments. It is possible, with this design, to assign an Employee to two or more different Company Cars on the same date, or to assign the same Company Car to two or more different Employees on the same date. See the sample records in this query.
One Company Car Assigned to Two Employees on the Same Day

Both Bezalel Simmel and Georgi Facello end up being assigned the Ford Focus on the 2nd of January, 2020. And that is fully supported by the Junction Table. Obviously, this is not logically acceptable. We need a different kind of table design. Here's how the first attempt at resolving this problem might look.
Composite Primary Key on Company Car and Checkout Date
 This approach seems like it might work. The Ford Focus can only be assigned once per day, which means that Giorgio gets it on the 2nd, Bezalel gets it on the 3rd, and Parto gets it on the 5th. 

However, there's still another problem. Parto gets two different Company Cars on the 5th. The Primary Key restraint doesn't prevent that; it only prevents assigning the same car twice on the same day.

The same problem would appear, though, if we tried to enforce the Primary Key restraint on EmployeeID and CheckoutDate. Each employee could get one car per day, but the same car could be assigned twice (or more) on that same day.

Our next step in resolving this problem will be to add an additional unique index to the junction table, like this.
Unique Index on Two Fields Not in Composite Primary Key

By adding a second unique index on the other two fields we need to restrict, we can now prevent duplicate assignments to the same employee on the same date, in addition to preventing duplicate assignments of the same company car on the same date. 

Note: the desired restriction is enforced by the Unique property. The Primary Key is unique by default. The second index we created, on EmployeeID and CheckoutDate, could be applied without the unique constraint. However, because it is possible to create that second index and make it unique, we can achieve the desired restriction on Employees and Company Cars as "One at a Time".

There Is More than One Way to Get the Right Answer

It's worth pointing out that there are other ways to go about handling this situation. For example, the composite primary key could be on EmployeeID and Checkout date while the second, unique index could be set up on CompanyCarID and CheckoutDate. 

Not only that, we could assign a new AutoNumber Primary Key to this table, and make two other unique indexes which would achieve the same result.
Surrogate Primary Key and Two Unique Indexes
The key features of this approach to designing the tables for a Sequential Type of Many-to-Many Relationship include the following:

  1. It limits records for each combination to "One at a Time"
  2. It supports a history of transactions in which two entities (in this case, employees and company cars) are involved over time.
  3. It enforces all of the restrictions at the table level. That means no additional wads of code have to be written.


Wednesday, January 1, 2020

Singing (and Dancing) With Access

Dancing with the Stars, Singing with the Idols

I like to think I can make MS Access sing and dance. Give me a task and most of the time I'll find a way to accomplish it. Once in a while it takes longer to navigate through a new kind of process, but so far I've never failed (well there was that once when I couldn't mimic Excel exactly in a datasheet view in a form, but it was on the clients "it would be nice..." list anyway). However, there are limits to what even Access can do. When I find myself stumped, I tend to want to blame Access. But that's just an excuse, on several levels. It shames me when I resort to blaming someone else for my shortcomings. That makes me human, I guess. It doesn't make me right.

Que SerĂ¡ SerĂ¡.

 Access is what it is. How trite and tautological, right? But one of the things that alternately frustrates and amuses me is that Microsoft designed Access to do exactly what it does, no more and no less. I refuse to admit defeat until I've exhausted all of the options. Sometimes I'm quite delighted to learn there is a way to do something I would have thought impossible. Often those tasks relate to working with URLs or other things webbie. It's something I know little about. However, I've learned never to say never, except when I've spent enough time Bingoogling to conclude I'm not going to find an answer. Then it's time for a long walk and maybe a fresh cuppa Costa Rican dark roast.

The corollary to that, of course, is that Access isn't what it isn't. VBA is a rich, powerful coding language that works under Windows on the desktop. It's not a web coding language. It never was, and despite all the Sturm und Drang about it over recent years, there's little chance it ever will be (I'd say zero chance, but, you know, never say never). You want to make web pages, go learn a web coding language. There are lots of them and you are smart enough to figure it out, if you must. Just don't waste my time and yours whining about something that nunca serĂ¡.

 Mommas Don't Let Your Babies Grow Up to Be Cowboy Coders

I run into this from time to time on the on-line forums. The question comes in the form of "How do I write a function (or maybe some SQL) to do X, Y, and Z? And then, when the answer is along the lines of "Use the built-in Access feature that does X, Y and Z, the response comes back, "But I want to do it this way. How do I do that?"

One common example is form design. Bound forms are the heart and soul of nearly every Access Relational Database Application. Writing a lot of code to handle unbound forms seems like a real waste of time and resources in all but a few very specific situations. If you want to ride off into the sunset on a white mustang, be my guest. But don't lose sight of the fact that the end of the road for most cowboys includes a lot of isolation.

If you don't want to do it "the Access Way", that's your call, of course. But  don't complain when it turns out to be harder, less reliable and less flexible than "the Access Way".

Know Your Limitations

 Access, for example, doesn't do a great job with data security. It has other strengths. Keeping your data absolutely protected is not one of them. If that's a requirement for your Relational Database Application, find a different tool. Access is, and always be, my favored tool for creating Relational Database Applications. On the other hand, just like I wouldn't haul pigs to market in a Cadillac I am pretty sure there are better tools for a lot of tasks.

Access and I are going to be dancing partners for a long time to come, I hope.We just have to avoid stepping on each others toes.

How Long Has This Been Going On?

It is axiomatic that an organization invests time and other resources in acquiring a well designed Relational Database Application because it needs and wants answers to common questions about their organization and its operations.

Things like:
  • How many people work here?
  • How long has Bob worked here?
  • Where does Nancy work, in Home Office or Regional Office One?
  • What's our best-selling line of shoes?
  • What is our profit margin on men's socks?
Dozens of other questions can be asked about nearly every organization. The job of your Relational Database Application is to gather, store, protect and analyze the data needed to answer all of those questions. And that means the  tables in that Relational Database Application have to be designed accordingly. It's the "Relational" part of the Database Application.

Here's an example of two ways to design tables, depending which questions need to be answered.

History vs Current Status

There are two ways to store Salary information for employees. One would be to store only the Current Salary paid to employees. The other would be to store the entire Salary History for those employees. The first requires only one table, the second requires two tables.

So, before you even start building out the tables for this employee tracking Relational Database Application, you need to know which question(s) the organization wants to be able to answer. This is a partial list of the kinds of questions Human Resources might ask about employees' salaries.

  1. How much does Jose, in Sales, make each year?
  2. How much did Anastasia, in IT, make in her previous position in the Shipping Department?
  3. When did Mark, in Customer Support, last get a raise?

Current Status

The first question can be answered from a single "Current Status" field in one table. It's a point in time question, and it refers only to THIS point in time. So we need only one value for Current Salary at a time.*

Dynamic vs Static Status

Note, however, that this does NOT mean the value of Current Salary is static. It can change from time to time. But regardless of those changes over time, the only question a Current Status field can answer is "How much is it when I ask the question?"

A static value, such as the employee's Date Of Birth, doesn't change. It too can be stored in a single "Current Status" field in one table.

If the organization decides, as a matter of policy, that they are concerned only with that Current Status, this "Current Status" field in a single table design is adequate. As a matter of fact, one can think of a number of data points for which we need to store only Current Status. Phone number comes to mind. If a customer changes their phone number, for example, we need to update our records to that new number, but there's no point to retaining the old, obsolete, phone number. You couldn't call it anyway!


Many other kinds of questions we typically ask in a Relational Database Application, though, require knowledge of the history beyond the current status. "When did this happen last?", "How Long has this been the case?", "What was it like Before and what is it like After?" questions all require us to keep track of the current status as well as each of the previous statuses along with the dates when those statuses change.

In order to answer the question about Anastasia's previous salary, we actually have to keep that data point in a table, along with the beginning and ending dates for that salary and her tenure in the Shipping Department.

Operational and Technical Decisions

Answering questions about "Current Status" requires only one data point, the current salary. That means a single field in a single table can store that data point. So, when it comes time to design the employee table for our organization, we can add a field called "Salary" to the Employee table and update it from time to time as required.

Because answering "History" questions requires a different approach to storing data, we also have to take a different approach to designing the tables to support it. We can't put a new field in a table for each salary change that occurs for every employee.  Think about how awkward that would become. Let's say our organization has ten employees on day one. We store their "StartingSalary" in the employee table along with other pertinent data points.

One month later, Anika gets promoted and receives a pay raise. If we want to keep the salary history, and if we're trying to store that history in the same table, we have to add a new field for Anika's new salary and call it something "Salary_2020_03_22" to reflect the date it happened. But that means the other nine employees now have the same field, "Salary_2020_03_22",  but nothing to put in it; they didn't yet get a raise.

Then, six months later, four other employees get raises and that means another new field, "Salary_2020_09_12". So, now Anika has values under "StartingSalary" and "Salary_2020_03_22". Four employees also have "StartingSalaries", but they have "Salary_2020_09_12" salaries. And five other employees have only "StartingSalaries" and nothing in the other two salary fields.

That scenario sounds far-fetched, of course, and it's obvious (I hope!) that it would be a terrible way to design and maintain a Relational Database Application**. Unfortunately, we see that kind thing all too often. It's not always as easy as this to see how ridiculous it is, but it always leads to problems.

Because we want to answer a different kind of question, we have to design our tables differently.

In this case, we have a one-to-many relationship between Employees and Salary (history). One employee has one or more salaries OVER TIME.

The proper design, therefore, is an employee table and a related EmployeeSalary table.

These screenshots illustrate how it has to work:
One Table Current Status of Salary vs Two Table of Employee Salary History
Note that Salary History requires at least a "from_date" in order to establish the proper sequence of values.
History of Salaries for One Employee
The query on the left is based on the table called "employee_salary". The query on the right is based on the two related tables, "employee" and "employee_salary_history."

 Look Before You Leap

Before you leap into creating a new Relational Database Application, make sure you know which questions the organization will need to answer, this year, next year and beyond.

* The third question MIGHT be answered from a "Current Status" field in one table if we also include the "As Of Date" for that status.

**On the other hand, it's pretty much the only way you can do it in an Excel Worksheet, isn't it? đŸ˜€