Tuesday, October 29, 2019

Pouring a Concrete Foundation After the Roof Goes On

Sometimes, when I'm cruising Access forums for questions I might be able to answer, I get the feeling that Access is simultaneously a really fantastic application-development tool and a really subtle trap for the unaware.

I'm talking about valid, proper, appropriate, NORMALIZED Relational Table Design, of course. Think, for example, of color options or size options for articles of clothing. Shirts come in Small, Medium, Large and Extra Large. Socks come in Blue, Black, Argyle or white. And so on. You've all seen "tables" in Excel worksheets, with one column for each option. In Excel they are the only option. You have to have different columns for each option.  And in the case of two options, you might even have combinations like "Small Blue", "Medium Blue", "Large Blue" and "Extra Large Blue", along with all of the other possible combinations. The inevitable result is multiple columns stretching out to the right as far as the eye can scroll.

Unfortunately, those columns make their way into Access tables as fields as well.

But in Access forums, the problem table design is seldom broached in the initial post, although I have seen a few direct questions about it.

The question often presents itself as a problem with designing an interface to handle multiple fields  Should you use a check box to select one of those options, or some other interface element that addresses each of those fields somehow?

Sometimes the question asks for help in constraining data when two or more fields are available, only one of which should be filled and the others left blank . If you pick a Large size shirt, you can't also have it be Small. Black socks can't also be White socks.

Sometimes it shows up as a problem getting results out of a query with that kind of inappropriate table design. How do I write a query to get the Shirt Size we sell most often?

There are many variations, but they all boil down to lack of Normalization, and I compare that to trying to go back and putting in a foundation under a house when you've already put up walls and a roof. It's possible, but it's not going to be easy and it all might collapse on you.

If you plan to create an Access Relational Database Application, and if you don't know what I am talking about here, I think you owe it to yourself to invest a few hours in your peace of mind by learning as much as you can about the Rules of Normalization BEFORE you try to put up walls and a a roof.  The nicest roof in the world won't keep the rain out if the building collapses because it lacks a foundation.

The reason this is a problem is that, as I said at the beginning, Access is really open and easy to get started. It's tempting to import a working spreadsheet and throw an interface on. And, unfortunately, that can be done without understanding what the consequences will be.

Be safe, stay dry. Normalize first.

Monday, August 5, 2019

Why Choose Min() and Max() Over First() and Last() ?

Humans are much better than computers at resolving ambiguity. We're trained from birth to do that.

People can look at a series, or list, of values and decide which one is "the first" and which one is "the last" by referencing both the values themselves and the context in which the question was asked.

Perhaps such a  question might be "Of this list of people and their birthdates, which birthdate came first?"

We don't need to have the birthdays sorted in date order to answer that question because our minds are trained to recognize the context as well as the values. For example, if the list of birthdates is presented to us with the names of the people sorted alphabetically, we don't go to the item at the top of the list and pick it because know that's not the right context; we scan the entire list looking for the earliest one relative to other birthdates.

That kind of disambiguation is a skill computers are still learning, and one which Relational Database Applications are not set up to handle directly. We still need to make explicit the Sort Order the computer must use in order to find "the first".

Put it a different way.

If the question is phrased as it is above, the sort must be "earliest" birthdate to "most recent" birthdate so that the "earliest" is also "first" on the resulting set of records.

If the question is phrased differently, the sort must be different. "Of this list of people and their birthdates, which one is the birthdate of the  first person on the list?"

To answer that question, of course, we have to know which sort order to apply to the list to get the result expected by the questioner. Do they want the "first" person alphabetically? Alphabetically by FirstName? Alphabetically by LastName? Or does "first" refer to some other criteria like "tallest", "top producer", etc?

When the question is asked explicitly, it's easy to see why and how the answer will be different. We're asking about birthdates in different contexts, and that means  first doesn't always return the same answer.

Because of that ambiguity, many seasoned Access developers prefer to use Min() and Max() instead of First() and Last(). Min() and Max() are absolutes within the values themselves, regardless of context. The Minimum birthdate is the same regardless of any sort applied any of the other fields in that query. The same is true for Maximum birthdate. Access doesn't need to disambiguate the contextual reference for the question, "What is the minimum (or earliest) birthdate in this list of people and their birthdates?"

While it is possible to get accurate results when using First() and Last(), doing so depends on knowing both the values and the context in which the comparison is being made.

If you want to continue to use First() and Last() instead of Min() and Max(), continue to do so. Just keep in mind the possibility of potential subtle errors and make sure you use the appropriate ORDER BY clause.

First and Last refer to position in a list. Min and Max refer to values in one field.

Tuesday, July 9, 2019

Green is for Go, Yellow is for Slow, Red is for No.

Calculated Values Stored in Fields Don't Pay, They Cost

Most people who spend a lot of time around relational database applications develop a phobia for calculated values stored in tables. I am aware of arguments for special cases that justify doing it anyway. However, the general rule is a sound one: Don't calculate and store values in tables.

I have repeated this rule innumerable times myself. At a recent family gathering, though, I heard a story that explains just how awful the consequences of ignoring this rule can be. I thought I'd share the story here, subject to the usual disclaimers about changing or concealing details to protect privacy of those involved.

This company is in the communications business in a general sense. They install hardware and software along with other supporting infrastructure in order to provide their particular services to both commercial and residential customers. They maintain a number of very large databases, one of which details existing customer accounts, another of which details existing infrastructure previously installed in homes and business locations. In other words, their technical people track where they've installed hardware and other equipment in their area of operations in one database. Their accounting  people track customers and the locations where they receive services. Two different databases built around similar, but different data.

Salespersons contact potential customers to negotiate contracts for those services. Part of that process involves requesting information from their technical side about what infrastructure might already be installed at a location, perhaps for a previous occupant of that location. That communication takes some time, slowing down the negotiation. And, as we all know, time is money, especially for a busy salesperson. Somebody came up with the great idea of creating a third system that draws data from both of the other databases, synthesized, of course, for the salesperson. That way, they could provide a simple dashboard style look at any given location that would compare existing hardware and software at a site to the requirement specified in the potential contract at that site and flash one of three or four statuses. Let's say, for the sake of discussion, that "green" means the location needs no modifications to work for THAT contract, "yellow" means minor upgrades or changes, while "red' means the costs of making the required changes would be too great for the potential revenue that would be generated by that contract (unless the customer is willing to pick up the tab).

With that tool in place, a salesperson (or more likely the staff tasked with data entry) can enter the location and the services requested on a potential contract and get a status back right away. They have the option at that point, to sign the contract if it's "green", renegotiate based on a "yellow" status, or inform the potential customer right away that they can't meet their request on "red". Great, saving time means the salesperson moves quicker and earns more. Everyone's happier. 

Unfortunately, that status is a calculated value and it is STORED ONLY in the salesperson's system . Not in the technical database and not in the existing accounts database (it's irrelevant to both of them). And of all of the people who use the system, the marketing and sales people are, perhaps, the least likely to understand why that's a problem for them.

In fact, that's where things went off the tracks one day. 

A salesperson negotiated a new contract with a potential customer moving into a new space. The proposed contract went back to the office where a data entry person entered it into their sales system. The location and services requested flashed "yellow", meaning it would be possible to accept the contract, but the location would need some modifications. The company was even willing to pay for those minor modifications in anticipation of the revenue the contract could be expected to generate. 

The salesperson goes back, explains the situation, gets a new agreement to delay installation until a crew could come and modify infrastructure. In this case, the customer would also have to accept a crew interrupting normal  operations. Agreed. That must be a good salesperson, right?

So, the company sends out a sub-contractor to make the modifications. They submit an invoice for the work done and get paid. 

Now, the data entry person tries to reenter the new contract, knowing the location now meets the "green" status. Uh, no. The SALESPERSON's system still shows "yellow". What? It seems that the subcontractor was paid, but there was no proof they actually DID the work. Either the infrastructure database was not updated properly, or the subcontractor scammed them. Hm.

So, another delay has to be negotiated with the potential customer while still another crew is sent out to inspect. And YES! the work was done. Good to go, except the keeper of the infrastructure database refuses to update it until they get the subcontractor's written summary of the work. Another delay while they contact the subcontractor to get the required paperwork. The subcontractor tells them it was already submitted along with the invoice, but they'll be happy to go out again to fill out the paperwork. After all, they can't be expected to blindly fill out the required document without an inspection by their own people. And, by the way, they gave the company the only copy of their document. Hm. Double-dipping, perhaps? Super cautious, maybe?

Anyway, the salesperson has a problem. Still another negotiated delay while the paperwork dance is working it's way through the bureaucracy, OR (sly wink), simply change the CALCULATED  value from "yellow" to "green", allowing the system to accept the contract so he can get paid. And that's what happened. All happy now, right?

Well, not quite yet. The salesperson sets a date for the final install, gets his commission check, and moves on to the next sale.

Not so fast. When the company's install crew arrives on site to do the final work, THEIR database still shows deficiencies. And they refuse to install. Everyone insists it's all good, but if you get in the habit of overriding your system, sooner or later you're going to be sorry. No install until they get the document that shows the required work was done.

The salesperson has no choice but to ask the potential customer to delay once more and allow another crew to come out and verify their first crew did their work. At this point, the customer said, "enough is enough" and the deal was cancelled.  The commission had to be clawed back in the salesperson's next check.

So, this story involves a number of problems, and the storage of a calculated value is just one. The subcontractor may, or may not, have been playing fair. The bright bulb who adjusted the status from "yellow" to "green" probably didn't even know the trap they had just set. A combination of rigid rules and the need for accountability combined to sabotage things.  But none of it would have happened except for that short-cut that allowed the calculated value to be stored--and updated manually.

Saturday, June 15, 2019

Your Proposed Relational Database Table Design is Inappropriate

During many years of participation at UtterAccess.com, I've had a chance to review many, many "First-Timer databases". After a promising beginning, these new comers run into a blocker like a query that doesn't return the right records, or a form that won't accept new records, or a report that doesn't work the way they want. They run into a puzzling or challenging problem which seems like it ought to be simple. Instead they get stuck and have to turn to online forums like ours for help. We actually love that, because it gives us a chance to help them clean things up and get off to a better start. But all too often, I have to tell new members, "You don't have a query problem. You have an inappropriate table design." Or "You don't have a report problem or a form problem.Your tables are not appropriately designed. Put the forms on hold while you fix your tables."

I admit "inappropriate table design" is a euphemism for the more blunt assessments I'm tempted to state. I favor it over "you have created non-normalized tables", or worse, "Your table design is wrong." Here's why.

Often, the limited information we get as part of the question reveals the underlying tables are not designed according to the Rules of Normalization, but that's not the question originally asked by the newcomer. Instead of offering SQL or VBA to work around the problem, we have to start out questioning the very basis of the database application itself--The tables.

Ouch. No one likes to hear that they broke the rules, especially when they didn't even know those rules exist. Moreover, Access readily, and even gleefully, aids and abets the creation of those inappropriate, non-normalized tables. Sometimes I can hear the newcomer asking their computer screen, "Who is this George Hepworth guy" they wonder, "to second-guess Microsoft?  They let me do it this way, so it can't be wrong, can it?"

Well, yes, there are many ways a table design can be wrong, and all too often it is wrong the first time. It's my responsibility, I think, to let the new comer know when their first step leads down the path to the swamp, not the garden.

Anyway, recently I've been wondering if maybe my approach is on the right level.

Am I too harsh in telling new members their query doesn't work because it's based on inappropriate, i.e. poorly designed, tables that need to be changed?

Or is calling it "an inappropriate table design" too subtle ? Should I be stating outright that the table design is wrong and doesn't make sense? Should the new comer have a dose of shock therapy the first day? Or maybe I should ignore the problems at first and try to offer a query that works with the flawed tables  in hopes we can fix the real problem later? Maybe this is really just a minor thing that doesn't matter all that much.

Time after time I have come to the same conclusion.

In the long run, ignoring flaws in a table design means the problems are only going to get bigger, more intractable and even become risky. Members at UtterAccess and other forums come up with all sorts of metaphors, usually trite and well-worn, to explain that concern.
  • It's like putting a band-aid on a broken leg. 
  • It's like asking your doctor to give you aspirin for a broken leg. 
  • It's like putting up walls on bare dirt without a solid foundation. 
 And on and on. They all boil down to the same thing: work-arounds don't work in the long run. In the short-run, sure, you can force Access to work with a poorly designed table architecture. The result is fragile, inflexible and inefficient. I like to point out that there's nothing heroic about creating massively complex queries or wads of code to compensate for an inadequate table design. Sort of like putting in flying buttresses on a garage wall to make up for leaving out the foundation.

As one of my favorite people used to say, "Let's face it. We have a problem.  I know what it is, and it's not a query problem. Your table design is inappropriate."

Let's fix that.

Wednesday, January 16, 2019

It's About Time

My YouTube Training Videos Line-Up is Growing.

I've been adding videos to the line up. I'm also getting better at it; I think. I published Numbers Five and Six yesterday.

So far, the series includes
  • An Overview of  Display Formatting options for date/time fields in Access tables. It emphasizes the point that the VALUE stored doesn't change regardless of the DISPLAY FORMAT applied to a date/time field.
    •  That video includes an introduction to Display Formatting for dates. The key take-away from that one is that ALL dates in Access are stored as decimals with both a date component and a time component. See the video for details.
  •  A More Detailed View of Display Formatting options for date/time fields in Access tables. It includes examples of date comparisons illustrating the underlying values are not always apparent from the display format applied. See the video for details.
  • Improving Appearance and Effectiveness of Forms by using custom display formats for dates and dates with times. The key take-away from that one is that custom display formats are an easy, effective way to make your Access forms better. See the video for details.
  • Effective Date and Time Filtering in Access Forms uses Date Display Formats along with basic date logic to filter records returned to Access forms.  See the video for details.
  •  Default Dates and Date Formatting & Grouping in Access Reports  carries the discussion on to Access reports. In this video we see how date logic and formatted dates can support grouping within an Access report. It also introduces simple date logic functions necessary to select Default Values for controls on Menus. See the video for details.
  • Date Formatting and Filtering in Access Reports Filter Criteria for filtering, grouping and sorting in Access reports. In this video I show you how to use TempVars as criteria in the query which provides the records to be displayed in an Access report. See the video for details.
I'd love to start getting feedback from readers and viewers.
  • Are the videos useful to beginners?
    • I realize that experienced Access users and developers should know much of this information already, but my target audience is really the beginners. Am I on target?
  •  Are the videos too long (or too short)?
    • How long should videos like this actually be? Is ten minutes enough? Twenty minutes?
  • What do you want to see next?
    • I will continue the current series on Dates and Date Logic because it is such a fascinating and extensive topic. Where would similar videos be helpful?
Thanks. Looking forward to hearing from you.