Tuesday, September 25, 2018

Did I Mention I Have a Thing Against Lookup Fields in Tables?

Where do Lookup Fields in Tables Come From Anyway?

Whether you consider them to be a useful idea or not, lookup fields in Access tables came from somewhere and we have to deal with them. My candidate for their dubious origin is SharePoint, but I have only a hunch. I know they were already in Access 2003, but I no longer have earlier versions of Access available and can't recall prior to that(I'd appreciate being set straight on this point, if anyone knows more about it.) Suffice it say they made their way past the door wardens and are now firmly ensconced in MS Access tables, alongside the other dodgy characters cadging drinks from the regular patrons and whistling at the wait staff as they squeeze past to deliver the next round to the table behind.

We're not talking about combo boxes, or drop downs, on forms

I want to be clear before we get into this in depth.

We are talking about lookup fields in tables. Combo boxes on forms work almost exactly the same way in some regards. They can also be abused in the some of the same ways. It's just that the more pernicious implementations I've seen were born in tables.

 What do Lookup Fields in Tables Do?

The basic concept of a lookup field sounds pretty good -- at first look. Make it easy for users to see the values, not the foreign keys, for fields. You have probably had the experience of opening a table and seeing columns of foreign key values in one or more fields. Trying to equate what you see there, e.g. CompanyID of 210, with the value to which that key value relates, i.e. "The Big Box Company", which is CompanyID 210 in tblCompany.

By converting the foreign key field to a lookup field, you magically allow users to see "The Big Box Company" even though the table stores Foreign Key 210.

Done right, and fully understood, they can be benign and even handy for a developer. So far, so good. But like so many of life's little problems, it doesn't end there.


 What Could POSSIBLY GO Wrong?

Actually, more than you might expect.

Here's an example of one of my least favorite attempts to use Look up fields in tables.

Storing the Value, not the Key. 

This is the most common mistake I see. The best way to get to the point is with some images. So, here's a typical Static Lookup Table. This table has a Primary Key and a Value field. Many Lookup Tables are like this.
Lookup Table of Months

It is a Static List; there are twelve months in a year. Many Lookup Tables are more dynamic. Consider a list of approved vendors, for example. Vendors are added and removed from the list as the organization's needs and priorities change. Such dynamic lists might have a number of other fields and be used in other ways, in their own right. In fact, many history tables can double as lookup tables. Here's another example of a query (based on a table with a Date field), in which all of the previously entered dates serves as a lookup list for events in a combo box on a form. You could do this in a lookup field in a table too, I suppose, but, please, please, please, don't

SELECT DateValue([WD].[WorkDate]) AS WD
FROM tblWorkDetail AS WD
GROUP BY DateValue([WD].[WorkDate])
ORDER BY DateValue([WD].[WorkDate]) DESC;


tblWorkDetail contains several other fields, including the start and stop times for the work, a description of the work, and the WorkID itself to which that detail relates. To select all work by date completed for an invoice, this list needs only the relevant dates, grouped on Date. Selecting one date from the combo box adds all of the work for that date to an invoice. Double duty. But I digress. Back to lookup tables which have been infected with Lookup Fields.

Different considerations apply to static and dynamic lookup tables. With a Static list, like Months, it's fine to dispense with the surrogate key. I've seen many tables like this that work just fine.


Static Lookup Table with a Single field


Because it's static, you won't have to revisit the values in this table. You can also count on any of these values stored as a related field in a different table being stable. Many developers follow this practice with static lookups, despite the prevalence of Surrogates (i.e AutoNumbers) as Primary Keys in most tables.

With a Dynamic list, like Vendors, the issues are different. What happens, for example, if you misspell a name as "The Doctar Company" instead of "The Doctor Company". After a week of data entry someone finally notices the misspelling. By now there are multiple purchases made from the company. How hard is it going to be to correct the spelling mistake?

It depends.

If your Vendors Lookup table is like the first example, of course, all of those related records were stored in the purchase table under a surrogate key, (i.e. the Primary Key of the Vendor table.)  No problem, then, correct the Value field in the vendor table and you're done. The related records are not impacted at all.
Surrogate Primary Key to Foreign Key 


If your Vendors table is like the second example, it gets a bit stickier. First, you may or may not have defined the parent-child relationship with Cascade Updates. If so, changing the spelling in the lookup table should cascade the correction to all of the related records child records. Quite doable and safe enough, if not really ideal.



Cascade Update Saves The Day With a Single Value Lookup Table
If you neglected to check the Cascade Update Related Fields property, though, Access won't let you correct the spelling because changing the Primary Key would orphan any child table records with the original value as the Foreign Key. This is not an end of the world kind of problem for an experienced developer, but for the newcomer, who doesn't know how to work around the problem, it could be pretty upsetting. (There are a couple of ways around this problem, neither of which I want to show you. I'd rather you stayed on the straight and narrow to begin with, eh?) 

The long way round to the point, but here we are at last

With that background on the basics of relationships and lookup tables, let's go back to the one in our original table. It looks like this in design view.
Defining a Lookup Field in a Table

 And it looks like this in datasheet view.
Resulting "Lookup List

The Foreign Keys for the Month names are hidden and the user sees only the Month names. Many newcomers are pleased by this, but unaware of the trap they just laid for themselves.

In fact, on the surface, there's no way to differentiate the two in datasheet view. Consider these four screenshots.
Datasheet View of Lookup fields in tables, closed and dropped
 Can you spot the difference? I can't, not without looking at them in design new. (No fair using the table names as a clue.)

It's not surprising then, that many times newcomers are confused when they try to work with such tables in their queries. "Why do I see the Names when I look at the table, but when I use them in a query,  get the numbers instead?"


 So, now, here are some queries based on these tables.
Query Using Tables with Lookup Field based on Surrogate Key
 Access very helpfully includes the lookup field, hiding the fact that ParentTableLookupID is, in fact, a number. Not clear? Try this.
Valid Filtering on Lookup Field using Foreign Key

 Now try this.
Invalid Filtering on On lookup Key using visible, Value Field

 Is it any wonder that so many newcomers flounder when they find that these sweet, sweet lookup fields in their tables are laden with hidden pitfalls? 

And it gets worse from here. I promise.

Come back later for a followup on other problems I've seen.




Thursday, September 6, 2018

Dry Cleaners, Canoes, and Pigs in a Cadillac


I have a handful of analogies that explain my database design philosophy. Some came from colleagues and mentors, some are my own. Here are three of my favorites.


Three Functions of a Dry Cleaners — Three Functions of a Database Application

A properly designed, three tier Access database application bears a remarkable resemblance to a dry cleaners.

First you need a pleasant, efficiently laid out, user friendly interface. In a dry cleaners, that is the reception area and front counter. Customers are welcomed in by chrome, glass and potted plants. They interact with the counter person, dropping off new batches of dirty items and picking up clean ones. They usually get a receipt for their transaction, and use it again later to identify their dry cleaning for the cleaners to get it back.

That's pretty much how an Access interface works too. It facilitates data entry and reporting, i.e. the interactions with the data in the back end.

In the back of the dry cleaners you find the equipment and storage tubs, baskets, and  bins and the noisy, dirty cleaning machines you never want your customers to see.

And that's exactly how the tables and queries in an Access database work as well. You never want users to have to see them, but nothing works without them.

And between them, you find a transitional area where each customers items are grouped and sorted and moved from one container to another according to the rules established to manage it all.

In a well-designed Access database application, that's the job of the logic layer—the VBA and macros.

Yes, I know. It's not an ideal scaffold on which to hang the complexities of a properly designed database application. It is, nonetheless, a reasonably colorful picture of an Access database properly split into a Front End and Back End--with the preview of the logic layer that makes it all work.

So, if you will, an Access database application has a lot in common with a dry cleaners.

Paddling vs Floating in a Canoe

I got the canoe analogy from a fellow MS Access MVP. It’s a good way to explain why it's so important to do things "the Access Way". One of the most common problems we run across with Access is the misguided application of Excel spreadsheet experience to relational databases like Access.

Access is remarkably flexible and forgiving. It’s possible, for better or worse, to make it perform amazingly complex feats using “spreadsheet style” tables and sticky wads of Macros or VBA. Things like Repeating Groups of fields in a table, or even multiple tables containing segmented data (e.g. “tblSales2017”, “tblSales2018”, “tblSales2019”) are not only possible, but even, with enough effort and ingenuity, quite workable.

As the saying goes, just because you can do something, that doesn't mean you should do it. And that leads to the analogy of a trip in a canoe.

If you go upstream in a canoe, against the current, you’ll spend all of our time paddling. 

If you go downstream in a canoe, with the current, you only need the paddle to steer. 

Access, of course, is the canoe in this analogy and the development tools—tables, queries, VBA and reports—are the paddles.  If you want to work less, you’ll learn and follow the best design principles. Normalized relational tables, forms with subforms, and so on. Use them to steer, not to paddle against the current.

Pigs in a Cadillac

One of my favorite stories concerns pigs, Cadillacs and the surprising rarity of common sense. And not just in the design of Access database applications. This story goes back to the very start of my career with Access.

The original version of this story involved the wisdom of buying a Cadillac to transport pigs.

I was a member of a team tasked with evaluating software applications for a large financial enterprise.  Two main contenders emerged in the search. One was a modest Windows based package that came with a mid-five figure license fee. The other one had, as my friend Armen likes to say, an additional zero on the right end of the price tag. One of the analysts responsible for the evaluation feared we were going to choose the Cadillac version, so she offered this little story to encourage the common sense choice.

Here's the story.

You have raised a herd of pigs which you need to get to market. It's time to acquire a vehicle to haul them there. A visit to the local auto dealer, though, presents a bit of a puzzle. On the dealer's lot you find two vehicles big enough to do the job. One is a used pickup truck with a stake bed suited almost perfectly to hauling farm animals. The other is a brand-new full size Cadillac Escalade with plenty of room for a handful of pigs—after a few modifications of course. The price tags of those two vehicle options also differ by a zero, as you probably already guessed.


So, the question for you: Do you want to have the prestige that goes with being able to haul pigs in the back of a brand-new Cadillac? Or should you humble yourself and buy a used pickup truck because it's better suited to the job (and cheaper to boot)?

Well, in that particular situation, the enterprise took ownership of a very nice Cadillac, and had it retrofitted with an appropriate pig holding enclosure, which came at a substantial additional premium over the original license fee 😁.

Unfortunately, in that case, common sense did not prevail.

Over the years, I've had more than one occasion to apply the moral of that story to other situations. Despite the temptation to haul my own pigs in a Cadillac, I have made a concerted effort to stick to the common sense choice as much as possible. It’s saved me a lot of embarrassment and effort, not to mention money.

Lately, answering questions on UtterAccess, I've been thinking about Pigs in a Cadillac a lot. It’s seductively easy to look for a clever way to write wads of code to do something that would be dead-simple, but boring, if you do it “the Access Way”. And that leads me to my final thought. There’s nothing heroic about writing wads of code to compensate for a poorly designed table schema or an elaborate interface.