C. J. Date was the keynote speaker at this year’s Hotsos Symposium. The speech was entitled “Foundation Matters”. I couldn’t agree more. Fondations are very, very important. As I wrote at the beginning of part 1 (which, by the way, is entitled “Fondations”) of my book:
He who has not first laid his foundations may be able with great ability to lay them afterwards, but they will be laid with trouble to the architect and danger to the building.
—Niccoló Machiavelli, Il principe. 1532.
I found Chris’ speech really interesting. As a result, I decided to read his newest book: SQL and Relational Theory: How to Write Accurate SQL Code. Even if I must admit that it was not always an easy read…, I would advise every developer who uses SQL on a regular (daily) basis to read it.
There are only two things that disappointed me:
- Chris is known for being against NULLs in relational databases. Hence, I was expecting to have full coverage of this topic in such a book. Instead, there are only few pages in chapter 4 (No Duplicates, No Nulls) about it. And, sadly, the chapter ends with the following paragraph:
Of course, if nulls are prohibited, then missing information will have to be handled by some others means. Unfortunately, those other means are much too complex to be discussed in detail here. The SQL mechanism of (nonnull) default values can be used in simple cases; but for a more comprehensive examination of the issues involved, including in particular an explanation of how you can still get “don’t know” answers when you want them, even from a database without nulls, I’m afraid I’ll have to refer you to some of the publications listed in Appendix D.
- I did not understand why there is so much Tutorial D in the book. I mean, since this is a book for database practitioners, it makes not much sense, in my opinion, to spend so much time explaining its syntax and its features.
All in all, a very interesting reading!
I do agree that the basic foundation is the key for success. I was reading Dan Tow’s book and came across the classic quote “Well begun is half done”.
Interesting to read these comments, as I was one of the O’Reilly reviewers for this text book. Your comments on Tutorial D reflected mine exactly. Indeed, I had suggested to Chris that he cut out all that Tutorial D stuff, and write another book entitled ‘Tutorial D and Relational Theory’. He was not persuaded…
The issue of nulls is much more important. Its a pity that more space was not devoted to the subject, because, ultimately, if nulls are adopted within a relational model (and thus one departs from two-valued logic), the fundamental integrity of the database is compromised. With 2VL, one can demonstrate categorically the integrity of the results obtained from any query. Departing from 2VL, one cannot. Now transpose that uncertainty into a database system providing answers to life-critical systems, like designing aircraft. Need one say more?
Hi Peter
Thank you very much for your thoughts! It’s really nice to know that I’m not the only one feeling like that ;-)
Cheers,
Chris
He does give another example on p.239 on how nulls can be problematic
My problem is that I find them useful in some circumstances. With an effective end date, for example (slowly changing dimensions), is it better to allow nulls (any current record will have an undefined end date?), or to use a dummy value like 31-Dec-4000?
Jonathan Lewis seems to prefer allowing nulls, in p. 124 of CBO Fundamentals, but suggests histograms if nulls aren’t allowed, on p. 178, so seems to allow for avoiding nulls — but the implication is that nulls aren’t a bad thing (at least from a performance perspective).
Your thoughts? Would it be better to disallow nulls and just use the dummy value with a histogram?
Hi James
> Your thoughts? Would it be better to disallow nulls and just use the dummy value with a histogram?
IMHO there are many situations were NULLs are useful. In fact, I think that using “dummy values” like 21-Dec-4000 is really not much better. Hence, the only other possibility is to design the tables to avoid such cases. But, hey, how we would model one (or several) table(s) to store “first name”, “last name” and “middle name” (knowing that the latter might not be available)? Of course I can think of a comple of solutions… But, honestly, I don’t really like them.
In summary, I think that working without NULLs is not the way to go (at least with a database engine like Oracle). That said, the developers must be aware of the problems related to them.
Cheers,
Chris