I read a 45-tips-database-performance-tips-for-developers document from a famous commercial vendor for SQL tools today and there was one tip that confuse me:
If possible, avoid NULL values in your database. If not, use the
appropriate IS NULL and IS NOT NULL code.
I like having NULL values because to me it is a difference if a value was never set or it 0 or string empty. So databases have this for a porpuse.
So is this tip nonsense or should I take action to prevent having NULL values at all in my database tables? Does it effect performance a lot have a NULL value instead of a filled number or string value?
Besides the reasons mentioned in other answers, we can look at NULLs from a different angle.
Regarding duplicate rows, Codd said
If something is true, saying it twice doesn’t make it any more true.
Similarly, you can say
If something is not known, saying it is unknown doesn't make it known.
Databases are used to record facts. The facts (truths) serve as axioms from which we can deduce other facts.
From this perspective, unknown things should not be recorded - they are not useful facts.
Anyway, anything that is not recorded is unknown. So why bother recording them?
Let alone their existence makes the deduction complicated.
The NULL question is not simple... Every professional has a personal opinion about it.
Relational theory Two-Valued Logic (2VL: TRUE and FALSE) rejects NULL, and Chris Date is one of the most enemies of NULLs. But Ted Codd, instead, accepted Three-Valued Logic too (TRUE, FALSE and UNKNOWN).
Just a few things to note for Oracle:
Single column B*Tree Indexes don't contain NULL entries. So the Optimizer can't use an Index if you code "WHERE XXX IS NULL".
Oracle considers a NULL the same as an empty string, so:
WHERE SOME_FIELD = NULL
is the same as:
WHERE SOME_FIELD = ''
Moreover, with NULLs you must pay attention in your queries, because every compare with NULL returns NULL.
And, sometimes, NULLs are insidious. Think for a moment to a WHERE condition like the following:
WHERE SOME_FIELD NOT IN (SELECT C FROM SOME_TABLE)
If the subquery returns one or more NULLs, you get the empty recordset!
These are the very first few cases that I want to talk about. But we can speak about NULLs for a lot of time...
It's usually good practice to avoid or minimise the use of nulls. Nulls cause some queries to return results that are "incorrect" (i.e. the results won't correspond with the intended meaning of the database). Unfortunately SQL and SQL-style databases can make nulls difficult, though not necessarily impossible, to avoid. It's a very real problem and even experts often have trouble spotting flaws in query logic caused by nulls.
Since there is nothing like nulls in the real world, using them means making some compromises in the way your database represents reality. In fact there is no single consistent "meaning" of nulls and little general agreement on what they are for. In practice, nulls get used to represent all sorts of different situations. If you do use them it's a good idea to document exactly what a null means for any given attribute.
Here's an excellent lecture about the "null problem" by Chris Date:
http://www.youtube.com/watch?v=kU-MXf2TsPE
There are various downsides to NULLs that can make using them more difficult than actual values. for example:
In some cases they are not indexed.
They make join syntax more difficult.
They need special treatment for comparisons.
For string columns it might be appropriate to use "N/A", or "N/K" as a special value that helps distinguish between different classes of what could otherwise be NULL, but that's tricky to do for numerics or dates -- special values are generally tricky to use, and it may be better to add an extra column (eg. for date_of_birth you might have a column that specifies "reason_for_no_date_of_birth", which can help the application be more useful.
For many cases where data values are genuinely unknown or not relevant they can be entirely appropriate of course -- date_of_death is a good example, or date_of_account_termination.
Sometimes even these examples can be rendered irrelevant by normalising events out to a different table, so you have a table for "ACCOUNT_DATES" with DATE_TYPES of "Open", "Close", etc.
I think using NULL values in the database is feasible until your application has a proper logic to handle it, but according to this post there may be some problems as discussed here
http://databases.aspfaq.com/general/why-should-i-avoid-nulls-in-my-database.html
Related
I'm still relatively new to database design, and I'm making a table with SQLite. I thought I was taught that it's best to use NULL in place of empty strings, so that's what I've been doing. I'm building an address table with the line:
CREATE TABLE addresses (
addressID INTEGER PRIMARY KEY,
officeName TEXT,
address TEXT NOT NULL CHECK(address<>''),
UNIQUE (officeName, address)
And adding addresses to the database (through PHP PDO) using the line
INSERT OR IGNORE INTO addresses (officeName,address) VALUES (?,?)
That line should check to see if the officeName/address is already in the database, and ignore it if it is, or add it if it isn't. "Address" is always a non-null string, but sometimes the officeName is blank. And if I make it NULL, it keeps getting added as if each NULL was distinct (it works fine if it's just an empty string). I did find this article saying that yes, NULLs are treated as distinct in a unique column. That now makes me wonder… should I always just use an empty string instead of NULL? Is there ever a case where it's "best practice" to use NULL instead? I thought it was always best practice, but now I'm thinking it might never be best practice.
NULL and the empty string are semantically different, just as NULL and 0 are semantically different.
NULL means "no value". In your case, that would be "no address".
Empty string is string string value of zero length. In your case, that would be an address that is the empty string.
Whether or not to use NULL or the empty string depends on the semantics of the situation, just like the decision of whether to use NULL or 0.
However, NULLs are a bit of a mess when it comes to comparison, IN, indexes, DISTINCT, and GROUP BY. Everyone seems to do things a little differently (FYI, this link doesn't cover SQL Server, which does it yet another way), so unfortunately, compromises are often made to accommodate particular desired behavior, depending on the DBMS.
In your case, you will have to use empty strings if you want to use the SQLite functionality you are interested in.
SQLite was originally coded in such a way that [NULLs are never distinct]. But the experiments run
on other SQL engines showed that none of them worked this way. So
SQLite was modified to work the same as Oracle, PostgreSQL, and DB2.
This involved making NULLs indistinct for the purposes of the SELECT
DISTINCT statement and for the UNION operator in a SELECT. NULLs are
still distinct in a UNIQUE column. This seems somewhat arbitrary, but
the desire to be compatible with other engines outweighed that
objection.
Know, however, that INSERT OR IGNORE is unique to SQLite; for no other DBMS would you be asking about using that statement.
Best practice is to base your decision on what you mean: no value, or the value with no characters. (Of course, you may always choose to forgo best practice for your own personal reasons.)
The data in my dabatase comes from an external source and where ever there is no data passed, I keep NULL in those places. Can anyone tell me if there are any implications in using NULL to represent empty value?
Should I follow some other convention like 'data_not_available'? or something like that?
Can anyone suggest?
I've always used NULL. I've seen arguments saying NULL was a hack and should never have been put into mainstream use and it's now out of control, but I can't think of a better way to treat something as having "no value."
After all how you can represent a number as having no value? 0 is a value. -1 is a value. -9999999 is a value.
Also foreign keys depend on a NULL value to signify there is no related record.
Wikipedia's entry on NULL in SQL is actually very informative.
NULL is acceptable, but can often indicate database design problems, where data should reside in a separate table with a FK.
Conceptually, NULL means "a missing unknown value" and it is treated somewhat differently from other values. For example, to test for NULL in MySQL, you cannot use the arithmetic comparison operators such as =, <, or <>.
Since you will be having columns that may have "missing or unkown" values, you have to set them to accept NULL. On the other hand, a table with many NULL columns may be indicating that this table needs to be refactored into smaller tables that better describe the entities they represent.
Note that in general using a convention like 'data_not_available' is not recommended. Using NULLs is the convention, and your DBMS already knows about it.
I'm just stepping into a project and it has a fairly large database backend. I've started digging through this database and 95% of the fields are nullable.
Is this normal practice in the database world? I'm just a lowly programmer, not a DBA but I would think you would want to keep nullable fields to a minimum, only where they make sense.
Is it a "code smell" if most columns are nullable?
Default values are typically the exception and NULLs are the norm, in my experience.
True, nulls are annoying.
It's also extremely useful because null is the best indicator of "NO VALUE". A concrete default value is very misleading, and you can lose information or introduce confusion down the road.
Anyone who has developed a data entry application knows how common it is for some of the fields to be unknown at the time of entry -- even for columns that are business-critical, to address #Chris McCall's answer.
However, a "code smell" is merely an indicator that something might be coded in a sloppy way. You use smells to identify things that need more investigation, not necessarily things that must be changed.
So yes, if you see nullable columns so consistently, you're right to be suspicious. It might indicate that someone was being lazy, or afraid to declare NOT NULL columns unequivocally. You can justify doing your own analysis.
I'm of the Extreme NO camp: I avoid NULLs all the time. Putting aside fundamental considerations about what they actually mean (because talk to different people, you'll get different answers such as "no value", "unknown value", "missing", "my ginger cat called Null"), the worst problem NULLs cause is that they often ruin your queries in mysterious ways.
I've lost count of the number of times I've had to debug someone's query (okay, maybe 9) and traced the problem to a join against a NULL. If your code needs ISNULL to repair joins then the chances are you've also lost index applicability and performance with it.
If you do have to store a "missing/unknown/null/cat" value (and it's something I prefer to avoid), it is better to be explicit about it.
Those skilled at NULLs may disagree. NULL use tends to split SQL crowds down the middle.
In my experience, heavy NULL use has been positively correlated with database abuse but I wouldn't carve this into stone tablets as some Law of Nature. My experience is just my experience.
EDIT: Additional thought. It is possible that those who are anti-null racists like myself are more excited by normalization than those who are pro-NULL. I don't think rabid normalizers would be too happy with ragged edges on their tables that can take NULLs. Lots of nulls may indicate that the the database developers are not into heavy normalisation. So rather than NULL suggesting code is "bad" it may alternatively suggest the philosophical position of the developers on normalisation. Maybe this is reaching. Just a thought.
Don't know if I consider it always a bad thing, but if the columns are being added because a single record (or maybe a few) need to have values while most don't, then it indicates a pretty flat table structure. If you're seeing column names like "addr1", "addr2", "addr3", then it stinks!
I would bet that most of the columns you have could be removed and represented in other tables. You could find the "non-null" ones through a foreign key relationship. This will increase the joins that you'll be doing, but it could be more preformant that doing a "where not col1 is null".
I think nullable columns should be avoided. Wherever the semantics of the domain make it possible to use a value that clearly indicates missing data, it should be used instead of NULL.
For instance, let's imagine a table that contains a Comment field. Most developers would place a NULL here to indicate that there's no data in the column. (And, hopefully, a check constraint that disallows zero-length strings so that we have a well-known "value" to indicate the lack of a value.) My approach is usually the opposite. The Comment column is NOT NULL and a zero-length string indicates the lack of a value. (I use a check constraint to ensure that the zero-length string is really a zero-length string, and not whitespace.)
So, why would I do this? Two reasons:
NULLs require special logic in SQL, and this technique avoids that.
Many client-side libraries have special values to indicate NULL. For instance, if you use Microsoft's ADO.NET, the constant DBNull.Value indicates a NULL, and you have to test for that. Using a zero-length string on a NOT NULL column obviates the need.
Despite all of this, there are many circumstances in which NULLs are fine. In fact, I have no objection to their use in the scenario above, although it wouldn't be my preferred way.
Whatever you do, be kind to those who will use your tables. Be consistent. Allow them to SELECT with confidence. Let me explain what I mean by this. I recently worked on a project whose database was not designed by me. Nearly every column was nullable and had no constraints. There was no consistency about what represented the absence of a value. It could be NULL, a zero-length string, or even a bunch of spaces, and often was. (How that soup of values got there, I don't know.)
Imagine the ugly code a developer has to write to find all of those records with a missing Comment field in this scenario:
SELECT * FROM Foo WHERE LEN(ISNULL(Comment, '')) = 0
Amazingly there are developers who regard this as perfectly acceptable, even normal, despite possible performance implications. Better would be:
SELECT * FROM Foo WHERE Comment IS NULL
Or
SELECT * FROM Foo WHERE Comment = ''
If your table is properly designed, the above two SQL statements can be relied upon to produce quality data.
In short, I would say yes, this is probably a code smell.
Whether a column is nullable or not is very important and should be determined carefully. The question should be assessed for every column. I am not a believer in a single "best practices" default for NULL. The "best practice" for me is to address the nullability thoroughly during the design and/or refactoring of the table.
To start with, none of your primary key columns are going to be nullable. Then, I strongly lean towards NOT NULL for anything which is a foreign key.
Some other things I consider:
Criteria where NULL should be strongly avoided:
money columns - is there really a possibility that this amount will be unknown?
Criteria where NULL can be justified most frequently:
datetime columns - there are no reserved dates, so NULL is effectively your best option
Other data types:
char/varchar columns - for codes/identifiers - NOT NULL almost exclusively
int columns - mostly NOT NULL unless it's something like "number of children" where you want to distinguish an unknown response.
No, whether or not a field should be nullable is a data concept and can't be a code smell. Whether or not NULLs are annoying to code has nothing to do with the usefulness of having nullable data fields.
They are a (very common) smell, I'm afraid. Look up C.J. Date writings on the topic.
As a best practice, if a column shouldn't be nullable, then it should be marked as such. However, I don't believe in going completely insane with things like this.
I think so. If you don't need the data, then it's not important to your business. If it is important to your business, it should be required.
This is all completely dependent on the scope and requirements of the project. I wouldn't use number of nullable fields alone as a metric for poorly written or designed code. Have a look at the business domain, if there are many non nullable fields represented there that are nullable in the database, then you have some issues.
In my experience, it is a problem when Null and Not Null don't match up to the required field /not required field.
It is in the realm of possibility that those really are all optional fields. If you find in the business tier or the UI tier that those fields are required, then I think this means the data model has drifted away from the business object model and is a sign of overly conservative DB change policies, or oversight.
If you run a sample data generator on your data, and then try to load the data that is valid according to SQL, you would find out right away if the rules match up.
That seems like a lot, it probably means you should at least investigate. Note that if this is mature product with a lot of data, convincing anyone to change the structure may be difficult. The earlier in the design phase you catch something like this the easier it is to fix up all the related code to adjust for the change.
Whether it is bad that they used the nulls would depend on whether the columns allowing nulls look as if they should be related tables (home phone, cell phone, business phone etc which should be in aspearate phone table) or if they look like things that might not be applicable to all records (possibly could bea related table with a one-to-one relationship)or might not be known at the time of data entry (probably ok). I would also check to see if they in fact alwAys do have a value (then you might be able to change to not null if the information is genuinely required by the busniess logic). If you have a few records with null
In my experience, a lot nullable field in a large database like you have is very normal. Considering it perhaps is used by a lot of applications written by different people. Making columns nullable is annoying but it is perhaps the best way to keep the application robust.
One of the many ways to map inheritance (e.g. c# objects) to a database is to create a table for the class at the top of the hierarchy, then add the columns for all the other classes. The columns have to be nullable for when an object of a different subclass is stored in the database. This is called Single-table inheritance mapping (or Map Hierarchy To A Single Table) and is a standard design pattern.
A side effect of Single-table inheritance mapping is that most columns are nullable.
Also in Oracle an empty string (0 length) is considered to be null, therefore in some companies all strings columns are made nullable even on SqlServer. (just because the first customer wants the software on SqlServer does not mean the 2nd customer does not have a Oracle DBA that will not let SqlServer onto there network)
To throw the opposite opinion out there. Every single field in a database should nullable. There is nothing more frustrating than working with a database that on every single insert throws an exception about required this or required that. Nothing should be required.
There is one exception to that, keys. Obviously all primary and foreign keys should be enforced to exist.
It should be the application's job to validate data and the database to simply store and retrieve what you give it. Having it process validation logic even as simple as null or not null makes a project way more complex to maintain for having different rules spread over everything.
As mentioned by others, front-facing data entry should allow omittance of many fields. This is complicated by how people interpret the trinary nature of NULL (e.g. empty versus missing).
As such, I am only answering about one facet of database design: foreign keys.
In general, foreign keys do not suffer from the arbitrary nature of business logic, therefore seeing these columns allowing NULL is definitely a code smell.
For example, if you had a [Person] table, in no situation would you ever have a [Person].[FatherID] value that was NULL intentionally.
For a large database, an attempt to save NULL to such a column is likely to occur at some point due to the inevitability of bugs, which would have been brought to light much sooner by having a NOT NULL constraint. So for version 1 or a table, you should never allow nullable columns without justification.
But things get much trickier in an evolving code base, especially one that is staying online and thus requires migration scripting to upgrade. In particular, you may find nullable columns added to tables later on, because properly adding them as non-nullable can be quite hard depending on your integration process.
Furthermore, visual table designers (such as in SQL Server Management Studio and Visual Studio) default to allowing NULL so it could simply be a matter of inadequate code review.
I don't want to attempt a proper answer for flag (i.e. boolean) columns, but I strongly suggest considering how they can be implemented without allowing NULL, since I have usually found ways to avoid nullability even under the constraints of business logic.
I'm reading CJ Date's SQL and Relational Theory: How to Write Accurate SQL Code, and he makes the case that positional queries are bad — for example, this INSERT:
INSERT INTO t VALUES (1, 2, 3)
Instead, you should use attribute-based queries like this:
INSERT INTO t (one, two, three) VALUES (1, 2, 3)
Now, I understand that the first query is out of line with the relational model since tuples (rows) are unordered sets of attributes (columns). I'm having trouble understanding where the harm is in the first query. Can someone explain this to me?
The first query breaks pretty much any time the table schema changes. The second query accomodates any schema change that leaves its columns intact and doesn't add defaultless columns.
People who do SELECT * queries and then rely on positional notation for extracting the values they're concerned about are software maintenance supervillains for the same reason.
While the order of columns is defined in the schema, it should generally not be regarded as important because it's not conceptually important.
Also, it means that anyone reading the first version has to consult the schema to find out what the values are meant to mean. Admittedly this is just like using positional arguments in most programming languages, but somehow SQL feels slightly different in this respect - I'd certainly understand the second version much more easily (assuming the column names are sensible).
I don't really care about theoretical concepts in this regard (as in practice, a table does have a defined column order). The primary reason I would prefer the second one to the first is an added layer of abstraction. You can modify columns in a table without screwing up your queries.
You should try to make your SQL queries depend on the exact layout of the table as little as possible.
The first query relies on the table only having three fields, and in that exact order. Any change at all to the table will break the query.
The second query only relies on there being those three felds in the table, and the order of the fields is irrelevant. You can change the order of fields in the table without breaking the query, and you can even add fields as long as they allow null values or has a default value.
Although you don't rearrange the table layout very often, adding more fields to a table is quite common.
Also, the second query is more readable. You can tell from the query itself what the values put in the record means.
Something that hasn't been mentioned yet is that you will often be having a surrogate key as your PK, with auto_increment (or something similar) to assign a value. With the first one, you'd have to specify something there — but what value can you specify if it isn't to be used? NULL might be an option, but that doesn't really fit in considering the PK would be set to NOT NULL.
But apart from that, the whole "locked to a specific schema" is a much more important reason, IMO.
SQL gives you syntax for specifying the name of the column for both INSERT and SELECT statements. You should use this because:
Your queries are stable to changes in the column ordering, so that maintenance takes less work.
The column ordering maps better to how people think, so it's more readable. It's more clear to think of a column as the "Name" column rather than the 2nd column.
I prefer to use the UPDATE-like syntax:
INSERT t SET one = 1 , two = 2 , three = 3
Which is far easier to read and maintain than both the examples.
Long term, if you add one more column to your table, your INSERT will not work unless you explicitly specify list of columns. If someone changes the order of columns, your INSERT may silently succeed inserting values into wrong columns.
I'm going to add one more thing, the second query is less prone to error orginally even before tables are changed. Why do I say that? Becasue with the seocnd form you can (and should when you write the query) visually check to see if the columns in the insert table and the data in the values clause or select clause are in fact in the right order to begin with. Otherwise you may end up putting the Social Security Number in the Honoraria field by accident and paying speakers their SSN instead of the amount they should make for a speech (example not chosen at random, except we did catch it before it actually happened thanks to that visual check!).
I know that logically, there are some cases where NULL values make sense in a DB schema, for example if some values plain haven't been specified. That said, working around DBNull in code tends to be a royal pain. For example, if I'm rendering a view, and I want to see a string, I would expect no value to be a blank string, not "Null", and I hate having to code around that scenario.
Additionally, it makes querying easier. Admittedly, you can do "foo is not null" very easily, but for junior SQL devs, it's counter intuitive to not be able to use "foo != null" (and yes, I know about options to turn off ANSI nulls, etc, but that's definitely NOT simpler, and I don't like working away from the standard).
What good reason is there for having/allowing nulls in a database schema?
The most significant reason for allowing NULLS is that there is no reasonable alternative. Logically, a NULL value represents "undefined". For lack of NULLS, you'll end up trying to specify a "dummy" value wherever the result is undefined, and then you'll have to account for said "dummy" value in ALL of your application logic.
I wrote a blog article on the reasons for including NULL values in your database. You can find it here. In short, I DO believe that NULL values are an integral part of database design, and should be used where appropriate.
C.J. Date in his book "SQL and Relational Theory" (2009: O'Reilly; ISBN 978-0-596-52306-0) takes a very strong stand against NULLs. He demonstrates that the presence of NULLs in SQL gives wrong answers to certain queries. (The argument does not apply to the relational model itself because the relational model does not allow NULLs.)
I'll try to summarize his example in words. He presents a table S with attributes SNO (Supplier Number) and City (City where supplier is located) and one row: (S1, London). Also a table P with attributes PNO (Part Number) and City (City where part is produced) and one row: (P1, NULL). Now he does the query "Get (SNO,PNO) pairs where either the supplier and part cities are different or the part city isn't Paris (or both)."
In the real world, P1 is produced in a city that either is or is not Paris, so the query should return (S1, P1) because the part city either is Paris or is not Paris. (The mere presence of P1 in table P means that the part has a city associated with it, even if unknown.) If it is Paris, then supplier and part cities are different. If it is not Paris, then the part city is not Paris. However, by the rules of three-valued logic, ('London' <> NULL) evaluates to UNKNOWN, (NULL <> 'Paris') evaluates to UNKNOWN, and UNKNOWN OR UNKNOWN reduces to UNKNOWN, which is not TRUE (and not FALSE either), and so the row isn't returned. The result of the query "SELECT S.SNO, P.PNO FROM S, P WHERE S.CITY <> P.CITY OR P.CITY <> 'Paris'" is an empty table, which is the wrong answer.
I'm not an expert and not currently equipped to take the pro or con here. I do consider C.J. Date to be one of the foremost authorities on relational theory.
P.S. It is also true that you can use SQL as something other than a relational database. It can do many things.
What good reason is there for having/allowing nulls in a database schema?
From the theory's point of view, having a NULL means that the value is not defined for a column.
Use it wherever you need to say "I don't know / I don't care" to answer the question "What is the value of this column?"
And here are some tips from performance's point of view:
In Oracle, NULL's are not indexed. You can save the index space and speed up the queries by using NULL's for the values you don't need to index.
In Oracle, trailing NULL's occupy no space.
Unlike zeroes, NULL's can be safely divided by.
NULL's do contribute into COUNT(*), but don't contribute into COUNT(column)
Nulls are good when your column can really have an unknown value which has no default.
We can't answer if your column applies to that rule.
for example if you have and end date you might be tempted to put in datetime.maxvalue in as the default isntead of null. it completely valid but you have to take into account reporting being done on that and stuff like that.
In theory, there is no difference between theory and practice. In practice, there is.
In theory, you can design a database that never needs a NULL in it, because it's fully normalized. Whenever a value is to be omitted, the entire row containing it can be omitted, so there's no need for any NULL.
However, the extent of table decomposition you have to go through in order to get this result is just simply not worth the gain from the aspect of theoretical esthetics. It's often best to let some columns contain NULLS.
Good candidates for nullable columns are ones where, in addition to the data being optional, you are never using the column in a comparison condition in a WHERE or HAVING clause. Believe it or not, foreign keys often work OK with NULLS in them, to indicate an instance of a relationship that is not present. INNER JOINS will drop the NULLS out along with the rows that contain them.
When a value is often used in boolean conditions, it's best to design so that NULLS won't happen. Otherwise you are apt to end up with the mysterious result that, in SQL, the value of "NOT UNKNOWN" is "UNKNOWN". This has caused bugs for a number of people before you.
Generally, if you allow NULL for a column in a database, that NULL value has some separate meaning with regards to the structure of the database itself. For example, in the StackOverflow database schema, NULL for the ParentId or Tags column in the Post table indicates whether the post is a question or an answer. Just make sure that in each case, the meaning is well documented.
Now your particular complaint is about handling these values in client code. There are two ways to mitigate the issue:
Most cases with a meaning like the one described above should never come back to the client in the first place. Use the NULL in your queries to gather the correct results, but don't return the NULL column itself.
For the remaining cases, you can generally use functions like COALESCE() or ISNULL() functions to return something that's easier to process.
A null is useful whenever you need to specify that there is no value at all.
You could use a magic number instead, but it's more intuitive to handle nulls than to handle magic values, and it's easier to remember which value to handle. (Hm... was it -1 or 99999 or 999999 that was the magic value...?)
Also, magic values doesn't have any real magic, there is no fail safe to keep you from using the value anyway. The computer doesn't know that you can't multiply 42 with -1 because -1 happens to be an unreasonable value in this situation, but it knows that you can't multiply 42 with null.
For a textual value an empty string can work as "no value", but there are some drawbacks even there. If you for example have three spaces in a field it's not always possible to visually distinguish from the empty string, but they are different values.
Nulls should and must be used anytime the information may not be available at the time the original data is entered (Example, ship date on an order).
Certainly there are situations where nulls may indicate the need to redesign (a table consisting of mostly null entries in most fields is probably not properly normalized, a filed that contains all null values is probably not needed.)
To not use nulls because your jr developers don't properly understand them indicates that you have a bigger problem than the nulls. Any developer who doesn't understand how to access data that includes nulls, needs to be given basic training in SQL. This is as silly as not using triggers to enforce data integrity rules because the devs forget to look at them when there is a problem or not using joins because the devs don't understand them or using select * because the devs are too lazy to add the field names.
In addition to the great reasons mentioned in other answers NULL can be very important for new releases of existing products.
Adding a new Nullable column to an already existing table has relatively low impact. Adding a new non-Nullable column is a much more involved process because of data migration. If you or your customers have lots of data the time and complexity of the migration can become a significant problem.
Reasons for having nulls
It's an accepted practice, and everyone who does database work knows how nulls function.
It clearly shows that there is an absence of a value.
For what it's worth, SQL-99 defines a predicate IS [NOT] DISTINCT FROM which returns true or false, even if the operands are NULL.
foo IS DISTINCT FROM 1234
Is equivalent to:
foo <> 1234 OR foo IS NULL
PostgreSQL, IBM DB2, and Firebird support IS DISTINCT FROM.
Oracle and Microsoft SQL Server don't (yet).
MySQL has their own operator <=>, which works like IS NOT DISTINCT FROM.
A database is corrupt to the extent that it contains null.
There is NEVER a case where NULL makes sense logically. NULL is not a part of the relational model, and relational theory does not have such a concept as NULL.
NULL is "useful", in the sense that crappy DBMS's leave you no other choice but to use it, at the PHYSICAL level, which those very same crappy DBMS's themselves gravely confuse with the logical level, and more or less force their users to do the same.
I agree with most of the answers on here, but to phase it a different way, "you can't have a value that means two things". It's just confusing. Does 0 actually mean 0? or does it mean we don't know yet? etc.
When there is an entity that has no value for its attribute, then we use a null value. A null value is not 0, but it is nothing value. One example is most Korean names have no middle name. If there is a name attribute with first name, middle and last name, a special value null should be given.