I have a question about NULL in database. From my previous knowledge, I can recall my mentor telling me that if the database is indexed and we put NULL keyword in a field then indexing doesnt remain in effect whenever a search is conducted on database and the search is sequential which affects performance significantly (if I am wrong on this concept please correct it). I am working with database currently (DB2) and I notice string 'NULL' instead of keyword NULL. I have tried searching data with string 'NULL' and then same data with NULL and results are different.
I wanted to know the difference between the String 'null' vs NULL
keyword in SQL (DB2)?
is one better than the other?
Does string 'NULL' affect indexing, this goes back to the concept i mentioned
above?
Thanks
I wanted to know the difference between the String 'null' vs NULL keyword in SQL (DB2)?
'NULL' is not the NULL, its a string with a value 'NULL'. Whereas NULL signifies the absence of a value. So if you write a condition like
where column = null //output will be unknown
where column = 'null' //output will contain all rows that have the string 'null'
is one better than the other?
It depends on your requirement. However when you are trying to store NULL then it is not the same as 'NULL'
Does string 'NULL' affect indexing, this goes back to the concept i
mentioned above?
I dont think 'NULL' affects performance. You can cross check it via the stats which you get. See here:
As far as an index is concerned, a NULL is treated like any other
value. (Yeah, I know NULL is really the absense of a value, but index
manager failed the basic relational concepts course ;-)
Given your stats, you won't be able to create a unique index on the
columns since a unique undex will only allow one NULL. With a
non-unique index, all the NULLs will be gathered into a single index
key (with lots of associated RIDs).
Related
I have database in which data is imported from the other table. if data is empty there. so while importing to here it became null. When i query the columns like, name doesnot starts with 'a', it should return all records whose name doesn't start with 'a'. including NULL/empty column. its returning epty records but not null. But i need null feilds also. I useing hibernate and sqlserver 2005. how to achieve this.?please help.
Thanks
Null and Empty are different things.
When you say "Retrieve all the entries that do not start with a" it means that it will retrieve all the entries with something that is not a. Null is not something. Null is nothing. Empty is something.
You should modify your query to add OR IS NULL, to retrieve also the null fields.
From Wiki:
Null is a special marker used in Structured Query Language (SQL) to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model...
...Since Null is not a member of any data domain, it is not considered a "value", but rather a marker (or placeholder) indicating the absence of value. Because of this, comparisons with Null can never result in either True or False, but always in a third logical result, Unknown.
Checkout this discussion.
I have a table with several columns.
Sometimes some of these column fields may be empty (ie. I won't use them in some cases).
My questions:
Would it be smart to set them to NULL in phpmyadmin?
What does the "NULL" property actually do?
Would I gain anything at all by setting them to NULL?
Is it possible to use a NULL field the same way even though it is set to null?
The concept of the NULL value is a common source of confusion for newcomers to SQL, who often think that NULL is the same as an empty string '', or a value of zero.
This is not the case. Conceptually, NULL means "a missing unknown value" and it is treated somewhat differently from other values. For example, to test for NULL, you cannot use the arithmetic comparison operators such as =, <, or <>.
If you have columns that may contain "a missing unknown value", you have to set them to accept NULLs.
On the other hand, a table with many NULL columns may be indicating that this table needs to be refactored into smaller tables that better describe the entities they represent.
I recommend you read Problems with NULL Values.
1- Would it be smart to set them to
NULL in phpmyadmin?
All fields are null by default unless you specify a default value for them or insert some value for them. No need to do this...
2 -What does the "NULL" property
actually do?
Null means that you have not assigned any value to it.
3- Would I gain anything at all by
setting them to NULL?
As said before, all fields are null by default unless you specify a default value for them or insert some value for them. I don't think you are going to gain anything.
4- Is it possible to use a NULL field
the same way even though it is set to
null?
What would you gain out of a field having value of NULL? No need for this too.
Going to try to answer your questions all at once here.
NULL represents something along the lines of "Unknown"/"No value" or "Not applicable". So yes, if there are columns that are unused in certain circumstances, it would be appropriate to set them to NULL when not used (as no other value is appropriate).
It is possible to constrain a column to NOT NULL, meaning that the column must have a value for each row. An example would the "name" column of a "person" record. It doesn't make sense for a name to be NULL, as everybody has a name.
You can "use" a NULL column, just keep in mind you have to be careful when doing comparisons. A NULL field is never equal to another field. Check for "IS NULL" or "IS NOT NULL".
Brief answers to your questions:
Yes, NULL means that the field contains nothing at all. If that's the true state of affairs, that's what the data should say. An example would be the shipped_date for an order which has not yet shipped. In this case, NULL would accurately represent the value until the order ships out, since until it does there isn't a valid time at which it did (and in this case, checking for the NULL value might be quite a valuable tool in determining which orders do still need to be shipped).
NULL means that the field contains nothing. "Nothing" is different from, say, the value 0 or the string "", as these are values. NULL means roughly the same thing as "N/A" or "I decline to answer". What exactly it would mean is context dependent on the column. Of course, some columns should never be NULL, and you can enforce that with your table design.
If most of the fields in a column are NULL, you should rethink exactly how you're using that column. Generally speaking, a large number of NULL values indicates you could design your tables better. As to defaulting, you can always set a nullable value to default to NULL.
The same way as what? NULL is a unique value. It's not equivalent to 0, or "", or anything else like that. In a query, you must check for IS NULL or IS NOT NULL, and if a null is pulled in to a dataset, you must check for it specifically there too. Asking if a column set to NULL is equal to 0, or "", or what have you, will return false.
Now sometimes some of these column fields may be empty (ie. I wont use them in some cases).
Would it be smart to set them to NULL in phpmyadmin?
Yes, that's what it's for.
What does the "NULL" property actually do?
It makes the database allow NULL as a value stored in the column. "NOT NULL" means a column
must have a value that is not NULL.
Would I gain anything at all by setting them to NULL?
No. If your logic requires that a column never contains NULL as a value, it's better to set it to "NOT NULL". Think of it as an assertion: it is safe to assume the column value will never be NULL, so you don't have to test for it. That database takes care of that assertion.
Is it possible to use a NULL field the same way even though it is set to null?
I'm not sure what you mean by that... Anyway, NULL and NOT NULL columns are identical in every way, except that NULL columns can contain NULL.
And NULL is a strange value. val = NULL is never true, even if val is NULL. For that you have to test with "IsNull()", "IS NULL" or "IS NOT NULL". See Reference Manual: Comparison Functions and Operators.
For some time i'm debating if i should leave columns which i don't know if data will be passed in and set the value to empty string ('') or just allow null.
i would like to hear what is the recommended practice here.
if it makes a difference, i'm using c# as the consuming application.
I'm afraid that...
it depends!
There is no single answer to this question.
As indicated in other responses, at the level of SQL, NULL and empty string have very different semantics, the former indicating that the value is unknown, the latter indicating that the value is this "invisible thing" (in displays and report), but none the less it a "known value". A example commonly given in this context is that of the middle name. A null value in the "middle_name" column would indicate that we do not know whether the underlying person has a middle name or not, and if so what this name is, an empty string would indicate that we "know" that this person does not have a middle name.
This said, two other kinds of factors may help you choose between these options, for a given column.
The very semantics of the underlying data, at the level of the application.
Some considerations in the way SQL works with null values
Data semantics
For example it is important to know if the empty-string is a valid value for the underlying data. If that is the case, we may loose information if we also use empty string for "unknown info". Another consideration is whether some alternate value may be used in the case when we do not have info for the column; Maybe 'n/a' or 'unspecified' or 'tbd' are better values.
SQL behavior and utilities
Considering SQL behavior, the choice of using or not using NULL, may be driven by space consideration, by the desire to create a filtered index, or also by the convenience of the COALESCE() function (which can be emulated with CASE statements, but in a more verbose fashion). Another consideration is whether any query may attempt to query multiple columns to append them (as in SELECT name + ', ' + middle_name AS LongName etc.).
Beyond the validity of the choice of NULL vs. empty string, in given situation, a general consideration it to try and be as consistent as possible, i.e. to try and stick to ONE particular way, and to only/purposely/explicitly depart from this way for good reasons and in few cases.
Don't use empty string if there is no value. If you need to know if a value is unknown, have a flag for it. But 9 times out of 10, if the information is not provided, it's unknown, and that's fine.
NULL means unknown value. An empty string means a known value - a string with length zero. These are totally different things.
empty when I want a valid default value that may or may not be changed, for example, a user's middle name.
NULL when it is an error if the ensuing code does not set the value explicitly.
However, By initializing strings with the Empty value instead of null, you can reduce the chances of a NullReferenceException occurring.
Theory aside, I tend to view:
Empty string as a known value
NULL as unknown
In this case, I'd probably use NULL.
One important thing is to be consistent: mixing NULLs and empty strings will end in tears.
On a practical implementation level, empty string takes 2 bytes in SQL Server where as NULLs are bitmapped. In some conditions and for wide/larger tables it makes a different in performance because it's more data to shift around.
Due to a weird request, I can't put null in a database if there is no value. I'm wondering what can I put in the store procedure for nothing instead of null.
For example:
insert into blah (blah1) values (null)
Is there something like nothing or empty for "blah1" instead using null?
I would push back on this bizarre request. That's exactly what NULL is for in SQL, to denote a missing or inapplicable value in a column.
Is the requester experiencing grief over SQL logic with NULL?
edit: Okay, I've read your reply with the extra detail about this job assignment (btw, generally you should edit your original question instead of posting more information in an answer).
You'll have to declare all columns as NOT NULL and designate a special value in the domain of that column's data type to signify "no value." The appropriate value to choose might be different on a case by case basis, i.e. zero may signify nothing in a person_age column, but it might have significance in an items_in_stock column.
You should document the no-value value for each column. But I suppose they don't believe in documentation either. :-(
Depends on the data type of the column. For numbers (integers, etc) it could be zero (0) but if varchar then it can be an empty string ("").
I agree with other responses that NULL is best suited for this because it transcends all data types denoting the absence of a value. Therefore, zero and empty string might serve as a workaround/hack but they are fundamentally still actual values themselves that might have business domain meaning other than "not a value".
(If only the SQL language supported a "Not Applicable" (N/A) value type that would serve as an alternative to NULL...)
Is null is a valid value for whatever you're storing?
Use a sentry value like INT32.MaxValue, empty string, or "XXXXXXXXXX" and assume it will never be a legitimate value
Add a bit column 'Exists' that you populate with true at the same time you insert.
Edit: But yeah, I'll agree with the other answers that trying to change the requirements might be better than trying to solve the problem.
If you're using a varchar or equivalent field, then use the empty string.
If you're using a numeric field such as int then you'll have to force the user to enter data, else come up with a value that means NULL.
I don't envy you your situation.
There's a difference between NULLs as assigned values (e.g. inserted into a column), and NULLs as a SQL artifact (as for a field in a missing record for an OUTER JOIN. Which might be a foreign concept to these users. Lots of people use Access, or any database, just to maintain single-table lists.) I wouldn't be surprised if naive users would prefer to use an alternative for assignments; and though repugnant, it should work ok. Just let them use whatever they want.
There is some validity to the requirement to not use NULL values. NULL values can cause a lot of headache when they are in a field that will be included in a JOIN or a WHERE clause or in a field that will be aggregated.
Some SQL implementations (such as MSSQL) disallow NULLable fields to be included in indexes.
MSSQL especially behaves in unexpected ways when NULL is evaluated for equality. Does a NULL value in a PaymentDue field mean the same as zero when we search for records that are up to date? What if we have names in a table and somebody has no middle name. It is conceivable that either an empty string or a NULL could be stored, but how do we then get a comprehensive list of people that have no middle name?
In general I prefer to avoid NULL values. If you cannot represent what you want to store using either a number (including zero) or a string (including the empty string as mentioned before) then you should probably look closer into what you are trying to store. Perhaps you are trying to communicate more than one piece of data in a single field.
I know that it does consider ' ' as NULL, but that doesn't do much to tell me why this is the case. As I understand the SQL specifications, ' ' is not the same as NULL -- one is a valid datum, and the other is indicating the absence of that same information.
Feel free to speculate, but please indicate if that's the case. If there's anyone from Oracle who can comment on it, that'd be fantastic!
I believe the answer is that Oracle is very, very old.
Back in the olden days before there was a SQL standard, Oracle made the design decision that empty strings in VARCHAR/VARCHAR2 columns were NULL and that there was only one sense of NULL (there are relational theorists that would differentiate between data that has never been prompted for, data where the answer exists but is not known by the user, data where there is no answer, etc. all of which constitute some sense of NULL).
By the time that the SQL standard came around and agreed that NULL and the empty string were distinct entities, there were already Oracle users that had code that assumed the two were equivalent. So Oracle was basically left with the options of breaking existing code, violating the SQL standard, or introducing some sort of initialization parameter that would change the functionality of potentially large number of queries. Violating the SQL standard (IMHO) was the least disruptive of these three options.
Oracle has left open the possibility that the VARCHAR data type would change in a future release to adhere to the SQL standard (which is why everyone uses VARCHAR2 in Oracle since that data type's behavior is guaranteed to remain the same going forward).
Tom Kyte VP of Oracle:
A ZERO length varchar is treated as
NULL.
'' is not treated as NULL.
'' when assigned to a char(1) becomes
' ' (char types are blank padded
strings).
'' when assigned to a varchar2(1)
becomes '' which is a zero length
string and a zero length string is
NULL in Oracle (it is no long '')
Oracle documentation alerts developers to this problem, going back at least as far as version 7.
Oracle chose to represent NULLS by the "impossible value" technique. For example, a NULL in a numeric location will be stored as "minus zero", an impossible value. Any minus zeroes that result from computations will be converted to positive zero before being stored.
Oracle also chose, erroneously, to consider the VARCHAR string of length zero (the empty string) to be an impossible value, and a suitable choice for representing NULL. It turns out that the empty string is far from an impossible value. It's even the identity under the operation of string concatenation!
Oracle documentation warns database designers and developers that some future version of Oracle might
break this association between the empty string and NULL, and break any code that depends on that association.
There are techniques to flag NULLS other than impossible values, but Oracle didn't use them.
(I'm using the word "location" above to mean the intersection of a row and a column.)
I suspect this makes a lot more sense if you think of Oracle the way earlier developers probably did -- as a glorified backend for a data entry system. Every field in the database corresponded to a field in a form that a data entry operator saw on his screen. If the operator didn't type anything into a field, whether that's "birthdate" or "address" then the data for that field is "unknown". There's no way for an operator to indicate that someone's address is really an empty string, and that doesn't really make much sense anyways.
According to official 11g docs
Oracle Database currently treats a character value with a length of zero as null. However, this may not continue to be true in future releases, and Oracle recommends that you do not treat empty strings the same as nulls.
Possible reasons
val IS NOT NULL is more readable than val != ''
No need to check both conditions val != '' and val IS NOT NULL
Empty string is the same as NULL simply because its the "lesser evil" when compared to the situation when the two (empty string and null) are not the same.
In languages where NULL and empty String are not the same, one has to always check both conditions.
Example from book
set serveroutput on;
DECLARE
empty_varchar2 VARCHAR2(10) := '';
empty_char CHAR(10) := '';
BEGIN
IF empty_varchar2 IS NULL THEN
DBMS_OUTPUT.PUT_LINE('empty_varchar2 is NULL');
END IF;
IF '' IS NULL THEN
DBMS_OUTPUT.PUT_LINE(''''' is NULL');
END IF;
IF empty_char IS NULL THEN
DBMS_OUTPUT.PUT_LINE('empty_char is NULL');
ELSIF empty_char IS NOT NULL THEN
DBMS_OUTPUT.PUT_LINE('empty_char is NOT NULL');
END IF;
END;
Because not treating it as NULL isn't particularly helpful, either.
If you make a mistake in this area on Oracle, you usually notice right away. In SQL server, however, it will appear to work, and the problem only appears when someone enters an empty string instead of NULL (perhaps from a .net client library, where null is different from "", but you usually treat them the same).
I'm not saying Oracle is right, but it seems to me that both ways are approximately equally bad.
Indeed, I have had nothing but difficulties in dealing with Oracle, including invalid datetime values (cannot be printed, converted or anything, just looked at with the DUMP() function) which are allowed to be inserted into the database, apparently through some buggy version of the client as a binary column! So much for protecting database integrity!
Oracle handling of NULLs links:
http://digitalbush.com/2007/10/27/oracle-9i-null-behavior/
http://jeffkemponoracle.com/2006/02/empty-string-andor-null.html
First of all, null and null string were not always treated as the same by Oracle. A null string is, by definition, a string containing no characters. This is not at all the same as a null. NULL is, by definition, the absence of data.
Five or six years or so ago, null string was treated differently from null by Oracle. While, like null, null string was equal to everything and different from everything (which I think is fine for null, but totally WRONG for null string), at least length(null string) would return 0, as it should since null string is a string of zero length.
Currently in Oracle, length(null) returns null which I guess is O.K., but length(null string) also returns null which is totally WRONG.
I do not understand why they decided to start treating these 2 distinct "values" the same. They mean different things and the programmer should have the capability of acting on each in different ways. The fact that they have changed their methodology tells me that they really don't have a clue as to how these values should be treated.