SQL Server NULLABLE column vs SQL COUNT() function - sql

Could someone help me understand something? When I can, I usually avoid (*) in an SQL statement. Well, today was payback. Here is a scenario:
CREATE TABLE Tbl (Id INT IDENTITY(1, 1) PRIMARY KEY, Name NVARCHAR(16))
INSERT INTO Tbl VALUES (N'John')
INSERT INTO Tbl VALUES (N'Brett')
INSERT INTO Tbl VALUES (NULL)
I could count the number of records where Name is NULL as follows:
SELECT COUNT(*) FROM Tbl WHERE Name IS NULL
While avoiding the (*), I discovered that the following two statements give me two different results:
SELECT COUNT(Id) FROM Tbl WHERE Name IS NULL
SELECT COUNT(Name) FROM Tbl WHERE Name IS NULL
The first statement correctly return 1 while the second statement yields 0. Why or How?

That's because
The COUNT(column_name) function returns the number of values (NULL
values will not be counted) of the specified column
so when you count Id you get expected result, while counting Name no, but the answer provided by query is correct

Everything is described in COUNT (Transact-SQL).
COUNT ( { [ [ ALL | DISTINCT ] expression ] | * } )
ALL - is default
COUNT(*) returns the number of items in a group. This includes NULL values and duplicates.
COUNT(ALL expression) evaluates expression for each row in a group and returns the number of nonnull values.

"COUNT()" does not count NULL values. So basically:
SELECT COUNT(Id) FROM Tbl WHERE Name IS NULL
will return the number of lines where ("ID" IS NOT NULL) AND ("Name" IS NULL); result is "1"
While:
SELECT COUNT(Name) FROM Tbl WHERE Name IS NULL
will count the lines where ("Name" IS NOT NULL) AND ("Name" IS NULL); result will always be 0

As it was said, COUNT (column_name) doesn't count NULL values.
If you don't want use COUNT(*) then use COUNT(1), but actualy you will not see difference in performance.

"Always avoid using *" is one of those blanket statements that people blindly follow. If you knew the reasons why you were avoiding * then you would know that none of those reasons apply when doing count(*).

The * in COUNT(*) is not the same * in SELECT * FROM...
SELECT COUNT(*) FROM T; very specifically means the cardinality of the table expression T.
SELECT COUNT(1) FROM T; will generate the same results as COUNT(*) but if the contents of the parentheses is not * then it must be parsed.
SELECT COUNT(c) FROM T; where c is a nullable column in table T will count the non-null values.
P.S. I'm comfortable with using SELECT * FROM... in the right circumstances.
P.P.S. Your 'table' has no key: consider INSERT INTO Tbl VALUES ('John', 'John', 'John', NULL, NULL, NULL); would be allowed by the results would be nonsense.

Related

How is the below query evaluated by the complier?

I was trying to create alternate queries to get the required result just came up with the below code. The below code gives Robert as the result i want to understand how is the compiler evaluated the below code.
CREATE TABLE NAMES(Id integer PRIMARY KEY, Name text);
/* Create few records in this table */
INSERT INTO NAMES VALUES(1,'Tom');
INSERT INTO NAMES VALUES(2,'Lucy');
INSERT INTO NAMES VALUES(3,'Frank');
INSERT INTO NAMES VALUES(4,'Jane');
INSERT INTO NAMES VALUES(5,'Robert');
COMMIT;
/* Display all the records from the table */
SELECT * FROM NAMES a where 0 = (select count(Name) from NAMES b where b.Id > a.Id);
For each row in names it is checked whether the number of rows with a higher ID is 0. Only in that case is the row selected. This means only the row with the highest ID gets selected.
It is unnecessary to count, though. More typical would be a NOT EXISTS query:
SELECT *
FROM names
WHERE NOT EXISTS
(
SELECT NULL
FROM names name_with_higher_id
WHERE name_with_higher_id.id > names.id
);
Or a MAX query:
SELECT *
FROM names
WHERE id = (SELECT max(id) FROM names);

SQL Select-IN query

I have a numeric column named id in my table.
I want to select the queries which has id in 1,2,3 and the one which has 'null' in them.
I dont want to use the query like:
SELECT * FROM MYTABLE WHERE ID IN (1,2,3) OR ID IS NULL
Can I use something like :
SELECT * FROM MYTABLE WHERE ID IN (1,2,3,null)
Is this possible? The above query returns me the same result as for
SELECT * FROM MYTABLE WHERE ID IN (1,2,3)
Short answer? No. You must use the IS NULL predicate. NULL != NULL (two NULL values are not necessarily equal), so any type of equals NULL, in (..., NULL) is not going to work.
If you using oracle, this may be solution.
SELECT * FROM MYTABLE WHERE NVL(ID,-1) IN (1,2,3,-1)
You must use:
SELECT * FROM MYTABLE WHERE ID IN (1,2,3) OR ID IS NULL
NULL always requires the special handling of IS NULL.
in sql server
SELECT * FROM MYTABLE WHERE isnull(ID,0) IN (1,2,3,0)

Return id if a row exists, INSERT otherwise

I'm writing a function in node.js to query a PostgreSQL table.
If the row exists, I want to return the id column from the row.
If it doesn't exist, I want to insert it and return the id (insert into ... returning id).
I've been trying variations of case and if else statements and can't seem to get it to work.
A solution in a single SQL statement. Requires PostgreSQL 8.4 or later though.
Consider the following demo:
Test setup:
CREATE TEMP TABLE tbl (
id serial PRIMARY KEY
,txt text UNIQUE -- obviously there is unique column (or set of columns)
);
INSERT INTO tbl(txt) VALUES ('one'), ('two');
INSERT / SELECT command:
WITH v AS (SELECT 'three'::text AS txt)
,s AS (SELECT id FROM tbl JOIN v USING (txt))
,i AS (
INSERT INTO tbl (txt)
SELECT txt
FROM v
WHERE NOT EXISTS (SELECT * FROM s)
RETURNING id
)
SELECT id, 'i'::text AS src FROM i
UNION ALL
SELECT id, 's' FROM s;
The first CTE v is not strictly necessary, but achieves that you have to enter your values only once.
The second CTE s selects the id from tbl if the "row" exists.
The third CTE i inserts the "row" into tbl if (and only if) it does not exist, returning id.
The final SELECT returns the id. I added a column src indicating the "source" - whether the "row" pre-existed and id comes from a SELECT, or the "row" was new and so is the id.
This version should be as fast as possible as it does not need an additional SELECT from tbl and uses the CTEs instead.
To make this safe against possible race conditions in a multi-user environment:
Also for updated techniques using the new UPSERT in Postgres 9.5 or later:
Is SELECT or INSERT in a function prone to race conditions?
I would suggest doing the checking on the database side and just returning the id to nodejs.
Example:
CREATE OR REPLACE FUNCTION foo(p_param1 tableFoo.attr1%TYPE, p_param2 tableFoo.attr1%TYPE) RETURNS tableFoo.id%TYPE AS $$
DECLARE
v_id tableFoo.pk%TYPE;
BEGIN
SELECT id
INTO v_id
FROM tableFoo
WHERE attr1 = p_param1
AND attr2 = p_param2;
IF v_id IS NULL THEN
INSERT INTO tableFoo(id, attr1, attr2) VALUES (DEFAULT, p_param1, p_param2)
RETURNING id INTO v_id;
END IF;
RETURN v_id:
END;
$$ LANGUAGE plpgsql;
And than on the Node.js-side (i'm using node-postgres in this example):
var pg = require('pg');
pg.connect('someConnectionString', function(connErr, client){
//do some errorchecking here
client.query('SELECT id FROM foo($1, $2);', ['foo', 'bar'], function(queryErr, result){
//errorchecking
var id = result.rows[0].id;
};
});
Something like this, if you are on PostgreSQL 9.1
with test_insert as (
insert into foo (id, col1, col2)
select 42, 'Foo', 'Bar'
where not exists (select * from foo where id = 42)
returning foo.id, foo.col1, foo.col2
)
select id, col1, col2
from test_insert
union
select id, col1, col2
from foo
where id = 42;
It's a bit longish and you need to repeat the id to test for several times, but I can't think of a different solution that involves a single SQL statement.
If a row with id=42 exists, the writeable CTE will not insert anything and thus the existing row will be returned by the second union part.
When testing this I actually thought the new row would be returned twice (therefor a union not a union all) but it turns out that the result of the second select statement is actually evaluated before the whole statement is run and it does not see the newly inserted row. So in case a new row is inserted, it will be taken from the "returning" part.
create table t (
id serial primary key,
a integer
)
;
insert into t (a)
select 2
from (
select count(*) as s
from t
where a = 2
) s
where s.s = 0
;
select id
from t
where a = 2
;

Count(*) with 0 for boolean field

Let's say I have a boolean field in a database table and I want to get a tally of how many are 1 and how many are 0. Currently I am doing:
SELECT 'yes' AS result, COUNT( * ) AS num
FROM `table`
WHERE field = 1
UNION
SELECT 'no' AS result, COUNT( * ) AS num
FROM `table`
WHERE field = 0;
Is there an easier way to get the result so that even if there are no false values I will still get:
----------
|yes | 3 |
|no | 0 |
----------
One way would be to outer join onto a lookup table. So, create a lookup table that maps field values to names:
create table field_lookup (
field int,
description varchar(3)
)
and populate it
insert into field_lookup values (0, 'no')
insert into field_lookup values (1, 'yes')
now the next bit depends on your SQL vendor, the following has some Sybase (or SQL Server) specific bits (the outer join syntax and isnull to convert nulls to zero):
select description, isnull(num,0)
from (select field, count(*) num from `table` group by field) d, field_lookup fl
where d.field =* fl.field
you are on the right track, but the first answer will not be correct. Here is a solution that will give you Yes and No even if there is no "No" in the table:
SELECT 'Yes', (SELECT COUNT(*) FROM Tablename WHERE Field <> 0)
UNION ALL
SELECT 'No', (SELECT COUNT(*) FROM tablename WHERE Field = 0)
Be aware that I've checked Yes as <> 0 because some front end systems that uses SQL Server as backend server, uses -1 and 1 as yes.
Regards
Arild
This will result in two columns:
SELECT SUM(field) AS yes, COUNT(*) - SUM(field) AS no FROM table
Because there aren't any existing values for false, if you want to see a summary value for it - you need to LEFT JOIN to a table or derived table/inline view that does. Assuming there's no TYPE_CODES table to lookup the values, use:
SELECT x.desc_value AS result,
COALESCE(COUNT(t.field), 0) AS num
FROM (SELECT 1 AS value, 'yes' AS desc_value
UNION ALL
SELECT 2, 'no') x
LEFT JOIN TABLE t ON t.field = x.value
GROUP BY x.desc_value
SELECT COUNT(*) count, field FROM table GROUP BY field;
Not exactly same output format, but it's the same data you get back.
If one of them has none, you won't get that rows back, but that should be easy enough to check for in your code.

PostgreSQL: Select a single-row x amount of times

A single row in a table has a column with an integer value >= 1 and must be selected however many times the column says. So if the column had '2', I'd like the select query to return the single-row 2 times.
How can this be accomplished?
Don't know why you would want to do such a thing, but...
CREATE TABLE testy (a int,b text);
INSERT INTO testy VALUES (3,'test');
SELECT testy.*,generate_series(1,a) from testy; --returns 3 rows
You could make a table that is just full of numbers, like this:
CREATE TABLE numbers
(
num INT NOT NULL
, CONSTRAINT numbers_pk PRIMARY KEY (num)
);
and populate it with as many numbers as you need, starting from one:
INSERT INTO numbers VALUES(1);
INSERT INTO numbers VALUES(2);
INSERT INTO numbers VALUES(3);
...
Then, if you had the table "mydata" that han to repeat based on the column "repeat_count" you would query it like so:
SELECT mydata.*
FROM mydata
JOIN numbers
ON numbers.num <= mydata.repeat_count
WHERE ...
If course you need to know the maximum repeat count up front, and have your numbers table go that high.
No idea why you would want to do this thought. Care to share?
You can do it with a recursive query, check out the examples in
the postgresql docs.
something like
WITH RECURSIVE t(cnt, id, field2, field3) AS (
SELECT 1, id, field2, field3
FROM foo
UNION ALL
SELECT t.cnt+1, t.id, t.field2, t.field3
FROM t, foo f
WHERE t.id = f.id and t.cnt < f.repeat_cnt
)
SELECT id, field2, field3 FROM t;
The simplest way is making a simple select, like this:
SELECT generate_series(1,{xTimes}), a.field1, a.field2 FROM my_table a;