T-SQL 2005: Counting All Rows and Rows Meeting Criteria - sql

Here's the scenario:
I have a table with 3 columns: 'KeyColumn', 'SubKeyColumn' and 'BooleanColumn', where the first two are the primary keys of the table.
For my query, I'd like to count the number of rows there are for any given value in 'KeyColumn', and I'd also like to know which ones have a value of true for 'BooleanColumn'. My initial thought was to create a query like this:
SELECT
COUNT(*)
,COUNT(CASE WHEN BooleanColumn = 1 THEN 1 ELSE 0 END)
FROM
MyTable
GROUP BY
KeyColumn
However, the 2nd part does not work (I'm not entirely sure why I thought it would to begin with). Is it possible to do something like this in one query? Or am I going to need to do multiple queries to make this happen?

Change COUNT to SUM in the 2nd part. ;)

... CASE WHEN BooleanColumn = 1 THEN 1 ELSE NULL END ...
COUNT counts the NON-NULL rows.

You could also do SUM(CAST(BooleanColumn AS TINYINT))

Related

Check a lot of colums for at least one 'true'

I have a table with a lot of columns (say 200) they are all boolean. I want to know which of those has at least one record set to true. I have come up with the following query which works fine:
SELECT sum(Case When [column1] = 1 Then 1 Else 0 End) as column1,
sum(Case When [column2] = 1 Then 1 Else 0 End) as column2, sum(Case
When [column3] = 1 Then 1 Else 0 End) as column3, FROM [tablename];
It will return the number of rows that are 'true' for a column. However, this is more information than I need and thereby maybe a more expensive query then needed. The query keeps scanning all fields for all records even though that would not be necessary.
I just learned something about CHECKSUM(*) that might be useful. Try the following code:
DECLARE #T TABLE (
b1 bit
,b2 bit
,b3 bit
);
DECLARE #T2 TABLE (
b1 bit
,b2 bit
,b3 bit
,b4 bit
,b5 bit
);
INSERT INTO #T VALUES (0,0,0),(1,1,1);
INSERT INTO #T2 VALUES (0,0,0,0,0),(1,1,1,1,1);
SELECT CHECKSUM(*) FROM #T;
SELECT CHECKSUM(*) FROM #T2;
You will see from the results that no matter how many columns are in a row, if they are all bit columns with a value of 0, the result of CHECKSUM(*) is always 0.
This means that you could use WHERE CHECKSUM(*)<>0 in your query to save the engine the trouble of summing rows where all the values are 0. Might improve performance.
And even if it doesn't, it's a neat thing to know.
EDIT:
You could do an EXISTS() function on each column. I understand that the EXISTS() function stops scanning when it finds a value that exists. If you have more rows than columns, it might be more performant. If you have more columns than rows, then your current query using SUM() on every column is probably the fastest thing you can do.
If you just want to know the rows that have at last one boolean field, you will need to test every of them.
Something like this (maybe):
SELECT ROW.*
FROM TABLE ROW
WHERE ROW.COLUMN_1 = 1
OR ROW.COLUMN_2 = 1
OR ROW.COLUMN_3 = 1
OR ...
OR ROW.COLUMN_N = 1;
If you actually have 200 columns/fields on one table with boolean then something like the following should work.
SELECT CASE WHEN column1 + column2 + column3 + ... + column200 >= 1 THEN 'Something was true for this record' ELSE NULL END AS My_Big_Field_Test
FROM [TableName];
I'm not in front of my machine, but you could also try the bitwise or operator:
SELECT * FROM [table name] WHERE column1 | column2 | column3 = 1
The OR answer from Arthur is the other suggestion I would offer. Try a few different suggestions and look at the query plans. Also take a look at disk reads and CPU usage. (SET STATISTICS IO ON and SET STATISTICS TIME ON).
See whatever method gives the desires results and the best performance...and then let us know :-)
You can use a query of the form
SELECT
CASE WHEN EXISTS (SELECT * FROM [Table] WHERE [Column1] = 1) THEN 0 ELSE 1 END AS 'Column1',
CASE WHEN EXISTS (SELECT * FROM [Table] WHERE [Column2] = 1) THEN 0 ELSE 1 END AS 'Column2',
...
The efficiency of this critically depends on how sparse your table is. If there are columns where every single row has a 0 value, then any query that searches for a 1 value will require a full table scan, unless an index is in place. A really good choice for this scenario (millions of rows and hundreds of columns) is a columnstore index. These are supported from SQL Server 2012 onwards; from SQL Server 2014 onwards they don't cause the table to be read-only (which is a major barrier to their adoption).
With a columnstore index in place, each subquery should require constant time, and so should the query as a whole (in fact, with hundreds of columns, this query gets so big that you might run into trouble with the input buffer and need to split it up into smaller queries). Without indexes, this query can still be effective as long as the table isn't sparse -- if it "quickly" runs into a row with a 1 value, it stops.

SQL SELECT statement within an IF statement

I have a trigger in SQL Server that needs to check on an update the number of rows with a value between a certain amount and do something accordingly. My current code is something like this:
IF EXISTS(SELECT COUNT(id) as NumberOfRows
FROM database
WHERE id = 3 AND value <= 20 and value > 2
GROUP BY id
HAVING COUNT(id) > 18)
-- if true, do something
From what I can tell, the select statement should find the number of rows with a value between 2 and 20 and if there are more than 18 rows, the EXISTS function should return 1 and the query will execute the code within the IF statement.
However, what is happening is that it is always executing the code within the IF statement regardless of the number of rows with a value between 2 and 20.
Any ideas on why this might be? I can post more complete code if it might help.
The reason is that the Exists function is checking the result of the sub-query for existing - are there any rows or not. And, as you return the COUNT, it'll never be not-existing - COUNT returns 0 if there are no rows presented in database.
Try to store the resulting count in a local variable, like in this question:
Using IF ELSE statement based on Count to execute different Insert statements
DECLARE #retVal int
SELECT #retVal = COUNT(*)
FROM TABLE
WHERE COLUMN = 'Some Value'
IF (#retVal > 0)
BEGIN
--INSERT SOMETHING
END
ELSE
BEGIN
--INSERT SOMETHING ELSE
END
I would do it like so (single line):
IF ((SELECT COUNT(id) FROM table WHERE ....)>18) BEGIN
...do something
You can even do between in a single line
IF ((SELECT COUNT(id) FROM table WHERE ....)between 2 and 20) BEGIN
...do something
END
Your subquery is looking for matches in the entire table. It does not limit the results only to those that are related to the rows affected by the update. Therefore, if the table already has rows matching your condition, the condition will be true on any update that affects other rows.
In order to count only the relevant rows, you should either join the database table to the inserted pseudo-table or use just the inserted table (there is not enough information in your question to be sure which is better).

SQL query to get null count from a column

I want to generate a SQL query to get null count in a particular column like
SELECT COUNT (column) AS count
FROM table
WHERE column = null ;
This is returning 0, but I want how many null values are present in that column like
SELECT COUNT (column) AS count
FROM table
WHERE column = 'some value';
which returns the count of the matched records
NULL value is special in that you cannot use = with it; you must use IS NULL instead:
SELECT COUNT (*) AS count FROM table where column IS null ;
This is because NULL in SQL does not evaluate as equal to anything, including other NULL values. Also note the use of * as the argument of COUNT.
You can use a conditional sum()
SELECT sum(case when column is null
then 1
else 0
end) AS count
FROM table
A different query but exact answer check it out
select count(*)-count(column) from table
please vote check this as answer if it helps you
To get exact output you can use below command as well -
SELECT COUNT (*) AS count FROM table where column IS null OR column='';
because some times only '' doesn't counted as NULL.

SQL Combine 2 rows into 2 columns

I have a table that contains entries like this:
I would like to transfor it to something like this:
Can't find how to do so with a group by only. Am I missing anything?
Thanks in advance for your help
SELECT Entity,
MAX(CASE WHEN Type=Auto THEN Value ELSE NULL END) AS ValueAuto,
MAX(CASE WHEN Type=Manual THEN Value ELSE NULL END) AS ValueMaual
FROM tableName
GROUP BY Entity
above query returns the good values if we have only two types, If I don't know how many groups are there in the table, dynamically how can do with the case statement.
Use UN-PIVOT is the best options to solve the above requirements.

Check whether a table contains rows or not sql server 2005

How to Check whether a table contains rows or not sql server 2005?
For what purpose?
Quickest for an IF would be IF EXISTS (SELECT * FROM Table)...
For a result set, SELECT TOP 1 1 FROM Table returns either zero or one rows
For exactly one row with a count (0 or non-zero), SELECT COUNT(*) FROM Table
Also, you can use exists
select case when exists (select 1 from table)
then 'contains rows'
else 'doesnt contain rows'
end
or to check if there are child rows for a particular record :
select * from Table t1
where exists(
select 1 from ChildTable t2
where t1.id = t2.parentid)
or in a procedure
if exists(select 1 from table)
begin
-- do stuff
end
Like Other said you can use something like that:
IF NOT EXISTS (SELECT 1 FROM Table)
BEGIN
--Do Something
END
ELSE
BEGIN
--Do Another Thing
END
FOR the best performance, use specific column name instead of * - for example:
SELECT TOP 1 <columnName>
FROM <tableName>
This is optimal because, instead of returning the whole list of columns, it is returning just one. That can save some time.
Also, returning just first row if there are any values, makes it even faster. Actually you got just one value as the result - if there are any rows, or no value if there is no rows.
If you use the table in distributed manner, which is most probably the case, than transporting just one value from the server to the client is much faster.
You also should choose wisely among all the columns to get data from a column which can take as less resource as possible.
Can't you just count the rows using select count(*) from table (or an indexed column instead of * if speed is important)?
If not then maybe this article can point you in the right direction.
Fast:
SELECT TOP (1) CASE
WHEN **NOT_NULL_COLUMN** IS NULL
THEN 'empty table'
ELSE 'not empty table'
END AS info
FROM **TABLE_NAME**