Case expression with multiple results - sql

I am looking for a way to actually create some duplication in my results in MS SQL Server. I understand that typically you are looking for ways to not create duplication, but in this examples I need all the individual rows returned.
I am working with a table with about 10 million rows and 33 columns. The table consists of an ID in the first column and the remainder of the columns have either 'Y' or 'NULL' in them - a HUGE majority of the columns are NULL.
Of the 10 million rows 8 million of them only have a single 'Y' per row with the remaining 2 million rows having more than one column with a 'Y' in a row.
For the rows with a single 'Y' a basic case expression works perfectly fine to create a single column of results .
Here is my problem though - I want two rows, one for each 'Y' if there is more than one 'Y' in a row.
Below is a small de-identified sample.
ID FLAG1 FLAG2 FLAG3 FLAG4 FLAG5
188 NULL NULL NULL NULL NULL
194 Y NULL NULL NULL NULL
200 Y NULL NULL NULL Y
I am attempting to use a Case Expression like this.
Select
ID
,Case
When [FLAG1] = 'Y'
Then 'FLAG1'
When [FLAG2] = 'Y'
Then 'FLAG2'
End as 'Service_Line'
What I want is a result that looks like this.
ID Service_Line
194 FLAG1
200 FLAG1
200 FLAG5
My problem is that the Case expression only returns the first result so I end up with this.
ID Service_Line
194 FLAG1
200 FLAG1
Is a Case Expression appropriate for what I am trying to accomplish or should I be trying to go about this some other way?

A case is not appropriate. A general SQL approach would be:
select id, 'FLAG1' as flag
from t
where flag1 = 'Y'
union all
select id, 'FLAG2' as flag
from t
where flag2 = 'Y'
union all
select id, 'FLAG3' as flag
from t
where flag3 = 'Y'
. . .
There are other (more efficient) methods, but those depend on the database you are using.
I should note: case is the right approach if you want the values in a single row:
select id,
concat( case when flag1 = 'Y' then 'FLAG1 ' else '' end,
case when flag2 = 'Y' then 'FLAG2 ' else '' end,
case when flag3 = 'Y' then 'FLAG3 ' else '' end,
. . .
) as flags
from t;
Of course, the syntax for concat() can vary among databases (concat() itself is ANSI standard). You can also trim off the last space.

Related

Count number of values in SQL query / bigquery

I have a query which extracts some data from a JSON document and I have a query that based on the number of values returned displays an overall column count. I can't seem to work out how to combine these into a single query? assume that I need to use a sub-query but not sure where to go from here?
SELECT
JSON_EXTRACT_SCALAR(data, '$.cat.name') as cat_name
JSON_EXTRACT_SCALAR(data, '$.dog.name') as dog_name
FROM table
SELECT
CASE WHEN cat_name IS NOT NULL THEN 1 ELSE 0 END +
CASE WHEN dog_name IS NOT NULL THEN 1 ELSE 0 END AS cat_dog_total
FROM table
You can use a subquery to maintain readability:
SELECT (CASE WHEN cat_name IS NOT NULL THEN 1 ELSE 0 END +
CASE WHEN dog_name IS NOT NULL THEN 1 ELSE 0 END
) AS cat_dog_total
from (select JSON_EXTRACT_SCALAR(data, '$.cat.name') as cat_name
JSON_EXTRACT_SCALAR(data, '$.dog.name') as dog_name
from table
) t
Of course, you can substitute in the JSON_EXTRACT_SCALAR() expressions as well, but this is more readable.

how do you check for nulls in any column in an entire table in SQL

I would like to check if any of my columns in a table have any null values. I am sure there is a quicker way than how I am doing it at the moment. I just want to see if there is a NULL in ANY column however my table has a lot of columns, is there a simple and quick way?
This way I have written so far works but it takes a long time to do for every column (hence the etc etc)
select
sum(case when id is null then 1 else 0 end) as id,
sum(case when name is null then 1 else 0 end) as name,
sum(case when review_count is null then 1 else 0 end) as review_coun,
sum(case when positive_review is null then 1 else 0 end) as
positive_review,
sum(etc etc
from user
I don't know if this will work for your scenario, but it's an option. You can CAST all your columns as a string and then concatenate them together. If you concatenate a NULL value with a string, it will return NULL.
SELECT 'Y'
WHERE EXISTS( -- Check if there are any NULL rows
SELECT
CAST(c1 AS CHAR(1)) ||
CAST(c2 AS CHAR(1)) ||
...
AS MyColumns
WHERE MyColumns IS NULL
)
;

Impala SQL, return value if a string exists within a subset of values

I have a table where the id field (not a primary key) contains either 1 or null. Over the past several years, any given part could have been entered multiple times with one, or both of these possible options.
I'm trying to write a statement that will return some value if there is ever a 1 associated with the select statement. There are lots of semi-duplicate rows, some with 1 and some with null, but if there is ever a 1, I want to return true, and if there are only null values, I want to return false. I'm not sure how to code this though.
If this is my SELECT part,id from table where part = "ABC1234" statement
part id
ABC1234 1
ABC1234 null
ABC1234 null
ABC1234 null
ABC1234 1
I want to write a statement that returns true, because 1 exists in at least one of these rows.
The closest I've come to this is by using a CASE statement, but I'm not quite there yet:
SELECT
a1.part part,
CASE WHEN a2.id is not null
THEN
'true'
ELSE
'false'
END AS id
from table.parts a1, table.ids a2 where a1.part = "ABC1234" and a1.key = a2.key;
I also tried the following case:
CASE WHEN exists
(SELECT id from table.ids where id = 1)
THEN
but I got the error subqueries are not supported in the select list
For the above SELECT statement, how do I return 1 single line that reads:
part id
ABC1234 true
You can use conditional aggregation to check if a part has atleast one row with id=1.
SELECT part,'True' id
from parts
group by part
having count(case when id = 1 then 1 end) >= 1
To return false when the id's are all nulls use
select part, case when id_true>=1 then 'True'
when id_false>=1 and id_true=0 then 'False' end id
from (
SELECT part,
count(case when id = 1 then 1 end) id_true,
count(case when id is null then 1 end) id_false,
from parts
group by part) t

Where Clause 'drops' more rows than expected

I have a SQL Server 2008 R2 query that was returning "hypothetically" 100 rows. I'm actually working with 7k - 8k rows.
The Where clause is something like this:
Where Col_a = 'Y'
And Col_b = 'N'
And Col_c = 'X'
and 25 of the rows had 'P' in Col_d.
I added:
And Col_d = 'P'
and the query returned the expected 25 rows.
Then I changed to
And Col_d <> 'P'
I expected to get 75 rows but I got only 50.
I thought adding "And Col_d <> 'P'" would only restrict the rows in which there is a 'P' in Col_d.
Why is that not the case and how do I figure out what else is getting dropped when I say And Col_d <> 'P'?
As I said - I am actually working with larger numbers so it is not that easy to eyeball it.
I'd appreciate any help.
Thanks!
As stated in the comment, null is a special case when it comes to comparisons.
Assume the following data.
id someVal
----
0 null
1 1
2 2
With a query:
select id
from table
where someVal = 1
would return id 1
select id
from table
where someVal <> 1
would return id 2
select id
from table
where someVal is null
would return id 0
select id
from table
where someVal is not null
would return both ids 1 and 2.
If you wanted nulls to be "counted" as values in a = <> comparison, it needs to be cast to something like:
select id
from table
where isNull(someVal, -1) <> 1
returns 0 and 2
Or you can change your ANSI Null setting.
What I want to do is only exclude the rows that have 'P' in Col_d
So in your specific case, because you want to treat null in Col_D as a non P row, your query could look like this:
select *
from someTable
Where Col_a = 'Y'
And Col_b = 'N'
And Col_c = 'X'
And isNull(Col_D, 'someArbitraryValue') <> 'P'
You have to do the above, because as I pointed out throughout the answer and in the links null does not compare the same way as values. You need to make the null something that is not null, (accomplished with isNull(Col_D, 'someArbitraryValue')) or change ANSI NULL setting in order to compare it as equal or not equal to some value.
Or as #Andrew pointed out magic numbers are bad (someArbitraryValue), so you could instead do:
select *
from someTable
Where Col_a = 'Y'
And Col_b = 'N'
And Col_c = 'X'
And (Col_D <> 'P' OR Col_D is null)
Normally I would do the directly above query, I was doing it the other way to mostly point out the differences in null comparison vs a value.

How to check if there is any Zeros in Column Fields using SQL

I am trying to check if there is any zeros in Flag1 or Flag2 fields but I am getting the wrong results here when I run the code below. In this case I know there is one zero in Flag1 field but my count is zero when I run the SQL. If there is one zero in either field then I am expecting to see the count to be either >=1. If there is no zero in either field then I expect zero count. How can I get that? thanks
here is my code:
select count(*) from myTable
where FLAG1 in(0) and FLAG2 in(0)
and ID = 202
Here is an example of what i have:
FLAG1 FLAG2
1 1
1 1
1 1
0 1
Instead of and use or (and don't use in unless you provide a list of values or a subquery):
select count(*) from myTable
where (FLAG1=0 or FLAG2=0)
and ID = 202
select count(*) from myTable
where (FLAG1 = 0 or FLAG2 = 0)
and ID = 202
I think this would be true:
select count(*) from myTable
where FLAG1=0 or FLAG2 =0
and ID = 202