Subtract 2 case statements - sql

I am trying to subtract the 2 case statements like this:
CASE
WHEN fct.measure IN ('A')
THEN fct.month_value
ELSE NULL
END
- CASE
WHEN fct.measure IN ('B')
THEN fct.month_value
ELSE NULL
END AS discounts
This query doesn't throw a syntax error, but it returns all NULL.
The month_value corresponding to A is 3173.100000 and the month value corresponding to B is 8043.000000.
Any suggestions on how this could return the correct result instead of all NULL?

I presume that you need some kind of conditional aggregation approach here:
SELECT
col,
MAX(CASE WHEN measure = 'A' THEN month_value END) -
MAX(CASE WHEN measure = 'B' THEN month_value END);
FROM yourTable
GROUP BY col;
This assumes that your table structure looks something like the following:
col | measure | month_value
1 | A | 3173.10
1 | B | 8043.00
We aggregate by each col value, and then use conditional aggregation to isolate the various month values based on the value of the measure column.

Related

Coalesce in duplicated values

I have a table like this:
And I want to transform for each value a column, to become something like this:
If I do a query like this:
Select "_sdc_source_key_id",
COALESCE(value='Integrity',null) as cia_security
,COALESCE (value='Confidentiality',null) as cia_conf
,COALESCE (value='Availability',null) as cia_availability
FROM
staging_jira.issues__fields__customfield_10420
where _sdc_source_key_id='201496'
That is my result, I have duplicated rows:
What should be the best solution to achieve my transformation?
Thanks a lot!
You can GROUP By "_sdc_source_key_id" and use MAX of your values
Select "_sdc_source_key_id",
MAX(COALESCE(value='Integrity',null)) as cia_security
,MAX(COALESCE (value='Confidentiality',null)) as cia_conf
,MSX(COALESCE (value='Availability',null)) as cia_availability
FROM
staging_jira.issues__fields__customfield_10420
where _sdc_source_key_id='201496'
GROUP BY "_sdc_source_key_id"
If your databse doesn't support MAX from boolean switch to Int
Select "_sdc_source_key_id",
MAX(CASE WHEN value='Integrity' THEN 1 ELSE null END) as cia_security
,MAX(CASE WHEN value='Confidentiality' THEN 1 ELSE null END) as cia_conf
,MSX(CASE WHEN value='Availability' THEN 1 ELSE null END) as cia_availability
FROM
staging_jira.issues__fields__customfield_10420
where _sdc_source_key_id='201496'
GROUP BY "_sdc_source_key_id"

How can I use pivot to find the records with the most columns populated?

I have a problem where I have 5 columns.
What I want to do is add a count on the end with the number of columns where there is no null value.
I am trying to use pivot as this seems to be the most logical SQL clause. Any ideas on this? I haven't used Pivot in many instances so this is new for me.
An inline pivot/conditional aggregate and a COUNT seems to be what you want here. As all your columns have different data types, you need to also use some CASE expressions. Something like this:
SELECT ID,
a,
...
(SELECT COUNT(V.C)
FROM (VALUES(CASE WHEN a IS NOT NULL THEN 1 END),
(CASE WHEN b IS NOT NULL THEN 1 END),
(CASE WHEN c IS NOT NULL THEN 1 END),
(CASE WHEN d IS NOT NULL THEN 1 END),
(CASE WHEN e IS NOT NULL THEN 1 END),
(CASE WHEN f IS NOT NULL THEN 1 END))V(C)) AS NonNullColumns
FROM dbo.YourTable;

Combine row aggregate data with individual rows

I have a table looking like below
base_data
session_id
event_type
player_guess
correct_answer
1
guess
'python'
NULL
1
guess
'javascript'
NULL
1
guess
'scala'
NULL
1
all_answered
NULL
['python','javascript','hadoop']
2
guess
'triangle'
NULL
2
guess
'square'
NULL
2
all_answered
NULL
['triangle','square']
I am trying to get a new column called as was_guess_correct defined as follow :
For each session_id, match the player_guess values with data in correct_answer. Correct answer for session_id is available when event_type = 'all_answered'
The result would look like -
session_id
event_type
player_guess
correct_answer
was_guess_correct
1
guess
'python'
NULL
1
1
guess
'javascript'
NULL
1
1
guess
'scala'
NULL
0
1
all_answered
NULL
['python','javascript','hadoop']
1
2
guess
'triangle'
NULL
1
2
guess
'square'
NULL
1
2
all_answered
NULL
['triangle','square']
1
The values in row all_answered are unique as well as sorted ( The order can be used or just checking using IN clause might also work )
For row with event_type all_answered, the column was_guess_correct does not matter. It can be 1 or 0 - whatever helps makes the query easier.
How would I be able to compute the above column in SQL/ Presto ?
I am trying to see - How to compute using JOIN/Unnest and also inline (without JOIN) if possible.
You can use window functions to get the correct answers on each row. Then how you manage the result depends on the type of the column. If it is a string, you can just use like:
select t.*,
(case when event_type = 'all_answered' or
max(correct_answer) over (partition by session_id) like '%''' || player_guess || '''%'
then 1 else 0
end) as was_guess_correct
from t;
Note that correct_answer is NULL in the "guess" rows, so max() works (assuming there is one correct answer row per session).

Impala SQL, return value if a string exists within a subset of values

I have a table where the id field (not a primary key) contains either 1 or null. Over the past several years, any given part could have been entered multiple times with one, or both of these possible options.
I'm trying to write a statement that will return some value if there is ever a 1 associated with the select statement. There are lots of semi-duplicate rows, some with 1 and some with null, but if there is ever a 1, I want to return true, and if there are only null values, I want to return false. I'm not sure how to code this though.
If this is my SELECT part,id from table where part = "ABC1234" statement
part id
ABC1234 1
ABC1234 null
ABC1234 null
ABC1234 null
ABC1234 1
I want to write a statement that returns true, because 1 exists in at least one of these rows.
The closest I've come to this is by using a CASE statement, but I'm not quite there yet:
SELECT
a1.part part,
CASE WHEN a2.id is not null
THEN
'true'
ELSE
'false'
END AS id
from table.parts a1, table.ids a2 where a1.part = "ABC1234" and a1.key = a2.key;
I also tried the following case:
CASE WHEN exists
(SELECT id from table.ids where id = 1)
THEN
but I got the error subqueries are not supported in the select list
For the above SELECT statement, how do I return 1 single line that reads:
part id
ABC1234 true
You can use conditional aggregation to check if a part has atleast one row with id=1.
SELECT part,'True' id
from parts
group by part
having count(case when id = 1 then 1 end) >= 1
To return false when the id's are all nulls use
select part, case when id_true>=1 then 'True'
when id_false>=1 and id_true=0 then 'False' end id
from (
SELECT part,
count(case when id = 1 then 1 end) id_true,
count(case when id is null then 1 end) id_false,
from parts
group by part) t

SQL Case Statement or Different Method?

I will be using my output to place into an Excel pivot table. The data is dealing with credit accounts that have either charged off or not.
EDIT: If chargeoffs is checked in the pivot table I want the totalaccounts column to be a count of total accounts regardless of the chargeoffdate value. If chargeoffs is left unchecked I want totalaccounts to be a count of all accounts when chargeoffdate is NULL.
Here is my SQL syntax so far:
SELECT
c.brand,
CASE WHEN a.chargeoffdate IS NULL THEN 'No Chargeoffs'
-- Below here should not be only chargeoffs, it should be chargeoffs + the column above ^^^
WHEN a.chargeoffdate IS NOT NULL THEN 'Chargeoffs'
ELSE 'Unknown' END AS chargeoffs,
COUNT(*) AS totalaccounts
FROM accounts
GROUP BY brand, chargeoffs
You can see the comment in my SQL to understand what I am going for, but I can't figure out how to accomplish this.
I tried:
CASE WHEN a.chargeoffdate IS NULL THEN 'No Chargeoffs'
-- Below here should not be only chargeoffs, it should be chargeoffs + the column above ^^^
WHEN (a.chargeoffdate IS NOT NULL OR a.chargeoffdate IS NULL) THEN 'Chargeoffs Included'
ELSE 'Unknown' END AS chargeoffs
But got the same results as the top query for some reason. Thanks.
ANOTHER EDIT: OUTPUT DESIRED
BRAND 1 | WITH CHARGEOFFS | COUNT(TOTALACCOUNTS)
BRAND 1 | WITHOUT CHARGEOFFS | COUNT(TOTALACCOUNTS)
BRAND 2 | WITH CHARGEOFFS | COUNT(TOTALACCOUNTS)
BRAND 2 | WITHOUT CHARGEOFFS | COUNT(TOTALACCOUNTS)
Updated:
Chargeoffs = Count of all accounts whether chargeoffdate is null or not
No Chargeoffs = Count of all accounts where chargeoffdate is null (they haven't charged off)
SELECT
brand,
count(*) as "Chargeoffs",
sum(CASE WHEN a.chargeoffdate IS NULL THEN 1 ELSE 0 END) as 'No Chargeoffs'
FROM accounts
GROUP BY brand
UPDATE: I'm tired, I obtained this long SQL, wich is near what you want:
SELECT brand,
tp,
CASE WHEN TP = 1 then sum(cnt) END as 'No Chargeoffs',
sum(cnt) as "Chargeoffs"
FROM(
SELECT
brand,
CASE WHEN a.chargeoffdate IS NULL THEN 1 ELSE 0 END as tp
count(*) as cnt
FROM accounts
GROUP BY brand, CASE WHEN a.chargeoffdate IS NULL THEN 1 ELSE 0 END
GROUP BY brand, tp
That was kind of stupid and my solution was easy. I just left it how I had it and the pivot table added them together for me.
I found this out after I had created 2 separate queries and did some data manipulation with SAS to get what I wanted. Ouch.