How to use a new column as a flag to show Null row? - sql

I have table A:
id
1
2
3
4
5
and table B:
id
2
3
4
I left join A and B:
id id
1 NULL
2 2
3 3
4 4
5 NULL
And how can I get a new column like this:
id id flag
1 NULL 0
2 2 1
3 3 1
4 4 1
5 NULL 0
Generally speaking, I want all rows in A but not in B to be flaged as 0 and want all rows in both tables to be flaged as 1. How can I achieve that? Better not use CTE.

This is just a CASE expression:
CASE WHEN B.id IS NULL THEN 0 ELSE 1 END AS flag
Alternatively, you could use an IIF (which is shorthand CASE expression):
IIF(b.id IS NULL, 0,1)

I would recommend using exists:
select a.*,
(case when exists (select 1 from b where b.id = a.id
then 1 else 0
end) as flag
from a;
The purpose of using exists instead of left join is that you are guaranteed to not get duplicate rows -- even if ids are duplicated in b. That is a nice guarantee.
From a performance perspective, the two should be similar, but it is possible that the case is an iota faster.

Related

How to check the count of each values repeating in a row

I have two tables. Data in the first table is:
ID Username
1 Dan
2 Eli
3 Sean
4 John
Second Table Data:
user_id Status_id
1 2
1 3
4 1
3 2
2 3
1 1
3 3
3 3
3 3
. .
goes on goes on
These are my both tables.
I want to find the frequency of individual users doing 'status_id'
My expected result is:
username status_id(1) status_id(2) status_id(3)
Dan 1 1 1
Eli 0 0 1
Sean 0 1 2
John 1 0 0
My current code is:
SELECT b.username , COUNT(a.status_id)
FROM masterdb.auth_user b
left outer join masterdb.xmlform_joblist a
on a.user1_id = b.id
GROUP BY b.username, b.id, a.status_id
This gives me the separate count but in a single row without mentioning which status_id each column represents
This is called pivot and it works in two steps:
extracts the data for the specific field using a CASE statement
aggregates the data on users, to make every field value lie on the same record for each user
SELECT Username,
SUM(CASE WHEN status_id = 1 THEN 1 END) AS status_id_1,
SUM(CASE WHEN status_id = 2 THEN 1 END) AS status_id_2,
SUM(CASE WHEN status_id = 3 THEN 1 END) AS status_id_3
FROM t2
INNER JOIN t1
ON t2.user_id = t1._ID
GROUP BY Username
ORDER BY Username
Check the demo here.
Note: This solution assumes that there are 3 status_id values. If you need to generalize on the amount of status ids, you would require a dynamic query. In any case, it's better to avoid dynamic queries if you can.

To update a column by checking the value from another column in a different table

I am trying to update flag in my main table based on the flag in another common table.Both are related with the Foreign Key relationship. But the problem is the flag in another common table is either 0 or 1. So, it should update the flag in the main table as 1 only if all the values for a particular FK is 1.
Suppose that there are 2 tables listed below. XYZ and ABC. Both are related to each other through Foreign Key.
XYZ:
XYZID Posted
1 0
2 0
3 0
4 0
ABC:
ABCID XYZID IsPosted
1 1 1
2 1 1
3 2 0
4 2 0
5 2 0
6 3 1
7 3 0
8 4 0
9 4 0
10 4 1
If you see for XYZID in ABC table the Isposted value is 1 for both. I want that value to be updated in the Posted as 1 of XYZ main table for XYZID 1. But if you look at XYZID value 3 in ABC table for IsPosted then it is 0 and 1. So for XYZID value 3 the Posted value should not be updated in the XYZ table as 1. In general, if all the foreign key value has the IsPosted as 1 then only it should be updated as 1 in the Posted column of XYZ table. If it is 0 or 1 then it should not update in the XYZ table.
I thought of using group by or cursor. But don't know how to start on this.
If anyone can help me in this then would be helpful. It is pretty simple but I am not getting the idea to start on this. Any help would be appreciated.
Update the table by joining a subquery that groups by xyzid the table abc and sets the condition in the having clause:
update t
set posted = 1
from xyz t inner join (
select xyzid from abc
group by xyzid
having sum(case when isposted = 0 then 1 else 0 end) = 0
) a on a.xyzid = t.xyzid
The condition in the having clause could also be written:
having sum(abs(isposted - 1)) = 0
See the demo.
Results:
> XYZID | Posted
> ----: | -----:
> 1 | 1
> 2 | 0
> 3 | 0
> 4 | 0
Assuming there can only be 0 or 1, one way is to use a correlated subquery getting the minimum isposted for an xyzid.
UPDATE main_table
SET posted = (SELECT min(another_common_table.isposted)
FROM another_common_table
WHERE another_common_table.xyzid = main_table.xyzid);
If there is a 0 the minimum will be 0. If there's only 1s it'll be 1.
Try the following:
UPDATE [a]
SET a.[Posted] = [b].[IsPosted]
FROM [a]
INNER JOIN (SELECT [xyzid],
[IsPosted] = MIN(Cast([IsPosted] AS INT))
FROM
[b]
GROUP BY
[xyzid]
HAVING
MIN(Cast([IsPosted] AS INT)) = 1) [b]
ON [a].[xyzid] = [b].[xyzid]
Essentially, the inner query returns only those entries from table B with all 1 values and then updates the A table based on the FK join.
There may be more efficient queries AND this will re-update previously updated A.Posted values and will NOT un-update A.Posted if anything in table B is marked as IsPosted = 0.

Get row counts for different lookup values

A temp table has 700+ records with a PK. 12 columns contain Id values from lookup tables. Each lookup table has 4-8 records in it. How can I get a record count for each Id value in table LookupA that has a relationship via the PK to Id values in every other lookup table? Each lookup value in each lookup table needs to compared for a record count to every other lookup table and value.
I can write a SQL statement to get specific values for specific columns, but that's a long exercise and will slow down the proc.
Here's a sample of the data.
PK LookupA LookupB LookupC
1 1 1 3
2 1 2 3
3 1 3 2
4 2 4 2
5 4 1 1
6 3 2 1
7 2 3 3
8 4 4 3
9 4 3 2
10 1 1 2
The results need to compare LookupA with LookupB and LookupC to get a row count.
Table Value LookupB 1 2 3 4 LookupC 1 2 3
LookupA 1 2 1 1 0 0 2 2
2 0 0 1 1 0 1 1
3 0 1 0 0 1 0 0
4 1 0 1 1 1 1 1
Then LookupB would be compared to LookupA and LookupC.
And LookupC would be compared to LookupA and LookupB.
With this code you can get the numbers for all combinations of A,B and C in pairs:
select 'A-B' as Combination, LookupA, LookupB, count(*) as NumRecords
from table
group by Combination,LookupA, LookupB
UNION
select 'A-C' as Combination, LookupA, LookupC, count(*) as NumRecords
from table
group by Combination,LookupA, LookupC
UNION
select 'B-C' as Combination, LookupB, LookupC, count(*) as NumRecords
from table
group by Combination,LookupB, LookupC
After this, if you want to see all the values for LookupA comparing to B and C just
look for Combinations A-B and A-C
If I understand correctly, your temp table contains foreign keys to other tables, so why not simply use joins? Something like this.
SELECT COUNT(DISTINCT lookupA.id) as CountA
, COUNT(DISTINCT lookupB.id) as CountB
, etc...
FROM #temp_table t
LEFT OUTER JOIN lookupA a on a.id = t.lookupA
LEFT OUTER JOIN lookupB b on b.id = t.lookupB
...etc
I would suggest reviewing the design if possible. Having so many small tables complicates things, is it not possible to consolidate this and just have one lookup table? You could have an additional field "LookupType" and all the lookups could be in the same place which would make retrieval much simpler.
I used a slight derivative of the statement below without any UNIONs to get me where I wanted to go.
/*
select 'A-B' as Combination, LookupA, LookupB, count(*) as NumRecords
from table
group by Combination, LookupA, LookupB
*/
I used a variable and a WHILE loop to place the various summaries where they need to be.

Best way to by column and aggregation on another column

I want to create a rank column using existing rank and binary columns. Suppose for example a table with ID, RISK, CONTACT, DATE. The existing rank is RISK, say 1,2,3,NULL, with 3 being the highest. The binary-valued is CONTACT with 0,1 or FAILURE/SUCESS. I want to create a new RANK that will order by RISK once a certain number of successful contacts has been exceeded.
For example, suppose the constraint is a minimum of 2 successful contacts. Then the rank should be created as follows in the two instances below:
Instance 1. Three ID, all have a min of two successful contacts. In that case the rank mirrors the risk:
ID risk contact date rank
1 3 S 1 3
1 3 S 2 3
1 3 F 3 3
1 3 F 4 3
2 2 S 1 2
2 2 S 2 2
2 2 F 3 2
2 2 F 4 2
3 1 S 1 1
3 1 S 2 1
3 1 S 3 1
Instance 2. Suppose ID=1 has only one successful contact. In that case it is relegated to the lowest rank, rank=1, while ID=2 gets the highest value, rank=3, and ID=3 maps to rank=2 because it satisfies the constraint but has a lower risk value than ID=2:
ID risk contact date rank
1 3 S 1 1
1 3 F 2 1
1 3 F 3 1
1 3 F 4 1
2 2 S 1 3
2 2 S 2 3
2 2 F 3 3
2 2 F 4 3
3 1 S 1 2
3 1 S 2 2
3 1 S 3 2
This is SQL, specifically Hive. Thanks in advance.
Edit - I think Gordon Linoff's code does it correctly. In the end, I used three interim tables. The code looks like that:
First,
--numerize risk, contact
select A.* ,
case when A.risk = 'H' then 3
when A.risk = 'M' then 2
when A.risk = 'L' then 1
when A.risk is NULL then NULL
when A.risk = 'NULL' then NULL
else -999 end as RISK_RANK,
case when A.contact = 'Successful' then 1
else NULL end as success
Second,
-- sum_successes_by_risk
select A.* ,
B.sum_successes_by_risk
from T as A
inner join
(select A.person, A.program, A.risk, sum(a.success) as sum_successes_by_risk
from T as A
group by A.person, A.program, A.risk
) as B
on A.program = B.program
and A.person = B.person
and A.risk = B.risk
Third,
--Create table that contains only max risk category
select A.* ,
B.max_risk_rank
from T as A
inner join
(select A.person, max(A.risk_rank) as max_risk_rank
from T as A
group by A.person
) as B
on A.person = B.person
and A.risk_rank = B.max_risk_rank
This is hard to follow, but I think you just want window functions:
select t.*,
(case when sum(case when contact = 'S' then 1 else 0 end) over (partition by id) >= 2
then risk
else 1
end) as new_risk
from t;

How to update one table based on aggregate query form another table

Say I have two tables.
Table A
id
A_status
parent_id_B
Table B
id
B_status
So for each id in B can have many records in A.
Now my question is, I need to set B_status to 1 when all child entries in Table A with same parent_id_B has A_status =1, else set B_status = 2
Ex:
Table A:
id A_status parent_id_B
1 1 1
2 1 1
3 1 2
4 1 3
5 1 3
Table B:
id B_status
1 0
2 0
3 0
Expected result:
Table B:
id B_status
1 1
2 1
3 1
Now consider another scenario
Table A:
id A_status parent_id_B
1 1 1
2 1 1
3 2 2
4 2 3
5 1 3
Table B:
id B_status
1 0
2 0
3 0
Expected result:
Table B:
id B_status
1 1
2 2
3 2
I need this to work only on sqlite. Thanks
I believe this can be done like so:
UPDATE TableB
SET B_Status =
(SELECT MAX(A_Status) FROM TableA WHERE TableA.Parent_ID_B = TableB.ID);
SqlFiddle with your second case here
In a more general case (without relying on direct mapping of A's status, you can also use a CASE ... WHEN in the mapping:
UPDATE TableB
SET B_Status =
CASE WHEN (SELECT MAX(A_Status)
FROM TableA
WHERE TableA.Parent_ID_B = TableB.ID) = 1
THEN 1
ELSE 2
END;
Edit (in the case where there are more than the original number of states):
I believe you'll need to determine 2 facts about each row, e.g.
Whether there is are any rows in table A with a status other than 1 for each B
And there must at least be one row for the same B
Or, whether the count of rows of A in state 1 = the count of all rows in A for the B.
Here's the first option:
UPDATE TableB
SET B_Status =
CASE WHEN
EXISTS
(SELECT 1
FROM TableA
WHERE TableA.Parent_ID_B = TableB.ID
AND TableA.A_Status <> 1)
OR NOT EXISTS(SELECT 1
FROM TableA
WHERE TableA.Parent_ID_B = TableB.ID)
THEN 2
ELSE 1
END;
Updated Fiddle