SQL Search for missing record, then insert value - sql

Below is a very oversimplified problem I am trying to solve
I have the following tables:
id title
1 first
2 second
3 third
4 fourth
5 fifth
id status user_id quiz_id
1 0 1 1
2 0 1 2
3 0 1 3
if a I run the following:
select *
from quiz as q
left join quiz_status as qs
ON q.id = qs.quiz_id
where qs.user_id=1
I'd get:
id title id status user_id quiz_id
1 first 1 0 1 1
2 second 2 0 1 2
3 third 3 0 1 3
4 fourth null null null null
5 fifth null null null null
I would like to be able to insert values where missing/null in the quiz_status table.
so the final outcome would be:
id title id status user_id quiz_id
1 first 1 0 1 1
2 second 2 0 1 2
3 third 3 0 1 3
4 fourth 4 0 1 4
5 fifth 5 0 1 5
What would be the insert statement for that?

Consider the insert ... select syntax:
insert into quiz_status(status, user_id, quiz_id)
select 0, u.user_id, q.id
from (select distinct user_id from quiz_status) u
cross join quiz q
left join quiz_status qz on q.id = qz.quiz_id and u.user_id = qz.user_id
where qz.quiz_id is null
This works by generating all combinations of users and quizs, and then left joining the status table to filter on missing records. In the real life, you would likely have a users table that you can use in place of the select distinct subquery.
If you need just one user it's simpler:
insert into quiz_status(status, user_id, quiz_id)
select 0, 1, q.id
from quiz q
left join quiz_status qz on q.id = qz.quiz_id and qz.user_id = 1
where qz.quiz_id is null
Note: presumably, id is a serial column so I left it apart in the inserts.


How to check the count of each values repeating in a row

I have two tables. Data in the first table is:
ID Username
1 Dan
2 Eli
3 Sean
4 John
Second Table Data:
user_id Status_id
1 2
1 3
4 1
3 2
2 3
1 1
3 3
3 3
3 3
. .
goes on goes on
These are my both tables.
I want to find the frequency of individual users doing 'status_id'
My expected result is:
username status_id(1) status_id(2) status_id(3)
Dan 1 1 1
Eli 0 0 1
Sean 0 1 2
John 1 0 0
My current code is:
SELECT b.username , COUNT(a.status_id)
FROM masterdb.auth_user b
left outer join masterdb.xmlform_joblist a
on a.user1_id = b.id
GROUP BY b.username, b.id, a.status_id
This gives me the separate count but in a single row without mentioning which status_id each column represents
This is called pivot and it works in two steps:
extracts the data for the specific field using a CASE statement
aggregates the data on users, to make every field value lie on the same record for each user
SELECT Username,
SUM(CASE WHEN status_id = 1 THEN 1 END) AS status_id_1,
SUM(CASE WHEN status_id = 2 THEN 1 END) AS status_id_2,
SUM(CASE WHEN status_id = 3 THEN 1 END) AS status_id_3
ON t2.user_id = t1._ID
GROUP BY Username
ORDER BY Username
Check the demo here.
Note: This solution assumes that there are 3 status_id values. If you need to generalize on the amount of status ids, you would require a dynamic query. In any case, it's better to avoid dynamic queries if you can.

List rows by count when some rows refer to another row

I'm attempting to write a query that will list items from Table 2 in order of how many rows reference it in Table 1. The difficulty is that some rows in Table 2 are actually based on another row in Table 2, in which case the row it refers to should be counted instead.
My structure looks like this:
Table 1
itemID templateID
1 1
2 2
3 3
4 4
5 5
Table 2
templateName templateID basedOnTemplateID
Foo 1 null
Bar 2 null
Tree 3 1
Dog 4 2
Bird 5 null
Desired Results
templateName templateID itemCount
Foo 1 2
Bar 2 2
Bird 5 1
Tree 3 0
Dog 4 0
What I have so far:
Max(table2.templateName) 'templateName',
Max(table2.templateID) 'templateID',
Count([itemID]) 'itemCount'
FROM table1
table2 on table1.templateid = (
WHEN table2.basedOnTemplateID is not null
THEN table2.basedOnTemplateID
ELSE table2.templateID
GROUP BY table2.templateid
You can left join and aggregate as follows:
count(t1.templateID) itemCount
from table2 t2
left join table1 t1
on t1.templateID = t2.templateID
and t2.basedOnTemplateID is null
group by t2.templateName, t2.templateID
order by itemCount desc, t2.templateID

How to add extra value to select query result

Id Subject
1 English
2 History
3 Maths
Id Userid SubjectId
1 1 1
2 1 3
3 2 2
Id Userid SubjectId Examdate Percentage
1 1 1 02/20/2020 50
2 1 0 Null Null
3 2 1 02/20/2020 70
4 2 2 02/20/2020 60
5 3 0 Null Null
6 4 3 02/18/2020 56
These are my sample tables.
I want to show records from log table of zero as well as all assigned subject of user 1
Suppose user 1 has 2 subject 1 and 3.
Show records from logs where subjectid comes in 0 as well as 1,3
Required Output :
Id Userid SubjectId Examdate Percentage
1 1 1 02/20/2020 50
2 1 0 Null Null
3 2 1 02/20/2020 70
4 3 0 Null Null
5 4 3 02/18/2020 56
Query :
select * from logs where rdatetime >= '' and subjectid in (select id from subjectmaster where userid = 1)
'Or' did not work.It was giving wrong output.How to handle it.
If I understand correctly, you want a correlated subquery and condition like this:
select l.*
from logs l
where l.subjectid = 0 or
exists (select 1
from subjectmaster sm
where sm.userid = l.userid and
sm.subjectid = l.subjectid
You can do left join :
select l.*
from logs l left join
subjectmaster sm
on sm.userid = l.userid and
sm.subjectid = l.subjectid
where not (l.subjectid <> 0 and sm.subjectid is null);
This query will give you the desired result:
select *
from logs l
where subjectid = 0
subjectid IN (select subjectid
from UserSubjectAssociation
where Userid = 1)

Best way to by column and aggregation on another column

I want to create a rank column using existing rank and binary columns. Suppose for example a table with ID, RISK, CONTACT, DATE. The existing rank is RISK, say 1,2,3,NULL, with 3 being the highest. The binary-valued is CONTACT with 0,1 or FAILURE/SUCESS. I want to create a new RANK that will order by RISK once a certain number of successful contacts has been exceeded.
For example, suppose the constraint is a minimum of 2 successful contacts. Then the rank should be created as follows in the two instances below:
Instance 1. Three ID, all have a min of two successful contacts. In that case the rank mirrors the risk:
ID risk contact date rank
1 3 S 1 3
1 3 S 2 3
1 3 F 3 3
1 3 F 4 3
2 2 S 1 2
2 2 S 2 2
2 2 F 3 2
2 2 F 4 2
3 1 S 1 1
3 1 S 2 1
3 1 S 3 1
Instance 2. Suppose ID=1 has only one successful contact. In that case it is relegated to the lowest rank, rank=1, while ID=2 gets the highest value, rank=3, and ID=3 maps to rank=2 because it satisfies the constraint but has a lower risk value than ID=2:
ID risk contact date rank
1 3 S 1 1
1 3 F 2 1
1 3 F 3 1
1 3 F 4 1
2 2 S 1 3
2 2 S 2 3
2 2 F 3 3
2 2 F 4 3
3 1 S 1 2
3 1 S 2 2
3 1 S 3 2
This is SQL, specifically Hive. Thanks in advance.
Edit - I think Gordon Linoff's code does it correctly. In the end, I used three interim tables. The code looks like that:
--numerize risk, contact
select A.* ,
case when A.risk = 'H' then 3
when A.risk = 'M' then 2
when A.risk = 'L' then 1
when A.risk is NULL then NULL
when A.risk = 'NULL' then NULL
else -999 end as RISK_RANK,
case when A.contact = 'Successful' then 1
else NULL end as success
-- sum_successes_by_risk
select A.* ,
from T as A
inner join
(select A.person, A.program, A.risk, sum(a.success) as sum_successes_by_risk
from T as A
group by A.person, A.program, A.risk
) as B
on A.program = B.program
and A.person = B.person
and A.risk = B.risk
--Create table that contains only max risk category
select A.* ,
from T as A
inner join
(select A.person, max(A.risk_rank) as max_risk_rank
from T as A
group by A.person
) as B
on A.person = B.person
and A.risk_rank = B.max_risk_rank
This is hard to follow, but I think you just want window functions:
select t.*,
(case when sum(case when contact = 'S' then 1 else 0 end) over (partition by id) >= 2
then risk
else 1
end) as new_risk
from t;

How to apply a single query that sum column for individual values

I have 2 tables named user and statistics
user table has 3 columns: id, name and category
statistics table has 3 columns: id, idUser (relational), cal
something like this:
Id name category
1 name1 1
2 name2 2
3 name3 3
Id idUser cal
1 1 1
2 1 1
3 1 1
4 2 1
5 2 1
How can I apply a query that sum the cal column by each category of users and give me something like this:
category totalcal
1 3
2 2
3 0
You want to do a left join to keep all the categories. The rest is just aggregation:
select u.category, coalesce(sum(s.cal), 0) as cal
from users u left join
statistics s
on u.id = s.idUser
group by u.category;
Use LEFT JOIN to get 0 sum for the category=3:
,SUM(statistics.cal) AS totalcal
LEFT JOIN statistics ON statistics.idUser = user.Id
Here SUM would return NULL for category=3. To get 0 instead of NULL you can use COALESCE(SUM(statistics.cal), 0).