Fill Future date for null groupby sql(presto) - sql

This might be easier than I'm thinking, but essentially want to fill in values that would be null for ID 2. Example below. Thanks.
Given Table:
|ID| food category | time |
:--:----------:-------
|1 |italian | 2021-10-01|
|1 | indian | 2021-10-23|
|1 | american| 2021-10-05|
|1 | mexican | 2021-10-07|
|1 | Chinese | 2021-10-09|
|1 | vietnamese| 2021-10-11|
|1 | thai | 2021-10-12|
|1 | Moroccan| 2021-9-01|
|1 | russian | 2021-7-01|
|1 | korean | 2021-4-30|
|1 | canadian| 2021-7-01|
|2 |italian | 2020-10-11|
|2 | indian | 2021-04-23|
|2 | american| 2021-10-25|
|2 | mexican | 2021-10-27|
I'd like to transform the table above by grouping by id and food category, but still have the time for ID 2 to be replaced with future dates(date_add('year',1, now()) for null time. Since there would be no record for ID 2 for the food categories of Chinese, Vietnamese, Thai, Moroccan, Russian, Korean, and Canadian these would be null, but I'd like them to still show in the group by the table and be placed by the date 1 year from now. Example of desired results below. Thank you for the help.
Desired Table:
|ID| food category | time |
:--:----------:-------
|1 |italian | 2021-10-01|
|1 | indian | 2021-10-23|
|1 | american| 2021-10-05|
|1 | mexican | 2021-10-07|
|1 | Chinese | 2021-10-09|
|1 | vietnamese| 2021-10-11|
|1 | thai | 2021-10-12|
|1 | Moroccan| 2021-9-01|
|1 | russian | 2021-7-01|
|1 | korean | 2021-4-30|
|1 | canadian| 2021-7-01|
|2 |italian | 2020-10-11|
|2 | indian | 2021-04-23|
|2 | american| 2021-10-25|
|2 | mexican | 2021-10-27|
|2 | Chinese | 2022-11-23|
|2 | vietnamese| 2022-11-23|
|2 | thai | 2022-11-23|
|2 | Moroccan| 2022-11-23|
|2 | russian | 2022-11-23|
|2 | korean | 2022-11-23|
|2 | canadian| 2022-11-23|

you can use following query
SELECT COALESCE(t1.ID,t2.ID) as ID,
COALESCE(t1.foodcategory,t2.foodcategory) as foodcategory,
CAST(COALESCE(t2.time,dateadd(year, 1, getdate())) AS DATE) time
FROM
(SELECT *
FROM
(SELECT foodcategory
FROM testTB
GROUP BY foodcategory) t1
JOIN
(SELECT id
FROM testTB
GROUP BY id) t2 on 1=1) t1
LEFT JOIN testTB t2 on t1.ID = t2.ID and t1.foodcategory = t2.foodcategory
or
WITH cte AS (
select distinct foodcategory from testTB
)
SELECT t2.ID,t1.foodcategory,CAST(COALESCE(t3.time,dateadd(year, 1, getdate())) AS DATE) time
FROM cte t1
FULL OUTER JOIN (
select distinct [ID] from testTB
) t2 on 1=1
left join testTB t3 on t2.ID = t3.ID and t1.foodcategory = t3.foodcategory
order by t2.id
demo in db<>fiddle

Use a CTE to gather the list of food categories first. Then gather the list of IDs.
WITH cteCat AS (
select distinct [food category] from table
)
, cteID AS (
select distinct [ID] from table
)
SELECT id.[ID], cat.[food category],
COALESCE(t.[time], dateadd(year, 1, getdate())) as [time]
FROM cteCat cat
, cteID id
LEFT OUTER JOIN table t
ON t.[ID] = id.[ID]
AND t.[food category] = cat.[food category]

Related

How to join two queries with count?

I have 3 tables in database:
Table 1: violation
| violation_id | violation_name |
|:-------------:|:--------------:|
|1 | No Parking |
|2 | Speed Contest |
|3 | No Helmet |
Table 2: violators
| violator_id | violation_id |
|:-------------:|:--------------:|
|1 |1 |
|2 |1 |
|3 |3 |
Table 2: previous_violator
| prev_violator_id| violation_id |
|:---------------:|:--------------:|
|1 |1 |
|2 |2 |
|3 |2 |
This view that I want:
| violation_name | Total |
|:-------------:|:--------------:|
|No Parking | 3 |
|Speed Contest | 2 |
|No Helmet | 1 |
I perform this code that joins the violator table and violation:
SELECT *,count(violators.violation_id) as vid
FROM violators
LEFT JOIN violation ON violation.violation_id = violators.violation_id
LEFT JOIN previous_violator ON previous_violator.violator_id = violators.violator_id
WHERE date_apphrehend BETWEEN '$from' AND '$to'
GROUP BY violators.violation_id
My problem is, I want to join the previous violator table that count to the total based on the violation_name.
You can first union all to get them into a single result and then count(*) it. Finally join with Violation to get names. ie:
select violation_name, count(*) as cnt
from (select violation_id from Violators
union all
select violation_id from previous_Violators) tmp
inner join Violation on tmp.violation_id = Violation.violation_id
group by Violation.violation_id, violation_name;
Sample DBFiddle demo.
PS: Sample is in postgreSQL but it would be the same for most backends. You didn't tag your backend.

Update a column value within a SELECT query

I have a complicated SQL question.
Can we update a column within a SELECT query? Example:
Consider this table:
|ID |SeenAt |
----------------
|1 |20 |
|1 |21 |
|1 |22 |
|2 |70 |
|2 |80 |
I want a SELECT Query that gives for each ID when was it seen for the first time. And when did it seen 'again':
|ID |Start |End |
---------------------
|1 |20 |21 |
|1 |20 |22 |
|1 |20 |22 |
|2 |70 |80 |
|2 |70 |80 |
First, both columns Start and End would have the same value, but when a second row with the same ID is seen we need to update its predecessor to give End the new SeenAt value.
I succeeded to create the Start column, I give the minimum SeenAt value per ID to all IDs. But I can't find a way to update the End column everytime.
Don't mind the doubles, I have other columns that change in every new row
Also, I am working in Impala but I can use Oracle.
I hope that I have been clear enough. Thank you
You could use lead() and nvl():
select id, min(seenat) over (partition by id) seen_start,
nvl(lead(seenat) over (partition by id order by seenat), seenat) seen_end
from t
demo
Start is easy just the MIN of the GROUP
End you need to find the MIN after the SeenAt and in case you don't find it then the current SeenAt
SQL DEMO
SELECT "ID",
(SELECT MIN("SeenAt")
FROM Table1 t2
WHERE t1."ID" = t2."ID") as "Start",
COALESCE(
(SELECT MIN("SeenAt")
FROM Table1 t2
WHERE t1."ID" = t2."ID"
AND t1."SeenAt" < t2."SeenAt")
, t1."SeenAt"
) as End
FROM Table1 t1
OUTPUT
| ID | START | END |
|----|-------|-----|
| 1 | 20 | 21 |
| 1 | 20 | 22 |
| 1 | 20 | 22 |
| 2 | 70 | 80 |
| 2 | 70 | 80 |
you seem to need min() analytic function with a self-join:
select distinct t1.ID,
min(t1.SeenAt) over (partition by t1.ID order by t1.ID) as "Start",
t2.SeenAt as "End"
from tab t1
join tab t2 on t1.ID=t2.ID and t1.SeenAt<=t2.SeenAt
order by t2.SeenAt;
Demo

Max value from joined table

I have two tables:
Operations (op_id,super,name,last)
Orders (or_id,number)
Operations:
+--------------------------------+
|op_id| super| name | last|
+--------------------------------+
|1 1 OperationXX 1 |
|2 1 OperationXY 2 |
|3 1 OperationXC 4 |
|4 1 OperationXZ 3 |
|5 2 OperationXX 1 |
|6 3 OperationXY 2 |
|7 4 OperationXC 1 |
|8 4 OperationXZ 2 |
+--------------------------------+
Orders:
+--------------+
|or_id | number|
+--------------+
|1 2UY |
|2 23X |
|3 xx2 |
|4 121 |
+--------------+
I need query to get table:
+-------------------------------------+
|or_id |number |max(last)| name |
|1 2UY 4 OperationXC|
|2 23X 1 OperationXX|
|3 xx2 2 OperationXY|
|4 121 2 OperationXZ|
+-------------------------------------+
use corelared subquery and join
select o.*,a.last,a.name from
(
select super,name,last from Operations from operations t
where last = (select max(last) from operations t2 where t2.super=t.super)
) a join orders o on t1.super =o.or_id
you can use row_number as well
with cte as
(
select * from
(
select * , row_number() over(partition by super order by last desc) rn
from operations
) tt where rn=1
) select o.*,cte.last,cte.name from Orders o join cte on o.or_id=cte.super
SELECT Orders.or_id, Orders.number, Operations.name, Operations.last AS max
FROM Orders
INNER JOIN Operations on Operations.super = Orders.or_id
GROUP BY Orders.or_id, Orders.number, Operations.name;
I don't have a way of testing this right now, but I think this is it.
Also, you didn't specify the foreign key, so the join might be wrong.

Better way of writing my SQL query with conditional group by

Here's my data
|vendorname |total|
---------------------
|Najla |10 |
|Disney |20 |
|Disney |10 |
|ToysRus |5 |
|ToysRus |1 |
|Gap |1 |
|Gap |2 |
|Gap |3 |
|Najla |2 |
Here's the resultset I want
|vendorname |grandtotal|
---------------------
|Disney |30 |
|Gap |6 |
|ToysRus |6 |
|Najla |2 |
|Najla |10 |
If the vendorname = 'Najla' I want individual rows with their respective total otherwise I would like to group them and return a sum of their totals.
This is my query--
select *
from
(
select vendorname, sum(total) grandtotal
from vendor
where vendorname<>'Najla'
group by vendorname
union all
select vendorname, total grandtotal
from vendor
where vendorname='Najla'
) A
I was wondering if there's a better way to write this query instead of repeating it twice and performing a union. Is there a condensed way to group some rows "conditionally".
Honestly, I think the union all version is going to be the best performing and easiest to read option if it has appropriate indexes.
You could, however, do something like this (assuming you have a unique id on your table):
select vendorname, sum(total) grandtotal
from t
group by
vendorname
, case when vendorname = 'Najla' then id else null end
rextester demo: http://rextester.com/OGZQ33364
returns
+------------+------------+
| vendorname | grandtotal |
+------------+------------+
| Disney | 30 |
| Gap | 6 |
| ToysRus | 6 |
| Najla | 10 |
| Najla | 2 |
+------------+------------+

Complicated min/max multi-table query

I need to get the min and max score of group ids, but only if they are enabled:
cdu_group_sl: cdu_group_cc: cdu_group_ph:
-------------------- -------------------- --------------------
|id |name |enabled | |id |name |enabled | |id |name |enabled |
-------------------- -------------------- --------------------
|1 |sl_1 |1 | |1 |cc_1 |1 | |1 |ph_1 |0 |
|2 |sl_3 |1 | |2 |cc_2 |0 | |2 |ph_2 |1 |
|3 |sl_4 |1 | |3 |cc_3 |1 | |3 |ph_3 |1 |
-------------------- -------------------- --------------------
Scores are found in a separate table:
cdu_user_progress
----------------------------------
|id |group_type |group_id |score |
----------------------------------
|1 |sl |1 |50 |
|1 |cc |1 |10 |
|1 |ph |1 |20 |
|1 |sl |2 |80 |
|1 |sl |3 |20 |
|1 |cc |3 |30 |
|1 |sl |1 |40 |
|1 |ph |1 |50 |
|1 |cc |1 |40 |
|1 |ph |2 |90 |
----------------------------------
I need to get a max and min score for each type of group for only enabled groups (for each type):
---------------------------------------------
|group_type |group_id |min_score |max_score |
---------------------------------------------
|sl |1 |40 |50 |
|sl |2 |80 |80 |
|sl |3 |20 |20 |
|cc |1 |10 |40 |
|cc |3 |30 |30 |
|ph |1 |20 |50 |
|ph |2 |90 |90 |
---------------------------------------------
Any idea what the query might be??? So far I have:
SELECT * FROM cdu_user_progress
JOIN cdu_group_sl ON (cdu_group_sl.id = cdu_user_progress.group_id AND cdu_user_progress.group_type = 'sl')
JOIN cdu_group_cc ON (cdu_group_cc.id = cdu_user_progress.group_id AND cdu_user_progress.group_type = 'cc')
JOIN cdu_group_ph ON (cdu_group_ph.id = cdu_user_progress.group_id AND cdu_user_progress.group_type = 'ph')
WHERE cdu_user_progress.uid = $student->uid
AND (cdu_user_progress.group_type = 'sl' AND cdu_group_sl.enabled = 1)
AND (cdu_user_progress.group_type = 'cc' AND cdu_group_cc.enabled = 1)
AND (cdu_user_progress.group_type = 'ph' AND cdu_group_ph.enabled = 1)
Probably completely wrong...
what about using a union to pick the groups you are interested in - something like:
select group_type, group_id min(score) min_score, max(score) max_score
from (
select id, 'sl' grp from cdu_group_sl where enabled = 1
union all
select id, 'cc' from cdu_group_cc where enabled = 1
union all
select id, 'ph' from cdu_group_ph where enabled = 1
) grps join cdu_user_progress scr
on grps.id = scr.group_id and grps.grp = scr.group_type
group by scr.group_type, scr.group_id
The following is probably the fastest way to do this query. To optimize this, you should have an index on group_id, enabled on each of the three "sl", "cc", and "ph" tables:
select cup.*
from cdu_user_progress cup
where (cup.group_type = 'sl' and
exists (select 1
from cdu_group_sl sl
where sl.id = cup.group_id and
sl.enabled = 1
)
) or
(cup.group_type = 'cc' and
exists (select 1
from cdu_group_cc cc
where cc.id = cup.group_id and
cc.enabled = 1
)
) or
(cup.group_type = 'ph' and
exists (select 1
from cdu_group_ph ph
where ph.id = cup.group_id and
ph.enabled = 1
)
)
As a note, having three tables with the same structure is usually a sign of a poor database schema. These three tables should probably be combined into a single table, which would make this query much easier to write.
If you are just starting up this project, I would recommend refining your data structure. Based on what you showed, you could benefit from only one cdu_groups table with a reference to a new cdu_group_types table, and removing the group_type column from cdu_user_progress.
If this is an established project, where changing the structure would be too disruptive... then one of the other answers showing a query would be a better/easier fit.
Otherwise, you could simplify things with restructured tables and end up with a query like:
SELECT group_type,
group_id,
MIN(score) as min_score,
MAX(score) as max_score
FROM cdu_user_progress c
INNER JOIN cdu_groups g
ON c.group_id=g.id
INNER JOIN cdu_group_types t
ON g.group_type_id=t.id
WHERE enabled=1
GROUP BY group_type, group_id
This is shown, with expected results, in this SQLFiddle. With this structure you can add new group types as you want (and also cut down on amount of tables and joins). Tables would be (simplified in this code below, no FKs or anything):
CREATE TABLE cdu_user_progress
(id INT, group_id INT, score INT)
CREATE TABLE cdu_group_types
(id INT, group_type VARCHAR(3))
CREATE TABLE cdu_groups
(id INT, group_type_id INT, name VARCHAR(10), enabled BIT NOT NULL DEFAULT 1)
Granted moving data to a new structure may be a pain or not reasonable... but wanted to throw this out there as a possibility or just something to chew on.