DB2 JOIN, UNION, and pull max value from each group - sql

I am having a difficult time wrapping my head around the path for solving a problem in DB2. I have three tables that look like this...
PARENT
id | label
--------------
1 | One
2 | Two
3 | Three
TABLE1
id | parentid | eventdate
-------------------------
1 | 1 | 2015-11-01
2 | 1 | 2015-12-01
3 | 2 | 2015-10-01
4 | 2 | 2015-09-01
5 | 3 | 2015-08-01
6 | 3 | 2015-07-01
TABLE2
id | parentid | eventdate
-------------------------
1 | 1 | 2015-11-15
2 | 1 | 2015-12-15
3 | 2 | 2015-07-15
4 | 2 | 2015-09-15
5 | 3 | 2015-08-15
6 | 3 | 2015-05-15
Ultimately, I need to find the max date from either table for each parent id. My thought is to UNION two SELECTs, each being JOINed to PARENT, but I am at a complete loss as to how to only pull back a single row for each parent that consists of the max date from either TABLE1 or TABLE2 like this:
One: 2015-12-15
Two: 2015-10-01
Three: 2015-08-15
If anyone could offer some guidance I would be extremely grateful.

You are on the right track. Use a union in a subquery and then join PARENT and GROUP BY label to get the MAX date.
SELECT label, MAX(eventdate) AS maxeventdate FROM (
SELECT parentid, eventdate FROM TABLE1
UNION ALL
SELECT parentid, eventdate FROM TABLE2)
JOIN PARENT ON (id = parentid)
GROUP BY label

SELECT label,
CASE WHEN max(t1.eventdate) > max(t2.eventdate)
THEN max(t1.eventdate)
ELSE max(t2.eventdate)
END as eventdate
FROM PARENT p
JOIN TABLE1 t1
ON p.id = t1.id
JOIN TABLE2 t2
ON p.id = t2.id
GROUP BY p.label

One method is to use union all followed by aggregation. The following does this with a twist, which is to pre-aggregate the results on each table:
select p.label, max(maxed) as max_eventdate
from ((select parentid, max(eventdate) as maxed
from table1
group by parentid
) union all
(select parentid, max(eventdate)
from table2
group by parentid
)
) t12 join
parent p
on t12.parentid = p.id
group by p.label;

There's a function GREATEST() just for this purpose, so you can adjust the solution proposed by #JuanCarlosOropeza:
SELECT label, GREATEST(max(t1.eventdate), max(t2.eventdate)) eventdate
FROM PARENT p
JOIN TABLE1 t1 ON p.id = t1.id
JOIN TABLE2 t2 ON ON p.id = t2.id
GROUP BY p.label
You may want to use LEFT JOIN in case events may be present only in one of the two event tables.

Related

How to do an outer join with full result between two tables

I have two tables:
TABLE1
id_attr
-------
1
2
3
TABLE2
id | id_attr | val
----------------------
10 | 1 | A
10 | 2 | B
As a result I want a table that show:
RESULT
id | id_attr | val
----------------------
10 | 1 | A
10 | 2 | B
10 | 3 | NULL
So I want the row with id=10 and id_attr=3 also when id_Attr=3 is missing in TABLE2 (and I know that because I have a NULL value (or something else) in the val column of RESULT.
NB: I could have others ids in table2. For example, after insert this row on table2: {11,1,A}, as RESULT I want:
id | id_attr | val
----------------------
10 | 1 | A
10 | 2 | B
10 | 3 | NULL
11 | 1 | A
11 | 2 | NULL
11 | 3 | NULL
So, for every id, I want always the match with all id_attr.
Your specific example only has one id, so you can use the following:
select t2.id, t2.id_attr, t2.val
from table2 t2
union all
select 10, t1.id_attr, NULL
from table1 t1
where not exists (select 1 from table2 t2 where t2.id_attr = t1.id_attr);
EDIT:
You can get all combinations of attributes and ids in the following way. Use a cross join to create all the rows you want and then a left join to bring in the data you want:
select i.id, t1.id_attr, t2.val
from (select distinct id from table2) i cross join
table1 t1 left join
table2 t2
on t2.id = i.id and t2.id_attr = t1.id_attr;
It sounds like you want to do just an outer join on id_attr instead of id.
select * from table2 t2
left outer join table1 t1 on t2.id_attr = t1.id_attr;

SQL query for two tables, ignoring resultsets from second table

I wish to cast a query in two tables, but the result set should show only all results from the first table with the info from the second table linked with the first one, but there are many linked info, i just want the last linked information. eg
table 1
id_t1 | number | type
1 555 file
2 666 img
table 2
id_t2 | id_table1_fk | date_in | description
1 1 04/07 aaaaaaa
2 1 05/07 bbbbbbb
query
id_t1 | number | type | date_in | description
1 555 file 05/07 bbbbbbb
2 666 img null null
Try this:
SELECT
t1.*,
new_t2.date_in,
new_t2.description
FROM
t1,
( SELECT *
FROM t2
WHERE id_table1_fk = t1.id_t1
ORDER BY id_t2 DESC
LIMIT 1
) AS new_t2
select table1.*, table2.date_in, table2.description
from table1 left outer join table2 on table1.id_t1 = table2.id_table1_fk

Select row with max value saving distinct column

I have values
- id|type|Status|Comment
- 1 | P | 1 | AAA
- 2 | P | 2 | BBB
- 3 | P | 3 | CCC
- 4 | S | 1 | DDD
- 5 | S | 2 | EEE
I wan to get values for each type with max status and with comment from the row with max status:
- id|type|Status|Comment
- 3 | P | 3 | CCC
- 5 | S | 2 | EEE
All the existing questions on SO do not care about the right correspondence of Max type and value.
This gives you one row per type, which have max status
select * from (
select your_table.*, row_number() over(partition by type order by Status desc) as rn from your_table
) tt
where rn = 1
Corrected: The below will use a subquery to figure out each type and what the max status is, then it joins that onto the original table and uses the where clause to only select those rows where the status equals the max status. Of note, if you have multiple records with the same max status, you will get both of them to come up.
WITH T1 AS (SELECT type, MAX(STATUS) AS max_status FROM table_name GROUP BY type)
SELECT t2.id, t2.type, t2.status, t2.comment
FROM T1 LEFT JOIN table_name t2 ON t2.type= T1.type
WHERE t2.status = T1.max_status

Creating sql view where the select is dependent on the value of two columns

I want to create a view in my database, based on these three tables:
I would like to select the rows in table3 that has the highest value in Weight, for rows that has the same value in Count.
Then I want them grouped by Category_ID and ordered by Date, so that if two rows in table3 are identical, I want the newest.
Let me give you an example:
Table1
ID | Date | UserId
1 | 2015-01-01 | 1
2 | 2015-01-02 | 1
Table2
ID | table1_ID | Category_ID
1 | 1 | 1
2 | 2 | 1
Table3
ID | table2_ID | Count | Weight
1 | 1 | 5 | 10
2 | 1 | 5 | 20 <-- count is 5 and weight is highest
3 | 1 | 3 | 40
4 | 2 | 5 | 10
5 | 2 | 3 | 40 <-- newest of the two equal rows
Then the result should be row 2 and 5 from table 3.
PS I'm doing this in mssql.
PPS I'm sory if the title is not appropriate, but I did not know how to formulate a good one.
SELECT
*
FROM
(
SELECT
t3.*
,RANK() OVER (PARTITION BY [Count] ORDER BY [Weight] DESC, Date DESC) highest
FROM TABLE3 t3
INNER JOIN TABLE2 t2 ON t2.Id = t3.Table2_Id
INNER JOIN TABLE1 t1 ON t1.Id = t2.Table1_Id
) t
WHERE t.Highest = 1
This will group by the Count (which must be the same). Then it will determine which has the highest weight. If two of more of them have the same 'heighest' weight, it takes the one with the most recent date first.
You can use RANK() analytic function here, and give those rows a rank and than choose the first rank for each ID
Something like
select *
from
(select
ID, table2_ID, Count, Weight,
RANK() OVER (PARTITION BY ID ORDER BY Count, Weight DESC) as Highest
from table3)
where Highest = 1;
This is the syntax for Oracle, if you not using it look in the internet for the your syntax which should be almost the same

SQL Group By Issue with same item ID

I am trying to track the total number of sales a rep has along with the amount of time he was clocked into work.
I have the following two tables:
table1:
employeeID | item | price | timeID
----------------------------------------
1 | 1 | 12.92 | 123
1 | 2 | 10.00 | 123
1 | 2 | 10.00 | 456
table2:
ID | minutes_in_shift
--------------------------
123 | 45
456 | 15
I would join these two queries with the following SQL:
SELECT
t1.employeeID, t1.item, t1.price, t1.shiftID, t2.minutes_in_shift
FROM table1 t1
JOIN table 2 t2 ON (t2.ID = t1.timeID)
Which would return the following table:
employeeID | item | price | timeID | minutes_in_shift
---------------------------------------------------
1 | 1 | 12.92 | 123 | 45
1 | 2 | 10.00 | 123 | 45
1 | 2 | 10.00 | 456 | 15
I would like for the consolidate results, however, to have this outcome:
employeeID | itemsSold | priceTotals | totaltimeworked
-----------------------------------------------------------------
1 | 3 | 32.92 | 60
I could use COUNT and SUM for the items and price but I cannot figure out how to properly show the total time worked in the manner it appears above.
Note: I am only having trouble with calculating the time worked. In shift 123 - employee 1 was working 45 minutes, regardless of how many items he sold.
Any suggestions?
If you wish to use the sample data as they are you will need to extract the shifts and sum the minutes, like this:
with a as (
select employeeID, count(*) itemsSold, sum(price) priceTotals
from Sampletable1
group by employeeID),
b as (
select employeeID, shiftID, max(minutes_in_shift) minutes_in_shift
from Sampletable1
group by employeeID, shiftID),
c as (
select employeeID, sum(minutes_in_shift) totaltimeworked
from b
group by employeeID)
select a.employeeID, a.itemsSold, a.priceTotals, c.totaltimeworked
from a inner join c on a.employeeID = c.employeeID
However, with your existing tables the select statement will be much easier:
with a as (
select employeeID, timeID, count(*) itemsSold, sum(price) priceTotals
from table1
group by employeeID, timeID)
select a.employeeID, sum(a.itemsSold), sum(a.priceTotals), sum(table2.minutes_in_shift) totaltimeworked
from a inner join table2 on a.timeID = table2.ID
group by a.employeeID
I think this query should do what you want:
SELECT t1.employeeID,
count(t1.item) AS itemsSold,
sum(t1.price) AS priceTotals,
sum(DISTINCT t2.minutes_in_shift) AS totaltimeworked
FROM table1 t1
JOIN table2 t2 ON (t2.ID = t1.timeID)
GROUP BY t1.employeeID;
Check on SQL Fiddle