Join with other table timestamp and sum of the columns - sql

I'm trying to generate an aggregate table.
Lets say this is my tblA.
| type | name | timestap |
|------|------|---------------------|
| prod | t1 | 2020-06-01 01:00:00 |
| prod | t2 | 2020-06-01 01:00:02 |
| prod | t3 | 2020-06-01 01:00:03 |
| test | t4 | 2020-06-01 02:20:02 |
| test | t5 | 2020-06-01 02:20:03 |
and tblB
| tid | starttime | name | subtask | maintask |
|-----|---------------------|------|---------|----------|
| 1 | 2020-06-01 01:10:00 | t1 | 5 | 10 |
| 1 | 2020-06-01 01:10:00 | t1 | 6 | 10 |
| 1 | 2020-06-01 01:10:00 | t1 | 7 | 10 |
| 1 | 2020-06-01 01:10:00 | t1 | 8 | 10 |
| 2 | 2020-06-01 00:01:00 | t1 | 3 | 10 |
| 2 | 2020-05-01 00:02:00 | t1 | 5 | 15 |
| 4 | 2020-06-01 01:00:00 | t2 | 10 | 10 |
| 5 | 2020-06-01 11:00:10 | t2 | 10 | 20 |
| 5 | 2020-06-01 11:00:10 | t2 | 11 | 20 |
| 5 | 2020-06-01 11:00:10 | t2 | 12 | 20 |
Now I need to create a report table with the sum of subtask and main task. But there is where condition, we need to pick the tid,subtask, maintask where the starttime is greater than than the tblA's timestamp for each name.Then do the SUM.
Expected output:
| type | name | sum_of_subtask | sum_of_maintask | diff |
|------|------|----------------|-----------------|------|
| prod | t1 | 26 | 40 | 14 |
| prod | t2 | 33 | 60 | 27 |
For t1, the tid would be 1, because its starttime is > tblA.timestamp
for t2, the tid is 5, tblB.starttime > tblA.timestamp
Also the other condition the rows Im going to pick is the MAX(tid)
where starttime is > tblA.timestamp.
Then get the rows and do the sum find the difference between sum_of_subtask,sum_of_maintask on diff column.
I'm not sure how to write the logic for this.

You need simple join and sum for aggregation. here is the demo.
select
type,
ta.name,
sum(subtask) as sum_of_subtask,
sum(maintask) as sum_of_maintask,
sum(maintask - subtask) as diff
from tblA ta
join tblB tb
on ta.name = tb.name
where starttime > timestap
group by
type,
ta.name;
output:
| type | name | sum_of_subtask | sum_of_maintask | diff |
| ---- | ---- | -------------- | --------------- | ---- |
| prod | t1 | 26 | 40 | 14 |
| prod | t2 | 33 | 60 | 27 |

Related

mySQL : Populate a column with the result of a query

I have a table ORDERS with a column named cache_total_price, spent by each client.
+----+-----------+------------+-----------+------------------+
| id | client_id | date | reference | cache_total_price|
+----+-----------+------------+-----------+------------------+
| 1 | 20 | 2019-01-01 | 004214 | 0 |
| 2 | 3 | 2019-01-03 | 007120 | 0 |
| 3 | 11 | 2019-01-04 | 002957 | 0 |
| 4 | 6 | 2019-01-07 | 003425 | 0 |
I have another table ORDERS_REFS where there is the price total spent for each orders id
+-----+-------------+------------+----------+---------------+------------+
| id | order_id | name | quantity | unit_price | total_price|
+-----+-------------+------------+----------+---------------+------------+
| 1 | 1 | Produit 19 | 3 | 49.57 | 148.71 |
| 2 | 1 | Produit 92 | 4 | 81.24 | 324.96 |
| 3 | 1 | Produit 68 | 2 | 17.48 | 34.96 |
| 4 | 2 | Produit 53 | 4 | 83.69 | 334.76 |
| 5 | 2 | Produit 78 | 6 | 5.99 | 35.94 |
I want to had to column cache_total_price the result of my query :
select sum(total_price) from orders_refs group by order_id;
result :
+--------------------+
| sum(total_price) |
+--------------------+
| 508.6299819946289 |
| 370.700008392334 |
| 132.3699951171875 |
| 2090.1800079345703 |
I've tried some queries with Insert into select or update set, but didn't worked :(
If you're wanting to update all the orders at once, in MySQL a subquery like this should do the trick. You could always add a WHERE clause to do one at a time.
UPDATE ORDERS
SET cache_total_price = (
SELECT sum(total_price) from ORDERS_REF where order_id = ORDERS.id
)
update t
set t.cache_total_price=x.sum_total_price
from ORDERS as t
join
(
select order_id,sum(total_price)as sum_total_price
from orders_refs group by order_id
)x on t.id=x.order_id
In SQL Server you can try this one.

How to return the maximum and minimum values for specific ID SQL

Given the following SQL tables: https://imgur.com/a/NI8VrC7. For each specific ID_t I need to return the MAX() and MIN() value of Cena_c(total price) column of a given ID_t.
| ID_t | Nazwa |
| ---- | ----- |
| 1 | T1 |
| 2 | T2 |
| 3 | T3 |
| 4 | T4 |
| 5 | T5 |
| 6 | T6 |
| 7 | T7 |
| ID | ID_t | Ilosc | Cena_j | Cena_c | ID_p |
| ---- | ---- | ----- | ------ | ------ | ---- |
| 100 | 1 | 1 | 10 | 10 | 1 |
| 101 | 2 | 3 | 20 | 60 | 2 |
| 102 | 4 | 5 | 10 | 50 | 7 |
| 103 | 2 | 2 | 20 | 40 | 5 |
| 104 | 5 | 1 | 30 | 30 | 5 |
| 105 | 7 | 6 | 80 | 480 | 1 |
| 106 | 6 | 7 | 15 | 105 | 2 |
| 107 | 6 | 5 | 15 | 75 | 1 |
| 108 | 3 | 3 | 25 | 75 | 7 |
| 109 | 7 | 1 | 80 | 80 | 5 |
| 110 | 4 | 1 | 10 | 10 | 2 |
| 111 | 2 | 9 | 20 | 180 | 2 |
Based on provided tables the correct result should look like this:
| ID_t | Cena_c_max | Cena_c_min |
| ----- | ---------- | ---------- |
| T1 | 10 | 10 |
| T2 | 180 | 60 |
| T3 | 75 | 75 |
| T4 | 50 | 10 |
| T5 | 30 | 30 |
| T6 | 105 | 75 |
| T7 | 480 | 80 |
Is this even possible?
I haven't found anything yet that I could use to implement my solution.
SELECT concat('T',ID_t), max(Cena_c) as Cena_c_max, min(Cena_c) as Cena_c_min
FROM table
GROUP BY ID_t
Better is to solve it with joins of tables, because it will be avoided in the future if the prefix T is changed to another letter.
Hardcoding should be avoided.
select b.nazva as "Nazva", max(a.cena.c) as "Cena_c_max", min(a.cena.c) as "Cena_c_min"
from table1 as a
left join table2 as b on (
a.id_t = b.id_t
)
group by id_t

Incremental/Update in hive

I have a hive external table with data say, (version less than 0.14)
+--------+------+------+------+
| id | A | B | C |
+--------+------+------+------+
| 10011 | 10 | 3 | 0 |
| 10012 | 9 | 0 | 40 |
| 10015 | 10 | 3 | 0 |
| 10017 | 9 | 0 | 40 |
+--------+------+------+------+
And I have a delta file having data given below.
+--------+------+------+------+
| id | A | B | C |
+--------+------+------+------+
| 10012 | 50 | 3 | 10 | --> update
| 10013 | 29 | 0 | 40 | --> insert
| 10014 | 10 | 3 | 0 | --> update
| 10013 | 19 | 0 | 40 | --> update
| 10015 | 70 | 3 | 0 | --> update
| 10016 | 17 | 0 | 40 | --> insert
+--------+------+------+------+
How can I update my hive table with the delta file, without using sqoop. Any help on how to proceed will be great! Thanks.
This is because there is duplicates in the file. How do you know which you should keep? The last one?
In that case you can use, for example, the row_number and then get the maximum value. Something like that.
SELECT coalesce(tmp.id,initial.id) as id,
coalesce(tmp.A, initial.A) as A,
coalesce(tmp.B,initial.B) as B,
coalesce(tmp.C, initial.C) as C
FROM
table_a initial
FULL OUTER JOIN
( SELECT *, row_number() over( partition by id ) as row_num
,COUNT(*) OVER (PARTITION BY id) AS cnt
FROM temp_table
) tmp
ON initial.id=tmp.id
WHERE row_num=cnt
OR row_num IS NULL;
Output:
+--------+-----+----+-----+--+
| id | a | b | c |
+--------+-----+----+-----+--+
| 10011 | 10 | 3 | 0 |
| 10012 | 50 | 3 | 10 |
| 10013 | 19 | 0 | 40 |
| 10014 | 10 | 3 | 0 |
| 10015 | 70 | 3 | 0 |
| 10016 | 17 | 0 | 40 |
| 10017 | 9 | 0 | 40 |
+--------+-----+----+-----+--+
You can load the file to a temporary table in hive and then execute a FULL OUTER JOIN between the two tables.
Query Example:
SELECT coalesce(tmp.id,initial.id) as id,
coalesce(tmp.A, initial.A) as A,
coalesce(tmp.B,initial.B) as B,
coalesce(tmp.C, initial.C) as C
FROM
table_a initial
FULL OUTER JOIN
temp_table tmp on initial.id=tmp.id;
Output
+--------+-----+----+-----+--+
| id | a | b | c |
+--------+-----+----+-----+--+
| 10011 | 10 | 3 | 0 |
| 10012 | 50 | 3 | 10 |
| 10013 | 29 | 0 | 40 |
| 10013 | 19 | 0 | 40 |
| 10014 | 10 | 3 | 0 |
| 10015 | 70 | 3 | 0 |
| 10016 | 17 | 0 | 40 |
| 10017 | 9 | 0 | 40 |
+--------+-----+----+-----+--+

Sum data from two tables with different number of rows

There are 3 Tables (SorMaster, SorDetail, and InvWarehouse):
SorMaster:
+------------+
| SalesOrder |
+------------+
| 100 |
| 101 |
| 102 |
+------------+
SorDetail:
+------------+------------+---------------+
| SalesOrder | MStockCode | MBackOrderQty |
+------------+------------+---------------+
| 100 | PN-1 | 4 |
| 100 | PN-2 | 9 |
| 100 | PN-3 | 1 |
| 100 | PN-4 | 6 |
| 101 | PN-1 | 6 |
| 101 | PN-3 | 2 |
| 102 | PN-2 | 19 |
| 102 | PN-3 | 14 |
| 102 | PN-4 | 6 |
| 102 | PN-5 | 4 |
+------------+------------+---------------+
InvWarehouse:
+------------+-----------+-----------+
| MStockCode | Warehouse | QtyOnHand |
+------------+-----------+-----------+
| PN-1 | A | 1 |
| PN-2 | B | 9 |
| PN-3 | A | 0 |
| PN-4 | B | 1 |
| PN-1 | A | 0 |
| PN-3 | B | 5 |
| PN-2 | A | 9 |
| PN-3 | B | 4 |
| PN-4 | A | 6 |
| PN-5 | B | 0 |
+------------+-----------+-----------+
Desired Results:
+------------+-----------------+--------------+
| MStockCode | SumBackOrderQty | SumQtyOnHand |
+------------+-----------------+--------------+
| PN-1 | 10 | 10 |
| PN-2 | 28 | 1 |
| PN-3 | 17 | 5 |
| PN-4 | 12 | 13 |
| PN-5 | 11 | 6 |
+------------+-----------------+--------------+
I have been going around in circles with no end in sight. Seems like it should be simple but just can't wrap my head around it. The SumBackOrderQty obviously getting counted twice as the SumQtyOnHand is evaluated. To this point I have been doing the calculations in the PHP instead of the select statement but would like to clean things up a bit where possible.
Current query statement is:
SELECT SorDetail.MStockCode,
SUM(SorDetail.MBackOrderQty) AS 'SumMBackOrderQty',
SUM(InvWarehouse.QtyOnHand) AS 'SumQtyOnHand'
FROM SysproCompanyJ.dbo.SorMaster SorMaster,
SysproCompanyJ.dbo.SorDetail SorDetail LEFT OUTER JOIN SysproCompanyJ.dbo.InvWarehouse InvWarehouse
ON SorDetail.MStockCode = InvWarehouse.StockCode
WHERE SorMaster.SalesOrder = SorDetail.SalesOrder
AND SorMaster.ActiveFlag != 'N'
AND SorDetail.MBackOrderQty > '0'
AND SorDetail.MPrice > '0'
GROUP BY SorDetail.MStockCode
ORDER BY SorDetail.MStockCode ASC
Without providing the complete picture, in terms of your RDBMS, database schema, a description of the problem you're trying to solve and sample data that matches the aforementioned, the following is just an illustration of what a solution based on Barmar's comment could look like:
SELECT SD.MStockCode,
SD.SumBackOrderQty,
IW.SumQtyOnHand
FROM (SELECT MStockCode,
SUM(MBackOrderQty) AS `SumBackOrderQty`
FROM SorDetail
JOIN SorMaster ON SorDetail.SalesOrder=SorMaster.SalesOrder
WHERE SorMaster.ActiveFlag != 'N'
AND SorDetail.MBackOrderQty > 0
AND SorDetail.MPrice > 0
GROUP BY MStockCode) AS SD
LEFT JOIN (SELECT MStockCode,
SUM(QtyOnHand) AS `SumQtyOnHand`
FROM InvWarehouse
GROUP BY MStockCode) AS IW ON SD.MStockCode=IW.MStockCode
ORDER BY SD.MStockCode;
Here's one approach:
select MStockCode,
(select sum(MBackOrderQty) from sorDetail as T2
where T2.MStockCode = T1.MStockCode ) as SumBackOrderQty,
(select sum(QtyOnHand) from invWarehouse as T3
where T3.MStockCode = T1.MStockCode ) as SumQtyOnHand
from
(
select mstockcode from sorDetail
union
select mstockcode from invWarehouse
) as T1
In a fiddle here: http://sqlfiddle.com/#!9/fdaca/6
Though my SumQtyOnHand values don't match yours (as #Gordon pointed out).

Joining two tables and calculating divide-SUM from the resulting table in SQL Server

I have one table that looks like this:
+---------------+---------------+-----------+-------+------+
| id_instrument | id_data_label | Date | Value | Note |
+---------------+---------------+-----------+-------+------+
| 1 | 57 | 1.10.2010 | 200 | NULL |
| 1 | 57 | 2.10.2010 | 190 | NULL |
| 1 | 57 | 3.10.2010 | 202 | NULL |
| | | | | |
+---------------+---------------+-----------+-------+------+
And the other that looks like this:
+----------------+---------------+---------------+--------------+-------+-----------+------+
| id_fundamental | id_instrument | id_data_label | quarter_code | value | AnnDate | Note |
+----------------+---------------+---------------+--------------+-------+-----------+------+
| 1 | 1 | 20 | 20101 | 3 | 28.2.2010 | NULL |
| 2 | 1 | 20 | 20102 | 4 | 1.8.2010 | NULL |
| 3 | 1 | 20 | 20103 | 5 | 2.11.2010 | NULL |
| | | | | | | |
+----------------+---------------+---------------+--------------+-------+-----------+------+
What I would like to do is to merge/join these two tables in one in a way that I get something like this:
+------------+--------------+--------------+----------+--------------+
| Date | Table1.Value | Table2.Value | AnnDate | quarter_code |
+------------+--------------+--------------+----------+--------------+
| 1.10.2010. | 200 | 3 | 1.8.2010 | 20102 |
| 2.10.2010. | 190 | 3 | 1.8.2010 | 20102 |
| 3.10.2010. | 202 | 3 | 1.8.2010 | 20102 |
| | | | | |
+------------+--------------+--------------+----------+--------------+
So the idea is to order them by Date from Table1 and since Table2 Values only change on the change of AnnDate we populate the Resulting table with same values from Table2.
After that I would like to go through the resulting table and create another (Final table) with the following.
On Date 1.10.2010. take last 4 AnnDates (so it would be 1.8.2010. and f.e. 20.3.2010. 30.1.2010. 15.11.2009) and Table2 values on those AnnDate. Make SUM of those 4 values and then divide the Table1 Value with that SUM.
So we would get something like:
+-----------+---------------------------------------------------------------+
| Date | FinalValue |
+-----------+---------------------------------------------------------------+
| 1.10.2010 | 200/(Table2.Value on 1.8.2010+Table2.Value on 20.3.2010 +...) |
| | |
+-----------+---------------------------------------------------------------+
Is there any way this can be done?
EDIT:
Hmm yes now I see that I really didn't do a good job explaining it.
What I wanted to say is
I try INNER JOIN like this:
SELECT TableOne.Date, TableOne.Value, TableTwo.Value, TableTwo.AnnDate, TableTwo.quarter_code
FROM TableOne
INNER JOIN TableTwo ON TableOne.id_intrument=TableTwo.id_instrument WHERE TableOne.id_data_label = somevalue AND TableTwo.id_data_label = somevalue AND date > xxx AND date < yyy
And this inner join returns 2620*40 rows which means for every AnnDate from table2 it returns all Date from table1.
What I want is to return 2620 values with Dates from Table1
Values from table1 on that date and Values from table2 that respond to that period of dates
f.e.
Table1:
+-------+-------+
| Date | Value |
+-------+-------+
| 1 | a |
| 2 | b |
| 3 | c |
| 4 | d |
+-------+-------+
Table2
+-------+---------+
| Value | AnnDate |
+-------+---------+
| x | 1 |
| y | 4 |
+-------+---------+
Resulting table:
+-------+---------+---------+
| Date | ValueT1 | ValueT2 |
+-------+---------+---------+
| 1 | a | x |
| 2 | b | x |
| 3 | c | x |
| 4 | d | y |
+-------+---------+---------+
You need a JOIN statement for your first query. Try:
SELECT TableOne.Date, TableOne.Value, TableTwo.Value, TableTwo.AnnDate, TableTwo.quarter_code FROM TableOne
INNER JOIN TableTwo
ON TableOne.id_intrument=TableTwo.id_instrument;