Table: Project Details
+-----+------------------+------------+--------------+------------+
| GPN | EmployeePosition | Project.No | ChargedHours | PayPerHour |
+-----+------------------+------------+--------------+------------+
| 2 | B | 101 | 50 | 57 |
| 3 | C | 100 | 75 | 44 |
| 4 | D | 100 | 100 | 24.75 |
| 5 | E | 103 | 125 | 19.25 |
| 6 | F | 101 | 150 | 16 |
| 7 | C | 100 | 175 | 44 |
+-----+------------------+------------+--------------+------------+
I need to find out total pay of each Project. So first I have to find out Total pay per employee and group it by Project.No.
The table below shows the Total pay per Employee which is created using other 2 existing columns
+-----+-------------+---------+------------+----------+----------------+
| GPN | EmpPosition | Proj.No | ChargedHrs | PayPerHr | TotalPayPerEmp |
+-----+-------------+---------+------------+----------+----------------+
| 2 | B | 101 | 50 | 57 | 993.75 |
| 3 | C | 100 | 75 | 44 | 2850 |
| 4 | D | 100 | 100 | 24.75 | 3300 |
| 5 | E | 103 | 125 | 19.25 | 2406.25 |
| 6 | F | 101 | 150 | 16 | 2400 |
| 7 | C | 100 | 175 | 44 | 7700 |
+-----+-------------+---------+------------+----------+----------------+
My Query:
Select EngNumber, SUM([CharHrs])[SumOfChargedHours], Levell, CostPH,
SUM([CharHrs])*CostPH [TotalPayPerEmployee]
FROM data1.dbo.PayedPerHour
GROUP BY EngNumber, Levell, TotalPayPerEmployee, CostPH
ORDER BY EngNumber;
Update data1.dbo.PayedPerHour
SET CostPH = CASE Levell
WHEN 'Associate Director' THEN '79.75'
WHEN 'Senior Manager' THEN '57'
WHEN 'Manager' THEN '44'
WHEN 'Senior' THEN '24.75'
WHEN 'Staff 2, 3 & 4' THEN '19.25'
WHEN 'Staff 1' THEN '16'
ELSE 'NULL'
END
WHERE Levell IN('Associate Director', 'Senior Manager','Manager', 'Senior',
'Staff 2, 3 & 4', 'Staff 1');
I want to group the TotalPayPerEmp by Proj.No but i cant accomplish it.
I would have made silly mistakes in the query since I'm very new to sql so please regret them
Expected table:
+---------+--------------------+
| Proj.No | TotalPayPerProject |
+---------+--------------------+
| 100 | 14093.75 |
| 101 | 5250 |
| 103 | 4881.25 |
+---------+--------------------+
I think this could be done using some of your algorithm, except at the ProjectNo granularity:
SELECT ProjectNo
,SUM(ChargedHours*PayPerHour) [TotalPayPerProject]
FROM ProjectDetails
GROUP BY ProjectNo
This gives output:
ProjectNo TotalPayPerProject
100 13475
101 5250
103 2406.25
This is different from your expected output, for some reason.
Here's a SQL fiddle: http://sqlfiddle.com/#!6/21a33/2/0
Related
TABLE 2 : trip_delivery_sales_lines
+-------+---------------------+------------+----------+------------+-------------+--------+--+
| Sl no | Order_date | Partner_id | Route_id | Product_id | Product qty | amount | |
+-------+---------------------+------------+----------+------------+-------------+--------+--+
| 1 | 2020-08-01 04:25:35 | 34567 | 152 | 432 | 2 | 100 | |
| 2 | 2021-09-11 02:25:35 | 34572 | 130 | 312 | 4 | 150 | |
| 3 | 2020-05-10 04:25:35 | 34567 | 152 | 432 | 3 | 123 | |
| 4 | 2021-02-16 01:10:35 | 34572 | 130 | 432 | 5 | 123 | |
| 5 | 2020-02-19 01:10:35 | 34567 | 152 | 432 | 2 | 600 | |
| 6 | 2021-03-20 01:10:35 | 34569 | 152 | 123 | 1 | 123 | |
| 7 | 2021-04-23 01:10:35 | 34570 | 152 | 432 | 4 | 200 | |
| 8 | 2021-07-08 01:10:35 | 34567 | 152 | 432 | 3 | 32 | |
| 9 | 2019-06-28 01:10:35 | 34570 | 152 | 432 | 2 | 100 | |
| 10 | 2018-11-14 01:10:35 | 34570 | 152 | 432 | 5 | 20 | |
| | | | | | | | |
+-------+---------------------+------------+----------+------------+-------------+--------+--+
From Table 2 : we had to find partners in route=152 and find the sum of product_qty of the last 2 sale [can be selected by desc order_date]
. We can find its result in table 3.
34567 – Serial number [ 1,8]
34570 – Serial number [ 7,9]
34569 – Serial number [6]
TABLE 3 : RESULT OBTAINED FROM TABLE 1,2
+------------+-------+
| Partner_id | count |
+------------+-------+
| 34567 | 5 |
| 34569 | 1 |
| 34570 | 6 |
| | |
+------------+-------+
From table 4 we want to find the above partner_ids leaf count
TABLE 4 :coupon_leaf
+------------+-------+
| Partner_id | Leaf |
+------------+-------+
| 34567 | XYZ1 |
| 34569 | XYZ2 |
| 34569 | DDHC |
| 34567 | DVDV |
| 34570 | DVFDV |
| 34576 | FVFV |
| 34567 | FVV |
| | |
+------------+-------+
From that we can find result as:
34567 – 3
34569-2
34570 -1
TABLE 5: result obtained from TABLE 4
+------------+-------+
| Partner_id | count |
+------------+-------+
| 34567 | 3 |
| 34569 | 2 |
| 34570 | 1 |
| | |
+------------+-------+
Now we want compare table 3 and 5
If partner_id count [table 3] > partner_id count [table 4]
Print partner_id
I want a single query to do all these operation
distinct partner_id can be found by: fROM TABLE 1
SELECT DISTINCT partner_id
FROM trip_delivery_sales ts
WHERE ts.route_id='152'
GROUP BY ts.partner_id
This answers the original version of the problem.
You seem to want to compare totals after aggregating tables 2 and 3. I don't know what table1 is for. It doesn't seem to do anything.
So:
select *
from (select partner_id, sum(quantity) as sum_quantity
from (select tdsl.*,
row_number() over (partition by t2.partner_id order by order_date) as seqnum
from trip_delivery_sales_lines tdsl
) tdsl
where seqnum <= 2
group by tdsl.partner_id
) tdsl left join
(select cl.partner_id, count(*) as leaf_cnt
from coupon_leaf cl
group by cl.partner_id
) cl
on cl.partner_id = tdsl.partner_id
where leaf_cnt is null or sum_quantity > leaf_cnt
Given the following SQL tables: https://imgur.com/a/NI8VrC7. For each specific ID_t I need to return the MAX() and MIN() value of Cena_c(total price) column of a given ID_t.
| ID_t | Nazwa |
| ---- | ----- |
| 1 | T1 |
| 2 | T2 |
| 3 | T3 |
| 4 | T4 |
| 5 | T5 |
| 6 | T6 |
| 7 | T7 |
| ID | ID_t | Ilosc | Cena_j | Cena_c | ID_p |
| ---- | ---- | ----- | ------ | ------ | ---- |
| 100 | 1 | 1 | 10 | 10 | 1 |
| 101 | 2 | 3 | 20 | 60 | 2 |
| 102 | 4 | 5 | 10 | 50 | 7 |
| 103 | 2 | 2 | 20 | 40 | 5 |
| 104 | 5 | 1 | 30 | 30 | 5 |
| 105 | 7 | 6 | 80 | 480 | 1 |
| 106 | 6 | 7 | 15 | 105 | 2 |
| 107 | 6 | 5 | 15 | 75 | 1 |
| 108 | 3 | 3 | 25 | 75 | 7 |
| 109 | 7 | 1 | 80 | 80 | 5 |
| 110 | 4 | 1 | 10 | 10 | 2 |
| 111 | 2 | 9 | 20 | 180 | 2 |
Based on provided tables the correct result should look like this:
| ID_t | Cena_c_max | Cena_c_min |
| ----- | ---------- | ---------- |
| T1 | 10 | 10 |
| T2 | 180 | 60 |
| T3 | 75 | 75 |
| T4 | 50 | 10 |
| T5 | 30 | 30 |
| T6 | 105 | 75 |
| T7 | 480 | 80 |
Is this even possible?
I haven't found anything yet that I could use to implement my solution.
SELECT concat('T',ID_t), max(Cena_c) as Cena_c_max, min(Cena_c) as Cena_c_min
FROM table
GROUP BY ID_t
Better is to solve it with joins of tables, because it will be avoided in the future if the prefix T is changed to another letter.
Hardcoding should be avoided.
select b.nazva as "Nazva", max(a.cena.c) as "Cena_c_max", min(a.cena.c) as "Cena_c_min"
from table1 as a
left join table2 as b on (
a.id_t = b.id_t
)
group by id_t
I have a hive external table with data say, (version less than 0.14)
+--------+------+------+------+
| id | A | B | C |
+--------+------+------+------+
| 10011 | 10 | 3 | 0 |
| 10012 | 9 | 0 | 40 |
| 10015 | 10 | 3 | 0 |
| 10017 | 9 | 0 | 40 |
+--------+------+------+------+
And I have a delta file having data given below.
+--------+------+------+------+
| id | A | B | C |
+--------+------+------+------+
| 10012 | 50 | 3 | 10 | --> update
| 10013 | 29 | 0 | 40 | --> insert
| 10014 | 10 | 3 | 0 | --> update
| 10013 | 19 | 0 | 40 | --> update
| 10015 | 70 | 3 | 0 | --> update
| 10016 | 17 | 0 | 40 | --> insert
+--------+------+------+------+
How can I update my hive table with the delta file, without using sqoop. Any help on how to proceed will be great! Thanks.
This is because there is duplicates in the file. How do you know which you should keep? The last one?
In that case you can use, for example, the row_number and then get the maximum value. Something like that.
SELECT coalesce(tmp.id,initial.id) as id,
coalesce(tmp.A, initial.A) as A,
coalesce(tmp.B,initial.B) as B,
coalesce(tmp.C, initial.C) as C
FROM
table_a initial
FULL OUTER JOIN
( SELECT *, row_number() over( partition by id ) as row_num
,COUNT(*) OVER (PARTITION BY id) AS cnt
FROM temp_table
) tmp
ON initial.id=tmp.id
WHERE row_num=cnt
OR row_num IS NULL;
Output:
+--------+-----+----+-----+--+
| id | a | b | c |
+--------+-----+----+-----+--+
| 10011 | 10 | 3 | 0 |
| 10012 | 50 | 3 | 10 |
| 10013 | 19 | 0 | 40 |
| 10014 | 10 | 3 | 0 |
| 10015 | 70 | 3 | 0 |
| 10016 | 17 | 0 | 40 |
| 10017 | 9 | 0 | 40 |
+--------+-----+----+-----+--+
You can load the file to a temporary table in hive and then execute a FULL OUTER JOIN between the two tables.
Query Example:
SELECT coalesce(tmp.id,initial.id) as id,
coalesce(tmp.A, initial.A) as A,
coalesce(tmp.B,initial.B) as B,
coalesce(tmp.C, initial.C) as C
FROM
table_a initial
FULL OUTER JOIN
temp_table tmp on initial.id=tmp.id;
Output
+--------+-----+----+-----+--+
| id | a | b | c |
+--------+-----+----+-----+--+
| 10011 | 10 | 3 | 0 |
| 10012 | 50 | 3 | 10 |
| 10013 | 29 | 0 | 40 |
| 10013 | 19 | 0 | 40 |
| 10014 | 10 | 3 | 0 |
| 10015 | 70 | 3 | 0 |
| 10016 | 17 | 0 | 40 |
| 10017 | 9 | 0 | 40 |
+--------+-----+----+-----+--+
I have the following four tables:
1) mls_user
2) mls_category
3) bonus_point
4) mls_entry
In mls_user table values are like below:
*-------------------------*
| id | store_id | name |
*-------------------------*
| 1 | 101 | sandeep |
| 2 | 101 | gagan |
| 3 | 102 | santosh |
| 4 | 103 | manu |
| 5 | 101 | jagveer |
*-------------------------*
In mls_category table values are like below:
*---------------------------------*
| cat_no | store_id | cat_value |
*---------------------------------*
| 20 | 101 | 1 |
| 21 | 101 | 4 |
| 30 | 102 | 1 |
| 31 | 102 | 2 |
| 40 | 103 | 1 |
| 41 | 103 | 1 |
*---------------------------------*
In bonus_point table values are like below:
*-----------------------------------*
| user_id | store_id | bonus_point |
| 1 | 101 | 10 |
| 4 | 101 | 5 |
*-----------------------------------*
In mls_entry table values are like below:
*-------------------------------------------------------*
| user_id | store_id | category | distance | status |
*-------------------------------------------------------*
| 1 | 101 | 20 | 10 | Approved |
| 1 | 101 | 21 | 40 | Approved |
| 1 | 101 | 20 | 10 | Approved |
| 2 | 101 | 20 | 5 | Approved |
| 3 | 102 | 30 | 10 | Approved |
| 3 | 102 | 31 | 80 | Approved |
| 4 | 101 | 20 | 15 | Approved |
*-------------------------------------------------------*
And I want below output:
*--------------------------------------------------*
| user name | Points | bonus Point | Total Point |
*--------------------------------------------------*
| Sandeep | 30 | 10 | 40 |
| Santosh | 30 | 0 | 30 |
| Manu | 15 | 5 | 20 |
| Gagan | 5 | 0 | 5 |
| Jagveer | 0 | 0 | 0 |
*--------------------------------------------------*
I tell the calculation of how the points will come for user Sandeep.
Points = ((10+10)/1 + 40/4)=30
Here 1 and 4 is cat value which comes from mls_category.
I am using below code for a particular user but when i
SELECT sum(t1.totald/c.cat_value) as total_distance
FROM mls_category c
join (
select sum(distance) totald, user_id, category
FROM mls_entry
WHERE user_id=1 AND store_id='101' AND status='approved'
group by user_id, category) t1 on c.cat_no = t1.category
I have created tables in online for checking
DEMO
Computing the points (other than the bonus points) requires a separate join between the mls_entry and mls_category tables. I would do this in a separate subquery, and then join this to the larger query.
Here is one approach:
SELECT
u.name,
COALESCE(t1.points, 0) AS points,
COALESCE(b.bonus_point, 0) AS bonus_points,
COALESCE(t1.points, 0) + COALESCE(b.bonus_point, 0) AS total_points
FROM mls_user u
LEFT JOIN
(
SELECT e.user_id, SUM(e.distance / c.cat_value) AS points
FROM mls_entry e
INNER JOIN mls_category c
ON e.store_id = c.store_id AND e.category = c.cat_no
GROUP BY e.user_id
) t1
ON u.id = t1.user_id
LEFT JOIN bonus_point b
ON u.id = b.user_id
ORDER BY
total_points DESC;
This is the output I am getting from the above query in the demo you setup:
The output does not match exactly, because you have (perhaps) a typo in Santosh's data in your question, or otherwise the expected output in your question has a typo.
I have 2 tables
Table A:
+------------+----------+
| Entry From | Entry To |
+------------+----------+
| 100 | 103 |
| 104 | 105 |
| 106 | 109 |
+------------+----------+
Table B:
+-------+-------+
| Entry | Value |
+-------+-------+
| 100 | 10 |
| 101 | 3 |
| 102 | 7 |
| 103 | 2 |
| 104 | 9 |
| 105 | 17 |
| 106 | 3 |
| 107 | 3 |
| 108 | 6 |
| 109 | 5 |
+-------+-------+
Desired result:
+------------+----------+-------------+
| Entry From | Entry To | Total Value |
+------------+----------+-------------+
| 100 | 103 | 22 |
| 104 | 105 | 26 |
| 106 | 109 | 17 |
+------------+----------+-------------+
Any solutions/advice is welcome.
Thanks to any help in advance!
Please try:
Select
a.EntryFrom, a.EntryTo, sum(Value) TotalValue
From TableA a INNER JOIN TableB b ON b.Entry between a.EntryFrom and a.EntryTo
Group by a.EntryFrom, a.EntryTo
What you're looking for is a subquery maybe.
SELECT
A.Entry_From, A.Entry_To,
(SELECT SUM(B.Value) FROM B
WHERE B.Entry BETWEEN A.Entry_From AND A.Entry_To) AS Total_Value
FROM A
It also depends on what version of SQL so YMMV :)
Here is a working fiddle: http://www.sqlfiddle.com/#!2/afbac/2 using this query:
select a.idxFrom, a.idxTo, sum(b.value) as total
from a inner join b on b.idx >= a.idxFrom and b.idx <= a.idxTo
group by a.idxFrom, a.idxTo