SQL Group By Issue with same item ID - sql

I am trying to track the total number of sales a rep has along with the amount of time he was clocked into work.
I have the following two tables:
table1:
employeeID | item | price | timeID
----------------------------------------
1 | 1 | 12.92 | 123
1 | 2 | 10.00 | 123
1 | 2 | 10.00 | 456
table2:
ID | minutes_in_shift
--------------------------
123 | 45
456 | 15
I would join these two queries with the following SQL:
SELECT
t1.employeeID, t1.item, t1.price, t1.shiftID, t2.minutes_in_shift
FROM table1 t1
JOIN table 2 t2 ON (t2.ID = t1.timeID)
Which would return the following table:
employeeID | item | price | timeID | minutes_in_shift
---------------------------------------------------
1 | 1 | 12.92 | 123 | 45
1 | 2 | 10.00 | 123 | 45
1 | 2 | 10.00 | 456 | 15
I would like for the consolidate results, however, to have this outcome:
employeeID | itemsSold | priceTotals | totaltimeworked
-----------------------------------------------------------------
1 | 3 | 32.92 | 60
I could use COUNT and SUM for the items and price but I cannot figure out how to properly show the total time worked in the manner it appears above.
Note: I am only having trouble with calculating the time worked. In shift 123 - employee 1 was working 45 minutes, regardless of how many items he sold.
Any suggestions?

If you wish to use the sample data as they are you will need to extract the shifts and sum the minutes, like this:
with a as (
select employeeID, count(*) itemsSold, sum(price) priceTotals
from Sampletable1
group by employeeID),
b as (
select employeeID, shiftID, max(minutes_in_shift) minutes_in_shift
from Sampletable1
group by employeeID, shiftID),
c as (
select employeeID, sum(minutes_in_shift) totaltimeworked
from b
group by employeeID)
select a.employeeID, a.itemsSold, a.priceTotals, c.totaltimeworked
from a inner join c on a.employeeID = c.employeeID
However, with your existing tables the select statement will be much easier:
with a as (
select employeeID, timeID, count(*) itemsSold, sum(price) priceTotals
from table1
group by employeeID, timeID)
select a.employeeID, sum(a.itemsSold), sum(a.priceTotals), sum(table2.minutes_in_shift) totaltimeworked
from a inner join table2 on a.timeID = table2.ID
group by a.employeeID

I think this query should do what you want:
SELECT t1.employeeID,
count(t1.item) AS itemsSold,
sum(t1.price) AS priceTotals,
sum(DISTINCT t2.minutes_in_shift) AS totaltimeworked
FROM table1 t1
JOIN table2 t2 ON (t2.ID = t1.timeID)
GROUP BY t1.employeeID;
Check on SQL Fiddle

Related

How do I join to another table and return only the most recent matching row?

I have a table that stores the lines on a contract. Each contract line his it's own unique ID, it also has the ID of its parent contract. Example:
+-------------+---------+
| contract_id | line_id |
+-------------+---------+
| 1111 | 100 |
| 1111 | 101 |
| 1111 | 102 |
+-------------+---------+
I have another table that stores the historical changes to contract lines. For example, every time the number of units on a contract line is changed a new row is added to the table. Example:
+-------------+---------+--------------+-------+
| contract_id | line_id | date_changed | units |
+-------------+---------+--------------+-------+
| 1111 | 100 | 2016-01-01 | 1 |
| 1111 | 100 | 2016-02-01 | 2 |
| 1111 | 100 | 2016-03-01 | 3 |
+-------------+---------+--------------+-------+
As you can see the contract line with ID 100 belonging to the contract with ID 1111 has been edited 3 times over 3 months. The current value is 3 units.
I'm running a query against the contract lines table to select all data. I want to join to the historical data table and select the most recent row for each contract line and show the units in my results. How do I do this?
Expected results (there would single results for 101 and 102 as well):
+-------------+---------+-------+
| contract_id | line_id | units |
+-------------+---------+-------+
| 1111 | 100 | 3 |
+-------------+---------+-------+
I've tried the query below with a left join but it returns 3 rows instead of 1.
Query:
SELECT *, T1.units
FROM contract_lines
LEFT JOIN (
SELECT contract_id, line_id, units, MAX(date_changed) AS maxdate
FROM contract_history
GROUP BY contract_id, line_id, units) AS T1
ON contract_lines.contract_id = T1.contract_id
AND contract_lines.line_id = T1.line_id
Actual results:
+-------------+---------+-------+
| contract_id | line_id | units |
+-------------+---------+-------+
| 1111 | 100 | 1 |
| 1111 | 100 | 2 |
| 1111 | 100 | 3 |
+-------------+---------+-------+
An extra join to contract_history along with maxdate will work
SELECT contract_lines.*,T2.units
FROM contract_lines
LEFT JOIN (
SELECT contract_id, line_id, MAX(date_changed) AS maxdate
FROM contract_history
GROUP BY contract_id, line_id) AS T1
JOIN contract_history T2 ON
T1.contract_id=T2.contract_id and
T1.line_id= T2.line_id and
T1.maxdate=T2.date_changed
ON contract_lines.contract_id = T1.contract_id
AND contract_lines.line_id = T1.line_id
Output
This is my preferred style because it doesn't require self joining and cleanly expresses your intent. Also, it competes very well with the ROW_NUMBER() method in terms of performance.
select a.*
, b.units
from contract_lines as a
join (
select a.contract_id
, a.line_id
, a.units
, Max(a.date_changed) over(partition by a.contract_id, a.line_id) as max_date_changed
from contract_history as a
) as b
on a.contract_id = b.contract_id
and a.line_id = b.line_id
and b.date_changed = b.max_date_changed;
Another possible solution to this. This uses RANK to sort/filter this. Similar to what you did, just a different tact.
SELECT contract_lines.*, T1.units
FROM contract_lines
LEFT JOIN (
SELECT contract_id, line_id, units,
RANK() OVER (PARTITION BY contract_id, line_id ORDER BY date_changed DESC) AS [rank]
FROM contract_history) AS T1
ON contract_lines.contract_id = T1.contract_id
AND contract_lines.line_id = T1.line_id
AND T1.rank = 1
WHERE T1.units IS NOT NULL
You could change this to a INNER JOIN and remove the IS NOT NULL in the WHERE clause if you expect data to be present all the time.
Glad you figured it out!
Try this simple query:
SELECT TOP 1 T1.*
FROM contract_lines T0
INNER JOIN contract_history T1
ON T0.contract_id = T1.contract_id and
T0.line_id = T1.line_id
ORDER BY date_changed DESC
As always seems to be the way after spending an hour looking at it and shouting at StackOverflow for having a rare period of maintenance I solve my own problem not long after posting a question.
In an effort to help anyone else who's stuck I'll show what I found. It might not be an efficient way to achieve this so if someone has a better suggestion I'm all ears.
I adapted the answer from here: T-SQL Subquery Max(Date) and Joins
SELECT *,
Units = (SELECT TOP 1 units
FROM contract_history
WHERE contract_lines.contract_id = contract_history.contract_id
AND contract_lines.line_id = contract_history.line_id
ORDER BY date_changed DESC
)
FROM ....

SQL Add hours for employees

I have a table for employees signing in and out. They have a date and time field for in and out and an PersonID number that links to the employees name etc.
I need to work out the difference between the 2 dates and times then add them all together for each employee.
select a.*,
b.timein,
b.timeout,
datediff(mi,b.timein,b.timeout) as total_mins
from tbl_people a
left join tbl_register b on a.id=b.personid
Output:
+----+-----------+----------+-------------------------+-------------------------+------------+
| ID | FirstName | LastName | TimeIn | TimeOut | Total_Mins |
+----+-----------+----------+-------------------------+-------------------------+------------+
| 1 | David | Test | 2015-05-12 12:11:00.000 | 2015-05-12 12:13:00.000 | 2 |
| 2 | David | Test | 2015-05-12 12:15:00.000 | 2015-05-12 12:18:00.000 | 3 |
+----+-----------+----------+-------------------------+-------------------------+------------+
This is what im currently getting. I would like it to show one record for each person with the total amount of minutes worked.
Thanks in anticipation!
Basically you have at least 2 options:
Option 1 - Use DISTINCT and SUM with OVER clause:
SELECT DISTINCT a.*,
SUM(DATEDIFF(mi, b.timein, b.timeout)) OVER(PARTITION BY a.id) AS total_mins
FROM tbl_people a
LEFT JOIN tbl_register b ON a.id=b.personid
Option 2 - Use a derived table for the GROUP BY part:
SELECT a.*,
total_mins
from tbl_people a
left join (
SELECT personid,
SUM(DATEDIFF(mi, timein, timeout) AS total_mins
FROM tbl_register
GROUP BY personid
) b ON a.id=b.personid
select
ppl.FirstName + ' ' + ppl.LastName as 'Person',
sum( datediff(mi, reg.timein, reg.timeout)) as 'total_mins'
from
tbl_people ppl
left join tbl_register reg on ppl.id = reg.personid
group by
ppl.FirstName + ' ' + ppl.LastName

Select columns with and without group by

Having Table1
id | productname | store | price
-----------------------------------
1 | name a | store 1 | 4
2 | name a | store 2 | 3
3 | name b | store 3 | 6
4 | name a | store 3 | 4
5 | name b | store 1 | 7
6 | name a | store 4 | 5
7 | name c | store 3 | 2
8 | name b | store 6 | 5
9 | name c | store 2 | 1
I need to get all columns but only the rows with the
lowest price.
Result needed:
id | productname | store | price
-----------------------------------
2 | name a | store 2 | 3
8 | name b | store 6 | 5
9 | name c | store 2 | 1
My best try is:
SELECT ProductName, MIN(Price) AS minPrice
FROM Table1
GROUP BY ProductName
But then I need the ID and STORE for each row.
Try this
select p.* from Table1 as p inner join
(SELECT ProductName, MIN(Price) AS minPrice FROM Table1 GROUP BY ProductName) t
on p.productname = t.ProductName and p.price = t.minPrice
Select ID,ProductName,minPrice
from
(
SELECT ProductName, MIN(Price) AS minPrice
FROM Table1
GROUP BY ProductName
) t
join Table1 t1 on t.ProductName = t1.ProductName
You didn't mention your SQL dialect, but most DBMSes support Standard SQL's "Windowed Aggregate Functions":
select *
from
( select t.*,
RANK() OVER (PARTITION BY ProductName ORDER BY Price) as rnk
from table1 as t
) as dt
where rnk = 1
If multiple stores got the same lowest price all of them will be returned. If you want only a single shop you have to switch to ROW_NUMBER instead of RANK or add column(s) to the ORDER BY.
I think this query should do:
select min(t.id) id
, t.productname
, t.price
from table1 t
join
( select min(price) min_price
, productname
from table1
group
by productname
) v
on v.productname = t.productname
and v.price = t.min_price
group
by t.productname
, t.price
It determines the lowest price per product and fetches every line in the base table (t). This avoids duplicates by grouping on the productname and selecting the lowest id.
This should work for you:
SELECT * FROM `Table1` AS `t1`
WHERE (
SELECT count(*) FROM `Table1` AS `t2` WHERE `t1`.`productName` = `t2`.`productName` AND `t2`.`price` < `t1`.`price`) < 1
Check SqlFiddle
But if you have same products with same minimum price in two stores, you will get both of them in result output

Find Min Value and value of a corresponding column for that result

I have a table of user data in my SQL Server database and I am attempting to summarize the data. Basically, I need some min, max, and sum values and to group by some columns
Here is a sample table:
Member ID | Name | DateJoined | DateQuit | PointsEarned | Address
00001 | Leyth | 1/1/2013 | 9/30/2013 | 57 | 123 FirstAddress Way
00002 | James | 2/1/2013 | 7/21/2013 | 34 | 4 street road
00001 | Leyth | 2/1/2013 | 10/15/2013| 32 | 456 LastAddress Way
00003 | Eric | 2/23/2013 | 4/14/2013 | 15 | 5 street road
I'd like the summarized table to show the results like this:
Member ID | Name | DateJoined | DateQuit | PointsEarned | Address
00001 | Leyth | 1/1/2013 | 10/15/2013 | 89 | 123 FirstAddress Way
00002 | James | 2/1/2013 | 7/21/2013 | 34 | 4 street road
00003 | Eric | 2/23/2013 | 4/14/2013 | 15 | 5 street road
Here is my query so far:
Select MemberID, Name, Min(DateJoined), Max(DateQuit), SUM(PointsEarned), Min(Address)
From Table
Group By MemberID
The Min(Address) works this time, it retrieves the address that corresponds to the earliest DateJoined. However, if we swapped the two addresses in the original table, we would retrieve "123 FirstAddress Way" which would not correspond to the 1/1/2013 date joined.
For almost everything you can use a simple groupby, but as you need "the same address than the row where the minimum datejoined is" is a little bit tricker and you can solve it in several ways, one is a subquery searching the address each time
SELECT
X.*,
(select Address
from #tmp t2
where t2.MemberID = X.memberID and
t2.DateJoined = (select MIN(DateJoined)
from #tmp t3
where t3.memberID = X.MemberID))
FROM
(select MemberID,
Name,
MIN(DateJoined) as DateJoined,
MAX(DateQuit) as DateQuit,
SUM(PointsEarned) as PointEarned
from #tmp t1
group by MemberID,Name
) AS X
`
Or other is a subquery with a Join
SELECT
X.*,
J.Address
FROM
(select
MemberID,
Name,
MIN(DateJoined) as DateJoined,
MAX(DateQuit) as DateQuit,
SUM(PointsEarned) as PointEarned
from #tmp t1
group by MemberID,Name
) AS X
JOIN #tmp J ON J.MemberID = X.MemberID AND J.DateJoined = X.DateJoined
You could rank your rows according to the date, and select the minimal one:
SELECT t.member_id,
name,
date_joined,
date_quit,
points_earned
address AS address
FROM (SELECT member_id
name,
MIN (date_joined) AS date_joined,
MAX (date_quit) AS date_quit,
SUM (points_earned) AS points_earned,
FROM my_table
GROUP BY member_id, name) t
JOIN (SELECT member_id,
address,
RANK() OVER (PARTITION BY member_id ORDER BY date_joined) AS rk
FROM my_table) addr ON addr.member_id = t.member_id AND rk = 1
SELECT DISTINCT st.memberid, st.name, m1.datejoined, m2.datequit, SUM(st.pointsearned), m1.Address
from SAMPLEtable st
LEFT JOIN ( SELECT memberid
, name
, MIN(datejoined)
, datequit
FROM sampletable
) m1 ON st.memberid = m1.memberid
LEFT JOIN ( SELECT memberid
, name
, datejoined
, MAX(datequit)
FROM sampletable
) m2 ON m1.memberid = m2.memberid

SQL Total Sale calculation

I have a sale table includes purchases/returns/exchanges
Sample:
--------**saleTbl**------------
CustID | DOP | SKU | Price
111 | 11/05/12 | 001 | 45.99
222 | 11/20/12 | 001 | 45.99
111 | 11/06/12 | 002 | 40.95
111 | 11/06/12 | 001 | -45.99
111 | 11/19/12 | 004 | 50.00
222 | 11/25/12 | 003 | 20.99
111 | 12/01/12 | 002 | -40.95
111 | 12/01/12 | 003 | 20.99
Criteria is: find total for each customer during 11/05/12 - 11/20/12. If customer exchanged the item that was purchased during that time and purchase with the same day will be count.
The expected result is:
CustID | DOP | Price
222 | 11/20/12 | 45.99
111 | 12/01/12 | 70.99
I have tried to get the total but of course it is not right:
SELECT DISTINCT [num_cp] AS 'Member Id'
,MAX([dop]) AS 'Date'
,SUM([price]) AS 'Point'
FROM [Mailing_List].[dbo].[UGG_DoublePoint]
WHERE [num_cp] IN
(
SELECT [num_cp]
FROM [Mailing_List].[dbo].[UGG_DoublePoint]
GROUP BY [num_cp]
HAVING SUM([price]) >0
)
--AND
AND [dop] BETWEEN '11/05/12' AND '11/20/12'
GROUP BY [num_cp]
Please help! Thanks everyone.
I think you need to change your query to this
; WITH CTE AS
(
SELECT *,
COUNT(*) OVER (PARTITION BY CustID, DOP) Row_Cnt
FROM TEST
), CTE2 AS
(
SELECT * FROM CTE
WHERE [dop] BETWEEN '11/05/12' AND '11/20/12'
), CTE3 AS
(
SELECT * FROM CTE2 WHERE price > 0
UNION
SELECT * FROM CTE2 WHERE price < 0
and SKU IN (SELECT SKU FROM CTE2 WHERE Price > 0)
UNION
SELECT * FROM CTE
WHERE row_cnt > 1 and DOP IN (
SELECT max(A.dop) d FROM CTE A
INNER JOIN CTE2 B ON A.CustID = B.CustID AND A.SKU = B.SKU
)
)
SELECT Custid, max(dop) dateid, sum(price) Price
from cte3
group by custid;
Check SQL Fiddle Demo
I think this will work. It basically filters out any returns from the result set by use of a left join.
NOTE: There would be an issue with this in the case that someone purchased/returned multiple SKUs of the same thing on the same day.
select pur.CustId, sum(pur.price) TotalPrice
from test pur
left join test ret
on pur.custid = ret.custid
and pur.dop = ret.dop
and pur.sku = ret.sku
and pur.price = (-1 * ret.price)
where pur.dop between '11/05/2012' AND '11/20/2012'
and ret.price is null
group by pur.CustId