Using GROUP BY and ORDER BY on an INNER JOIN SQL query - sql

I am using the following query to group work times and expenses for clients from three tables, one for clients, one for work times and one for expenses:
SELECT a.*,
COALESCE(b.totalCount, 0) AS CountWork,
COALESCE(b.totalAmount, 0) AS WorkTotal,
COALESCE(c.totalCount, 0) AS CountExpense,
COALESCE(c.totalAmount, 0) AS ExpenseTotal
FROM clients A
LEFT JOIN
(
SELECT Client,
COUNT(*) totalCount,
SUM(Amount) totalAmount
FROM work_times
WHERE DATE BETWEEN '2013-01-01' AND '2013-02-01'
GROUP BY Client
) b ON a.Client = b.Client
LEFT JOIN
(
SELECT Client,
COUNT(*) totalCount,
SUM(Amount) totalAmount
FROM expenses
WHERE DATE BETWEEN '2013-01-01' AND '2013-02-01'
GROUP BY Client
) c ON a.Client = c.Client
WHERE b.Client IS NOT NULL OR
c.Client IS NOT NULL
You can see the query working in a fiddle here.
I am trying to amend this query so that there is a row for each client for each month sorted by month and then client. I am trying to do so with the following amended query:
SELECT a.*,
COALESCE(b.totalCount, 0) AS CountWork,
COALESCE(b.totalAmount, 0) AS WorkTotal,
COALESCE(c.totalCount, 0) AS CountExpense,
COALESCE(c.totalAmount, 0) AS ExpenseTotal
FROM clients A
LEFT JOIN
(
SELECT Client,
COUNT(*) totalCount,
SUM(Amount) totalAmount,
SUBSTR(Date, 1, 7) as Month
FROM work_times
GROUP BY Month,Client
ORDER BY Month
) b ON a.Client = b.Client
LEFT JOIN
(
SELECT Client,
COUNT(*) totalCount,
SUM(Amount) totalAmount,
SUBSTR(Date, 1, 7) as Month
FROM expenses
GROUP BY Month,Client
ORDER BY Month,Client
) c ON a.Client = c.Client
WHERE b.Client IS NOT NULL OR
c.Client IS NOT NULL
You can see the amended query in action here.
It's not working quite right though. Only one row is returned for Client B even though there is a work time in January 2013 and an expense in February 2013 (so there should be 2 rows) and it appears that the rows are being ordered by Client as opposed to Month. Could someone suggest how to amend the query to get the desired output which for the example on the second fiddle would be:
╔════════╦═══════════╦═══════════╦══════════════╦══════════════╗
║ CLIENT ║ COUNTWORK ║ WORKTOTAL ║ COUNTEXPENSE ║ EXPENSETOTAL ║
╠════════╬═══════════╬═══════════╬══════════════╬══════════════╣
║ A ║ 1 ║ 10 ║ 1 ║ 10 ║
║ B ║ 1 ║ 20 ║ 0 ║ 0 ║
║ A ║ 1 ║ 15 ║ 0 ║ 0 ║
║ B ║ 0 ║ 0 ║ 1 ║ 10 ║
║ C ║ 1 ║ 10 ║ 0 ║ 0 ║
╚════════╩═══════════╩═══════════╩══════════════╩══════════════╝

Unless I am missing something in the requirements, what you need to do is get a list of the clients and the dates and then join that to your subqueries. So your query will be:
SELECT a.*,
COALESCE(b.totalCount, 0) AS CountWork,
COALESCE(b.totalAmount, 0) AS WorkTotal,
COALESCE(c.totalCount, 0) AS CountExpense,
COALESCE(c.totalAmount, 0) AS ExpenseTotal
FROM
(
select distinct c.Client, d.Month
from clients c
cross join
(
select SUBSTR(Date, 1, 7) as Month
from work_times
union
select SUBSTR(Date, 1, 7) as Month
from expenses
) d
) A
LEFT JOIN
(
SELECT Client,
COUNT(*) totalCount,
SUM(Amount) totalAmount,
SUBSTR(Date, 1, 7) as Month
FROM work_times
GROUP BY Month,Client
ORDER BY Month,Client
) b
ON a.Client = b.Client
and a.month = b.month
LEFT JOIN
(
SELECT Client,
COUNT(*) totalCount,
SUM(Amount) totalAmount,
SUBSTR(Date, 1, 7) as Month
FROM expenses
GROUP BY Month,Client
ORDER BY Month,Client
) c
ON a.Client = c.Client
and a.month = c.month
WHERE b.Client IS NOT NULL OR
c.Client IS NOT NULL
order by a.month, a.client
See SQL Fiddle with Demo.
The result is:
| CLIENT | MONTH | COUNTWORK | WORKTOTAL | COUNTEXPENSE | EXPENSETOTAL |
--------------------------------------------------------------------------
| A | 2013-01 | 1 | 10 | 1 | 10 |
| B | 2013-01 | 1 | 20 | 0 | 0 |
| A | 2013-02 | 1 | 15 | 0 | 0 |
| B | 2013-02 | 0 | 0 | 1 | 20 |
| C | 2013-02 | 1 | 10 | 0 | 0 |

If you do an order by in a sub-query, it doesn't matter, because the outer query may (and may need to) re-order the results. You want to add an order by to the outer query.
Your problem is that you are trying to order by the month and client of the B table, and also order by the month and client of the C table. You need to define the order of B.month, B.client, and C.month and put it into an order by for the outer query.
BTW, if you only group by month in the sub-query for the C table, then client is not meaningful. Some databases, like DB2, do not allow you to put an unaggregated field in a select if it is not in the group by.

Related

Group chronologically

I'm trying to get the top result from each status from a table grouped by customer id and status, ordered by time.
The data given:
CustNo Date Status
1 2016-03-24 C
1 2016-02-08 C
1 2016-01-17 A
1 2015-12-04 C
2 2016-04-28 B
2 2016-03-25 C
2 2016-02-13 C
2 2016-01-04 C
3 2016-02-02 A
3 2016-01-09 A
3 2015-12-12 A
3 2015-11-30 A
I want the output to look like this:
CustNo Date Status
1 2016-03-24 C
1 2016-01-17 A
1 2015-12-04 C
2 2016-04-28 B
2 2016-03-25 C
3 2016-02-02 A
As you can see I want the top date for each status change (if any) within each customer. I solved it for customer 2 and 3 where there is no change of status or the status never changes back, but as for customer 1 the status has changed from C to A and back to C and this is the tricky part (for me at least). I always seem to get the C status grouped all together.
You can try this.
With CTE as
{
select CustNo,
DATE,
Status,
ROW_NUMBER () over (partition by CustNo order by date) as ord
from tab
)
SELECT CustNo,
DATE,
Status
FROM CTE
EXCEPT
SELECT C1.CustNo,
C1.DATE,
C1.Status
FROM CTE C1
INNER JOIN CTE C2 ON C1.CustNo = C2.CustNo
AND C1.Status = C2.Status
AND C1.ord + 1 = C2.ord
You could use windowed function ROW_NUMBER to calculate groups:
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY CustNo ORDER BY Date) -
ROW_NUMBER() OVER(PARTITION BY CustNo, Status ORDER BY Date) AS grp
FROM mytable
)
SELECT CustNo, Status, MAX(Date) AS Date
FROM cte
GROUP BY CustNo, Status, grp
ORDER BY CustNo, Date DESC;
LiveDemo
Output:
╔════════╦════════╦═════════════════════╗
║ CustNo ║ Status ║ Date ║
╠════════╬════════╬═════════════════════╣
║ 1 ║ C ║ 24.03.2016 00:00:00 ║
║ 1 ║ A ║ 17.01.2016 00:00:00 ║
║ 1 ║ C ║ 04.12.2015 00:00:00 ║
║ 2 ║ B ║ 28.04.2016 00:00:00 ║
║ 2 ║ C ║ 25.03.2016 00:00:00 ║
║ 3 ║ A ║ 02.02.2016 00:00:00 ║
╚════════╩════════╩═════════════════════╝

Return results from multiple tables

I am doing analysis on the Stack Overflow dump.
Problem statement: I have 4 tables and require result in the format given.
Table 1: UserID Year QuestionsOnTopicA
Table 2: UserID Year AnswersOnTopicA
Table 3: UserID Year QuestionsOnTopicB
Table 4: UserID Year AnswersOnTopicB
Desired Output:
UserID Year QuestionsOnTopicA AnswersOnTopicA QuestionsOnTopicB AnswersOnTopicB
UserID column should have entries from all the 4 tables.
I tried performing inner and outer join on the tables but the results were incorrect.
Inner join (returns userid present only in first table 1)
Outer join (returns other columns only for userid in table 1)
Not sure if union will make sense in this scenario.
Queries are being executed on data.stackexchange.com/stackoverflow
Example
Table 1: 1001, 2010, 5 || 1001, 2011, 3 || 1002, 2010, 4
Table 2: 1001, 2010, 10 || 1001, 2011, 7 || 1002, 2010, 5
Table 3: 1002, 2010, 5
Table 4: 1001, 2010, 10 || 1004, 2011, 5
Output:
1001, 2010, 5 , 10, 0, 10
1001, 2011, 3, 7, 0, 0
1002, 2010, 4, 5, 5, 0
1004, 2011, 0, 0, 0, 5
Ok, this works as intended:
SELECT COALESCE(A.UserID,B.UserID,C.UserID,D.UserID) UserID,
COALESCE(A.[Year],B.[Year],C.[Year],D.[Year]) [Year],
ISNULL(A.QuestionsOnTopicA,0) QuestionsOnTopicA,
ISNULL(B.AnswersOnTopicA,0) AnswersOnTopicA,
ISNULL(C.QuestionsOnTopicB,0) QuestionsOnTopicB,
ISNULL(D.AnswersOnTopicB,0) AnswersOnTopicB
FROM Table1 A
FULL JOIN Table2 B
ON A.UserID = B.UserID
AND A.[Year] = B.[Year]
FULL JOIN Table3 C
ON COALESCE(A.UserID,B.UserID) = C.UserID
AND COALESCE(A.[Year],B.[Year]) = C.[Year]
FULL JOIN Table4 D
ON COALESCE(A.UserID,B.UserID,C.UserID) = D.UserID
AND COALESCE(A.[Year],B.[Year],C.[Year]) = D.[Year]
Here is a sqlfiddle with a demo of this.
And the results are:
╔════════╦══════╦═══════════════════╦═════════════════╦═══════════════════╦═════════════════╗
║ UserID ║ Year ║ QuestionsOnTopicA ║ AnswersOnTopicA ║ QuestionsOnTopicB ║ AnswersOnTopicB ║
╠════════╬══════╬═══════════════════╬═════════════════╬═══════════════════╬═════════════════╣
║ 1001 ║ 2010 ║ 5 ║ 10 ║ 0 ║ 10 ║
║ 1001 ║ 2011 ║ 3 ║ 7 ║ 0 ║ 0 ║
║ 1002 ║ 2010 ║ 4 ║ 5 ║ 5 ║ 0 ║
║ 1004 ║ 2011 ║ 0 ║ 0 ║ 0 ║ 5 ║
╚════════╩══════╩═══════════════════╩═════════════════╩═══════════════════╩═════════════════╝
Use this SQL may be?
SELECT a.UserID, a.Year,
a.QuestionsOnTopicA,
b.AnswersOnTopicA,
c.QuestionsOnTopicB,
d.AnswersOnTopicB
FROM Table 1 a,
Table 2 b,
Table 3 c,
Table 4 d
WHERE a.UserID = b.UserID
AND b.UserID = c.UserID
AND c.UserID = d.UserID
AND d.UserID = a.UserID
select coalesce(a.UserID, b.UserID, c.UserID, d.UserID),
coalesce(a.Year, b.Year, c.Year, d.Year),
a.QuestionsOnTopicA, b.AnswersOnTopicA,
c.QuestionsOnTopicB, d.AnswersOnTopicB
from Table1 a full outer join Table2 b on a.UserID = b.UserID and a.Year = b.Year
full outer join Table3 c on (c.UserID = b.UserID or c.UserID = a.UserID)
and (c.Year = b.Year or c.Year = a.Year)
full outer join Table4 d on (d.UserID = c.UserID or d.UserID = b.UserID or d.UserID = a.UserID)
and (d.Year = a.Year or d.Year = b.Year or d.Year = a.Year);
First of all you should retrieve the data from the tables using inner join.
Then you should use SQL Server Pivot as shown in this link.

SQL Statement To Group Different Date Ranges as New Columns

SQL beginner here. Looking at a table of items in an Oracle DB and wanted to export items by year (each in a separate column), group them by a userid, and then sum a total field.
I can export them individually with date ranges like
WHERE DATE > '01-JAN-13'
AND DATE < '31-DEC-13'
My table 'CUSTOMER_ORDERS' looks like this Here is how my table looks
Customer Name | Customer ID | Date | Sale
_________________________________________
Customer 1 | CUS01 | 05-JAN-13 | 110.00
Customer 2 | CUS02 | 06-JAN-11 | 110.00
Customer 3 | CUS03 | 07-JAN-12 | 70.00
Customer 1 | CUS01 | 05-JAN-12 | 10.00
Customer 2 | CUS02 | 05-JAN-11 | 210.00
Ideally I want to export something like this
Customer Name | Customer ID | 2011 Total | 2012 Total | 2013 Total
_________________________________________
Customer 1 | CUS01 | 0 | 10 | 110
Customer 2 | CUS02 | 320 | 0 | 0
Customer 3 | CUS03 | 0 | 70 | 0
I'm sure this is super simple, I just can't figure out the right way to do it.
You can use an aggregate function with a CASE expression to PIVOT the data from rows into columns:
select
CustomerName,
CustomerID,
sum(case when to_char(dt, 'YYYY') = 2011 then Sale else 0 end) Total_2011,
sum(case when to_char(dt, 'YYYY') = 2012 then Sale else 0 end) Total_2012,
sum(case when to_char(dt, 'YYYY') = 2013 then Sale else 0 end) Total_2013
from CUSTOMER_ORDERS
group by CustomerName, CustomerID;
See SQL Fiddle with Demo.
Depending on your version of Oracle, you might be able to use the PIVOT function if you are using Oracle 11g+:
select *
from
(
select CustomerName, CustomerId,
'Total_'||to_char(dt, 'YYYY') year, sale
from CUSTOMER_ORDERS
)
pivot
(
sum(sale)
for year in ('Total_2011', 'Total_2012', 'Total_2013')
);
See SQL Fiddle with Demo
Use the power of self-joins to subset the data in the way you need. Try something like
select c.ID , c.Name , sum(c2011.Sale) , sum(c2012.Sale) , sum( c2013.Sale )
from ( select distinct c.ID , c.Name from customer_order ) c
left join customer_order c2011 on c2011.id = c.id and year(c.Date) = 2011
left join customer_order c2012 on c2012.id = c.id and year(c.Date) = 2012
left join customer_order c2013 on c2013.id = c.id and year(c.Date) = 2013
group by c.ID , c.Name
order by c.ID , c.Name
To get the desired result. Alternatively...
select c.ID , c.Name ,
sum(case when year(c.Date) = 2011 then c.Sale else 0 end) ,
sum(case when year(c.Date) = 2012 then c.Sale else 0 end) ,
sum(case when year(c.Date) = 2013 then c.Sale else 0 end)
from customer_order c
group by c.ID , c.Name
order by c.ID , c.Name

SQL sum field and select a column (with condition) and sum another column

I have a select statement:
SELECT ID, A, B, C, D
FROM MyTable
GROUP BY ID, A, B, C, D
HAVING D >= '14/06/2013'
AND D <= '17/06/2013'
show this:
ID | A | B | C | D
--------------------------------------------
11 | 1370 | 0 | 0 | 14/06/2013
11 | 1370 | 100 | 0 | 15/06/2013
11 | 1470 | 400 | 0 | 16/06/2013
11 | 1870 | 0 | 300 | 17/06/2013
I Want the result is:
ID | min of D| Sum(B) | Sum(C) | max of D| MIN(D)
11 | 1370 | 500 | 300 | 1870 | 14/06/2013
How do I do that on SQL Server
Here is a way (assuming SQL Server 2005+):
;WITH CTE AS
(
SELECT *,
RN1 = ROW_NUMBER() OVER(PARTITION BY ID ORDER BY D DESC),
RN2 = ROW_NUMBER() OVER(PARTITION BY ID ORDER BY D)
FROM YourTable
WHERE D >= '20130614'
AND D <= '20130617'
)
SELECT ID,
MIN(CASE WHEN RN2 = 1 THEN A END) [min of D],
SUM(B) [Sum(B)],
SUM(C) [Sum(C)],
MIN(CASE WHEN RN1 = 1 THEN A END) [max of D],
MIN(D) [Min(D)]
FROM CTE
GROUP BY ID
Results:
╔════╦══════════╦════════╦════════╦══════════╦════════════╗
║ ID ║ MIN OF D ║ SUM(B) ║ SUM(C) ║ MAX OF D ║ MIN(D) ║
╠════╬══════════╬════════╬════════╬══════════╬════════════╣
║ 11 ║ 1370 ║ 500 ║ 300 ║ 1870 ║ 2013-06-14 ║
╚════╩══════════╩════════╩════════╩══════════╩════════════╝
And here is an sqlfiddle with a demo of this.
You can do that by a JOIN
SELECT T.ID ,
MAX(G.B) AS [SUM(B)],
MAX(G.C) AS [SUM(C)],
MAX(MINI)AS [MIN(D)] ,
MAX(CASE WHEN T.D = G.MINI THEN T.A ELSE NULL END ) AS [MIN OF D],
MAX(CASE WHEN T.D = G.MAXI THEN T.A ELSE NULL END ) AS [MAX OF D]
FROM TEST T
JOIN ( SELECT ID , SUM(B) B ,SUM(C) C ,MIN(D) AS MINI ,MAX(D) AS MAXI
FROM test
WHERE D >= '06/14/2013'
AND D <= '06/17/2013'
GROUP BY ID ) G ON G.ID = T.ID
GROUP BY T.ID
SQL Fiddle demo HERE

select data that has at least P and R

I have a table named Table1 as shown below:
ID AccountNo Trn_cd
1 123456 P
2 123456 R
3 123456 P
4 12345 P
5 111 R
6 111 R
7 5625 P
I would like to display those records that accountNo appears more than one time (duplicate) and trn_cd has at least both P and R.
In this case the output should be at this way:
ID AccountNo Trn_cd
1 123456 P
2 123456 R
3 123456 P
I have done this sql but not the result i want:
select * from Table1
where AccountNo IN
(select accountno from table1
where trn_cd = 'P' or trn_cd = 'R'
group by AccountNo having count(*) > 1)
Result as below which AccountNo 111 shouldn't appear because there is no trn_cd P for 111:
ID AccountNo Trn_cd
1 123456 P
2 123456 R
3 123456 P
5 111 R
6 111 R
Any idea?
Use aggregation for this. To get the account numbers:
select accountNo
from table1
having count(*) > 1 and
sum(case when trn_cd = 'P' then 1 else 0 end) > 0 and
sum(case when trn_cd = 'N' then 1 else 0 end) > 0
To get the account information, use a join or in statement:
select t.*
from table1 t
where t.accountno in (select accountNo
from table1
having count(*) > 1 and
sum(case when trn_cd = 'P' then 1 else 0 end) > 0 and
sum(case when trn_cd = 'N' then 1 else 0 end) > 0
)
This problem is called Relational Division.
This can be solved by filtering the records which contains P and R and counting the records for every AccountNo returned, and filtering it again using COUNT(DISTINCT Trn_CD) = 2.
SELECT a.*
FROM tableName a
INNER JOIN
(
SELECT AccountNo
FROM TableName
WHERE Trn_CD IN ('P','R')
GROUP BY AccountNo
HAVING COUNT(DISTINCT Trn_CD) = 2
) b ON a.AccountNO = b.AccountNo
SQLFiddle Demo
SQL of Relational Division
OUTPUT
╔════╦═══════════╦════════╗
║ ID ║ ACCOUNTNO ║ TRN_CD ║
╠════╬═══════════╬════════╣
║ 1 ║ 123456 ║ P ║
║ 2 ║ 123456 ║ R ║
║ 3 ║ 123456 ║ P ║
╚════╩═══════════╩════════╝
For faster performance, add an INDEX on column AccountNo.