I'm trying to aggregate quarterly records that belong to unique customer ids. I would then like to filter out those by $amount spent on X.
below is what I've coded:
select year, count(*), sum($spentX), customerid
from table A
where year = 2017
group by year, customerid
having sum($spentX)<=1000
Is this correct? Also, how do I sum the results to give me 2017 total, this query lists all customerids so I have to aggregate in another software.
2nd question:
How do I use the above when I join two tables?
thank you so much in advance
Your code looks correct. The way you have it set up your output would look something like the below. It is already filtered down to 2017 and grouped by the year and customer.
year | count(*) | sum($spentX) | customerid
2017 | 5 | 500.00 | 56
2017 | 8 | 800.00 | 43
2017 | 3 | 300.00 | 85
2017 | 2 | 200.00 | 56
2017 | 5 | 500.00 | 25
If you want to get a sum total for all customers who spent less than or equal to $1000 you would need put the entire thing in a subquery and sum it again
SELECT SUB.YEAR, SUM(SUB.TOTAL_COUNT), SUM(SUB.TOTAL_SPENT)
FROM (
SELECT YEAR, COUNT(*) AS TOTAL_COUNT, SUM($SPENTX) AS TOTAL_SPENT, CUSTOMERID
FROM TABLE A
WHERE YEAR = 2017
GROUP BY YEAR, CUSTOMERID
HAVING SUM($SPENTX)<=1000
) SUB
GROUP BY SUB.YEAR
OUTPUT:
YEAR | SUM(SUB.TOTAL_COUNT) | SUM(SUB.TOTAL_SPENT)
2017 | 23 | 2300.00
If you wanted to join that to another table say one with customer information it would look like this:
SELECT SUB.* , C.CUSTOMER_NAME
FROM (
SELECT YEAR, COUNT(*) AS TOTAL_COUNT, SUM($SPENTX) AS TOTAL_SPENT, CUSTOMERID
FROM TABLE A
WHERE YEAR = 2017
GROUP BY YEAR, CUSTOMERID
HAVING SUM($SPENTX)<=1000
) SUB
INNER JOIN TABLE_CUSTOMERS C ON C.CUSTOMERID = SUB.CUSTOMERID
The output for this query would look something like this:
YEAR | TOTAL_COUNT | TOTAL_SPENT | customerid | CUSTOMER_NAME
2017 | 5 | 500.00 | 56 | LARRY
2017 | 8 | 800.00 | 43 | MARGE
2017 | 3 | 300.00 | 85 | JOHN
2017 | 2 | 200.00 | 56 | RICK
2017 | 5 | 500.00 | 25 | SAM
Related
First of all, I wish to say hi to the community here. The posts here have been a great help with VBA but this is my first question ever. I have a task that I need to solve in SQL (in MS Access) but it's sort of new to me and the task seems to be too complex.
I have a table in Access with the following structure(let's call it Tinvoices):
invoice | year | date | company | step | agent
5110001 | 2019 | 15/01/2019 | 1201 | 0 | John
5110001 | 2019 | 15/01/2019 | 1201 | 1 | Jack
5110002 | 2019 | 10/02/2019 | 1202 | 0 | John
5110002 | 2019 | 10/02/2019 | 1202 | 1 | Jack
5110002 | 2019 | 10/02/2019 | 1202 | 2 | Daniel
5110002 | 2019 | 10/02/2019 | 1202 | 3 | John
5110003 | 2019 | 12/03/2019 | 1205 | 0 | Jack
5110003 | 2019 | 12/03/2019 | 1205 | 1 | Daniel
5110003 | 2019 | 12/03/2019 | 1205 | 2 | David
This table relates to actions on invoices. Invoices and their related data are repeated with each step.
There is another table, which contains agents belonging to a certain department (let's call it Tdeptusers):
agent
John
Jack
What I need to do is the following. Have distinct lines for the invoices (the most unique key is combining the invoice, year and company) and counting in separate steps have been done by users in the Tdeptusers table and how many by users who are not in Tdeptusers. Something like this:
invoice | year | month | company | actionsByOurDept | actionsByOthers
5110001 | 2019 | 1 | 1201 | 2 | 0
5110002 | 2019 | 2 | 1202 | 3 | 1
5110003 | 2019 | 3 | 1205 | 1 | 2
I'm kind of a beginner, so you'll have to excuse me in providing usable codes. Being a complete beginner, I got stuck after the absolute basics. I have stuff like this:
SELECT
invoice,
year,
DatePart("m", Date) AS month,
company,
Sum(IIf(i.agent IN(d.agent), 1, 0)) AS actionsByOurDept,
Sum(IIf(i.agent IN(d.agent), 0, 1)) AS actionsByOthers
FROM Tinvoices AS i, Tdeptusers AS d
GROUP BY invoice, year, DatePart("m", Date), company;
This doesn't give back the desired result, mostly not in actionsByOthers, instead I get huge numbers. Maybe something similar to this solution might work but I haven't been able to do it.
Much appreciation for the help, folks.
Use proper standard explicit JOIN syntax :
SELECT i.invoice, year, DatePart("m", i.Date) AS month, i.company,
SUM( IIF(d.agent IS NOT NULL, 1, 0) ) AS actionsByOurDept,
SUM( IIf(d.agent IS NULL, 1, 0) ) AS actionsByOthers
FROM Tinvoices AS i LEFT JOIN
Tdeptusers AS d
ON d.agent = i.agent
GROUP BY i.invoice, i.year, DatePart("m", i.Date), i.company;
Use left join:
SELECT invoice, year, DatePart("m", Date) AS month, company,
COUNT(d.agent) AS actionsByOurDept,
SUM(IIF(d.agent IS NULL, 1, 0)) AS actionsByOthers
FROM Tinvoices AS i LEFT JOIN
Tdeptusers AS d
ON d.agent = i.agent
GROUP BY invoice, year, DatePart("m", Date), company;
You can directly count your department's users using COUNT().
I have created a calculation in Microsoft SQL Server Management Studio that creates a running total per company and quarter, but at a monthly level and this part works fine.
So if company X sold 40 apples, hypothetically, in Jan and then 60 in Feb, then the running total in Feb would be 100 and if they sold 30 in March, then March's running total would be 130 and then in April it would reset for the new quarter.
What I need now is to find the MAX of these values, per month across all companies. So if Company 'X' sold 100 in Feb, but Company 'Y' sold 150, I want to return 150.
The calculation I use to get the rolling values per quarter calls on two functions to calculate the quarter each month falls into, as well as the relevant Fiscal Period / year ('GetQuarter' and 'GetFiscalPeriod' being the functions).
So my question is, is there any way to find the max at a different level of detail (in this case across ALL Companies) when the value you are looking at is already aggregated at Company level?
I'm told Stored Procedures would make this a lot simpler but the software I use can't call on Stored Procedures, only views and tables.
SELECT
cm.Company_Code,
cm.[Date],
cm.Measure,
SUM(cm.Actual) OVER (
PARTITION BY (
SELECT dbo.GetQuarter(SUBSTRING(cm.[Date], 5, 2))),
cm.Measure,
cm.Company_Code,
(LEFT((SELECT dbo.GetFiscalPeriod(cm.[Date])), 4))
ORDER BY cm.[Date]
) AS Current_QTD_Actual
FROM mytable cm
Desired Output would look like the "MAX" field below:
+--------------+--------+-----+-----+----------+---------+-----+------------+
| Company_Code | Actual | QTD | MAX | Date | Measure | QTR | FiscalYear |
| AAA | 40 | 40 | 40 | 20180701 | Bananas | Q1 | 2019 |
| BBB | 35 | 35 | 40 | 20180701 | Bananas | Q1 | 2019 |
| AAA | 60 | 100 | 105 | 20180801 | Bananas | Q1 | 2019 |
| BBB | 70 | 105 | 105 | 20180801 | Bananas | Q1 | 2019 |
| AAA | 30 | 130 | 150 | 20180901 | Bananas | Q1 | 2019 |
| BBB | 45 | 150 | 150 | 20180901 | Bananas | Q1 | 2019 |
| AAA | 25 | 25 | 45 | 20181001 | Bananas | Q2 | 2019 |
| BBB | 45 | 45 | 45 | 20181001 | Bananas | Q2 | 2019 |
| AAA | 30 | 55 | 85 | 20181101 | Bananas | Q2 | 2019 |
| BBB | 40 | 85 | 85 | 20181101 | Bananas | Q2 | 2019 |
+--------------+--------+-----+-----+----------+---------+-----+------------+
As the QTD calculation I currently have is already a rolled up SUM, simply wrapping this in a MAX function does not work for obvious reasons.
I tried creating a temporary table within the calculation using examples I've seen online, which I then call back into the original table and max that value but I think my syntax is wrong because it never comes out right (I'm still a novice so temporary table syntaxes still elude me quite a bit).
You seem to want the cumulative sum of the maximum values for each month. If this is correct, you can use two levels of window functions:
select measure, fiscalyear, qtr, date, actual,
sum(actual) over (partition by measure fiscalyear, qtr order by date) as running_actual
from (select t.*,
row_number() over (partition by measure, date order by actual desc) as seqnum
from t
) t
where seqnum = 1;
You can't stack aggregates together on the same SELECT with the only exception of appying a windowed aggregate (with an OVER clause) over a regular aggregate. For example:
SELECT
T.GroupedColumn,
RowsByGroup = COUNT(*), -- Regular aggregate
SumOfAllRows = SUM(COUNT(*)) OVER () -- Windowed aggregate of a regular one
FROM
MyTable AS T
GROUP BY
T.GroupedColumn
You can however apply them if you warp the former on a subquery or CTE, which also make the query more readable IMO. I believe you are looking for something like the following:
;WITH RunningSumPerQuarterPerCompany AS
(
SELECT
cm.Company_Code,
cm.[Date],
cm.Measure,
Current_QTD_Actual = SUM(cm.Actual) OVER (
PARTITION BY
dbo.GetQuarter(SUBSTRING(cm.[Date], 5, 2)),
cm.Measure,
cm.Company_Code,
LEFT(dbo.GetFiscalPeriod(cm.[Date]), 4)
ORDER BY
cm.[Date]),
-- Add additional PARTITION BY columns for the GROUP BY later on
Quarter = dbo.GetQuarter(SUBSTRING(cm.[Date], 5, 2)),
FiscalPeriod = LEFT(dbo.GetFiscalPeriod(cm.[Date]), 4)
FROM
mytable cm
),
MaxRunningSumPerQuarter AS
(
SELECT
R.Quarter,
R.FiscalPeriod,
Max_Current_QTD_Actual = MAX(R.Current_QTD_Actual)
FROM
RunningSumPerQuarterPerCompany AS R
GROUP BY
R.Quarter,
R.FiscalPeriod -- GROUP BY whichever dimension you need
)
SELECT
R.*,
M.Max_Current_QTD_Actual
FROM
RunningSumPerQuarterPerCompany AS R
LEFT JOIN MaxRunningSumPerQuarter AS M ON
R.Quarter = M.Quarter AND
R.FiscalPeriod = M.FiscalPeriod -- Join by the GROUP BY columns to display the MAX
I have a table (dataset_final) that contains data on the number of sales (field quantity) of goods in a particular store for a particular week of the year. Unique goods about 200 thousand, about 50 stores, the period of 6 years.
dataset_final
+---------+-------------+---------+----------+----------+
| year_id | week_number | good_id | store_id | quantity |
+---------+-------------+---------+----------+----------+
| 2017 | 37 | 137233 | 9 | 1 |
+---------+-------------+---------+----------+----------+
| 2017 | 38 | 137233 | 9 | 4 |
+---------+-------------+---------+----------+----------+
| 2017 | 40 | 137233 | 9 | 3 |
+---------+-------------+---------+----------+----------+
| 2016 | 35 | 152501 | 23 | 6 |
+---------+-------------+---------+----------+----------+
| 2016 | 37 | 152501 | 23 | 3 |
+---------+-------------+---------+----------+----------+
I would like the missing values, i.e. when the combination of good and store was not sold in a certain week of the year, to fill in the zero. For example.
+---------+-------------+---------+----------+----------+
| year_id | week_number | good_id | store_id | quantity |
+---------+-------------+---------+----------+----------+
| 2017 | 37 | 137233 | 9 | 1 |
+---------+-------------+---------+----------+----------+
| 2017 | 38 | 137233 | 9 | 4 |
+---------+-------------+---------+----------+----------+
| 2017 | 40 | 137233 | 9 | 3 |
+---------+-------------+---------+----------+----------+
| 2016 | 35 | 152501 | 23 | 6 |
+---------+-------------+---------+----------+----------+
| 2016 | 37 | 152501 | 23 | 3 |
+---------+-------------+---------+----------+----------+
| 2017 | 39 | 137233 | 9 | 0 |
+---------+-------------+---------+----------+----------+
| 2016 | 36 | 152501 | 23 | 0 |
+---------+-------------+---------+----------+----------+
I wanted to do this: find all unique combinations of year_id, week_number, good_id, store_id and add only those that are not in the dataset_final table. My query:
WITH t1 AS (SELECT DISTINCT
[year_id]
,[week_number]
,[good_id]
,[store_id]
FROM [fs_db].[dbo].[ds_dataset_final]),
t2 AS (SELECT DISTINCT [year_id], [week_number] FROM [fs_db].[dbo].[ds_dataset_final])
SELECT t2.[year_id], t2.[week_number], t1.[good_id], t1. [store_id] FROM t1
full join t2 ON t2.[year_id]=t1.[year_id] AND t2.[week_number]=t2.[week_number]
This query produces about 1.2 billion unique combinations, which seems too much.
Also, I take into account the combination only from the beginning of sales of goods, for example, if the table has sales of a particular product only from 2017, then I do not need to fill in earlier data.
The basic idea is to general all the rows using cross join and then use left join to bring in the values.
Assuming you have all year/week combinations in your original table and have all the goods and stores in the table, you can use:
select vw.year_id, vw.week_number,
g.good_id, s.store_id,
coalesce(d.quantity, 0) as quantity
from (select distinct year_id, week_number
from fs_db..ds_dataset_final
) yw cross join
(select distinct good_id
from fs_db..ds_dataset_final
) g cross join
(select distinct store_id
from fs_db..ds_dataset_final
) s left join
fs_db..ds_dataset_final d
on d.year_id = vw.year_id and
d.week_number = vw.week_number and
d.good_id = g.good_id and
d.store_id = s.store_id;
You may have other sources for each of the dimensions (such as a proper dimension table). If so, don't use select distinct but use the reference tables.
EDIT:
Just add as the last line the in the query:
where yw.year >= 2015 and yw.year < 2019
if you want the years 2015, 2016, 2017, and 2018.
This is very much pseudo SQL in the absence of what your actual database looks like, it should, however, get you on the right path. You'll need to replace the objects like dbo.Store with your actual objects, and I suggest creating a proper calendar table:
--This shoudl really be a full calendar table, but we'll making a sample here
CREATE TABLE dbo.Weeks (Year int,
Week int);
INSERT INTO dbo.Weeks (Year, Week)
SELECT Y.Year,
W.Week
FROM (VALUES(2016),(2017),(2018),(2019))Y(Year)
CROSS APPLY (SELECT TOP 52 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS Week
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N1(N),
(VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N2(N)) W
GO
WITH CTE AS(
SELECT W.Year,
W.Week,
S.StoreID,
G.GoodsID
FROM dbo.Weeks W
CROSS JOIN dbo.Store S
CROSS JOIN dbo.Goods G
WHERE EXISTS (SELECT 1
FROM dbo.YourTable YT
WHERE YT.year_id <= W.Year
AND YT.store_id = S.StoreID))
SELECT C.Year,
C.Week,
C.StoreID,
C.GoodsID,
ISNULL(YT.quantity,0) AS quantity
FROM CTE C
LEFT JOIN YourTable YT ON C.Year = YT.year_id
AND C.Week = YT.week_number
AND C.StoreID = YT.store_id
AND C.GoodsID = YT.good_id
--WHERE?
table structure is as follows
+---------------+---------+---------+
| customer_name | date | balance |
+---------------+---------+---------+
| 123 | june 14 | 20 |
| 123 | june 15 | 30 |
| 1234 | june 14 | 30 |
| 12345 | june 16 | 50 |
+---------------+---------+---------+
i would like to join on the same table, keeping my original data set as 2014 and i want to analyse trends to see which customers balance doesnt change from 2014.
for example i would like to show the below
+-----------+-----------+-----------+
| custmomer | june14bal | june15bal |
+-----------+-----------+-----------+
| 1234 | 30 | null |
| 123 | 20 | 30 |
+-----------+-----------+-----------+
I have trids multiple left joins but cant seem to get it working. the most important thing is starting my sample with records from 2014 only.
current script
with TABLE_DATA as
(
select Customer ,DATE, Balance
from table
where dATE in ('30-JUN-2014','30-juN-2015')
)
SELECT
sum(inv1.balance) as year1bal,
suminv2.balance) as year2bal,
customer,
date
from table_datA inv1
left join TABLE_DATA inv2
on inv1.customer= inv2.customer and inv2.as_of_Date = '30-June-2015'
group by date, customer
you can add having clause after group by Like:
having sum(inv1.balance) != sum(inv2.balance)
or try the below query
with table2014 as
(
select Customer ,sum(Balance) Balance2014
from tableName
where dATE ='30-JUN-2014' group by Customer
)
,Table2015 as
( select Customer ,sum( Balance) Balance2015
from tableName
where dATE ='30-juN-2015' group by Customer
)
SELECT
inv1.customer,Balance2014, Balance2015
from table2014 inv1
left join Table2015 inv2
on inv1.customer= inv2.customer
--where Balance2014 !=Balance2015
I have Data that looks like:
ID | Year | State | Cost
----+-----------+-----------+-----------
1 | 2012 | CA | 10
2 | 2009 | FL | 90
3 | 2005 | MA | 50
2 | 2009 | FL | 75
1 | 2012 | CA | 110
I need it to look like:
ID | Year | State | Cost
----+-----------+-----------+-----------
1 | 2012 | CA | 120
2 | 2009 | FL | 165
3 | 2005 | MA | 50
So I need the year to remain the same, the state to remain the same, but the cost to be summed for each ID.
I know how to do the summing, but I don't know how to make the year and state stay the same.
You use a GROUP BY or TOTALS Query. Something like,
SELECT
ID,
[Year],
State,
Sum(Cost) As TotalCost
FROM
yourTable
GROUP BY
ID,
[Year],
State;
A Group By clause GROUPS the records based on the common information. The Sum adds up the column specified to give you the right information.