Database query sql - sql

Company would like to find out if there is any correlation between different months of a year
and demerit codes so you have been assigned to generate a report that shows for ALL the
demerits, the code, description, total number of offences committed for the demerit code
so far in any month (of any year) and then the total of offences committed for the demerit
code in each month (of any year). The column headings in your output should be renamed
as Demerit Code, Demerit Description, Total Offences (All Months), and then the first three
letters of each month (with the first letter in uppercase). The output must be sorted by
Total Offences (All Months) column in descending format and where there is more than
one demerit code with the same total, sort them by demerit code in ascending format. Your
output must have the form shown below. Your output can clearly be different from the
following output.
select d.dem_code as "Demerit Code", d.dem_description as "Demerit Description", count(o.off_no) as " Total Offences (All Months)",
select off_datetime from offence where to_char(off_datetime, 'Mon') == 'Jan'
from demerit d join offence o
on d.dem_code = o.dem_code
group by d.dem_code, dem_description
Can anyone tell me what's wrong with my query? what would be the solution for this, I've attached a copy of how output should be
(NOTE offence and demerit are different tables
y

select
[Demerit Code],[Demerit Description],[Total Offences (All Months)],
isnull(Jan,0) [Jan],isnull(Feb,0) [Feb],isnull(Mar,0) [Mar],isnull(Apr,0) [Apr],isnull(May,0) [May],
isnull(Jun,0) [Jun],isnull(Jul,0) [Jul],isnull(Aug,0) [Aug],isnull(Sep,0) [Sep],isnull(Oct,0) [Oct],
isnull(Nov,0) [Nov],isnull(Dec,0) [Dec]
from (
select d.dem_code as 'Demerit Code',
d.dem_description as 'Demerit Description',
o.off_no,
isnull(sum(o.off_no) OVER() ,0)'Total Offences (All Months)',
FORMAT(o.off_datetime, 'MMM', 'en-US') as 'DateExp'
from demerit d
inner join offence o
on d.dem_code = o.dem_code
group by d.dem_code,d.dem_description,FORMAT(o.off_datetime, 'MMM', 'en-US'),o.off_no
) as demerits
pivot (
sum(demerits.off_no)
FOR DateExp IN (
[Jan],
[Feb],
[Mar],
[Apr],
[May],
[Jun],
[Jul],
[Aug],
[Sep],
[Oct],
[Nov],
[Dec])
) as pivotdemerits

Related

RFM Analysis Not Outputting All Customer IDs

So I'm working on an RFM analysis, and with lots of help, was able to put together the following query that outputs the customer_id, r score, f score, m score, and lastly a combined rfm score:
--This will first create quintiles using the ntile function
--Then factor in the conditions
--Then combine the score
--Then the substrings will seperate each score's individual points
SELECT *,
SUBSTRING(rfm_combined,1,1) AS recency_score,
SUBSTRING(rfm_combined,2,1) AS frequency_score,
SUBSTRING(rfm_combined,3,1) AS monetary_score
FROM (
SELECT
customer_id,
rfm_recency*100 + rfm_frequency*10 + rfm_monetary AS rfm_combined
FROM
(SELECT
customer_id,
ntile(5) over (order by last_order_date) AS rfm_recency,
ntile(5) over (order by count_order) AS rfm_frequency,
ntile(5) over (order by total_spent) AS rfm_monetary
FROM
(SELECT
customer_id,
MAX(oms_order_date) AS last_order_date,
COUNT(*) AS count_order,
SUM(quantity_ordered * unit_price_amount) AS total_spent
FROM
l_dmw_order_report
WHERE
order_type NOT IN ('Sales Return', 'Sales Price Adjustment')
AND item_description_1 NOT IN ('freight', 'FREIGHT', 'Freight')
AND line_status NOT IN ('CANCELLED', 'HOLD')
AND oms_order_date BETWEEN '2019-01-01' AND CURRENT_DATE
AND customer_id = 'US621111112234061'
GROUP BY customer_id))
ORDER BY customer_id desc)
In the above, you will notice that I am forcing it to only output on a particular customer_id. That is because I wanted to test to see if this query is accounting for when a customer_id appears in multiple YearMonth categories (because they could have bought in Jan, then again in Feb, then again in Nov).
The issue here is that, although the query outputs the right scores, it only seems to be accounting for the customer_id once, regardless of if it appears in multiple months. For this particular customer ID, I see that they appear in Jan 2019, Feb 2019, and Nov 2019, so it should be giving me 3 rows instead of just 1. Been testing for a few hours and can't seem to find the cause, but I suspect that my grouping may be wrong.
Thank you for your help and let me know if you have any questions!!
Best,
Z

getting a distinct count from with a date field

I have a piece of code that is looking for the distinct count of Kegs, the count of the distinct kegs that are tagged and ones that are untagged, what I have so far is:
with CTE as
(select UID_KEG, IS_TAGGED, movement_date
from MOVEMENT M
inner join Keg on M.UID_Keg = Keg.Unique_ID
where DATEPART(year,Movement_date) = '2019'
and UID_MOVEMENT_TYPE = 1
)
select COUNT(Distinct CTE.UID_KEG) as 'Kegs', datepart(week,movement_date)
as 'Week number',
SUM(case when Is_Tagged = 1 then 1 end) as 'tagged',
SUM(case when Is_Tagged = 0 then 1 end) as 'untagged'
from CTE
group by datepart(week,movement_date)
order by [Week number] asc
It currectly returns a distinct count of the kegs but the figures for tagged and un tagged are incorrect and I can only assume it because it's counting duplicate kegs.
Can any one advise how I can get round this or do a count on just the distinct kegs?
You want conditional aggregation using COUNT(DISTINCT). That would be:
SELECT COUNT(DISTINCT CTE.UID_KEG) as Kegs,
datepart(week, movement_date) as Week_number,
COUNT(DISTINCT CASE WHEN Is_Tagged = 1 THEN CTE.UID_KEG END) as tagged,
COUNT(DISTINCT(CASE WHEN Is_Tagged = 0 THEN CTE.UID_KEG END) as untagged
FROM CTE
GROUP BY datepart(week, movement_date)
ORDER BY MIN(movement_date);
Notes:
The tagged and untagged counts may still add up to more than the total count, assuming that kegs can be both tagged and untagged in a single week.
You should include the year() as well as the week, especially because you are not selecting data from a single year.
Only use single quotes for string and date constants. Do not use them for column aliases; that can lead to hard-to-debug errors.
If you remove the Distinct from your count, the sum of untapped and tapped should equal your total (if it is a binary 0 or 1). This indicates that you have duplicate UID_KEG values. Take some time to understand why. Part of your problem is that it seems you don't quite understand the shape of your dataset very well.
Take some time to look at the data to understand if there are duplicates (why? are they caused by the join, or are they in the base data?), look to see if they can appear as tagged and untagged.
EDIT: In response to your comment. If they can be scanned twice you will have to have the assumption that if Is_Tagged = 1 for any UID_KEG in that day, then all kegs with that UID_KEG are tagged.
In that case you will have to adapt the code to use this assumption.
WITH CTE
AS (
SELECT UID_KEG
,IS_TAGGED
,movement_date
FROM MOVEMENT M
INNER JOIN Keg ON M.UID_Keg = Keg.Unique_ID
WHERE DATEPART(year, Movement_date) = '2019'
AND UID_MOVEMENT_TYPE = 1
)
SELECT CTE.UID_KEG AS 'Kegs'
,datepart(week, movement_date) AS 'Week number'
,MAX(Is_Tagged) AS 'tagged'
FROM CTE
GROUP BY CTE.UID_KEG
,datepart(week, movement_date)
ORDER BY [Week number] ASC
This code might not be perfect, I couldn't test it, but it should get you a complete list of each keg, in each day, and if that keg was marked as tagged at least once, and if it was not marked as tagged at all.
The most important thing here is removing duplication of the kegs within each day, then it is possible to calculate.
I'm not great with CTE's but you will need to aggregate one level up to the daily level, now you will be able to count the distinct number of kegs and which ones were tagged and untagged.
Hope that makes sense.
EDIT: here is a subquery that should work
SELECT [Week number]
,count(1) [numKegs]
,sum(tagged) [numTagged]
FROM (
SELECT UID_KEG AS 'Kegs'
,datepart(week, movement_date) AS 'Week number'
,MAX(IS_TAGGED) AS 'tagged'
FROM MOVEMENT M
INNER JOIN Keg ON M.UID_Keg = Keg.Unique_ID
WHERE DATEPART(year, Movement_date) = '2019'
AND UID_MOVEMENT_TYPE = 1
GROUP BY UID_KEG
,datepart(week, movement_date)
) kegdailylevel
GROUP BY [Week number]
ORDER BY [Week number] ASC

Running Totals for the year

Trying to create running totals based on the year in my query as i'm showing the last 3 years of sales and commissions in my query and want running yearly totals for those for each salesperson listed
Tried various ways to get the data to do this but haven't been able to.
SELECT TOP (100) PERCENT 'abc' AS CompanyCode, abc.AR_Salesperson.SalespersonName, abc.AR_SalespersonCommission.SalespersonDivisionNo, abc.AR_SalespersonCommission.SalespersonNo,
SUM(abc.AR_SalespersonCommission.InvoiceTotal) AS InvoiceTotalSum, SUM(abc.AR_SalespersonCommission.CommissionAmt) AS CommissionAmtSum, DATENAME(month, abc.AR_SalespersonCommission.InvoiceDate)
AS Month, DATENAME(year, abc.AR_SalespersonCommission.InvoiceDate) AS Year, DATEPART(m, abc.AR_SalespersonCommission.InvoiceDate) AS MonthNumber
FROM abc.AR_Customer INNER JOIN
abc.AR_SalespersonCommission ON abc.AR_Customer.ARDivisionNo = abc.AR_SalespersonCommission.ARDivisionNo AND abc.AR_Customer.CustomerNo = abc.AR_SalespersonCommission.CustomerNo INNER JOIN
abc.AR_Salesperson ON abc.AR_SalespersonCommission.SalespersonDivisionNo = abc.AR_Salesperson.SalespersonDivisionNo AND
abc.AR_SalespersonCommission.SalespersonNo = abc.AR_Salesperson.SalespersonNo
GROUP BY abc.AR_Salesperson.SalespersonName, abc.AR_SalespersonCommission.SalespersonDivisionNo, abc.AR_SalespersonCommission.SalespersonNo, DATENAME(month, abc.AR_SalespersonCommission.InvoiceDate),
DATENAME(year, abc.AR_SalespersonCommission.InvoiceDate), DATEPART(m, abc.AR_SalespersonCommission.InvoiceDate)
HAVING (DATENAME(year, abc.AR_SalespersonCommission.InvoiceDate) > DATEADD(year, - 3, GETDATE()))
UNION
SELECT TOP (100) PERCENT 'XYZ' AS CompanyCode, xyz.AR_Salesperson.SalespersonName, xyz.AR_SalespersonCommission.SalespersonDivisionNo, xyz.AR_SalespersonCommission.SalespersonNo,
SUM(xyz.AR_SalespersonCommission.InvoiceTotal) AS InvoiceTotalSum, SUM(xyz.AR_SalespersonCommission.CommissionAmt) AS CommissionAmtSum, DATENAME(month, xyz.AR_SalespersonCommission.InvoiceDate)
AS Month, DATENAME(year, xyz.AR_SalespersonCommission.InvoiceDate) AS Year, DATEPART(m, xyz.AR_SalespersonCommission.InvoiceDate) AS MonthNumber
FROM xyz.AR_Customer INNER JOIN
xyz.AR_SalespersonCommission ON xyz.AR_Customer.ARDivisionNo = xyz.AR_SalespersonCommission.ARDivisionNo AND xyz.AR_Customer.CustomerNo = xyz.AR_SalespersonCommission.CustomerNo INNER JOIN
xyz.AR_Salesperson ON xyz.AR_SalespersonCommission.SalespersonDivisionNo = xyz.AR_Salesperson.SalespersonDivisionNo AND
xyz.AR_SalespersonCommission.SalespersonNo = xyz.AR_Salesperson.SalespersonNo
GROUP BY xyz.AR_Salesperson.SalespersonName, xyz.AR_SalespersonCommission.SalespersonDivisionNo, xyz.AR_SalespersonCommission.SalespersonNo, DATENAME(month, xyz.AR_SalespersonCommission.InvoiceDate),
DATENAME(year, xyz.AR_SalespersonCommission.InvoiceDate), DATEPART(m, xyz.AR_SalespersonCommission.InvoiceDate)
HAVING (DATENAME(year, xyz.AR_SalespersonCommission.InvoiceDate) > DATEADD(year, - 3, GETDATE()))
I expect the output to have running totals for the InvoiceTotalSum and CommissionAmt for each salesperson for the last 3 years. So of course January will be 0 for each person but Feb through December will have a running total
Sample data and desired results below. Desired results are the highlighted columns
Sample Data and Desired Results
2 things before I go to the solution.
First, I am not sure why you need an UNION into your query. I can see the difference between abc and xyz but it still looks strange.
It is surely possible your query can be shortened/simplified, which would need more info to tell.
Second, I do not see a valid reason why the running total should be 0 for January.
Explanation about that:
February (2nd month of the year): running total in your expected result contains the amount for 2 months
March: 3 months
April: 4 months
...
So January should contain the running total for 1 month (January itself).
Try the query below:
WITH MyData AS (
<Please paste your query here>
)
SELECT CompanyCode, SalesPersonName, SalesPersonDivisionNo, InvoiceTotalSum,
SUM(InvoiceTotalSum) OVER (PARTITION BY SalesPersonDivisionNo, SalesPersonNo, SalesPersonName, Year ORDER BY MonthNumber) AS InvoiceTotalRunningSum,
CommissionAmtSum,
SUM(CommissionAmtSum) OVER (PARTITION BY SalesPersonDivisionNo, SalesPersonNo, SalesPersonName, Year ORDER BY MonthNumber) AS CommissionAmtRunningSum,
Month, Year, MonthNumber
FROM MyData
ORDER BY CompanyCode, SalesPersonDivisionNo, SalesPersonNo, SalesPersonName, Year, MonthNumber
The magic takes place in the PARTION BY/ORDER BY
I think you need to review your query and simplify it.
a few notes :
if the CompanyCode is already existed within the database, join its table and link it with the current records instead of writing it manually.
DATENAME(year, ... ) the shorthand is YEAR()
DATEPART(m, ...) the shorthand is MONTH()
I encourage you to use aliases
(DATENAME(year, xyz.AR_SalespersonCommission.InvoiceDate) > DATEADD(year, - 3, GETDATE())) will exclude the first year and include the current.So, 2019-3 = 2016, yours will get 2017,2018, and 2019, while it should get 2016,2017, and 2018.
for your InvoiceTotalRunningSum use :
SUM(InvoiceTotalSum) OVER (PARTITION BY SalespersonNo ORDER BY MonthNumber UNBOUNDED PRECEDING)
this will do an accumulative sum on InvoiceTotalSum for each SalespersonNo. you can partition the records for each year, month ..etc. simply by adding more partitions, but I used your current query as sub-query, and did that instead :
read more about SELECT - OVER Clause (Transact-SQL)
try it out :
SELECT
'abc' AS CompanyCode
, SalespersonName
, SalespersonDivisionNo
, SalespersonNo
, InvoiceTotalSum
, SUM(InvoiceTotalSum) OVER (PARTITION BY SalespersonNo ORDER BY MonthNumber UNBOUNDED PRECEDING) InvoiceTotalRunningSum
, CommissionAmtSum
, SUM(CommissionAmtSum) OVER (PARTITION BY SalespersonNo ORDER BY MonthNumber UNBOUNDED PRECEDING) CommissionAmtRunningSum
, [Month]
, [Year]
, MonthNumber
FROM (
SELECT
'abc' AS CompanyCode
, ars.SalespersonName
, arsc.SalespersonDivisionNo
, arsc.SalespersonNo
, SUM(arsc.InvoiceTotal) InvoiceTotalSum
, SUM(arsc.CommissionAmt) CommissionAmtSum
, DATENAME(month, arsc.InvoiceDate) [Month]
, YEAR(arsc.InvoiceDate) [Year]
, MONTH(arsc.InvoiceDate) MonthNumber
FROM
abc.AR_Customer arc
INNER JOIN abc.AR_SalespersonCommission arsc ON arc.ARDivisionNo = arsc.ARDivisionNo AND arc.CustomerNo = arsc.CustomerNo
INNER JOIN abc.AR_Salesperson ars ON arsc.SalespersonDivisionNo = ars.SalespersonDivisionNo AND arsc.SalespersonNo = ars.SalespersonNo
GROUP BY
ars.SalespersonName
, arsc.SalespersonDivisionNo
, arsc.SalespersonNo
, YEAR(arsc.InvoiceDate)
, MONTH(arsc.InvoiceDate)
, DATENAME(month, arsc.InvoiceDate)
HAVING
YEAR(arsc.InvoiceDate) BETWEEN YEAR(GETDATE()) - 3 AND YEAR(GETDATE()) - 1 -- Only include the last three years (excluding current year)
) D

Getting percentages of counts in SQL Server

I am building an SQL Server query that gets the number of leads that were generated from a certain sources by month. This is the query that tells me the monthly count. But I want to add a column that shows what those leads are for that month as a total of all leads for that month. I'm not clear on how to do this. Any help?
SELECT FORMAT([ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08].[Created Date]
, 'yyyy-MM') AS 'YYYY-MM'
, 'Kiosk-Mall' AS 'Lead Source'
, COUNT(*) AS 'Monthly Total From That Lead Source'
FROM [ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08]
WHERE [ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08].[Lead Source] =
'Kiosk-Mall'
GROUP BY FORMAT([ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08].[Created Date], 'yyyy-MM')
ORDER BY FORMAT([ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08].[Created Date], 'yyyy-MM');
You can use conditional aggregation -- basically moving the WHERE condition to a CASE expressions in the argument to an aggregation function:
SELECT FORMAT(l.[Created Date], 'yyyy-MM') AS YYYYMM,
'Kiosk-Mall' AS Lead_Source,
SUM(CASE WHEN l.[Lead Source] = 'Kiosk-Mall' THEN 1 ELSE 0 END) AS [Monthly Total From That Lead Source],
AVG(CASE WHEN l.[Lead Source] = 'Kiosk-Mall' THEN 1.0 ELSE 0 END) AS proportion_of_total
FROM [ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08] l
GROUP BY FORMAT(l.[Created Date], 'yyyy-MM')
ORDER BY YYYYMM
Notes:
Table aliases make the query easier to write and to read.
It is better to choose column aliases that do not need to be escaped (i.e. no spaces, no punctuation).

SQL query for displaying sales for two different months

I was wondering what method is most used when creating a query that displays sales for two different months (selected in the parameter).
My database looks something like this:
Posting Date Company Size Sales
01/01/2011 Microsoft 1000 900
I already have a parameter where "year month" is selected.
What I want is to have two parameters so that I can compare the sales in "year month" side by side in Microsoft Visual Studio.
So the query should have two parameters, #PostingDate1 and #PostingDate2
Thanks for any help!
--UPDATE--
Trying to make this more understandable.
The two parameters to select from will be "year month"
So that the result table will look like this when "year month" is selected for parameter 1: january 2011, and parameter 2: february 2011 (doesn´t matter what months are selected, just that the results will show the different months)
Company Size Sales1 Sales2
Microsoft 1000 100 200
That is if sales for january 2011 was 100
and sales for february 2011 was 200
I think that you want to do a CROSS JOIN, but I'm not completely sure that I understood your question. If the results of your query are onle ONE row, then I recommend a CROSS JOIN, of not, then its probably better not to use it. It should be something like this:
SELECT A.[Posting Date] [Posting Date 1], A.Company Company1, A.Size Size1, A.Sales Sales1,
B.[Posting Date] [Posting Date 2], B.Company Company2, B.Size Size2, B.Sales Sales2
FROM (SELECT [Posting Date], Company, Size, Sales
FROM YourTable
WHERE [Posting Date] = #PostingDate1) A
CROSS JOIN (SELECT [Posting Date], Company, Size, Sales
FROM YourTable
WHERE [Posting Date] = #PostingDate2) B
My answer:
I ended up using "UNION", don´t know whether this is more appropriate, but it got rid of the redundant data from using "CROSS JOIN".
SELECT
A.Company, A.Size, SUM(A.Sales) as Sales1, SUM(B.Sales2)
FROM
(
(SELECT Company, Size, Sales as Sales, 0 as Sales2
FROM Sales
WHERE Posting date = #PostingDate1) AS A
UNION
(SELECT Company, Size, 0 as Sales, Sales as Sales2
FROM Sales
WHERE Posting date = #PostingDate2)
) AS B
)
GROUP BY
A.Company, A.Size