Getting percentages of counts in SQL Server - sql

I am building an SQL Server query that gets the number of leads that were generated from a certain sources by month. This is the query that tells me the monthly count. But I want to add a column that shows what those leads are for that month as a total of all leads for that month. I'm not clear on how to do this. Any help?
SELECT FORMAT([ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08].[Created Date]
, 'yyyy-MM') AS 'YYYY-MM'
, 'Kiosk-Mall' AS 'Lead Source'
, COUNT(*) AS 'Monthly Total From That Lead Source'
FROM [ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08]
WHERE [ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08].[Lead Source] =
'Kiosk-Mall'
GROUP BY FORMAT([ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08].[Created Date], 'yyyy-MM')
ORDER BY FORMAT([ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08].[Created Date], 'yyyy-MM');

You can use conditional aggregation -- basically moving the WHERE condition to a CASE expressions in the argument to an aggregation function:
SELECT FORMAT(l.[Created Date], 'yyyy-MM') AS YYYYMM,
'Kiosk-Mall' AS Lead_Source,
SUM(CASE WHEN l.[Lead Source] = 'Kiosk-Mall' THEN 1 ELSE 0 END) AS [Monthly Total From That Lead Source],
AVG(CASE WHEN l.[Lead Source] = 'Kiosk-Mall' THEN 1.0 ELSE 0 END) AS proportion_of_total
FROM [ProspectData].[dbo].[Real Estate.KPC.Leads.2018-08-08] l
GROUP BY FORMAT(l.[Created Date], 'yyyy-MM')
ORDER BY YYYYMM
Notes:
Table aliases make the query easier to write and to read.
It is better to choose column aliases that do not need to be escaped (i.e. no spaces, no punctuation).

Related

SQL create column for every week (Loop?)

I need to make a report for weekly changes.
This is the code for todays amount
SELECT
[Entry No_],
[Customer No_],
[Posting Date],
[Description],
[Currency Code],
Trans_type = case when [Deposit]=1 then 'Deposit'
when [Imprest]=1 then 'Imprest'
else 'Other' end,
A.Amount
FROM Table1
LEFT JOIN
(
SELECT Distinct [Cust_ Ledger Entry No_],
SUM ([Amount EUR]) as 'amount'
FROM Table2
group by [Cust_ Ledger Entry No_]
having
SUM ([Amount EUR]) <> '0'
)A
on [Entry No_] = A.[Cust_ Ledger Entry No_]
Where
A.Amount is not NULL
Code to generate data for previous week is here (adding only where clause):
SELECT
[Entry No_],
[Customer No_],
[Posting Date],
[Description],
[Currency Code],
Trans_type = case when [Deposit]=1 then 'Deposit'
when [Imprest]=1 then 'Imprest'
else 'Other' end,
A.Amount
FROM Table1
LEFT JOIN
(
SELECT Distinct [Cust_ Ledger Entry No_],
SUM ([Amount EUR]) as 'amount'
FROM Table2
where [posting Date] < '2020-11-23'
group by [Cust_ Ledger Entry No_]
having
SUM ([Amount EUR]) <> '0'
)A
on [Entry No_] = A.[Cust_ Ledger Entry No_]
Where
A.Amount is not NULL
It would be enough to union both queries and then export to Excel and make pivot, but problem is that I need results of last 50 weeks. Is there any smart way to avoid union 50 tables and run one simple code to generate weekly report?
Thanks
it might be easier with sample, but I don't know how to paste table here..
Maybe it is true, i dont need union here, and group by would be enough, but it stills sounds difficult for me :)
Ok. Lets say table has such headers: Project | Country | date | amount
The code below returns amount for todays date
Select
Project,
SUM(amount)
From Table
Group by Project
I actually need todays date and also the results of previous weeks (What was the result on November 22 (week 47), November 15 (week 46) and so on.. total 50 weeks from todays date).
Code for previous week amount is here:
Select
Project,
SUM(amount)
From Table
Where Date < '2020.11.23'
Group by Project
So my idea was to create create 50 codes and join the results together, but i am sure it is a better way to do this. Besides i dont want to edit this query every week and add a new date for it.
So any ideas, to make my life easier?
if I have understood your requirement correctly, all you need to do is extract the week from the date e.g.
Select
Project,
datepart(week, date),
SUM(amount)
From Table
Where Date < '2020.11.23'
Group by Project, datepart(week, date)

How to select max date over the year function

I am trying to select the max date over the year, but it is not working. Any ideas on what to do?
SELECT a.tkinit [TK ID],
YEAR(a.tkeffdate) [Rate Year],
max(a.tkeffdate) [Max Date],
tkrt03 [Standard Rate]
FROM stageElite.dbo.timerate a
join stageElite.dbo.timekeep b ON b.tkinit = a.tkinit
WHERE a.tkinit = '02672'
and tkeffdate BETWEEN '2014-01-01' and '12-31-2014'
GROUP BY a.tkinit,
tkrt03,
a.tkeffdate
Perhaps you only want it by year and not rolled up by calendar date. For SQL server you can try this.
SELECT
…
MaxDate = MAX(a.tkeffdate) OVER (PARTITION BY a.tkinit, YEAR(a.tkeffdate)))
…
Or you could modify the query above to group by the year instead of date-->
GROUP BY a.tkinit,
tkrt03,
YEAR(a.tkeffdate)
You seem to want only one row and all the columns. Use ORDER BY and TOP:
SELECT TOP (1) tr.tkinit as [TK ID],
YEAR(tr.tkeffdate) as [Rate Year],
a.tkeffdate as [Max Date],
tkrt03 as [Standard Rate]
FROM stageElite.dbo.timerate tr JOIN
stageElite.dbo.timekeep tk
ON tk.tkinit = tr.tkinit
WHERE tr.tkinit = '02672' AND
tr.tkeffdate >= '2014-01-01' AND
tr.tkeffdate < '2015-01-01'
ORDER tr.tkeffdate DESC;
Note that I also fixed your date comparisons and table aliases.

getting a distinct count from with a date field

I have a piece of code that is looking for the distinct count of Kegs, the count of the distinct kegs that are tagged and ones that are untagged, what I have so far is:
with CTE as
(select UID_KEG, IS_TAGGED, movement_date
from MOVEMENT M
inner join Keg on M.UID_Keg = Keg.Unique_ID
where DATEPART(year,Movement_date) = '2019'
and UID_MOVEMENT_TYPE = 1
)
select COUNT(Distinct CTE.UID_KEG) as 'Kegs', datepart(week,movement_date)
as 'Week number',
SUM(case when Is_Tagged = 1 then 1 end) as 'tagged',
SUM(case when Is_Tagged = 0 then 1 end) as 'untagged'
from CTE
group by datepart(week,movement_date)
order by [Week number] asc
It currectly returns a distinct count of the kegs but the figures for tagged and un tagged are incorrect and I can only assume it because it's counting duplicate kegs.
Can any one advise how I can get round this or do a count on just the distinct kegs?
You want conditional aggregation using COUNT(DISTINCT). That would be:
SELECT COUNT(DISTINCT CTE.UID_KEG) as Kegs,
datepart(week, movement_date) as Week_number,
COUNT(DISTINCT CASE WHEN Is_Tagged = 1 THEN CTE.UID_KEG END) as tagged,
COUNT(DISTINCT(CASE WHEN Is_Tagged = 0 THEN CTE.UID_KEG END) as untagged
FROM CTE
GROUP BY datepart(week, movement_date)
ORDER BY MIN(movement_date);
Notes:
The tagged and untagged counts may still add up to more than the total count, assuming that kegs can be both tagged and untagged in a single week.
You should include the year() as well as the week, especially because you are not selecting data from a single year.
Only use single quotes for string and date constants. Do not use them for column aliases; that can lead to hard-to-debug errors.
If you remove the Distinct from your count, the sum of untapped and tapped should equal your total (if it is a binary 0 or 1). This indicates that you have duplicate UID_KEG values. Take some time to understand why. Part of your problem is that it seems you don't quite understand the shape of your dataset very well.
Take some time to look at the data to understand if there are duplicates (why? are they caused by the join, or are they in the base data?), look to see if they can appear as tagged and untagged.
EDIT: In response to your comment. If they can be scanned twice you will have to have the assumption that if Is_Tagged = 1 for any UID_KEG in that day, then all kegs with that UID_KEG are tagged.
In that case you will have to adapt the code to use this assumption.
WITH CTE
AS (
SELECT UID_KEG
,IS_TAGGED
,movement_date
FROM MOVEMENT M
INNER JOIN Keg ON M.UID_Keg = Keg.Unique_ID
WHERE DATEPART(year, Movement_date) = '2019'
AND UID_MOVEMENT_TYPE = 1
)
SELECT CTE.UID_KEG AS 'Kegs'
,datepart(week, movement_date) AS 'Week number'
,MAX(Is_Tagged) AS 'tagged'
FROM CTE
GROUP BY CTE.UID_KEG
,datepart(week, movement_date)
ORDER BY [Week number] ASC
This code might not be perfect, I couldn't test it, but it should get you a complete list of each keg, in each day, and if that keg was marked as tagged at least once, and if it was not marked as tagged at all.
The most important thing here is removing duplication of the kegs within each day, then it is possible to calculate.
I'm not great with CTE's but you will need to aggregate one level up to the daily level, now you will be able to count the distinct number of kegs and which ones were tagged and untagged.
Hope that makes sense.
EDIT: here is a subquery that should work
SELECT [Week number]
,count(1) [numKegs]
,sum(tagged) [numTagged]
FROM (
SELECT UID_KEG AS 'Kegs'
,datepart(week, movement_date) AS 'Week number'
,MAX(IS_TAGGED) AS 'tagged'
FROM MOVEMENT M
INNER JOIN Keg ON M.UID_Keg = Keg.Unique_ID
WHERE DATEPART(year, Movement_date) = '2019'
AND UID_MOVEMENT_TYPE = 1
GROUP BY UID_KEG
,datepart(week, movement_date)
) kegdailylevel
GROUP BY [Week number]
ORDER BY [Week number] ASC

New to SQL. Would like to convert an IF(COUNTIFS()) Excel formula to SQL code and have SQL calculate it instead of Excel

I am running SQL Server 2008 R2 (RTM).
I have a SQL query that pulls Dates, Products, Customers and Units:
select
[Transaction Date] as Date,
[SKU] as Product,
[Customer Name] as Customer,
sum(Qty) as Units
from dataset
where [Transaction Date] < '2019-03-01' and [Transaction Date] >= '2016-01-01'
group by [Transaction Date], [SKU], [Customer Name]
order by [Transaction Date]
This pulls hundreds of thousands of records and I wanted to determine if a certain transaction was a new order or reorder based on the following logic:
Reorder: That specific Customer has ordered that specific product in the last 6 months
New Order: That specific Customer hasn’t ordered that specific product in the last 6 months
For that I have this formula in Excel that seems to be working:
=IF(COUNTIFS(A$1:A1,">="&DATE(YEAR(A2),MONTH(A2)-6,DAY(A2)),C$1:C1,C2,B$1:B1,B2),"Reorder","New Order")
The formula works when I paste it individually or in a smaller dataset, but when I try to copy paste it to all 500K+ rows, Excel gives up because it loops for each calculation.
This could probably be done in SQL, but I don’t have the knowledge on how to convert this excel formula to SQL, I just started studying it.
You're doing pretty well with the start of your query there. There are three additional functions you're looking to add to your query.
The first thing you'll need is the easiest. GETDATE() simply returns the current date. You'll need that when you're comparing the current date to the transaction date.
The second function is DATEDIFF, which will give you a unit of time between two dates (months, days, years, quarters, etc). Using DATEDIFF, you can say "is this date within the last 6 months". The format for this is pretty easy. It's DATEDIFF(interval, date1, date2).
The thrid function you're looking for is CASE, which allows you to tell SQL to give you one answer if one condition is met, but a different answer if a different condition is met. For your example, you can say "if the difference in days is < 60, return 'Reorder', if not give me 'New Order'".
Putting it all together:
SELECT CASE
WHEN DATEDIFF(MONTH, [Transaction Date], GETDATE()) <= 6
THEN 'Reorder'
ELSE 'New Order'
END as ORDER_TYPE
,[Transaction Date] AS DATE
,[SKU] AS PRODUCT
,[Customer Name] AS CUSTOMER
,Qty AS UNITS
FROM DATASET
For additonal examples on CASE, take a look at this site: https://www.w3schools.com/sql/sql_ref_case.asp
For additional examples on DATEDIFF, take a look here: See the
following webpage for examples and a chance to try it out:
https://www.w3schools.com/sql/func_sqlserver_datediff.asp
SELECT CASE
WHEN Datediff(day, [transaction date], Getdate()) <= 180 THEN 'reorder'
ELSE 'Neworder'
END,
[transaction date] AS Date,
[sku] AS Product,
[customer name] AS Customer,
qty AS Units
FROM datase
If I understand correctly, you want to peak at the previous date and make a comparison. This suggests lag():
select (case when lag([Transaction Date]) over (partition by SKU, [Customer Name] order by [Transaction Date]) >
dateadd(month, -6, [Transaction Date])
then 'Reorder'
else 'New Order'
end) as Order_Type
[Transaction Date] as Date,
[SKU] as Product,
[Customer Name] as Customer,
sum(Qty) as Units
from dataset d
group by [Transaction Date], [SKU], [Customer Name];
EDIT:
In SQL Server 2008, you can emulate the LAG() using OUTER APPLY:
select (case when dprev.[Transaction Date] >
dateadd(month, -6, d.[Transaction Date])
then 'Reorder'
else 'New Order'
end) as Order_Type
d.[Transaction Date] as Date,
d.[SKU] as Product,
d.[Customer Name] as Customer,
sum(d.Qty) as Units
from dataset d outer apply
(select top (1) dprev.*
from dataset dprev
where dprev.SKU = d.SKU and
dprev.[Customer Name] = d.[Customer Name] and
dprev.[Transaction Date] < d.[Transaction Date]
order by dprev.[Transaction Date] desc
) dprev
group by d.[Transaction Date], d.[SKU], d.[Customer Name];

Displaying different date periods on income data

I have the below query which displays data like so:
Income Type This Month Last Month This Year Last Year
1 179640.00 179640.00 179640.00 179640.00
2 12424440.00 12424440.00 12424440.00 12424440.00
Select
Income_Type As [Income Type],
Sum(Income_Amount) As [This Month],
Sum(Income_Amount) As [Last Month],
Sum(Income_Amount) As [This Year],
Sum(Income_Amount) As [Last Year]
From Income I
Left Join Finance_Types FT On I.Income_Type = FT.Type_ID
Group By
Income_Type
The Income table has a Income_Date which is a datetime column.
I'm struggling to get my head around how I would pull out the data for 'This Month', 'Last Month', 'This Year', 'Last Year' with the correct Sums in one query if possible?
Use date functions:
SUM(CASE WHEN YEAR(yourdatefield) = YEAR(GetDate()) - 1 THEN Income_Amount ELSE 0 END) AS 'Last Year'
That case statement only returns the Income_Amount if it was the last year, so you would be summing up only those amounts.
If you're not using SQL Server, the syntax might be a bit different.