Different count amount using grouping - sql

I want to modify this SQL query:
SELECT Count(DISTINCT contacts) AS Totaly
FROM clicks
WHERE clicks.campaign_id = 1234
AND clicks.type IN ( 1, 2, 3 )
To get clicks per day. I created this:
SELECT Cast(clicks.time AS DATE) AS 'date',
----- THE SAME CODE -----
Count(DISTINCT contacts) AS Totaly
FROM clicks
WHERE clicks.campaign_id = 1234
AND clicks.type IN ( 1, 2, 3 )
----- THE SAME CODE -----
GROUP BY Cast(clicks.time AS DATE)
The problem is that count of records from second query is not the same as from first query. I can check that with this:
WITH cte
AS (SELECT Cast(clicks.time AS DATE) AS 'date',
----- THE SAME CODE -----
Count(DISTINCT contacts) AS Totaly
FROM clicks
WHERE clicks.campaign_id = 1234
AND clicks.type IN ( 1, 2, 3 )
----- THE SAME CODE -----
GROUP BY Cast(clicks.time AS DATE))
SELECT Sum(totaly)
FROM cte
So my question is - why I am not getting the same sum?

The numbers are not comparable. COUNT(DISTINCT) counts the number of distinct different values in the group, when you change the grouping you will change the overall total because they are no longer counting the same distinct items.
For your queries your first simply COUNTs the DISTINCT values of contacts. There is no GROUP BY so this is based purely on the whole data set. For the second query you are COUNTing the number of DISTINCT values of contacts per date. This means that the same value of contacts can (and will) be counted more than once if they appear on different dates. AS a result when you SUM the COUNT from your second query you get a COUNT of distinct contacts and dates.
Let's take this very overly simplified example data:
CREATE TABLE dbo.SomeTable (ContactID int,
SomeDate date);
GO
INSERT INTO dbo.SomeTable (ContactID,
SomeDate)
VALUES(1,'20230101'),
(1,'20230102'),
(2,'20230102'),
(3,'20230103'),
(3,'20230104');
Now, very clearly, we can see that there are 3 different values for ContactId, however, we can verify this:
SELECT COUNT(DISTINCT ContactID)
FROM dbo.SomeTable;
Now let's get a DISTINCT COUNT of ContactID for each value of SomeDate. This will give us the following result:
SomeDate
DistinctContacts
2023-01-01
1
2023-01-02
2
2023-01-03
1
2023-01-04
1
And to verify:
SELECT SomeDate,
COUNT(DISTINCT ContactID) AS DistinctContacts
FROM dbo.SomeTable
GROUP BY SomeDate;
So we can clearly see that the total, 5, here is quite different but it isstill correct. The total is now representative of the DISTINCT values of contactID and SomeDate.
If you wanted to get both the DISTINCT count by SomeDate and the total you could use ROLLUP or GROUING SETS. I demonstrate both, however, due to the over simplification they return the same result; for other queries they might not (as ROLLUP might include more groups):
SELECT SomeDate,
COUNT(DISTINCT ContactID) AS DistinctContacts
FROM dbo.SomeTable
GROUP BY SomeDate
WITH ROLLUP;
SELECT SomeDate,
COUNT(DISTINCT ContactID) AS DistinctContacts
FROM dbo.SomeTable
GROUP BY GROUPING SETS((SomeDate),());
This returns the following:
SomeDate
DistinctContacts
2023-01-01
1
2023-01-02
2
2023-01-03
1
2023-01-04
1
NULL
3
Note that for the row where SomeDate has the value NULL 3 is returned, not 5.

Related

How to determine difference in month between rows in a way like crossing

I'm creating a report where I need to calculate the difference between two dates in different rows in a way like cross to cross. How can I achieve this result with the following data:
CREATE TABLE #Customers (
customerid INT,
issuedate DATE,
statusdate date
)
INSERT INTO #Customers
SELECT 928, '2017-07-24', '2018-01-22'
union
SELECT 928, '2018-04-05', '2018-10-05'
union
SELECT 928, '2019-02-21', '2019-01-21'
--The Result should be like this "Displaying the difference between '2018-01-22' and '2018-04-05'
--And difference between '2018-20-05' and '2019-02-21'
DROP TABLE #Customers
I expect the result from the query to display the difference in months between columns 'statusdate' and 'issuedate' giving an output of
3 and
4
SELECT DATEDIFF(MONTH,'2018-01-22','2018-04-05')
UNION
SELECT DATEDIFF(MONTH,'2018-10-05','2019-02-21')
Use the LAG() function to get the value from the previous row, and check where that value is not null. We use a sub-select to fetch the values, as the LAG() functions cannot be used in a WHERE clause.
SELECT DATEDIFF(MONTH, T.PreviousStatusDate, T.issuedate) as difference
FROM (
SELECT LAG(c.statusdate) OVER(ORDER BY c.statusdate) as PreviousStatusDate,
c.issuedate
FROM Customers c
) AS T
WHERE T.PreviousStatusDate IS NOT NULL
Result,
difference
----------------
3
4
See live demo at http://www.sqlfiddle.com/#!18/d1169/10
If I understand correctly and you want to get the difference between values from current and subsequent rows, next approach using LEAD() may help:
SELECT DATEDIFF(MONTH, statusdate, nextissuedate) AS [Difference]
FROM (
SELECT
customerid,
issuedate,
statusdate,
LEAD(issuedate) OVER (PARTITION BY customerid ORDER BY issuedate) AS nextissuedate
FROM #Customers
) t
WHERE nextissuedate IS NOT NULL
Output:
----------------
Difference
----------------
3
4

How to get count of records satisfying multipel critierias in SQL Server?

I have an sql table with the below columns
OrderNo, GroupNum, ShipMethod, TrackingNo
I want to find number of orders that have multiple 'ShipMethod' for same groupnum?
Sample records wourld be:
Order123 1 DHL
Order123 2 DHL1
Order123 2 Fedex
Then i need to get result stating 2 or if possible output as below:
OrderNumer GroupNum Count
---------- ------- -----
Order123 2 2 (Because 2 shipmethods)
Group by the columns you want to be unique, use count() to get each groups count and use having to limit the output to only the relevant groups
select ordernum, groupnum, count(*) as cnt
from your_table
group by ordernum, groupnum
having count(*) > 1
If I understand correctly:
select OrderNumer, groupnum, count(*)
from t
group by OrderNumer, groupnum
having count(*) > 1;
You may also want count(distinct shipmethod), if you want to count the distinct values rather than the number of rows.

Select a, min(b) but also want C from that same record

Select A, min(b) from TableX
group by a
This works but I want one more piece of information. The output will be one row for each A with A and the min(b) for that A.
But I also want C from that row.
I cannot figure out how to do it!
MS SQL Server 2012
"C" is the sysident of the row.
So table has
Sysident ID Date
1 100 2014-01-01
2 100 2014-01-02
3 200 2014-02-01
4 200 201-002-05
etc
I want output of
Sysident id Date
1 100 2014-01-01
3 200 2014-02-01
I can get the ID and min date with a simple Select ID, Min(date) group by ID but don't know how to get the Sysident for each of the rows.
When I write/edit this, my sample table looks like a table but when it displays it is all run together. I have searched HELP for formatting so it will look like a table but cannot find anything.
The question is very clear (to me). For every unique A, I want the sysident of the row with the oldest date and what that date is.
If you want the first date, you can use min:
select id
, min(sysident)
, min(date)
from YourTable
group by
id
If you want a specific version of sysident, say the first ordered by the date column, you can use SQL Server's row_number():
select *
from (
select row_number() over (
partition by id
order by [date]) as rn
, min([date]) over (partition by id) as min_date
, id
, sysident
from YourTable
) as SubQueryAlias
where rn = 1 -- Only oldest row per value of id
For more answers, check out the greatest-n-per-group tag.

Return min date and corresponding amount to that distinct ID

Afternoon
I am trying to return the min value/ max values in SQL Server 2005 when I have multiple dates that are the same but the values in the Owed column are all different. I've already filtered the table down by my select statement into a temp table for a different query, when I've then tried to mirror I have all the duplicated dates that you can see below.
I now have a table that looks like:
ID| Date |Owes
-----------------
1 20110901 89
1 20110901 179
1 20110901 101
1 20110901 197
1 20110901 510
2 20111001 10
2 20111001 211
2 20111001 214
2 20111001 669
My current query:
Drop Table #Temp
Select Distinct Convert(Varchar(8), DateAdd(dd, Datediff(DD,0,DateDue),0),112)as Date
,ID
,Paid
Into #Temp
From Table
Where Paid <> '0'
Select ,Id
,Date
,Max(Owed)
,Min(Owed)
From #Temp
Group by ID, Date, Paid
Order By ID, Date, Paid
This doesn't strip out any of my dates that are the same, I'm new to SQL but I'm presuming its because my owed column has different values. I basically want to be able to pull back the first record as this will always be my minimum paid and my last record will always be my maximum owed to work out my total owed by ID.
I'm new to SQL so would like to understand what I've done wrong for my future knowledge of structuring queries?
Many Thanks
In your "select into"statement, you don't have an Owed column?
GROUP BY is the normal way you "strip out values that are the same". If you group by ID and Date, you will get one row in your result for each distinct pair of values in those two columns. Each row in the results represents ALL the rows in the underlying table, and aggregate functions like MIN, MAX, etc. can pull out values.
SELECT id, date, MAX(owes) as MaxOwes, MIN(owes) as minOwes
FROM myFavoriteTable
GROUP BY id, date
In SQL Server 2005 there are "windowing functions" that allow you to use aggregate functions on groups of records, without grouping. An example below. You will get one row for each row in the table:
SELECT id, date, owes,
MAX(Owes) over (PARTITION BY select, id) AS MaxOwes,
MIN(Owes) over (PARTITION BY select, id) AS MinOwes
FROM myfavoriteTable
If you name a column "MinOwes" it might sound like you're just fishing tho.
If you want to group by date you can't also group by ID, too, because ID is probably unique. Try:
Select ,Date
,Min(Owed) AS min_date
,Max(Owed) AS max_date
From #Temp
Group by Date
Order By Date
To get additional values from the row (your question is a bit vague there), you could utilize window functions:
SELECT DISTINCT
,Date
,first_value(ID) OVER (PARTITION BY Date ORDER BY Owed) AS min_owed_ID
,last_value(ID) OVER (PARTITION BY Date ORDER BY Owed) AS max_owed_ID
,first_value(Owed) OVER (PARTITION BY Date ORDER BY Owed) AS min_owed
,last_value(Owed) OVER (PARTITION BY Date ORDER BY Owed) AS max_owed
FROM #Temp
ORDER BY Date;

SQL Server : SUM() of multiple rows including where clauses

I have a table that looks something like the following :
PropertyID Amount Type EndDate
--------------------------------------------
1 100 RENT null
1 50 WATER null
1 60 ELEC null
1 10 OTHER null
2 70 RENT null
2 10 WATER null
There will be multiple items billed to a property, also billed multiple times. For example RENT could be billed to property #1 12 times (over a year), however the only ones I'm interested for are those with ENDDATE of null (in otherwords, current)
I would like to achieve :
PropertyId Amount
--------------------------
1 220
2 80
I have tried to do something like this :
SELECT
propertyId,
SUM() as TOTAL_COSTS
FROM
MyTable
However, in the SUM would I be forced to have multiple selects bringing back the current amount for each type of charge? I could see this becoming messy and I'm hoping for a much simpler solution
Any ideas?
This will bring back totals per property and type
SELECT PropertyID,
TYPE,
SUM(Amount)
FROM yourTable
GROUP BY PropertyID,
TYPE
This will bring back only active values
SELECT PropertyID,
TYPE,
SUM(Amount)
FROM yourTable
WHERE EndDate IS NULL
GROUP BY PropertyID,
TYPE
and this will bring back totals for properties
SELECT PropertyID,
SUM(Amount)
FROM yourTable
WHERE EndDate IS NULL
GROUP BY PropertyID
......
Try this:
SELECT
PropertyId,
SUM(Amount) as TOTAL_COSTS
FROM
MyTable
WHERE
EndDate IS NULL
GROUP BY
PropertyId
you mean getiing sum(Amount of all types) for each property where EndDate is null:
SELECT propertyId, SUM(Amount) as TOTAL_COSTS
FROM MyTable
WHERE EndDate IS NULL
GROUP BY propertyId
sounds like you want something like:
select PropertyID, SUM(Amount)
from MyTable
Where EndDate is null
Group by PropertyID
The WHERE clause is always conceptually applied (the execution plan can do what it wants, obviously) prior to the GROUP BY. It must come before the GROUP BY in the query, and acts as a filter before things are SUMmed, which is how most of the answers here work.
You should also be aware of the optional HAVING clause which must come after the GROUP BY. This can be used to filter on the resulting properties of groups after GROUPing - for instance HAVING SUM(Amount) > 0
Use a common table expression to add grand total row, top 100 is required for order by to work.
With Detail as
(
SELECT top 100 propertyId, SUM(Amount) as TOTAL_COSTS
FROM MyTable
WHERE EndDate IS NULL
GROUP BY propertyId
ORDER BY TOTAL_COSTS desc
)
Select * from Detail
Union all
Select ' Total ', sum(TOTAL_COSTS) from Detail