Group by records by date - sql

I am using SQL Server 2008 R2. I am having a database table like below :
+--+-----+---+---------+--------+----------+-----------------------+
|Id|Total|New|Completed|Assigned|Unassigned|CreatedDtUTC |
+--+-----+---+---------+--------+----------+-----------------------+
|1 |29 |1 |5 |6 |5 |2014-01-07 06:00:00.000|
+--+-----+---+---------+--------+----------+-----------------------+
|2 |29 |1 |5 |6 |5 |2014-01-07 06:00:00.000|
+--+-----+---+---------+--------+----------+-----------------------+
|3 |29 |1 |5 |6 |5 |2014-01-07 06:00:00.000|
+--+-----+---+---------+--------+----------+-----------------------+
|4 |30 |1 |3 |2 |3 |2014-01-08 06:00:00.000|
+--+-----+---+---------+--------+----------+-----------------------+
|5 |30 |0 |3 |4 |3 |2014-01-09 06:00:00.000|
+--+-----+---+---------+--------+----------+-----------------------+
|6 |30 |0 |0 |0 |0 |2014-01-10 06:00:00.000|
+--+-----+---+---------+--------+----------+-----------------------+
|7 |30 |0 |0 |0 |0 |2014-01-11 06:00:00.000|
+--+-----+---+---------+--------+----------+-----------------------+
Now, I am facing a strange problem while grouping the records by CreatedDtUTC column.
I want the distinct records from this table. Here you can observe that the first three records are duplicates created at the same date time. I want the distinct records so I had ran the query given below :
SELECT Id, Total, New, Completed, Assigned, Unassigned, MAX(CreatedDtUTC)
FROM TblUsage
GROUP BY CreatedDtUTC
But it gives me error :
Column 'TblUsage.Id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I also have tried DISTINCT for CreatedDtUTC column, but had given the same error. Can anyone let me know how to get rid of this?
P.S. I want the CreatedDtUTC coumn in CONVERT(VARCHAR(10), CreatedDtUTC,101) format.

Try this............
SELECT min(Id) Id, Total, New, Completed, Assigned, Unassigned, CreatedDtUTC
FROM TblUsage
GROUP BY Total, New, Completed, Assigned, Unassigned, CreatedDtUTC

The error message itself is very explicit. You can't put a column without applying an aggregate function to it into SELECT clause if it's not a part of GROUP BY. And the reason behind is very simple SQL Server doesn't know which value for that column within a group you want to select. It's not deterministic and therefore prohibited.
You can either put all the columns besides Id in GROUP BY and use MIN() or MAX() on Id or you can leverage windowing function ROW_NUMBER() in the following way
SELECT Id, Total, New, Completed, Assigned, Unassigned, CONVERT(VARCHAR(10), CreatedDtUTC,101) CreatedDtUTC
FROM
(
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY Total, New, Completed, Assigned, Unassigned, CreatedDtUTC
ORDER BY id DESC) rnum
FROM TblUsage t
) q
WHERE rnum = 1
Output:
| ID | TOTAL | NEW | COMPLETED | ASSIGNED | UNASSIGNED | CREATEDDTUTC |
|----|-------|-----|-----------|----------|------------|--------------|
| 3 | 29 | 1 | 5 | 6 | 5 | 01/07/2014 |
| 6 | 30 | 0 | 0 | 0 | 0 | 01/10/2014 |
| 7 | 30 | 0 | 0 | 0 | 0 | 01/11/2014 |
| 5 | 30 | 0 | 3 | 4 | 3 | 01/09/2014 |
| 4 | 30 | 1 | 3 | 2 | 3 | 01/08/2014 |
Here is SQLFiddle demo

Try this:
SELECT MIN(Id) AS Id, Total, New, Completed, Assigned, Unassigned,
CONVERT(VARCHAR(10), CreatedDtUTC, 101) AS CreatedDtUTC
FROM TblUsage
GROUP BY Total, New, Completed, Assigned, Unassigned, CreatedDtUTC
Check the SQL FIDDLE DEMO
OUTPUT
| ID | TOTAL | NEW | COMPLETED | ASSIGNED | UNASSIGNED | CREATEDDTUTC |
|----|-------|-----|-----------|----------|------------|--------------|
| 1 | 29 | 1 | 5 | 6 | 5 | 01/07/2014 |
| 4 | 30 | 1 | 3 | 2 | 3 | 01/08/2014 |
| 5 | 30 | 0 | 3 | 4 | 3 | 01/09/2014 |
| 6 | 30 | 0 | 0 | 0 | 0 | 01/10/2014 |
| 7 | 30 | 0 | 0 | 0 | 0 | 01/11/2014 |

Related

Issue about multiple grouping. How to get a single row from a group?

This is a table with my data:
-----------------------------
| date | value | id |
|03/05/18 |5 | 1 |
|03/05/18 |3 | 2 |
|03/05/18 |5 | 3 |
|03/05/18 |6 | 4 |
|03/05/18 |9 | 5 |
|08/03/19 |5 | 6 |
|08/03/19 |3 | 7 |
|08/03/19 |1 | 8 |
|08/03/19 |6 | 9 |
|01/06/20 |7 | 10 |
|01/06/20 |0 | 11 |
|01/06/20 |2 | 12 |
-----------------------------
I need to find the maximum value in each date and output it with corresponding id.
Example:
-----------------------------
| date | value | id |
|03/05/18 |9 | 5 |
|08/03/19 |6 | 9 |
|01/06/20 |7 | 10 |
-----------------------------
Now I know how output the maximum value in each date but without corresponding id.
Example:
----------------------
| date | value |
|03/05/18 |9 |
|08/03/19 |6 |
|01/06/20 |7 |
----------------------
Software I use is MS SQL Server 2012.
My code:
SELECT
date,
MIN(value)
FROM
my_table
GROUP BY date
I've tried the SQL Server function "FIRST_VALUE" but it didn't help.
ALSO I tried to create a comparing condition in a subquery and run into some problems with specifying variables (alias) outside and inside my subquery.
Any ideas, please?
You can filter with a subquery:
select t.*
from mytable t
where t.value = (select max(t1.value) from mytable t1 where t1.date = t.date)
This would allow top ties, if any. Another option is to use window functions:
select *
from (
select t.*, rank() over(partition by date order by value desc) rn
from mytable t
) t
where rn = 1
If you want to break ties, you can use row_number() instead of rank() - but to get a stable result, you would need a second column in the order by clause.

Need a simple query to calculate sequence length in SQL Server

I have this view that represent the status of connections for each user to a system inside table as below:
---------------------------------------
|id | date | User | Connexion |
|1 | 01/01/2018 | A | 1 |
|2 | 02/01/2018 | A | 0 |
|3 | 03/01/2018 | A | 1 |
|4 | 04/01/2018 | A | 1 |
|5 | 05/01/2018 | A | 0 |
|6 | 06/01/2018 | A | 0 |
|7 | 07/01/2018 | A | 0 |
|8 | 08/01/2018 | A | 1 |
|9 | 09/01/2018 | A | 1 |
|10 | 10/01/2018 | A | 1 |
|11 | 11/01/2018 | A | 1 |
---------------------------------------
The target output would be to get the count of succeeded and failed connection order by date so the output would be like that
---------------------------------------------------------------
|StartDate EndDate User Connexion Length|
|01/01/2018 | 01/01/2018 | A | 1 | 1 |
|02/01/2018 | 02/01/2018 | A | 0 | 1 |
|03/01/2018 | 04/01/2018 | A | 1 | 2 |
|05/01/2018 | 07/01/2018 | A | 0 | 3 |
|08/01/2018 | 11/01/2018 | A | 1 | 4 |
---------------------------------------------------------------
This is what is called a gaps-and-islands problem. The best solution for your version is a difference of row numbers:
select user, min(date), max(date), connexion, count(*) as length
from (select t.*,
row_number() over (partition by user order by date) as seqnum,
row_number() over (partition by user, connexion order by date) as seqnum_uc
from t
) t
group by user, connexion, (seqnum - seqnum_uc);
Why this works is a little tricky to explain. Generally, I find that if you stare at the results of the subquery, you'll see how the difference is constant for the groups that you care about.
Note: You should not use user or date for the names of columns. These are keywords in SQL (of one type or another). If you do use them, you have to clutter up your SQL with escape characters, which just makes the code harder to write, read, and debug.

SQL Server 2008:: Efficient way to do the following query

I have the following data:
Input:
----------------------------
| Id | Value|
----------------------------
| 1 |A |
| 1 |B |
| 2 |C |
| 2 |D |
| 2 |E |
| 3 |F |
----------------------------
I need to convert the results to the following:
Output (Count is based on Id)
----------------------------
| Id | Value| Count|
----------------------------
| 1 |A | 2 |
| 1 |B | 2 |
| 2 |C | 3 |
| 2 |D | 3 |
| 2 |E | 3 |
| 3 |F | 1 |
----------------------------
I am using SQL server 2008. Is it possible to write a query to do this?
If yes could anyone help me provide a SQL to obtain the above output from the input data I gave.
You are looking for COUNT OVER:
select id, value, count(*) over (partition by id)
from mytable
order by id, value;

How to add multiple rows with the same value and ID in one row

I have data like this:
|ID|partner_name|quantity|Price|Period |
|1 |partner 1 | 1 | 100 |01/2017|
|2 |partner 1 | 2 | 200 |01/2017|
|3 |partner 1 | 4 | 400 |01/2017|
|4 |partner 1 | 1 | 100 |02/2017|
I want the data to be like this:
|ID|partner_name|quantity|Price|Period |
|1 |partner 1 | 7 | 700 |01/2017|
|2 |partner 1 | 1 | 100 |02/2017|
How can i create that with sql?
thanks,
You should group your query:
SELECT partner_name, SUM(quantity), SUM(price), period FROM your_table
GROUP BY partner_name, period;
This will merge rows with same partner_name and period together.

Selecting all rows in which id is distinct

Hi i need some advice on how to do a select statement on selecting all rows in which the phone number acts as a measure of "distinction".
Example of what i have.
|ID |Name |Phone Number| Address |
| | | | |
|1 |John | 1234567 | A.Road 1 |
|1 |John | 1234567 | B.Road 2 |
|2 |Jane | 7654321 | C.Road 3 |
|3 |Jim | 7654321 | C.road 3 |
Example of what i want:
|ID |Name |Phone Number| Address |
| | | | |
|1 |John | 1234567 | A.Road 1 |
|2 |Jane | 7654321 | C.Road 3 |
Regarding on which of the rows SQL chooses to pic on the result doesn't matter only that the whole row is available and that it makes a selection of distinct phone numbers. Hope you understand what i'm trying to do here.
ANSI SQL supports the row_number() function, which is a typical solution:
select t.*
from (select t.*,
row_number() over (partition by phone_number order by id) as seqnum
from t
) t
where seqnum = 1;