I need a way to group data based on previous rows

I need a way to group data based on previous rows - sql

Let me try to explain this again.
This table has a record for each person for each day of the month. There are approx 20 fields in the table. If any of the fields change (other than the date fields), then I want to group those records. So, for example, if days 1, 2, & 3 are the same, then when I read in day 4 and notice that it is changed, I want to group days 1, 2, & 3 together with a begindate of day one, and an enddate of day 3...etc
Rownum ID BegDate EndDate Field1, Field2.... Field20
1 1 6/1/2017 6/1/2017 xxxx xxxx xxxxx
2 1 6/2/2017 6/2/2017 xxxx xxxx xxxxx
3 1 6/3/2017 6/3/2017 xxxx xxxx xxxxx
4 1 6/4/2017 6/4/2017 yyyy yyyy yyyy
5 1 6/5/2017 6/5/2017 yyyy yyyy yyyy
6 1 6/6/2017 6/6/2017 xxxx xxxx xxxxx
7 1 6/7/2017 6/7/2017 xxxx xxxx xxxxx
8 1 6/8/2017 6/8/2017 zzzz zzzz zzzz
....
So in the example data above, I would have a group with rows 1,2,3 then a group with rows 4,5 then a group with rows 6,7 then a group with 8...etc
ID BegDate EndDate Field1 Field2 ...... Field20 Sum
1 6/1/2017 6/3/2017 xxxx xxxx xxxxx 3
1 6/4/2017 6/5/2017 yyyy yyyy yyyy 2
1 6/6/2017 6/7/2017 xxxx xxxx xxxxx 2
1 6/8/2017 6/15/2017 zzzz zzzz zzzz 8
.....

As example. Create table:
create table t
(date_ datetime,
status varchar(1));
And add data
insert into t values ('2017-11-01','A');
insert into t values ('2017-11-02','A');
insert into t values ('2017-11-03','A');
insert into t values ('2017-11-04','B');
insert into t values ('2017-11-05','B');
insert into t values ('2017-11-06','B');
insert into t values ('2017-11-07','C');
insert into t values ('2017-11-08','C');
insert into t values ('2017-11-09','C');
insert into t values ('2017-11-10','C');
insert into t values ('2017-11-11','B');
insert into t values ('2017-11-12','B');
insert into t values ('2017-11-13','B');
insert into t values ('2017-11-14','B');
insert into t values ('2017-11-15','B');
And use this query
select min(date_start),
IFNULL(date_end,now()),
status
from
( select
t1.date_ date_start,
(select min(date_) from t t2 where t2.date_>t1.date_ and t2.status<>t1.status) - interval 1 day as 'date_end',
t1.status status
from t t1
) a
group by date_end,status
order by 1
http://sqlfiddle.com/#!9/96e27/11

You can do this with a difference of row numbers:
select ID, min(BegDate) as Begdate, max(EndDate) as max(EndDate),
Field1, Field2, ...... Field20,
datediff(day, min(BegDate), max(EndDate))
from (select t.*,
row_number() over (partition by id order by begdate) as seqnum,
row_number() over (partition by id, Field1, Field2, . . ., Field20 order by begdate) as seqnum_2
from t
) t
group by id, (seqnum - seqnum_2), Field1, Field2, . . . Field20 ;

try below query (with 2 extra fields - field1 and field2).
To handle you 20 fields add remaining column whereever you see field1,field2 with field1, field2, field3,......field20
create table #tmp (RowNum int, id int,begdate datetime,EndDate datetime, field1 varchar(10),field2 varchar(10))
insert into #tmp values(1,1,'2017-06-01','2017-06-01','xxxxx','xxxxx')
insert into #tmp values(2,1,'2017-06-02','2017-06-02','xxxxx','xxxxx')
insert into #tmp values(3,1,'2017-06-03','2017-06-03','xxxxx','xxxxx')
insert into #tmp values(4,1,'2017-06-04','2017-06-04','yyyyy','yyyyy')
insert into #tmp values(5,1,'2017-06-05','2017-06-05','yyyyy','yyyyy')
insert into #tmp values(6,1,'2017-06-06','2017-06-06','xxxxx','xxxxx')
insert into #tmp values(7,1,'2017-06-07','2017-06-07','xxxxx','xxxxx')
insert into #tmp values(8,1,'2017-06-08','2017-06-08','zzzzz','zzzzz')
insert into #tmp values(9,1,'2017-06-09','2017-06-09','zzzzz','zzzzz')
insert into #tmp values(10,1,'2017-06-10','2017-06-10','zzzzz','zzzzz')
insert into #tmp values(11,2,'2017-06-04','2017-06-04','yyyyy','yyyyy')
insert into #tmp values(12,2,'2017-06-05','2017-06-05','yyyyy','yyyyy')
insert into #tmp values(13,2,'2017-06-06','2017-06-06','xxxxx','xxxxx')
insert into #tmp values(14,2,'2017-06-07','2017-06-07','xxxxx','xxxxx')
insert into #tmp values(15,1,'2017-06-11','2017-06-11','xxxxx','xxxxx')
insert into #tmp values(16,1,'2017-06-12','2017-06-12','xxxxx','xxxxx')
insert into #tmp values(17,1,'2017-06-13','2017-06-13','zzzzz','xxxxx')
insert into #tmp values(18,1,'2017-06-14','2017-06-14','zzzzz','xxxxx')
insert into #tmp values(19,1,'2017-06-15','2017-06-15','yyyyy','xxxxx')
insert into #tmp values(20,1,'2017-06-16','2017-06-16','zzzzz','xxxxx')
select ID, min(BegDate) as Begdate, max(EndDate) as EndDate,
Field1,Field2, /*Add all other fields here*/
datediff(day, min(BegDate), max(EndDate))+1 As [Sum]
from(
select *,
row_number() over (partition by id order by begdate) as seqnum,
row_number() over (partition by id, Field1,field2 /*Add all other fields here*/ order by begdate) as seqnum_2
from #tmp
) t
group by id, (seqnum - seqnum_2), Field1,Field2 /*Add all other fields here*/
order by ID,Begdate
Drop table #tmp

Related

how to count the number of last day

I got a data like it :
id date_ type
1 05/03/2020 A
2 07/03/2020 A
3 15/03/2020 A
4 25/03/2020 B
5 24/03/2020 B
6 31/03/2020 C
7 31/03/2020 D
I used the function last_day,
I did it :
select last_day(date_) from table1
But I got it :
31/03/2020 : 7
And I want to have it :
31/03/2020 : 2
thanks !

If you are looking for the count of records having last day of the month in date_ field then:
Schema and insert statements:
create table table1(id int, date_ date, type varchar(10));
insert into table1 values(1, '05-Mar-2020', 'A');
insert into table1 values(2, '07-Mar-2020', 'A');
insert into table1 values(3, '15-Mar-2020', 'A');
insert into table1 values(4, '25-Mar-2020', 'B');
insert into table1 values(5, '24-Mar-2020', 'B');
insert into table1 values(6, '31-Mar-2020', 'C');
insert into table1 values(7, '31-Mar-2020', 'D');
Query:
select date_, count(*)cnt
from table1
where date_ = last_day(date_)
group by date_;
Ouput:
DATE_
CNT
31-MAR-20
2
If you need all the date_ with count no need to use last_day:
Query:
select date_, count(*)cnt
from table1
group by date_
order by date_;
Output:
DATE_
CNT
05-MAR-20
1
07-MAR-20
1
15-MAR-20
1
24-MAR-20
1
25-MAR-20
1
31-MAR-20
2
db<>fiddle here

I think you want aggregation:
select date_, count(*)
from t
where date_ = last_day(date_)
group by date_;

The way I understood it, "last day" isn't the result of the LAST_DAY function, but maximum date value in that table. The result you're after is count of rows whose date is equal to that "maximum" date.
If that's so, then this might be one option: counting rows is easy. ROW_NUMBER analytic function calculates ordinal numbers of each row, sorted by date in descending order which means that it is the 1st row you need.
Something like this:
SQL> select date_, cnt
2 from (select date_,
3 count(*) cnt,
4 row_number() over (order by date_ desc) rn
5 from table1
6 group by date_
7 )
8 where rn = 1;
DATE_ CNT
---------- ----------
31/03/2020 2
SQL>

Find sum() from two different tables and join them based on a condition?

I have two tables,
Table1:
ID Amount Date
------------------
123 500.00 02-Sep-2020
123 240.00 02-Sep-2020
124 200.50 02-Sep-2020
125 150.70 03-Sep-2020
123 480.80 03-Sep-2020
Table2
ID Settled_Amount Date
-------------------------------
123 150.25 02-Sep-2020
124 200.00 03-Sep-2020
125 100.40 03-Sep-2020
I want to sum the Amount column of table1 and sum the settled_amount column of Table2 of a particular ID group by the Date column.
So My result would be for ID=123:
Sum(Amount) Sum(Settled_amount) Date
------------------------------------------
740.00 150.25 02-Sep-2020
480.80 03-Sep-2020

You can use union all and group by. For all ids:
select id, date, sum(amount), sum(amount_settled)
from ((select id, date, amount, null as amount_settled
from table1
) union all
(select id, date, null as amount, amount_settled
from table2
)
) t
group by id, date
order by date;
You can filter for a particular id using a where clause in the outer query.

Another way to write it without subseleting as Gordon does.
declare #table1 table (id int, amount numeric(18,2), Dates Date)
Insert into #table1
values
(123 ,500.00 ,'02-Sep-2020'),
(123 ,240.00 ,'02-Sep-2020'),
(124 ,200.50 ,'02-Sep-2020'),
(125 ,150.70 ,'03-Sep-2020'),
(123 ,480.80 ,'03-Sep-2020')
declare #table2 table (id int, Settled_Amount numeric(18,2), Dates Date)
insert into #table2
values
(123 , 150.25 ,'02-Sep-2020'),
(124 , 200.00 ,'03-Sep-2020'),
(125 , 100.40 ,'03-Sep-2020');
with Table1 as (
select sum(amount) as Amount,ID,Dates from #table1
group by ID,Dates
)
,
Table2 as (
Select sum(Settled_amount) as Settled_amount, ID,Dates from #table2
group by ID,Dates
)
select Amount,Settled_amount,a.Dates,a.ID from Table1 a left join table2 b on a.id = b.id and a.Dates = b.Dates
where a.id=123

How to get rows from two tables on maximum value of particular field

I have two tables that has date_updated column.
TableA is like below
con_id date_updated type
--------------------------------------------
123 19/06/2018 2
123 15/06/2018 1
123 01/05/2018 3
101 06/04/2018 1
101 05/03/2018 2
And I have TableB that also has the same structure
con_id date_updated type
--------------------------------------------
123 15/05/2018 2
123 01/05/2018 1
101 07/06/2018 1
The resultant table should have the data with the recent date
con_id date_updated type
--------------------------------------------
123 19/06/2018 2
101 07/06/2018 1
Here the date_updated column is datetime datatype of sql server. I tried this by using group by and selecting the maximum date_updated. But i am not able to include column type in select statement. When i used type in group by ,the result is not correct as the type is also grouped. How can i query this. Please help

SELECT *
FROM
(SELECT *, ROW_NUMBER() OVER(Partition By con_id ORDER BY date_updated DESC) as seq
FROM
(SELECT * FROM TableA
UNION ALL
SELECT * FROM TableB) as tblMain) as tbl2
WHERE seq = 1

One method:
WITH A AS(
SELECT TOP 1 con_id,
date_updated,
type
FROM TableA
ORDER BY date_updated DESC),
B AS(
SELECT TOP 1 con_id,
date_updated,
type
FROM TableB
ORDER BY date_updated DESC),
U AS(
SELECT *
FROM A
UNION ALL
SELECT *
FROM B)
SELECT *
FROM U;
The 2 CTE's at the top get your most recent rows from the tables, and then the end statement unions them together.
For the benefit of the person who says this doesn't work:
USE Sandbox;
GO
CREATE TABLE tablea (con_id int, date_updated date, [type] tinyint);
CREATE TABLE tableb (con_id int, date_updated date, [type] tinyint);
GO
INSERT INTO tablea
VALUES
(123,'19/06/2018',2),
(123,'15/06/2018',1),
(123,'01/05/2018',3),
(101,'06/04/2018',1),
(101,'05/03/2018',2);
INSERT INTO tableb
VALUES
(123,'15/05/2018',2),
(123,'01/05/2018',1),
(101,'07/06/2018',1);
GO
WITH A AS(
SELECT TOP 1 con_id,
date_updated,
[type]
FROM TableA
ORDER BY date_updated DESC),
B AS(
SELECT TOP 1 con_id,
date_updated,
[type]
FROM TableB
ORDER BY date_updated DESC),
U AS(
SELECT *
FROM A
UNION ALL
SELECT *
FROM B)
SELECT *
FROM U;
GO
DROP TABLE tablea;
DROP TABLE tableb;
This returns the dataset:
con_id date_updated type
----------- ------------ ----
123 2018-06-19 2
101 2018-06-07 1
Which is identical to the OP's data:
con_id date_updated type
--------------------------------------------
123 19/06/2018 2
101 07/06/2018 1

Hope this helps:
WITH combined
AS(
select * FROM tableA
UNION
select * FROM tableB)
SELECT t1.con_id,
t1.date_updated,
t1.type
FROM (
SELECT con_id,
date_updated,
type,
row_number() OVER(partition BY con_id ORDER BY date_updated DESC) AS rownumber
FROM combined) t1
WHERE rownumber = 1;

Can be done using window functions:
declare #TableA table (con_id int, date_updated date, [type] int)
declare #TableB table (con_id int, date_updated date, [type] int)
insert into #TableA values
(123, '2018-06-19', 2)
, (123, '2018-06-15', 1)
, (123, '2018-05-01', 3)
, (101, '2018-04-06', 1)
, (101, '2018-03-05', 2)
insert into #TableB values
(123, '2018-05-15', 2)
, (123, '2018-05-01', 1)
, (101, '2018-06-07', 1)
select distinct con_id
, first_value(date_updated) over (partition by con_id order by con_id, date_updated desc) as con_id
, first_value([type]) over (partition by con_id order by con_id, date_updated desc) as [type]
from
(Select * from #TableA UNION Select * from #TableB) x

Select ID, Count(ID) and Group by Date

I have an article table which has id and date (month/year) columns, first of all I would like to count ids and group them by date, then I would like to see which id belongs to which date group in single query like that:
id date count
-----------------
1 01/2015 2
2 01/2015 2
3 02/2015 1
4 03/2015 4
5 03/2015 4
6 03/2015 4
7 03/2015 4
I have 2 queries
Select Count(id)
from article
group by date
and
Select id
from article
gives results;
count date id date
------------- ----------
2 01/2015 1 01/2015
1 02/2015 2 01/2015
4 03/2015 3 02/2015
I need a single query like
select count(id), id, date
from....
which brings id, count, date columns to use in my C# code.
Can someone help me with this?

SELECT id,
date,
COUNT(*) OVER (PARTITION BY date) AS Count
FROM article
Sql fiddle

Can't quite do that in one query, but you could use a CTE to produce a single result set:
create table #tt (id int null, dt varchar(8))
insert #tt values
(1,'01/2015'),
(2,'01/2015'),
(3,'02/2015'),
(4,'03/2015'),
(5,'03/2015'),
(6,'03/2015'),
(7,'03/2015')
;with cteCount(d, c) AS
(
select dt, count(id) from #tt group by dt
)
select id, dt, c
from #tt a
inner join cteCount cc
on a.dt = cc.d
drop table #tt
results:
id dt c
1 01/2015 2
2 01/2015 2
3 02/2015 1
4 03/2015 4
5 03/2015 4
6 03/2015 4
7 03/2015 4

if not exists(select * from TEST.sys.objects where type=N'U' and name=N'article')
begin
create table article(
[id] int,
[date] date)
end
with this data:
insert into article(id,date) values(1,convert(date,'15/01/2015',103));
insert into article(id,date) values(1,convert(date,'15/02/2015',103));
insert into article(id,date) values(2,convert(date,'15/03/2015',103));
insert into article(id,date) values(2,convert(date,'15/01/2015',103));
insert into article(id,date) values(3,convert(date,'15/02/2015',103));
insert into article(id,date) values(4,convert(date,'15/03/2015',103));
insert into article(id,date) values(5,convert(date,'15/01/2015',103));
insert into article(id,date) values(5,convert(date,'15/02/2015',103));
insert into article(id,date) values(1,convert(date,'15/03/2015',103));
insert into article(id,date) values(2,convert(date,'15/01/2015',103));
insert into article(id,date) values(3,convert(date,'15/02/2015',103));
insert into article(id,date) values(4,convert(date,'15/03/2015',103));
insert into article(id,date) values(5,convert(date,'15/01/2015',103));
insert into article(id,date) values(1,convert(date,'15/02/2015',103));
insert into article(id,date) values(2,convert(date,'15/03/2015',103));
insert into article(id,date) values(3,convert(date,'15/01/2015',103));
insert into article(id,date) values(4,convert(date,'15/03/2015',103));
select id,[date], count(id) [count] from article
group by [date],[id]
the result:
id date count
1 2015-01-15 1
1 2015-02-15 2
1 2015-03-15 1
2 2015-01-15 2
2 2015-03-15 2
3 2015-01-15 1
3 2015-02-15 2
4 2015-03-15 3
5 2015-01-15 2
5 2015-02-15 1

It's not clear how you want to generate the id field in result. if you
want to generate it manually then use RANK() or if you want to get it
from the table id value then you can use max() or min()(depends
upon on your expected result)
Use RANK() Fiddle Demo Here
Try:
create table tt (id int null, dt varchar(8),count int)
insert tt values
(1,'01/2015',2),
(2,'01/2015',2),
(3,'02/2015',1),
(4,'03/2015',4),
(5,'03/2015',4),
(6,'03/2015',4),
(7,'03/2015',4)
Query:
select count(id) as count,dt,RANK()
over(order by count(id)) as id from tt group by dt
EDIT2:
or you just can use MAX() or MIN()
like:
select count(id) as count,dt,Min(id) as id from tt group by dt
or
select count(id) as count,dt,MAX(id) as id from tt group by dt

Delete Duplicate rows from table which have same Id

I have a table Emp which have records like this
Id Name
1 A
2 B
3 C
1 A
1 A
2 B
3 C
Now I want to delete the duplicate rows from the table
I am using this query to select or count number of duplicate records
SELECT NameCol, COUNT(*) as TotalCount FROM TestTable
GROUP BY NameCol HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC
and what query should i write to delete the duplicate rows from table.
if I write this query to delete the duplicate records then it is giving a (0) row Affected result.
`DELETE FROM TestTable
WHERE ID NOT IN ( SELECT MAX(ID) FROM
TestTable
GROUP BY NameCol
)`

For sqlserver 2005+
Testdata:
declare #t table(Id int, Name char(1))
insert #t values
(1,'A'),(2,'B'),(3,'C'),(1,'A'),(1,'A'),(2,'B'),(3,'C')
Delete statement(replace #t with your Emp table)
;with a as
(
select row_number() over (partition by id, name order by id) rn
from #t
)
delete from a where rn > 1
select * from #t

**Q How to Remove duplicate data with help of Rowid**
create table abcd(id number(10),name varchar2(20))
insert into abcd values(1,'abc')
insert into abcd values(2,'pqr')
insert into abcd values(3,'xyz')
insert into abcd values(1,'abc')
insert into abcd values(2,'pqr')
insert into abcd values(3,'xyz')
select * from abcd
id Name
1 abc
2 pqr
3 xyz
1 abc
2 pqr
3 xyz
Delete Duplicate record but keep Distinct Record in table
DELETE
FROM abcd a
WHERE ROWID > (SELECT MIN(ROWID) FROM abcd b
WHERE b.id=a.id
);
run the above query 3 rows delete
select * from abcd
id Name
1 abc
2 pqr
3 xyz

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

I need a way to group data based on previous rows - sql

Related

how to count the number of last day

Find sum() from two different tables and join them based on a condition?

How to get rows from two tables on maximum value of particular field

Select ID, Count(ID) and Group by Date

Delete Duplicate rows from table which have same Id

Categories

Resources