get records that have only 1 record per group - sql

we have attendance db data as follows (sql server)
empid date type
1 01-Jan In
1 01-Jan Out
2 01-Jan In
3 01-Jan In
3 01-Jan Out
How can we get records that have only 1 record per date per employee (in above case empid 2 for 01-jan)?
The query should simply list all records of employees that have only single type for a day.
EDIT
The result set should be a bit more specific: show all employee who only have "In" for a date but no "Out"

Use Having
select empid, date, count(*)
from Mytable
group by empid, date
having count(*) = 1
You can use this to get the full line:
select t1.*
from MyTable t1
inner join
(
select empid, date, count(*)
from Mytable
group by empid, date
having count(*) = 1
) t2
on t1.empid = t2.empid
and t1.date = t2.date

You can use window functions:
select t.*
from (select t.*,
count(*) over (partition by empid, date) as cnt
from t
) t
where cnt = 1;
You can also use aggregation:
select empid, date, max(type) as type
from t
group by empid, date
having count(*) = 1;

Use a correlated subquery
select * from tablename a
where not exists (select 1 from tablename b where a.empid=b.empid and a.date=b.date and type='Out')
OR
select empid, date,count(distinct type)
from tablename
group by empid,date
having count(distinct type)=1

The Solution is Very Simple, You can use 'DISTINCT' function.
Query Should be as,
SELECT DISTINCT empid FROM attendance
This will return only 1 record per date per employee.
For Your Reference, Check it out- https://www.techonthenet.com/sql_server/distinct.php

This will work if we have ID with 1 IN OR 1 OUT as well
Declare #t table (empid int,date varchar(50),types varchar(50))
insert into #t values (1,'01-Jan','IN')
insert into #t values (1,'01-Jan','OUT')
insert into #t values (2,'01-Jan','IN')
insert into #t values (3,'01-Jan','OUT')
insert into #t values (4,'01-Jan','OUT')
select * from #t a
where not exists (select 1 from #t b where a.empid=b.empid and a.types!=b.types)

Related

Count number of cases where visitor rank is higher on one page then on another

I want to count number fullvisitorID where rank in /page_y is higher then rank in page_x. So in this case result would be 1, only 111
fullvisitorID
rank
page
111
1
/page_x
111
2
/page_y
222
1
/page_x
222
2
/page_x
333
2
/page_x
333
1
/page_y
Consider below approach
select count(*) from (
select distinct fullvisitorID
from your_table
qualify max(if(page='/page_y',rank,null)) over win > max(if(page='/page_x',rank,null)) over win
window win as (partition by fullvisitorID)
)
SELECT COUNTIF(page = '/page_y') cnt FROM (
SELECT * FROM sample_table WHERE page IN ('/page_x', '/page_y')
QUALIFY ROW_NUMBER() OVER (PARTITION BY fullvisitorID ORDER BY rank DESC) = 1
);
for count you can use COUNT and GROUP BY
SELECT fullvisitorID, COUNT(fullvisitorID), Page FROM table t1
WHERE rank = (SELECT MAX(t2.rank) FROM table t2 WHERE t2.fullvisitorID = t1.fullvisitorID)
Group By fullvisitorID, Page
You can apply a SELF JOIN between the two tables, by matching on the "fullvisitorID" field, then force
the first table to have "page_y" values
the second table to have "page_x" values
rank of the first table to have higher rank of the second table
SELECT *
FROM tab t1
INNER JOIN tab t2
ON t1.fullvisitorID = t2.fullvisitorID
AND t1.page = '/page_y'
AND t2.page = '/page_x'
AND t1.rank > t2.rank
Table separation approach:
DECLARE #t1 TABLE ( fullvisitorID INT, [rank] INTEGER,[page] VARCHAR (max)) --here where page = x
DECLARE #t2 TABLE ( fullvisitorID INT, [rank] INTEGER,[page] VARCHAR (max)) --here where page = y
INSERT INTO #t1 SELECT * FROM #test t WHERE t.[page] LIKE '/page_x'
INSERT INTO #t2 SELECT * FROM #test t WHERE t.[page] LIKE '/page_y'
SELECT COUNT(*) FROM #t1 INNER JOIN #t2 ON [#t1].fullvisitorID = [#t2].fullvisitorID WHERE [#t1].rank < [#t2].rank

update all rows of a table based on minimum value of its group

I have a table like this
Date----- ----------Value--------- Group <br>
2017-01-01--------10--------------1--<br>
2017-01-02---------9---------------1--<br>
2017-01-03 --------5---------------2--<br>
2017-01-04 --------4---------------2--<br>
i want to update all value column in the table such that it is set to minimum date's value in that group
like this
Date----- ----------Value--------- Group <br>
2017-01-01--------10--------------1--<br>
2017-01-02---------10---------------1--<br>
2017-01-03 --------5---------------2--<br>
2017-01-04 --------5---------------2--<br>
Here you go, 2 sub-queries, the first to calculate min date per group then join back to original table to get the associated value. Then finally join this to the original table to update all associated groups with that value:
UPDATE M SET M.Value = RESULT.Value FROM MyTable M
INNER JOIN (
SELECT MV.Group, M.Value FROM MyTable M
INNER JOIN (
SELECT MIN(Date) as MinDateValue, Group FROM MyTable
GROUP BY Group
) MV ON MV.MinDateValue = M.Date AND MV.Group = M.Group
) RESULT ON RESULT.Group = M.Group
First get min date and value from sub query.Based on this result update main table
CREATE TABLE #Table(_Date Date,value INT,_Group INT)
INSERT INTO #Table(_Date ,value ,_Group)
SELECT '2017-01-01',10,1 UNION ALL
SELECT '2017-01-02',9,1 UNION ALL
SELECT '2017-01-03',5,2 UNION ALL
SELECT '2017-01-04',4,2
UPDATE #Table SET value = _Output._Value
FROM
(
SELECT A._Date , A._Group , T.value _Value
FROM #Table T
JOIN
(
SELECT MIN(_Date) _Date ,_Group
FROM #Table
GROUP BY _Group
) A ON A._Date = T._Date
) _Output WHERE _Output._Group = #Table._Group
SELECT * FROM #Table
You can also use a CTE.
Query
;with cte as(
select [rn] = row_number() over(
partition by [Group]
order by [Date]
), *
from [your_table_name]
)
update t1
set t1.[Value] = t2.[Value]
from cte t1
join cte t2
on t1.[Group] = t2.[Group]
and t1.[rn] > t2.[rn];

SQL Server Query_ToDelete_Records

Can anyone help me with the script which will select the latest date from the column dtUpdated_On if date is greater than last date and ID is less than last ID. I know the question is not clear but try to understand. In this example I want to delete ID 1003 (I know in this example we will say... Delete from tableName where ID=1003)
ID dtUpdated_On
-----------------------------------
1001 2009-12-11 20:08:16.857
1002 2012-03-31 02:35:16.650
1003 2012-09-01 00:00:00.000
1004 2012-03-31 02:35:16.650
Assuming that by "last" you mean the row with the highest id, then you can do:
select t.*
from t join
(select top 1 dtUpdated_On
from t
order by id desc
) last
on t.dtUpdated_On > last.dtUpdated_On;
You can also express this in the where clause, which is simpler for deletion (in my opinion):
delete t
where t.dtUpdated_On > (select top 1 t2.dtUpdated_On
from t
order by id desc
)
Try this,
DECLARE #MyTable TABLE(ID INT, dtUpdated_On DATETIME)
INSERT INTO #MyTable
VALUES (1001, '2009-12-11 20:08:16.857')
,(1002, '2012-03-31 02:35:16.650')
,(1003, '2012-09-01 00:00:00.000')
,(1004, '2012-03-31 02:35:16.650')
;WITH LatestDate AS
(
SELECT TOP 1 ID, dtUpdated_On
FROM #MyTable
ORDER BY dtUpdated_On DESC, ID DESC
),
LastestID AS
(
SELECT c.ID, c.dtUpdated_On, t.ID AS LatestID, t.dtUpdated_On AS LatestIDDate
FROM #MyTable t
INNER JOIN LatestDate c ON t.dtUpdated_On < c.dtUpdated_On
AND t.ID > c.ID
)
DELETE t
FROM #MyTable t
INNER JOIN LastestID c ON c.ID = t.ID
SELECT *
FROM #MyTable t

Oracle SQL to delete duplicate records based on columns

I have a table with records:
DATE NAME AGE ADDRESS
01/13/2014 abc 27 us
01/29/2014 abc 27 ma <- duplicate
02/03/2014 abc 27 ny <- duplicate
02/03/2014 def 28 ca
I want to delete the record number 2 and 3 since they are duplicates for record 1 based on name and age. DATE column is a timestamp based from the record when it was added (sql date) and considered unique.
I found this sql but not sure if it will work and a bit concerned as the table has 2 million records and delting the wrong ones will be a bad idea:
SELECT A.DATE, A.NAME, A.AGE
FROM table A
WHERE EXISTS (SELECT B.DATE
FROM table B
WHERE B.NAME = A.NAME
AND B.AGE = A.AGE);
There are many instance of this records so if someone can help me write a sql to delete this records?
Query
DELETE FROM tbl t1
WHERE dt IN
(
SELECT t1.dt
FROM tbl t1
JOIN tbl t2 ON
(
t2.name = t1.name
AND t2.age=t1.age
AND t2.dt > t1.dt
)
);
Fiddle demo
delete from table
where (date, name, age) not in ( select max( date ), name, age from table group by name, age )
Before delete verify with
select * from table
where (date, name, age) not in ( select max( date ), name, age from table group by name, age )
ROW_NUMBER analytical function will helpful (supported by Oracle and Sqlserver).
The logic of assigning a unique ordered number for each row inside a partition, needs to be implemented carefully inside ORDER BY clause.
SELECT A_TABLE.*,
ROW_NUMBER ()
OVER (PARTITION BY NAME, AGE
ORDER BY DATE DESC)
seq_no
FROM A_TABLE;
Then you may use the result for delete operation:
Delete A_TABLE
where DATE,NAME,AGE IN
(
SELECT DATE,NAME,AGE FROM
(
SELECT A_TABLE.*,
ROW_NUMBER ()
OVER (PARTITION BY NAME, AGE
ORDER BY DATE DESC)
seq_no
FROM A_TABLE;
)
WHERE seq_no != 1
)

Group by id and select most recent

I have a table example like this:
date id status
01/01/2013 55555 high
01/01/2014 55555 low
01/01/2010 44444 high
01/01/2011 33333 low
I need in order: group by id and select most recent date.
this is the result I want.
date id status
01/01/2014 55555 low
01/01/2010 44444 high
01/01/2011 33333 low
I do not care the order of the rows.
you need to join your table with a subquery that "links" the record date with the greatest date for each id:
select a.*
from your_table as a
inner join (
select id, max(date) as max_date
from your_table
group by id
) as b on a.id = b.id and a.date = b.max_date;
I think you will need a subquery to get the MAX(Date) and then inner join. Try this:
SELECT A.[Date], A.[Id], A.[Status]
FROM Table A
INNER JOIN(SELECT Id, MAX([Date]) AS MaxDate
FROM Table
GROUP BY [Id]) B ON
A.[Id] = B.[Id] AND
A.[Date] = B.[MaxDate]
--return the group id and the latest date in that group
select id
, MAX([date]) [latestDateInGroup]
from tbl
group by id
--return the group id, and the related status and date for the record with the latest date in that group
select id
, [status] [latestDateInGroup'sStatus]
, [date] [latestDateInGroup]
from
(
select id
, [status]
, [date]
, row_number() over (partition by id order by [date] desc) r
from tbl
) x
where x.r = 1
--return all ids and statuses, along with the latest date in that group's group (requires SQL 2012+)
select id
, [status]
, max([date]) over (partition by id order by [date] desc) [latestDateInGroup]
from tbl
SQL Fiddle's offline at the moment; once back up the following code should allow you to build a table to test the above queries with
http://sqlfiddle.com
create table tbl ([date] date, id bigint, [status] nvarchar(4))
go
insert tbl select '2013-01-01', 55555, 'high'
insert tbl select '2014-01-01', 55555, 'low'
insert tbl select '2010-01-01', 44444, 'high'
insert tbl select '2011-01-01', 33333, 'low'