Pull records with unique flags in an efficient way - sql

From the table below, I want to write a query that extracts the records where the flag first occurs. As an example, from the table below, I would want to pull the Nov 8 record, Dec 6 record, and Jan 10 record into a separate table. Any thoughts on how to best approach this? I'm not tied to having the flag column being a count - ideally it could be binary, but I'm not sure... the flag column is computed and not part of the raw data.
Date Location KPI Flag
11/8/2017 A 5 1
11/15/2017 A 5 1
11/22/2017 A 5 1
11/29/2017 A 5 1
12/6/2017 A 10 2
12/13/2017 A 10 2
12/20/2017 A 10 2
12/27/2017 A 10 2
1/3/2018 A 10 2
1/10/2018 A 15 3
1/17/2018 A 15 3
1/24/2018 A 15 3

Often the fastest method is a correlated subquery:
select t.*
from t
where t.date = (select min(t2.date)
from t t2
where t2.location = t.location and
t2.kpi = t.kpi
);
In particular, this can make use of an index on (location, kpi, date).
That said, if you want the rows where kpi changes, then you might want lag():
select t.*
from (select t.*,
lag(kpi) over (partition by location order by date) as prev_kpi
from t
) t
where prev_kpi is null or prev_kpi <> kpi;
In particular, this will allow kpi values to repeat at different times -- and you will get one for each group of adjacent values.

You can use PARTITION BY along with ROW_NUMBER() ,
Below query works fine with you data :
SELECT [Date], [Flag] FROM (
SELECT [Date], [Flag], ROW_NUMBER() OVER
( PARTITION BY [Flag]
ORDER BY [Date]) row_num
FROM #test) t
WHERE t.row_num = 1

So far what I understand is to exact the most older date from each of the flag categories.
select * from (
select
Date,
Location,
KPI,
Flag
row_number() over(partition by Flag order by Date asc) as RN
from
Your_Table
) t
where t.RN = 1
This solution is using partition to get the expected data.

Related

Detecting overlapped dates and updating the newest record SQL Server 2008

I want to be able to detect all records (some duplicated) and marking the ones who are overlapped (the records uploaded after) with others as OVER. For that I have the SELECT to return existing overlapped records and the CTE to set this column to OVER.
My problem is adapting the select query to mark the newest with this value and store it inside the cte as I'm unfamilliar with SQL.
The select:
select t.*
from testtable t where exists
(select 1 from testtable t2
where t.idd = t2.idd
AND t.id<>t2.id
AND t2.beg <= t.end
AND t.beg <= t2.end)
The half way done CTE:
;with cte
as (select t.*, Row_number() over (partition by idd order by date_uploaded desc) RN
from testtable as t)
update cte set overlapped = 'OVER'
where RN > 1
and (overlapped is null or overlapped <> 'UNIQUE')
Example data, how it should look like:
overlapped ID idd iduser iddate name beg end date_uploaded
UNIQUE 52 -1907372231 666 201802 sol 2018-09-01 2018-09-10 2018-09-12
OVER 53 -1907372231 666 201802 sol 2018-09-10 2018-09-12 2018-09-13
Notice how the row 53 overlaps BEG date with END
Any help with my problem it's hugely appreciated.
Instead of updating the CTE, use a CASE expression when you SELECT from it:
;with cte
as (select t.*, Row_number() over (partition by idd order by date_uploaded desc) RN
from testtable as t)
SELECT CASE
WHEN RN > 1 and (overlapped is null or overlapped <> 'UNIQUE') THEN 'OVER'
ELSE overlapped
END AS overlapped
FROM cte

How to get second row value by player in SQL Server and insert it in other table

how can i get second row value by player in SQL Server and insert it in other table.
For example i will have table like this:
PlayerId VIpLevel StartDate
1 1 2000-01-01 00:10
1 4 2001-01-01 00:10
1 5 2001-01-11 00:10
2 1 2000-01-01 00:10
2 3 2000-01-02 00:10
2 7 2000-05-02 00:10
So i want to get for player 1 and player 2 their second VipLevel order by StarDate DESC.
So far i find this, but it's not good for me
UPDATE #Results
SET [PreviousVIPLevel] = (
SELECT
ROW_NUMBER() OVER(ORDER BY StartDate DESC) AS RowNum,
PlayerId,
VIPLevelId
FROM #table
) foo
WHERE RowNum =2
I assume you have a playerid in your #Results table so you can update each player's record with their 2nd highest level. Then you need to use partition by in your row_number function and join accordingly:
UPDATE A
SET PreviousVIPLevel= B.VIPLevelId
FROM #Results A
JOIN (SELECT
ROW_NUMBER() OVER(PARTITION BY PlayerId ORDER BY StartDate DESC) AS RowNum,
PlayerId,
VIPLevelId
FROM #table
) B ON A.PLayerId = B.PlayerId AND B.RowNum = 2
Your current query cannot work for multiple reasons. First, you cannot update a field selecting multiple columns. That's why I used a join instead. Second, if you were able to get yours to work, it would update all records to the same value since you are missing the partition by clause.
There is actually no update involved. It's an insert statement. Assuming the destination table is created:
WITH CTE AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY PlayerId ORDER BY StartDate DESC) AS RowNum,
PlayerId,
VIPLevelId
FROM #table
)
INSERT INTO Destination
SELECT * FROM CTE WHERE RowNum = 2

How to get next minimum date that is not within 30 days and use as reference point in SQL?

I have a subset of records that look like this:
ID DATE
A 2015-09-01
A 2015-10-03
A 2015-10-10
B 2015-09-01
B 2015-09-10
B 2015-10-03
...
For each ID the first minimum date is the first index record. Now I need to exclude cases within 30 days of the index record, and any record with a date greater than 30 days becomes another index record.
For example, for ID A, 2015-09-01 and 2015-10-03 are both index records and would be retained since they are more than 30 days apart. 2015-10-10 would be dropped because it's within 30 days of the 2nd index case.
For ID B, 2015-09-10 would be dropped and would NOT be an index case because it's within 30 days of the 1st index record. 2015-10-03 would be retained because it's greater than 30 days of the 1st index record and would be considered the 2nd index case.
The output should look like this:
ID DATE
A 2015-09-01
A 2015-10-03
B 2015-09-01
B 2015-10-03
How do I do this in SQL server 2012? There's no limit to how many dates an ID can have, could be just 1 to as many as 5 or more. I'm fairly basic with SQL so any help would be greatly appreciated.
working like in your example, #test is your table with data:
;with cte1
as
(
select
ID, Date,
row_number()over(partition by ID order by Date) groupID
from #test
),
cte2
as
(
select ID, Date, Date as DateTmp, groupID, 1 as getRow from cte1 where groupID=1
union all
select
c1.ID,
c1.Date,
case when datediff(Day, c2.DateTmp, c1.Date) > 30 then c1.Date else c2.DateTmp end as DateTmp,
c1.groupID,
case when datediff(Day, c2.DateTmp, c1.Date) > 30 then 1 else 0 end as getRow
from cte1 c1
inner join cte2 c2 on c2.groupID+1=c1.groupID and c2.ID=c1.ID
)
select ID, Date from cte2 where getRow=1 order by ID, Date
select * from
(
select ID,DATE_, case when DATE_DIFF is null then 1 when date_diff>30 then 1 else 0 end comparison from
(
select ID, DATE_ ,DATE_-LAG(DATE_, 1) OVER (PARTITION BY ID ORDER BY DATE_) date_diff from trial
)
)
where comparison=1 order by ID,DATE_;
Tried in Oracle Database. Similar funtions exist in SQL Server too.
I am grouping by Id column, and based on DATE field, am comparing the date in current field with its previous field. The very first row of a given user id would return null, and first field is required in our output as first index. For all other fields, we return 1 when the date difference with respect to previous field is greater than 30.
Lag function in transact sql
Case function in transact sql
Your logic explained in question is wrong,at one place ,you have said take the first index record and in next place you considered immediate record..
This works for immediate records:
with cte
as
(
select *, ROW_NUMBER() over (partition by id order by datee) as rownum
from #test
)
select *,datediff(day,beforedate,datee)
from cte t1
cross apply
(Select isnull(max(Datee),t1.datee) as beforedate from cte t2 where t1.id =t2.id and t2.rownum<t1.rownum) b
where datediff(day,beforedate,datee)= 0 or datediff(day,beforedate,datee)>=30
This works for constant base record:
select *,datediff(day,basedate,datee) from #test t1
cross apply
(select min(Datee) as basedate from #test t2 where t1.id=t2.id)b
where datediff(day,basedate,datee)>=30 or datediff(day,basedate,datee)=0
Try this solution.
Sample demo
with diffs as (
select t1.id,t1.dt strtdt,t2.dt enddt,datediff(dd,t1.dt,t2.dt) daysdiff
from t t1
join t t2 on t1.id=t2.id and t1.dt<t2.dt
)
, y as (
select id,strtdt,enddt
from (
select id,strtdt,enddt,row_number() over(partition by id,strtdt order by daysdiff) as rn
from diffs
where daysdiff > 30
) x
where rn=1
)
,z as (
select *,coalesce(lag(enddt) over(partition by id order by strtdt),strtdt) prevend
from y)
select id,strtdt from z where strtdt=prevend
union
select id,enddt from z where strtdt=prevend

Getting all fields from table filtered by MAX(Column1)

I have table with some data, for example
ID Specified TIN Value
----------------------
1 0 tin1 45
2 1 tin1 34
3 0 tin2 23
4 3 tin2 47
5 3 tin2 12
I need to get rows with all fields by MAX(Specified) column. And if I have few row with MAX column (in example ID 4 and 5) i must take last one (with ID 5)
finally the result must be
ID Specified TIN Value
-----------------------
2 1 tin1 34
5 3 tin2 12
This will give the desired result with using window function:
;with cte as(select *, row_number(partition by tin order by specified desc, id desc) as rn
from tablename)
select * from cte where rn = 1
Edit: Updated query after question edit.
Here is the fiddle
http://sqlfiddle.com/#!9/20e1b/1/0
SELECT * FROM TBL WHERE ID IN (
SELECT max(id) FROM
TBL WHERE SPECIFIED IN
(SELECT MAX(SPECIFIED) FROM TBL
GROUP BY TIN)
group by specified)
I am sure we can simplify it further, but this will work.
select * from tbl where id =(
SELECT MAX(ID) FROM
tbl where specified =(SELECT MAX(SPECIFIED) FROM tbl))
One method is to use window functions, row_number():
select t.*
from (select t.*, row_number() over (partition by tim
order by specified desc, id desc
) as seqnum
from t
) t
where seqnum = 1;
However, if you have an index on tin, specified id and on id, the most efficient method is:
select t.*
from t
where t.id = (select top 1 t2.id
from t t2
where t2.tin = t.tin
order by t2.specified desc, id desc
);
The reason this is better is that the index will be used for the subquery. Then the index will be used for the outer query as well. This is highly efficient. Although the index will be used for the window functions; the resulting execution plan probably requires scanning the entire table.

SQL query to count number of objects in each state on each day

Given a set of database records that record the date when an object enters a particular state, I would like to produce a query that shows how many objects are in each state on any particular date. The results will be used to produce trend reports showing how the number of objects in each state changes over time.
I have a table like the following that records the date when an object enters a particular state:
ObjID EntryDate State
----- ---------- -----
1 2014-11-01 A
1 2014-11-04 B
1 2014-11-06 C
2 2014-11-01 A
2 2014-11-03 B
2 2014-11-10 C
3 2014-11-03 B
3 2014-11-08 C
There are an arbitrary number of objects and states.
I need to produce a query that returns the number of objects in each state on each date. The result would look like the following:
Date State Count
---------- ----- -----
2014-11-01 A 2
2014-11-01 B 0
2014-11-01 C 0
2014-11-02 A 2
2014-11-02 B 0
2014-11-02 C 0
2014-11-03 A 1
2014-11-03 B 2
2014-11-03 C 0
2014-11-04 A 0
2014-11-04 B 3
2014-11-04 C 0
2014-11-05 A 0
2014-11-05 B 3
2014-11-05 C 0
2014-11-06 A 0
2014-11-06 B 2
2014-11-06 C 1
2014-11-07 A 0
2014-11-07 B 2
2014-11-07 C 1
2014-11-08 A 0
2014-11-08 B 1
2014-11-08 C 2
2014-11-09 A 0
2014-11-09 B 1
2014-11-09 C 2
2014-11-10 A 0
2014-11-10 B 0
2014-11-10 C 3
I'm working with an Oracle database.
I haven't been able to find an example that matches my case. The following questions look like they are asking for solutions to similar but different problems:
SQL Count Of Open Orders Each Day Between Two Dates
Mysql select count per category per day
Any help or hints that can be provided would be much appreciated.
SELECT EntryDate AS "Date", State, COUNT(DISTINCT ObjectId) AS "Count" GROUP BY EntryDate, State ORDER BY EntryDate, State;
I'm going to do a quick and dirty way to get numbers. You can choose your preferred method . . . using recursive CTEs, connect by, or a numbers table. So, the following generates the all combinations of the dates and states. It then uses a correlated subquery to count the number of objects in each state on each date:
with n as (
select rownum - 1 as n
from table t
),
dates as (
select mind + n.n
from (select min(date) as mind, max(date) as maxd from table) t
where mind + n.n <= maxd
)
select d.date, s.state,
(select count(*)
from (select t2.*, lead(date) over (partition by ObjId order by date) as nextdate
from table t2
) t2
where d.date >= t2.date and (d.date < t2.nextdate or t2.nextdate is null) and
d.state = t2.state
) as counts
from dates d cross join
(select distinct state from table t)
This query will list how many objects ENTERED a particular state on each day, assuming each object only changes state ONCE a day. If objects change state more than once a day, you would need to use count(distinct objid):
select entrydate, state, count(objid)
from my_table
group by entrydate, state
order by entrydate, state
However, you are asking how many objects ARE in a particular state on each day, thus you would need a very different query to show that. Since you only provide that particular table in your example, I'll work with that table only:
select alldatestates.entrydate, alldatestates.state, count(statesbyday.objid)
from
(
select alldates.entrydate, allstates.state
from (select distinct entrydate from mytable) alldates,
(select distinct state from mytable) allstates
) alldatestates
left join
(
select alldates.entrydate, allobjs.objid, (select min(state) as state from mytable t1
where t1.objid = allobjs.objid and
t1.entrydate = (select max(entrydate) from mytable t2
where t2.objid = t1.objid and
t2.entrydate <= alldates.entrydate)) as state
from (select distinct entrydate from mytable) alldates,
(select distinct objid from mytable) allobjs
) statesbyday
on alldatestates.entrydate = statesbyday.entrydate and alldatestates.state = statesbyday.state
group by alldatestates.entrydate, alldatestates.state
order by alldatestates.entrydate, alldatestates.state
Of course, this query will be much simpler if you have a table for all the possible states and another one for all the possible object ids.
Also, probably you could find a query simpler than that, but this one works. The downside is, it could very quickly become an optimizer's nightmare! :)
As each state is not recorded every date , you need to do CROSS JOIN to get the unique states and then do GROUP BY.
SELECT EntryDate,
C.State,
SUM(case when C.state = Table1.state then 1 else 0 end) as Count
FROM Table1
CROSS JOIN ( SELECT DISTINCT State FROM Table1) C
GROUP BY EntryDate, C.State
ORDER BY EntryDate
Try this query :
select EntryDate As Date, State, COUNT(ObjID) AS Count from table_name
GROUP BY EntryDate , State
ORDER BY State
You can try this with analytic function as well:
Select
Date,
State,
count(distinct obj) OVER (PARTITION BY EntryDate, State) count
from table
order by 1;
Select EntryDate as Date, State, Count(Distinct ObjID) as Count From Table_1
Group by EntryDate, State
Working out of SQL SERVER because I'm more familiar, but here's what I've got so far:
fiddle example (SQL SERVER but the only difference should be the date functions I think...): http://sqlfiddle.com/#!3/8b9748/2
WITH zeroThruNine AS (SELECT 0 AS n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9),
nums AS (SELECT 10*b.n+a.n AS n FROM zeroThruNine a, zeroThruNine b),
Dates AS (
SELECT DATEADD(d,n.n,(SELECT MIN(t.EntryDate) FROM #tbl t)) AS Date
FROM nums n
WHERE DATEADD(d,n.n,(SELECT MIN(t.EntryDate) FROM #tbl t))<=(SELECT MAX(t.EntryDate) FROM #tbl t)
), Data AS (
SELECT d.Date, t.ObjID, t.State, ROW_NUMBER() OVER (PARTITION BY t.ObjID, d.Date ORDER BY t.EntryDate DESC) as r
FROM Dates d, #tbl t
WHERE d.Date>=t.EntryDate
)
SELECT t.Date, t.State, COUNT(*)
FROM Data t
WHERE t.r=1
GROUP BY t.Date, t.State
ORDER BY t.Date, t.State
First, start off making a numbers table (see http://web.archive.org/web/20150411042510/http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-numbers-table.html) for examples. There are different ways to create number tables in different databases, so the first two WITH expressions I've created are just to create a view of the numbers 0 through 99. I'm sure there are other ways, and you may need more than just 100 numbers (representing 100 dates between the first and last dates you provided)
So, once you get to the Dates CTE, the main part is the Data CTE
It finds each date from the Dates cte, and pairs it with the values of the #tbl table (your table) with any States that were recorded after said date. It also marks the order of which states/per objid in decreasing order. That way, in the final query, we can just use WHERE t.r=1 to get the max state for each objid per date
One issue, this gets data for all dates, even those where nothing was recorded, but for zero-counts, it doesn't return anything. If you wanted to, you could left join this result with a view of distinct states and take 0 when no join was made