Get one record over period with condition - sql

I have a big query creating a history of object changes. In short it result looks like this:
id changedOn recordtype
1 2019-12-5 history
1 2020-01-1 history
1 2020-01-7 actual
2 2018-10-9 history
The result I want:
id changedOn recordtype
1 2019-12-5 history
1 2020-01-7 actual
2 2018-10-9 history
If there is 2 records in the same month on each id I want to ommit history records for this Month.
I would like to avoid cursor if it possible. But I'm stuck.

If you want one record per month with a preference for "actual", then use row_number():
select t.*
from (select t.*,
row_number() over (partition by id, year(changedOn), month(changedOn) order by recordtype) as seqnum
from t
) t
where seqnum = 1;
If you want all "actual" records for a month -- and then if there are none -- all the history records, I would recommend logic like this:
select t.*
from t
where t.recordtype = 'actual' or
(t.recordtype = 'history' and
not exists (select 1
from t t2
where t2.id = t.id and
t2.recordtype = 'actual' and
year(t2.changedon) = year(t.changedon) and
month(t2.changedon) = month(t.changedon)
);
These two approaches are subtly different. But you will only notice the differences if you have multiple "actual"s or "history"s in a single month for a single id.

Just remove the records with changedOn that are not the most recent
select * from tbl a
where not exists
(select 1 from tbl b where a.id = b.id and a.recordtype = b.recordtype and a.changedOn < b.changedOn )

Related

Oracle SQL: check amount of active users on given date (check closest date of grouped field)

Have a table given, holding the status history of a user:
ID
USERID
MODIFIED
STATUS
1
1
01.01.2020
inactive
2
1
01.07.2020
active
3
2
04.08.2020
active
4
2
04.06.2020
active
5
2
01.08.2020
inactive
6
2
01.10.2020
active
7
3
01.09.2020
inactive
I want to provide a date, i.e. 01.07.2020, and understand how many UserIds were active on that day.
I therefor need to check the modified date which is closest but not above 01.07.2020, grouped by the userid.
Desired result for 01.07.2020:
ID
USERID
MODIFIED
STATUS
2
1
01.07.2020
active
4
2
04.06.2020
active
From there I could just sum the status, and see I had two users active on the checked date of 01.07.2020.
Current approach for first step:
select max(id), userid, max(modified)
keep (dense_rank first order by modified) as id
from MY_TABLE
where modified <= '01.07.2020'
group by userid;
it does not yet provide fully correct results
the final step would then be a simple sum I assume, something like:
Select sum(case when status = 'active' then 1 else 0 end) as "active_users"
from MY_TABLE t1
inner join (
select max(id)
keep (dense_rank first order by modified) as id
from MY_TABLE
where modified <= '01.07.2020'
group by userid
) t2 on t1.id = t2.id
You can use row_number() to get the last status as of that date:
select count(*)
from (select t.*,
row_number() over (partition by userid order by modified desc) as seqnum
from my_table t
where t.modified <= date '2020-07-01'
) t
where seqnum = 1 and status = 'Active';
Another option is a correlated subquery:
select count(*)
from my_table t
where t.modified = (select max(t2.modified)
from my_table t2
where t2.userid = t.userid and
t2.modified <= date '2020-07-01'
) and
t.status = 'Active';
Or, you can use two levels of aggregation:
select count(*)
from (select userid,
max(status) keep (dense_rank first order by modified desc) as status
from my_table t
where t.modified <= date '2020-07-01'
group by userid
) t
where status = 'Active';
Since you are looking for a count of active users nearest to the supplied date, the following would work.
select count(distinct userid)
from table
where modified <= '01.07.2020'
and status='Active'

SQL - delete record where sum = 0

I have a table which has below values:
If Sum of values = 0 with same ID I want to delete them from the table. So result should look like this:
The code I have:
DELETE FROM tmp_table
WHERE ID in
(SELECT ID
FROM tmp_table WITH(NOLOCK)
GROUP BY ID
HAVING SUM(value) = 0)
Only deletes rows with ID = 2.
UPD: Including additional example:
Rows in yellow needs to be deleted
Your query is working correctly because the only group to total zero is id 2, the others have sub-groups which total zero (such as the first two with id 1) but the total for all those records is -3.
What you're wanting is a much more complex algorithm to do "bin packing" in order to remove the sub groups which sum to zero.
You can do what you want using window functions -- by enumerating the values for each id. Taking your approach using a subquery:
with t as (
select t.*,
row_number() over (partition by id, value order by id) as seqnum
from tmp_table t
)
delete from t
where exists (select 1
from t t2
where t2.id = t.id and t2.value = - t.value and t2.seqnum = t.seqnum
);
You can also do this with a second layer of window functions:
with t as (
select t.*,
row_number() over (partition by id, value order by id) as seqnum
from tmp_table t
),
tt as (
select t.*, count(*) over (partition by id, abs(value), seqnum) as cnt
from t
)
delete from tt
where cnt = 2;

SQL Query to update earliest date for unique combinations

I am trying to update the Contract_Start date to be the earliest date for each unique Company and Route combination. For example, the Contract_Start for the first three records should be 1/15/12 as Company = 1 and Route = 1 for all three. The Contract_Start would then change to 1/20/12 for record 4 as that is the earliest date for the combination of Company = 1 and Route = 2. Any help would be greatly appreciated.
Company Route Driver Date Contract_Start
1 1 A 1/29/12
1 1 B 2/3/12
1 1 C 1/15/12
1 2 A 1/28/12
1 2 B 1/20/12
2 1 A 1/7/12
2 1 B 1/16/12
2 2 A 2/9/12
1 2 B 1/4/12
Update query with subquery can solve your problem
UPDATE TABLE TABLE T
SET CONTRACT_START = (
SELECT MIN(DATE)
FROM TABLE TI
WHERE T.COMPANY = TI.COMPANY
AND T.ROUTE = TI.ROUTE
)
Use window function to get Contract_Start. You needn't to store it at the table.
select Company, Route, Driver, Date, min(Date) over(partition by Company, Route) as Contract_Start
from myTable;
Use an updatable CTE with window functions:
with toupdate as (
select t.*, min(date) over (partition by company, route) as min_date
from t
)
update toupdate
set contract_start = min_date;
Updatable CTE/subqueries/views are a very handy feature in SQL Server and quite powerful when combined with window functions.
I hope this approach will resolve your issue and fulfill your task requirements.
UPDATE TABLE SET Contract_Start = CE.Date1
FROM Table_Name TB JOIN (
SELECT Company,Route,date1, Count(*) AS Countnum,
Row_Number() OVER(PARTITION BY Company, Route ORDER BY date1 ASC) AS Rownumber
FROM Table_Name GROUP BY Company,Route,date1) as CE ON TB.Company = CE.Company AND TB.Route = CE.Route
WHERE Rownumber = 1;

PostgreSQL array_agg but with stop condition

I have table with record of children & i want to get comma separated results in descending order by month but with a breaking condition of status of child in each month. if status is 0 push it to array but if status is 1 then don't push it and break it there and don't check previous months record.
Table
Desired Output:
I have tried it this way which gives me all the months. but i don't know how to break it on status = 1 condition for every child
SELECT name, ARRAY_AGG(month ORDER BY month DESC)
FROM children
GROUP BY name
I think of this as:
SELECT name, ARRAY_AGG(month ORDER BY month DESC)
FROM (SELECT c.*,
MAX(c.month) FILTER (c.status = 1) OVER (PARTITION BY c.name) as last_1_month
FROM children c
) c
WHERE month > last_1_month
GROUP BY name;
This logic simply gets the last month where status = 1 and then chooses all later months.
If month is actually sequential with no gaps then you can do:
SELECT name,
ARRAY_AGG(month ORDER BY month DESC)[1:MAX(month) - MAX(month) FILTER (c.status = 1)]
FROM children c
GROUP BY name;
I'd use a not exists condition to filter out the records you don't want:
SELECT name, ARRAY_AGG(month ORDER BY month DESC)
FROM children a
WHERE NOT EXISTS (SELECT *
FROM children b
WHERE a.name = b.name AND b.status = 1 and a.month <= b.month)
GROUP BY name

SQL retrieve recent record

I want to retrieve TOPIC 1 SCORES with the most recent score (excluding null) (sorted by date) for each detailsID, (there are only detailsID 2 and 3 here, therefore only two results should return)
What about getting rid of Topic 1 Scores in GROUP BYdetailsID,Topic 1 Scores ?
Use a subquery to get the max and then join to it.
SELECT a.detailsID,`Topic 1 Scores`, a.Date
FROM Information.scores AS a
JOIN (SELECT detailsID, MAX(Date) "MaxDate"
FROM Information.scores
WHERE `Topic 1 Scores` IS NOT NULL
GROUP BY detailsID) Maxes
ON a.detailsID = Maxes.detailsID
AND a.Date = Maxes.MaxDate
WHERE `Topic 1 Scores` IS NOT NULL
Assuming SQL Server:
SELECT
ROW_NUMBER() OVER (PARTITION BY detailsID ORDER BY Date DESC) AS RowNumber,
detailsID, Date, Topic 1 Scores
FROM
Information.scores
Try doing
SELECT detailsID,`Topic 1 Scores`, MAX(Date) as "Date" GROUP BY "Date"