how to get unique row numbers in sql - sql

How to get only the first row from the result of the below query. I need the latest record for each date so I did the partition by created_date. But in some places, I am getting the same row number and not able to get the expected output. Please find the below query, current output, and expected output.
What changes do in need to make in order to get the expected output? Thank you.
WITH ctetable
AS (
SELECT created_date BPMDate
,tenor
,row_number() OVER (
PARTITION BY created_date ORDER BY created_date DESC
) rw
FROM table1 a
INNER JOIN table2 b ON a.case_id = b.case_id
AND a.eligible_transaction = 'true'
AND to_date(a.created_date) >= '2020-10-01'
AND to_date(a.created_date) <= '2020-10-05'
AND case_status = 'Completed'
)
SELECT BPMDate
,Tenor
,rw
FROM ctetable
Current output:
date tenor rw
2020-10-05 13:24:15.0 1W 1
2020-10-05 12:15:43.0 1Y 1
2020-10-05 12:15:43.0 1Y 2
2020-10-01 13:30:59.0 1W 1
2020-10-01 13:30:59.0 1W 2
Expected output:
date tenor rw
2020-10-05 13:24:15.0 1W 1
2020-10-01 13:30:59.0 1W 1
Regards,
Viresh

That would be:
with ctetable as (
select created_date, bpmdate, tenor,
row_number() over (partition by date(created_date) order by created_date desc ) rn
from table1 a
inner join table2 b
on a.case_id = b.case_id
and a.eligible_transaction = 'true'
and to_date(a.created_date) >= '2020-10-01'
and to_date(a.created_date) <= '2020-10-05'
and case_status='completed'
)
select bpmdate,tenor,rw
from ctetable
where rn = 1
Changes to your original code:
you need to remove the time portion of the date in the partition by clause of the window function; you didn't tell which database you are using: I used date(), but the function might be different in your database (trunc() in Oracle, date_trunc() in Postgres, and so on)
the outer query needs to filter on the row number that is equal to 1

You seem to want the first row per day:
select BPMDate, Tenor, rw
from (select t.*,
row_number() over (partition by trunc(bpmdate) order by bpmdate) as seqnum
from ctetable
) t
where seqnum = 1;
Note: I don't know if your database supports trunc(), but that is simply some method for extracting the date from the column.

Related

SQL How to subtract 2 row values of a same column based on same key

How to extract the difference of a specific column of multiple rows with same id?
Example table:
id
prev_val
new_val
date
1
0
1
2020-01-01 10:00
1
1
2
2020-01-01 11:00
2
0
1
2020-01-01 10:00
2
1
2
2020-01-02 10:00
expected result:
id
duration_in_hours
1
1
2
24
summary:
with id=1, (2020-01-01 10:00 - 2020-01-01 11:00) is 1hour;
with id=2, (2020-01-01 10:00 - 2020-01-02 10:00) is 24hour
Can we achieve this with SQL?
This solutions will be an effective way
with pd as (
select
id,
max(date) filter (where c.old_value = '0') as "prev",
max(date) filter (where c.old_value = '1') as "new"
from
table
group by
id )
select
id ,
new - prev as diff
from
pd;
if you need the difference between successive readings something like this should work
select a.id, a.new_val, a.date - b.date
from my_table a join my_table b
on a.id = b.id and a.prev_val = b.new_val
you could use min/max subqueries. For example:
SELECT mn.id, (mx.maxdate - mn.mindate) as "duration",
FROM (SELECT id, max(date) as mindate FROM table GROUP BY id) mn
JOIN (SELECT id, min(date) as maxdate FROM table GROUP BY id) mx ON
mx.id=mn.id
Let me know if you need help in converting duration to hours.
You can use the lead()/lag() window functions to access data from the next/ previous row. You can further subtract timestamps to give an interval and extract the parts needed.
select id, floor( extract('day' from diff)*24 + extract('hour' from diff) ) "Time Difference: Hours"
from (select id, date_ts - lag(date_ts) over (partition by id order by date_ts) diff
from example
) hd
where diff is not null
order by id;
NOTE: Your expected results, as presented, are incorrect. The results would be -1 and -24 respectively.
DATE is a very poor choice for a column name. It is both a Postgres data type (at best leads to confusion) and a SQL Standard reserved word.

Grouping results by a set of dates in Redshift with two tables

Hope you are fine, I am trying to account the amount of observations that I have in an employee database. Tables look more or less like this:
Date_Table
date_dt
2020-09-07
2020-09-14
2020-09-21
Employee_table
login_id
effective_date
is_active
a
2020-09-07
1
a
2020-09-14
1
b
2020-09-07
1
b
2020-09-14
0
c
2020-09-21
1
keep in mind the effective_date represents (the higher the date the most recent the change) some change (attrition, position change, what ever, those are easily filtered) being the latest the one the current status.
In the above example the date 2020-09-14 for empl_login b would be the day it stopped to be active within the table.
I want to reflect something like this:
the_date
amount_of_employees
2020-09-07
2
2020-09-14
1
2020-09-21
2
This query works perfectly fine, and provides me the correct number:
SELECT '2020-09-07',COUNT(DISTINCT login_id) amount_of_employees
FROM (SELECT date_dt FROM Date_Table) AS dd,(SELECT *,
ROW_NUMBER() OVER (PARTITION BY login_id ORDER BY effective_date DESC) AS chk
FROM Employee_table WHERE effective_date <= '2020-09-07' ) AS dp
WHERE
dp.is_active =1
AND
dp.chk=1
GROUP BY 1
ORDER BY 1 ASC;
Great! This one works and gives me the right value:
the_date
amount_of_employees
2020-09-07
2
However, when I try this to build my dataset with this query:
SELECT dd.date_dt ,COUNT(DISTINCT login_id) amount_of_employees
FROM (SELECT date_dt FROM Date_Table) AS dd,(SELECT *,
ROW_NUMBER() OVER (PARTITION BY login_id ORDER BY effective_date DESC) AS chk
FROM Employee_table WHERE effective_date <= dd.date_dt ) AS dp
WHERE
dp.is_active =1
AND
dp.chk=1
GROUP BY 1
ORDER BY 1 ASC;
I get this error message:
Invalid operation: subquery in FROM may not refer to other relations of same query level
I tried to investigate something like this:
https://w3coded.com/questions/672056/error-subquery-in-from-cannot-refer-to-other-relations-of-same-query-level
but didn't work or doesn't apply necessarily. May be I am not getting it
Any idea? I wouldn't like to make A lot of unions, but is a workaround.
Thanks in advance
I'm not familiar with Amazon Redshift,but as long as your query syntax is supported, you can use a subquery to get the count, and there you'll be able to refer to the columns of the outer query like this
SELECT
dt.date_dt,
(
SELECT COUNT(DISTINCT login_id)
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY login_id ORDER BY effective_date DESC) AS rn
FROM employee_table et
WHERE et.effective_date <= dt.date_dt
ORDER BY effective_date DESC
) t
WHERE rn = 1 AND is_active = 1
) amount
FROM date_table dt
this is a solution for this:
SELECT dt.date_dt, COUNT(DISTINCT login_id) other_account
FROM Date_Table dt
LEFT JOIN employee_table et ON dd.date_dt BETWEEN et.effective_date AND et.effective_date + (some additional interval)
WHERE et.is_active = 1 (And other where clauses)
GROUP BY 1
Thanks for all your support

How to select most dense 1 min in Oracle

I have table with time stamp column tmstmp, this table contains log of certain events. I need to find out the max number events which occurred within 1 min interval.
Please read carefully! I do NOT want to extract the time stamps minute fraction and sum like this:
select count(*), TO_CHAR(tmstmp,'MI')
from log_table
group by TO_CHAR(tmstmp,'MI')
order by TO_CHAR(tmstmp,'MI');
It needs to take 1st record and then look ahead until it selects all records within 1 min from the 1st and sum number of records, then take 2nd and do the same etc..
And as the result there must be a recordset of (sum, starting timestamp).
Anyone has a snippet of code somewhere and care to share please?
Analytic function with a logical window can provide this information directly:
select l.tmstmp,
count(*) over (order by tmstmp range between current row and interval '59.999999' second following) cnt
from log_table l
order by 1
;
TMSTMP CNT
--------------------------- ----------
01.01.16 00:00:00,000000000 4
01.01.16 00:00:10,000000000 4
01.01.16 00:00:15,000000000 3
01.01.16 00:00:20,000000000 2
01.01.16 00:01:00,000000000 3
01.01.16 00:01:40,000000000 2
01.01.16 00:01:50,000000000 1
Please adjust the interval length for your precision. It must be the highest possible value below 1 minute.
To get the maximal minute use the subquery (and don't forget you may receive more that one record - with the MAX count):
with tst as (
select l.tmstmp,
count(*) over (order by tmstmp range between current row and interval '59.999999' second following) cnt
from log_table l)
select * from tst where cnt = (select max(cnt) from tst);
TMSTMP CNT
--------------------------- ----------
01.01.16 00:00:00,000000000 4
01.01.16 00:00:10,000000000 4
I think you can achieve your goal using a subquery in SELECT statement, as follow:
SELECT tmstmp, (
SELECT COUNT(*)
FROM log_table t2
WHERE t2.tmstmp >= t.tmstmp AND t2.tmstmp < t.tmstmp + 1 / (24*60)
) AS events
FROM log_table t;
One method uses a join and aggregation:
select t.*
from (select l.tmstmp, count(*)
from log_table l join
log_table l2
on l2.tmstmp >= l.tmstmp and
l2.tmstmp < l.tmstmp + interval '1' minute
group by l.tmpstmp
order by count(*) desc
) t
where rownum = 1;
Note: This assumes that tmstmp is unique on each row. If this is not true, then the subquery should be aggregating by some column that is unique.
EDIT:
For large data, there is a more efficient way that makes use of cumulative sums:
select tmstamp - interval 1 minute as starttm, tmstamp as endtm, cumulative
from (select tmstamp, sum(inc) over (order by tmstamp) as cumulative
from (select tmstamp, 1 as inc from log_table union all
select tmstamp + interval '1' day, -1 as inc from log_table
) t
order by sum(inc) over (order by tmstamp) desc
) t
where rownum = 1;

Select rows where value is equal given value or lower and nearest to it

Sorry for confusing title. Please, tell, if it's possible to do via db request. Assume we have following table
ind_id name value date
----------- -------------------- ----------- ----------
1 a 10 2010-01-01
1 a 20 2010-01-02
1 a 30 2010-01-03
2 b 10 2010-01-01
2 b 20 2010-01-02
2 b 30 2010-01-03
2 b 40 2010-01-04
3 c 10 2010-01-01
3 c 20 2010-01-02
3 c 30 2010-01-03
3 c 40 2010-01-04
3 c 50 2010-01-05
4 d 10 2010-01-05
I need to query all rows to include each ind_id once for the given date, and if there's no ind_id for given date, then take the nearest lower date, if there's no any lower dates, then return ind_id + name (name/ind_id pairs are equal) with nulls.
For example, date is 2010-01-04, I expect following result:
ind_id name value date
----------- -------------------- ----------- ----------
1 a 30 2010-01-03
2 b 40 2010-01-04
3 c 40 2010-01-04
4 d NULL NULL
If it's possible, I'll be very grateful if someone help me with building query. I'm using SQL server 2008.
Check this SQL FIDDLE DEMO
with CTE_test
as
(
select int_id,
max(date) MaxDate
from test
where date<='2010-01-04 00:00:00:000'
group by int_id
)
select A.int_id, A.[Value], A.[Date]
from test A
inner join CTE_test B
on a.int_id=b.int_id
and a.date = b.Maxdate
union all
select int_id, null, null
from test
where int_id not in (select int_id from CTE_test)
(Updated) Try:
with cte as
(select m.*,
max(date) over (partition by ind_id) max_date,
max(case when date <= #date then date end) over
(partition by ind_id) max_acc_date
from myTable m)
select ind_id,
name,
case when max_acc_date is null then null else value end value,
max_acc_date date
from cte c
where date = coalesce(max_acc_date, max_date)
(SQLFiddle here)
Here is a query that returns the result that you are looking for:
SELECT
t1.ind_id
, CASE WHEN t1.date <= '2010-01-04' THEN t1.value ELSE null END
FROM test t1
WHERE t1.date=COALESCE(
(SELECT MAX(DATE)
FROM test t2
WHERE t2.ind_id=t1.ind_id AND t2.date <= '2010-01-04')
, t1.date)
The idea is to pick a row in a correlated query such that its ID matches that of the current row, and the date is the highest one prior to your target date of '2010-01-04'.
When such row does not exist, the date for the current row is returned. This date needs to be replaced with a null; this is what the CASE statement at the top is doing.
Here is a demo on sqlfiddle.
You can use something like:
declare #date date = '2010-01-04'
;with ids as
(
select distinct ind_id
from myTable
)
,ranks as
(
select *
, ranking = row_number() over (partition by ind_id order by date desc)
from myTable
where date <= #date
)
select ids.ind_id
, ranks.value
, ranks.date
from ids
left join ranks on ids.ind_id = ranks.ind_id and ranks.ranking = 1
SQL Fiddle with demo.
Ideally you wouldn't be using the DISTINCT statement to get the ind_id values to include, but I've used it in this case to get the results you needed.
Also, standard disclaimer for these sorts of queries; if you have duplicate data you should consider a tie-breaker column in the ORDER BY or using RANK instead of ROW_NUMBER.
Edited after OPs update
Just add the new column into the existing query:
with ids as
(
select distinct ind_id, name
from myTable
)
,ranks as
(
select *
, ranking = row_number() over (partition by ind_id order by date desc)
from myTable
where date <= #date
)
select ids.ind_id
, ids.name
, ranks.value
, ranks.date
from ids
left join ranks on ids.ind_id = ranks.ind_id and ranks.ranking = 1
SQL Fiddle with demo.
As with the previous one it would be best to get the ind_id/name information through joining to a standing data table if available.
Try
DECLARE #date DATETIME;
SET #date = '2010-01-04';
WITH temp1 AS
(
SELECT t.ind_id
, t.name
, CASE WHEN t.date <= #date THEN t.value ELSE NULL END AS value
, CASE WHEN t.date <= #date THEN t.date ELSE NULL END AS date
FROM test1 AS t
),
temp AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY ind_id ORDER BY t.date DESC) AS rn
FROM temp1 AS t
WHERE t.date <= #date OR t.date IS NULL
)
SELECT *
FROM temp AS t
WHERE rn = 1
Use option with EXISTS operator
DECLARE #date date = '20100104'
SELECT ind_id,
CASE WHEN date <= #date THEN value END AS value,
CASE WHEN date <= #date THEN date END AS date
FROM dbo.test57 t
WHERE EXISTS (
SELECT 1
FROM dbo.test57 t2
WHERE t.ind_id = t2.ind_id AND t2.date <= #date
HAVING ISNULL(MAX(t2.date), t.date) = t.date
)
Demo on SQLFiddle
This is not the exact answer but will give you the concept as i just write it down quickly without any testing.
use
go
if
(Select value from table where col=#col1) is not null
--you code to get the match value
else if
(Select LOWER(Date) from table ) is not null
-- your query to get the nerst dtae record
else
--you query withh null value
end

how to get last date form DB table mysql

i have this table in my DB
categoriesSupports-> id, category_id, support_id, date
the thing is that i need to extract all support_id where date is the closest date from now...
something like this... if there is in the DB table
id, category_id, support_id, date
1 1 1 2010-11-23
2 1 2 2010-11-25
3 1 1 2010-11-26
4 1 3 2010-11-24
i need to get just
id, category_id, support_id, date
2 1 2 2010-11-25
3 1 1 2010-11-26
4 1 3 2010-11-24
So for better undestanding... i need the closest date for each support and i only have date from the past...
Ive being trying a lot and I dont know how...
The following should give you:
all the categoriesSupports for current date(one or multiple)
One previous categoriesSupport(if exists)
One future categoriesSupport(if exists)
(
SELECT *
FROM `categoriesSupports`
WHERE `date` < CURDATE()
ORDER BY `date` DESC
LIMIT 1
)
UNION
(
SELECT *
FROM `categoriesSupports`
WHERE `date` = CURDATE()
)
UNION
(
SELECT *
FROM `categoriesSupports`
WHERE `date` > CURDATE()
ORDER BY `date` ASC
LIMIT 1
)
A. This answers 'where date is the closest date from now...':
SELECT *
FROM `categoriesSupports`
WHERE `date` IN (
SELECT `date`
FROM `categoriesSupports`
ORDER BY `date` DESC
LIMIT 1
)
Notes:
You can set LIMIT n to select entries for more dates.
If you only want for the last date you can replace IN with = because the sub-select will return only one value.
If your table includes future dates replace ORDER BY date DESC with ORDER BY ABS(NOW() - date) ASC.
A solution with JOINS. Will work only if you have past dates.
SELECT a.*
FROM `categoriesSupports` AS a
LEFT JOIN `categoriesSupports` AS b
ON b.date > a.date
WHERE b.id IS NULL
Added just for reference.
B. This answers 'where date is in the last 3 days (including today)':
SELECT *
FROM `categoriesSupports`
WHERE DATEDIFF(NOW(), `date`) < 3
Replace 3 with any number if you want more or less days.
C. Same as A., but per support id
SELECT a.*
FROM `categoriesSupports` AS a
LEFT JOIN `categoriesSupports` AS b
ON b.support_id = a.support_id AND b.date > a.date
WHERE b.id IS NULL
This answers the latest version of the question.
SELECT *
FROM CurrentDeals
WHERE (julianday(Date('now')) - julianday(date))<=3
ORDER BY date ASC
Here, you have to decide what would be your meaning of "closest". I have used 3 as the sample. This will list out the records, which has a date value lesser that or equal to 3.
Hope this is what you wanted.