Closest Date in Oracle - sql

I am trying to get the closest date to a given date in Oracle. I have been working form How to get the closest dates in Oracle sql, but the example in that question uses two different tables. I'm no PL SQL guru and I'm struggling to get this to work. I have a single table that contains an ID field and a Date field. I need the ID that it closest to the date passed into the query.
select *
from ( select SEQ_ID, ENTERED_DATE, rank() over ( partition by ENTERED_DATE order by difference asc ) as rnk
from ( select SEQ_ID, ENTERED_DATE, abs(ENTERED_DATE - 2/9/1999) from DOWNTIME_DETAILS)) as difference
where rnk = 1
This gives me an error: "SQL command not properly ended"
How can I fix the query? What am I doing wrong?

The as difference is assigning a table alias. You can't use as for table aliases, only for column aliases (so as rnk is OK). Just remove the second as. As you are refering to difference in the outer query, it looks like you meant it to be a column alias and just had it in the wrong place:
select *
from (
select SEQ_ID, ENTERED_DATE,
rank() over ( order by difference ) as rnk
from (
select SEQ_ID, ENTERED_DATE,
abs(ENTERED_DATE - to_date('2/9/1999', 'MM/DD/YYYY')) as difference
from DOWNTIME_DETAILS
)
)
where rnk = 1
You also had a date without any quote marks, so that would have been interpreted as numbers in this case, and wouldn't have had the effect you were looking for. You should always use explicit conversion; I've guessed your date format. And you should not be partitioning by the original entered_date as that will make everything rank as 1. If you have two records that have the same difference they will still both rank as 1 so you'll see both. You could add a way to break ties by modifying the order by, e.g.
rank() over ( order by difference , entered_date, seq_id ) as rnk
... but you'll need to specify the criteria so it makes sense for your data and situation.
You could also do this:
select max(SEQ_ID) keep (dense_rank first
order by abs(ENTERED_DATE - to_date('2/9/1999', 'MM/DD/YYYY')))
as seq_id,
max(ENTERED_DATE) keep (dense_rank first
order by abs(ENTERED_DATE - to_date('2/9/1999', 'MM/DD/YYYY')))
as entered_date
from DOWNTIME_DETAILS;
... but then you have to supply the date twice.

Related

Find the second largest value with Groupings

In SQL Server, I am attempting to pull the second latest NOTE_ENTRY_DT_TIME (items highlighted in screenshot). With the query written below it still pulls the latest date (I believe it's because of the grouping but the grouping is required to join later). What is the best method to achieve this?
SELECT
hop.ACCOUNT_ID,
MAX(hop.NOTE_ENTRY_DT_TIME) AS latest_noteid
FROM
NOTES hop
WHERE
hop.GEN_YN IS NULL
AND hop.NOTE_ENTRY_DT_TIME < (SELECT MAX(hope.NOTE_ENTRY_DT_TIME)
FROM NOTES hope
WHERE hop.GEN_YN IS NULL)
GROUP BY
hop.ACCOUNT_ID
Data sample in the table:
One of the "easier" ways to get the Nth row in a group is to use a CTE and ROW_NUMBER:
WITH CTE AS(
SELECT Account_ID,
Note_Entry_Dt_Time,
ROW_NUMBER() OVER (PARTITION BY AccountID ORDER BY Note_Entry_Dt_Time DESC) AS RN
FROM dbo.YourTable)
SELECT Account_ID,
Note_Entry_Dt_Time
FROM CTE
WHERE RN = 2;
Of course, if an ACCOUNT_ID only has 1 row, then it will not be returned in the result set.
The OP's statement "The row will not always be 2." from the comments conflicts with their statement "I am attempting to pull the second latest NOTE_ENTRY_DT_TIME" in the question. At a best guess, this means that the OP has rows with the same date, that could be the "latest" date. If so, then would simply need to replace ROW_NUMBER with DENSE_RANK. Their sampple data, however, doesn't suggest this is the case.
You can use window functions:
select *
from (
select
n.*,
row_number() over(partition by account_id order by note_entry_dt_time desc) rn
from notes n
) t
where rn = 2

Get minimum without using row number/window function in Bigquery

I have a table like as shown below
What I would like to do is get the minimum of each subject. Though I am able to do this with row_number function, I would like to do this with groupby and min() approach. But it doesn't work.
row_number approach - works fine
SELECT * FROM (select subject_id,value,id,min_time,max_time,time_1,
row_number() OVER (PARTITION BY subject_id ORDER BY value) AS rank
from table A) WHERE RANK = 1
min() approach - doesn't work
select subject_id,id,min_time,max_time,time_1,min(value) from table A
GROUP BY SUBJECT_ID,id
As you can see just the two columns (subject_id and id) is enough to group the items together. They will help differentiate the group. But why am I not able to use the other columns in select clause. If I use the other columns, I may not get the expected output because time_1 has different values.
I expect my output to be like as shown below
In BigQuery you can use aggregation for this:
SELECT ARRAY_AGG(a ORDER BY value LIMIT 1)[SAFE_OFFSET(1)].*
FROM table A
GROUP BY SUBJECT_ID;
This uses ARRAY_AGG() to aggregate each record (the a in the argument list). ARRAY_AGG() allows you to order the result (by value) and to limit the size of the array. The latter is important for performance.
After you concatenate the arrays, you want the first element. The .* transforms the record referred to by a to the component columns.
I'm not sure why you don't want to use ROW_NUMBER(). If the problem is the lingering rank column, you an easily remove it:
SELECT a.* EXCEPT (rank)
FROM (SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY subject_id ORDER BY value) AS rank
FROM A
) a
WHERE RANK = 1;
Are you looking for something like below-
SELECT
A.subject_id,
A.id,
A.min_time,
A.max_time,
A.time_1,
A.value
FROM table A
INNER JOIN(
SELECT subject_id, MIN(value) Value
FROM table
GROUP BY subject_id
) B ON A.subject_id = B.subject_id
AND A.Value = B.Value
If you do not required to select Time_1 column's value, this following query will work (As I can see values in column min_time and max_time is same for the same group)-
SELECT
A.subject_id,A.id,A.min_time,A.max_time,
--A.time_1,
MIN(A.value)
FROM table A
GROUP BY
A.subject_id,A.id,A.min_time,A.max_time
Finally, the best approach is if you can apply something like CAST(Time_1 AS DATE) on your time column. This will consider only the date part regardless of the time part. The query will be
SELECT
A.subject_id,A.id,A.min_time,A.max_time,
CAST(A.time_1 AS DATE) Time_1,
MIN(A.value)
FROM table A
GROUP BY
A.subject_id,A.id,A.min_time,A.max_time,
CAST(A.time_1 AS DATE)
-- Make sure the syntax of CAST AS DATE
-- in BigQuery is as I written here or bit different.
Below is for BigQuery Standard SQL and is most efficient way for such cases like in your question
#standardSQL
SELECT AS VALUE ARRAY_AGG(t ORDER BY value LIMIT 1)[OFFSET(0)]
FROM `project.dataset.table` t
GROUP BY subject_id
Using ROW_NUMBER is not efficient and in many cases lead to Resources exceeded error.
Note: self join is also very ineffective way of achieving your objective
A bit late to the party, but here is a cte-based approach which made sense to me:
with mins as (
select subject_id, id, min(value) as min_value
from table
group by subject_id, id
)
select distinct t.subject_id, t.id, t.time_1, t.min_time, t.max_time, m.min_value
from table t
join mins m on m.subject_id = t.subject_id and m.id = t.id

How to select the first observation in a category in PostgreSQL

My table contains different house IDs(dataid), time of observation(readtime), meter reading Basic Output
And the query is as follows Query statement :
select *
from university.gas_ert
where readtime between '01/01/2014' and '01/02/2014'
I am trying to get only the first observation of each day of all the dataids between the time span. I have tried GROUP BY, but it doesn't seem working.
Distinct ON could make your query much more simple.. More read in Documentation
Definition :
Keeps only the first row of each set of rows where the given
expressions evaluate to equal. Note that the “first row” of each set
is unpredictable unless ORDER BY is used to ensure that the desired
row appears first.
SELECT
DISTINCT ON (meter_value) meter_value,
dataid,
readtime
FROM
university.gas.ert
WHERE
readtime between '2014-01-01' and '2014-01-02'
ORDER BY
meter_value,
readtime ASC;
If you want one row for each unique dataid within the time range, you should use the DISTINCT ON construction. The following query will give you a row for each dataid for each day in the range described in the WHERE clause and lets you extend the range if you want to return rows for each day x dataid combination.
select distinct on(dataid, date_trunc('day', readtime)) *
from university.gas_ert
where readtime between '2014-01-01' and '2014-01-02'
order by dataid, date_trunc('day', readtime) asc
You can take a look at window functions to help out in this. ROW_NUMBER.
GROUP the records on the basis of day using date_trunc(ie without the time component) and then rank them on the basis of readtime asc
select *
from (
select *
,row_number() over(partition by date_trunc('day',a.readtime) order by a.readtime asc ) as rnk
from university.gas_ert a
)x
where x.rnk=1

Rank Over Partition By in Oracle SQL (Oracle 11g)

I have 4 columns in a table
Company Part Number
Manufacturer Part Number
Order Number
Part Receipt Date
Ex.
I just want to return one record based on the maximum Part Receipt Date which would be the first row in the table (The one with Part Receipt date 03/31/2015).
I tried
RANK() OVER (PARTITION BY Company Part Number,Manufacturer Part Number
ORDER BY Part Receipt Date DESC,Order Number DESC) = 1
at the end of the WHERE statement and this did not work.
This would seem to do what you want:
select t.*
from (select t.*
from t
order by partreceiptdate desc
) t
where rownum = 1;
Analytic functions like rank() are available in the SELECT clause, they can't be invoked directly in a WHERE clause. To use rank() the way you want it, you must declare it in a subquery and then use it in the WHERE clause in the outer query. Something like this:
select company_part_number, manufacturer_part_number, order_number, part_receipt_date
from ( select t.*, rank() over (partition by... order by...) as rnk
from your_table t
)
where rnk = 1
Note also that you can't have a column name like company part number (with spaces in it) - at least not unless they are enclosed in double-quotes, which is a very poor practice, best avoided.

SQL Server : UNION ALL but remove duplicate IDs by choosing first date of occurrence

I am unioning two queries but I'm getting an ID that occurs in each query. I do not know how to keep only the first time the id occurs. Everything else about the row is different. In general, it will be hard to know which of the two queries I will have to keep a duplicate on, therefore, I need a general solution.
I was thinking about creating a temp table and choosing the min date (once the date has been converted to an int).
Any ideas on the proper syntax?
You can do this using the row_number() function. This will assign a sequential number, starting with 1, to each row with the same id (based on the partition by clause). The ordering of the sequence is determined by the order by clause. So, the following assigns 1 to the earliest date for each id:
select t.*
from (select t.*,
row_number() over (partition by id order by date asc) as seqnum
from ((select *
from <subquery1>
) union all
(select *
from <subquery2>
)
) t
) t
where seqnum = 1;
The final where clause simply filters for the first occurrence.
If you use the keyword UNION, then it will remove duplicates from the two data sets you are working with. UNION ALL preserves duplicates.
You can view the specifics here:
http://www.w3schools.com/sql/sql_union.asp
If you want to only have one of the 2 records and they are not identical you will have to filter them yourself. You may need to do something like the following. THis may be possible to do with the one (select union select) block but this should get you started.
select *
from (
select id
, date
, otherstuf
from table_1
union all
select id
, date
, otherstuf
from table_2
) x1
, (
select id
, date
, otherstuf
from table_1
union all
select id
, date
, otherstuf
from table_2
) x2
where x1.id = x2.id
and x1.date < x2.date
Although rethinking this if you go down a path like this why bother to UNION it?