Hive query to return single row based on eff and exp date - sql

I have a table with the following data.
I am expecting row which needs to be returned is with exp_dt "2020-09-22". But when run below query it returning both the rows. I am not understanding why it is returning the first row also when it has eff_dt "2020-09-19".
select id,cd,eff_dt,exp_dt,post_dt from table
where from_unixtime(unix_timestamp(eff_dt,"yyyy-MM-dd")) <= from_unixtime(unix_timestamp("2020-09-21","yyyy-MM-dd"))
and from_unixtime(unix_timestamp(exp_dt,"yyyy-MM-dd")) >= from_unixtime(unix_timestamp("2020-09-21","yyyy-MM-dd"));
Is there any issue with my query? I am expecting 2nd row to be returned.

Use < for the comparison to exp_date:
select id,cd,eff_dt,exp_dt,post_dt
from table
where from_unixtime(unix_timestamp('2020-09-21', 'yyyy-MM-dd')) >= from_unixtime(unix_timestamp(eff_dt, 'yyyy-MM-dd')) and
from_unixtime(unix_timestamp('2020-09-22', 'yyyy-MM-dd')) < from_unixtime(unix_timestamp(exp_dt, 'yyyy-MM-dd'))
I reversed the comparison order. I find it easier to follow the logic with the constants first.

Does this capture the edge case of equal same day expiry and solve your problem at the same time?
select id,cd,eff_dt,exp_dt,post_dt from table
where
(from_unixtime(unix_timestamp(eff_dt,"yyyy-MM-dd")) <= from_unixtime(unix_timestamp("2020-09-21","yyyy-MM-dd"))
and
from_unixtime(unix_timestamp(exp_dt,"yyyy-MM-dd")) > from_unixtime(unix_timestamp("2020-09-21","yyyy-MM-dd"))
)
or
(from_unixtime(unix_timestamp(eff_dt,"yyyy-MM-dd")) = from_unixtime(unix_timestamp("2020-09-21","yyyy-MM-dd"))
and
from_unixtime(unix_timestamp(exp_dt,"yyyy-MM-dd")) = from_unixtime(unix_timestamp("2020-09-21","yyyy-MM-dd"))
)
;
In fact I suspect exp is always >= eff, so maybe only one condition
from_unixtime(unix_timestamp(eff_dt,"yyyy-MM-dd")) <= from_unixtime(unix_timestamp("2020-09-21","yyyy-MM-dd"))
is enough...?

You do not need from_unixtime(unix_timestamp()) because dates are already in correct format and argument is in the same yyyy-MM-dd format.
The issue in your query is that you are using equal for both eff and exp dates
To find latest record on date use this query:
select id,cd,eff_dt,exp_dt,post_dt from table
where eff_dt <= "2020-09-21"
and exp_dt > "2020-09-21";
There should be no records when eff_dt = exp_dt in SCD2 if you have only date (without time component). dates can be equal only if you are using timestamps, and time is different, in this case convert your argument date to timestamp before checking.
SCD2 should be designed in such way that fact record can be mapped to exactly one record of SCD2.

Related

tdate issue I'm facing in SQL query

While fetching count from table by using following query
Select count(*)
from tab
where tdate = '17-05-19' ---> output 0
or
Select count(*)
from tab
where trunc(tdate) = '17-05-19' ---->output 0
If I use:
Select count(*)
from tab
where tdate >sysdate - 1 ---> it returns some count(yesterday+some of the today txn)
But here I want only yesterday txn whenever I fire this query.
But here I want only yesterday txn whenever I fire this query.
You may use this.
Select count (*) from tab where
tdate >= TRUNC(SYSDATE) - 1
AND tdate < TRUNC(SYSDATE)
The advantage of this over using TRUNC on the date column is that it will utilize an index if it exists over tdate
If you tried by using
Select count(*) from tab where trunc(tdate) = date'2019-05-17'
(or, you could use
Select count(*) from tab where to_char(tdate,'dd-mm-yy') = '17-05-19' by formatting through to_char function
or, you could use
Select count(*) from tab where trunc(tdate) = trunc(sysdate)-1 to get only the data for the day before
)
you'd get some results provided you have data for the date 17th May.
So, you need to provide a formatting for your literal as date'2019-05-17'(known as date literal) especially for Oracle DB, it might be used as '2019-05-17' without date part in MySQL as an example.
Btw, trunc function is used to extract the date portion, and remove the time part of a date type column value.
If your table is populated with huge data, therefore performance may matter, then you can even create functional index on trunc(tdate).
Demo

Fetching records from SQL based on Month name

So this my table structure and data.
Now I want to filter data based on Month by ExpenseDate column.
So how can I achieve that?
I was trying
select * from tblExpenses where (ExpenseDate = MONTH('April'))
But it throws an error: "Conversion failed when converting date and/or time from character string."
Please help. Thank you.
You are putting month() on the wrong column. It can be applied to ExpensesDate:
select *
from tblExpenses
where month(ExpenseDate) = 4;
Note that month() returns a number, not the name of the month.
I think it is more likely that you want records from a particular April, not every April. This would be expressed as:
where ExpenseDate >= '2018-04-01' and ExpenseDate < '2018-05-01'
I think your where clause is just reversed I think you want this (and change the word to a number)
select * from tblExpenses where Month(ExpenseDate) = 4

sql query to get today new records compared with yesterday

i have this table:
COD (Integer) (PK)
ID (Varchar)
DATE (Date)
I just want to get the new ID's from today, compared with yesterday (the ID's from today that are not present yesterday)
This needs to be done with just one query, maximum efficiency because the table will have 4-5 millions records
As a java developer i am able to do this with 2 queries, but with just one is beyond my knowledge so any help would be so much appreciated
EDIT: date format is dd/mm/yyyy and every day each ID may come 0 or 1 times
Here is a solution that will go over the base data one time only. It selects the id and the date where the date is either yesterday or today (or both). Then it GROUPS BY id - each group will have either one or two rows. Then it filters by the condition that the MIN date in the group is "today". Those are the id's that exist today but did not exist yesterday.
DATE is an Oracle keyword, best not used as a column name. I changed that to DT. I also assume that your "dt" field is a pure date (as pure as it can be in Oracle, meaning: time of day, which is always present, is 00:00:00).
select id
from your_table
where dt in (trunc(sysdate), trunc(sysdate) - 1)
group by id
having min(dt) = trunc(sysdate)
;
Edit: Gordon makes a good point: perhaps you may have more than one such row per ID, in the same day? In that case the time-of-day may also be different from 00:00:00.
If so, the solution can be adapted:
select id
from your_table
where dt >= trunc(sysdate) - 1 and dt < trunc(sysdate) + 1
group by id
having min(dt) >= trunc(sysdate)
;
Either way: (1) the base table is read just once; (2) the column DT is not wrapped within any function, so if there is an index on that column, it can be used to access just the needed rows.
The typical method would use not exists:
select t.*
from t
where t.date >= trunc(sysdate) and t.date < trunc(sysdate + 1) and
not exists (select 1
from t t2
where t2.id = t.id and
t2.date >= trunc(sysdate - 1) and t2.date < trunc(sysdate)
);
This is a general solution. If you know that there is at most one record per day, there are better solutions, such as using lag().
Use MINUS. I suppose your date column has a time part, so you need to truncate it.
select id from mytable where trunc(date) = trunc(sysdate)
minus
select id from mytable where trunc(date) = trunc(sysdate) - 1;
I suggest the following function index. Without it, the query would have to full scan the table, which would probably be quite slow.
create idx on mytable( trunc(sysdate) , id );

Choose between two different date ranges

I have a check box, when checked, my date range flips causing me now to choose which date range to look at.
So, 4/15/14 thru 4/20/14 … when my check box is checked, this date range is now 4/10/14 thru 4/15/14.
In my SQL Select I need to chose, based on this check box, which date range.
This didn't work ??
Where ( ? Between Date_1 and Date_2 ) or ( ? Between Date_2 and Date_1 )
Nor did this work ??
Where ( ?
Case
When Ck_Bx Is Null
Then Date_2 and Date_1
Else Date_1 and Date_2
End
)
Here is the SQL THAT IS WORKING AND I AM TRYING TO MODIFY THE "WHERE CLAUSE"
ExecuteSQL ( "
Select ToDo_Name_Calc, ToDo_Name
From ToDo
WHERE ( ( ? Between ToDo_Alert_Date and ToDo_Date ) or ( ? Between ToDo_Date and ToDo_Alert_Date ) ) and ToDo_Ck_Bx Is Null
Order By ToDo_Alert_Date Asc " ; " - " ; "" ;
cDateOfFirstPortal +11 )
Any assistance I would be grateful.
Tom
between is really just a syntax shortcut and it is the exact equivalent of:
( field >= small-value and field <= larger-value )
Let's say we use 2014-04-20 as the larger-value, but the field contains time as well as date. So evaluating <= 2014-04-20 against a stored value of 2014-04-20 11:12:13 means that value is ignored. In truth we really do want that value included (it occurs DURING the day of 2014-04-20) and the most reliable way of protecting we don't make that mistake is to get all data that is less than 2014-04-21.
So, the greatest problem faced when using date ranges using between is that you could miss almost 24 hours of data if you get it wrong. A safer approach avoids this problem by using less than for the upper date - but you add one day to it, and because we need to add one day we may have to use database specific code (e.g. date_add() for MySQL, dateadd() for SQL Server/Sybase).
Not using between, which is the safer option, requires some date arithmetic, represent here simply by +1
SELECT
ToDo_Name_Calc
, ToDo_Name
FROM ToDo
WHERE (
? >= ToDo_Alert_Date AND ? < (ToDo_Date+1) --*
AND ToDo_Ck_Bx IS NULL)
OR (
? >= ToDo_Date AND ? < (ToDo_Alert_Date+1) --*
AND ToDo_Ck_Bx IS NOT NULL
)
ORDER BY
ToDo_Alert_Date ASC
--* use the relevant date addition method for your dbms.
Another reason for not using between here is that the dates MUST be in a specific order or nothing is returned', the older date must be first, the younger date must be last. You cannot just reverse them inside the between syntax.

Query aggregate faster than MAX

I have a fairly large table in which one of the columns is a date column. The query I execute is as follows.
select max(date) from tbl where date < to_date('10/01/2010','MM/DD/YYYY')
That is, I want to find the cell value closest to and less than a particular date value. This takes considerable time because of the max on the large table. Is there a faster way to do this? maybe using LAST_VALUE?
Put an index on the date column and the query should be plenty fast.
1) Add an index to the date column. Simply put, an index allows the database engine to store information about the data so it will speed up most queries where that column is one of the clauses. Info here http://docs.oracle.com/cd/B28359_01/server.111/b28310/indexes003.htm
2) Consider adding a second clause to the query. You have where date < to_date('10/01/2010','MM/DD/YYYY') now, why not change it to:
where date < to_date('10/01/2010','MM/DD/YYYY') and date > to_date('09/30/2010', 'MM/DD/YYYY')
since this will reduce the number of scanned rows.
Try
select date from (
select date from tbl where date < to_date('10/01/2010','MM/DD/YYYY') order by date desc
) where rownum = 1