Find Gap between dates ranges SQL Oracle - sql

l want to get the gap between dates range via SQL query lets see the situation:
l have table employees like : Every month the employee deserve payment
ID Name From_date To_date Paid_Amount`
1 ali 01/01/2002 31/01/2002 300
2 ali 01/02/2002 28/02/2002 300
3 ali 01/04/2002 30/04/2002 300
4 ali 01/05/2002 31/05/2002 300
5 ali 01/07/2002 31/07/2002 300
Now, we notice there are no payments in March and June
so, how by SQL query I can't get these months ??

Try this,
with mine(ID,Name,From_date,To_date,Paid_Amount) as
(
select 1,'ali','01/01/2002','31/01/2002',300 from dual union all
select 2,'ali','01/02/2002','28/02/2002',300 from dual union all
select 3,'ali','01/04/2002','30/04/2002',300 from dual union all
select 4,'ali','01/05/2002','31/05/2002',300 from dual union all
select 5,'ali','01/07/2002','31/07/2002',300 from dual
),
gtfirst (fromdt,todt) as (
select min(to_Date(from_Date,'dd/mm/yyyy')) fromdt,max(to_Date(to_Date,'dd/mm/yyyy')) todt from mine
),
dualtbl(first,last,fromdt,todt) as
(
select * from(select TRUNC(ADD_MONTHS(fromdt, rownum-1), 'MM') AS first,TRUNC(LAST_DAY(ADD_MONTHS(fromdt, rownum-1))) AS last,fromdt,todt from gtfirst connect by level <=12)
where first between fromdt and todt and last between fromdt and todt
)
select to_char(first,'month') no_payment_date from dualtbl where first not in (select to_Date(from_Date,'dd/mm/yyyy') from mine)
and first not in (select to_Date(to_date,'dd/mm/yyyy') from mine)

If you want to get the date difference between one payment date and the previous payment date and the ID field is sequential, then you may simply join back to the table and select the previous row.
SELECT X.From_date, Y.From_date, Y.From_date - X.From_date Difference
FROM Employees X
LEFT OUTER JOIN Employees Y ON Y.ID = X.ID - 1
If the ID field is not sequential, then you can use a similar method, but build a temporary table with a row index that you can use to join back to the previous payment.

Related

Exclude certain products based on date range

For example, I have sales data for 1 year, and some of the products not available on a specific date range.
I currently have for 1 date range, but what is the best practice if have multiple exclusions?
SELECT * FROM XXX
WHERE
IF(Date BETWEEN '2018-11-22' AND '2019-03-28',
ID IN (8467,8468,8469,8470),
ID IN (8467,8468,8469,8470,9551,9552,9553)
)
Especially how to solve the issue if dates are overlapping?
If you are trying to exclude values, I am thinking:
SELECT *
FROM XXX
WHERE ID IN (8467, 8468, 8469, 8470, 9551, 9552, 9553) AND
(Date BETWEEN '2018-11-22' AND '2019-03-28' AND
ID NOT IN (9551, 9552, 9553) OR
Date NOT BETWEEN '2018-11-22' AND '2019-03-28'
);
You can add multiple pairs for other dates.
For a full solution, you might want to create a table with olumns such as:
product_id
start_exclusion_date
end_exclusion_date
And then phrase the query as:
select xxx.*
from xxx left join
exclusions e
on xxx.id = e.product_id and
xxx.date >= e.start_exclusion_date and
xxx.date <= e.end_exclusion_date
where xxx.id in ( . . . );
This is likely to be easier to maintain in the long term.
Try this,
select * from xxx
where not(date between '2018-11-22' and '2019-03-28' and id in(9551,9552,9553))
order by id, date
Below is an example for BigQuery Standard SQL and shows direction for building "complete picture" with whitelist and blacklist rules (all with quite simplified dummy data just to demonstrate it in action)
#standardSQL
WITH `project.dataset.xxx` AS (
SELECT 1 id, DATE '2018-11-22' `date` UNION ALL
SELECT 2, '2018-11-23' UNION ALL
SELECT 3, '2018-11-24' UNION ALL
SELECT 4, '2018-11-25' UNION ALL
SELECT 1, '2018-11-26' UNION ALL
SELECT 2, '2018-11-27' UNION ALL
SELECT 3, '2018-11-28' UNION ALL
SELECT 8, '2018-11-29'
), `project.dataset.whitelist` AS (
SELECT DATE '2018-11-22' start, DATE '2018-11-29' finish, [2,3] ids UNION ALL
SELECT '2018-11-22', '2018-11-22', [1]
), `project.dataset.blacklist` AS (
SELECT DATE '2018-11-26' start, DATE '2018-11-28' finish, [1,3] ids UNION ALL
SELECT '2018-11-22', '2018-11-22', [10]
)
SELECT DISTINCT t.*
FROM `project.dataset.xxx` t
JOIN `project.dataset.whitelist` w
ON (`date` BETWEEN w.start AND w.finish AND id IN UNNEST(w.ids))
JOIN `project.dataset.blacklist` b
ON NOT(`date` BETWEEN b.start AND b.finish AND id IN UNNEST(b.ids))
with result
Row id date
1 1 2018-11-22
2 2 2018-11-27
3 2 2018-11-23
4 3 2018-11-28
5 3 2018-11-24
Obviously, in real case all involved tables are real tables and query will look just like below
#standardSQL
SELECT DISTINCT t.*
FROM `project.dataset.xxx` t
JOIN `project.dataset.whitelist` w
ON (`date` BETWEEN w.start AND w.finish AND id IN UNNEST(w.ids))
JOIN `project.dataset.blacklist` b
ON NOT(`date` BETWEEN b.start AND b.finish AND id IN UNNEST(b.ids))

SQL with nested condition

EDIT: added third requirement after playing with solution from Tim Biegeleisen
EDIT2: modified Robbie's DOB to be before his parent's marriage date
I am trying to create a query that will look at two tables and determine the difference in dates based on a percentage. I know, super confusing... Let me try and explain using the tables below:
Bob and Mary are married on 2010-01-01 and expect 4 kids (Parent table)
I want to know how many years it took until they met 50% of their expected kids (i.e. 2/4 kids). Using the Child table to see the DOB of their 4 kids, we know that Frankie is the second child which meets our 50% threshold so we use Frankie's DOB and subtract it from Frankie's parent's marriage date and end up with 3 years!
If the goal isn't reached then display no value e.g. Mick and Jo only had 1 child so far so they haven't yet reached their goal
Hoping this is doable using BigQuery standard SQL.
Parent table
id married_couple married_at expected_kids
--------------------------------------
1 Bob and Mary 2010-01-01 4
2 Mick and Jo 2010-01-01 4
Child table
id child_name parent_id date_of_birth
--------------------------------------
1 Eddie 1 2012-01-01
2 Frankie 1 2013-01-01
3 Robbie 1 2005-01-01
4 Duncan 1 2015-01-01
5 Rick 2 2014-01-01
Expected SQL result
parent_id half_goal_reached(years)
--------------------------------------
1 3
2
Below both soluthions for BigQuery Standard SQL
First one is more in classic sql way, the second one is more of BigQuery style (I think)
First Solution: with analytics function
#standardSQL
SELECT
parent_id,
IF(
MAX(pos) = MAX(CAST(expected_kids / 2 AS INT64)),
MAX(DATE_DIFF(date_of_birth, married_at, YEAR)),
NULL
) AS half_goal_reached
FROM (
SELECT c.parent_id, c.date_of_birth, expected_kids, married_at,
ROW_NUMBER() OVER(PARTITION BY c.parent_id ORDER BY c.date_of_birth) AS pos
FROM `child` AS c
JOIN `parent` AS p
ON c.parent_id = p.id
)
WHERE pos <= CAST(expected_kids / 2 AS INT64)
GROUP BY parent_id
Second Solution: with use of ARRAY
#standardSQL
SELECT
parent_id,
DATE_DIFF(dates[SAFE_ORDINAL(CAST(expected_kids / 2 AS INT64))], married_at, YEAR) AS half_goal_reached
FROM (
SELECT
parent_id,
ARRAY_AGG(date_of_birth ORDER BY date_of_birth) AS dates,
MAX(expected_kids) AS expected_kids,
MAX(married_at) AS married_at
FROM `child` AS c
JOIN `parent` AS p
ON c.parent_id = p.id
GROUP BY parent_id
)
Dummy Data
You can test / play with both solutions using below dummy data
#standardSQL
WITH `parent` AS (
SELECT 1 id, 'Bob and Mary' married_couple, DATE '2010-01-01' married_at, 4 expected_kids UNION ALL
SELECT 2, 'Mick and Jo', DATE '2010-01-01', 4
),
`child` AS (
SELECT 1 id, 'Eddie' child_name, 1 parent_id, DATE '2012-01-01' date_of_birth UNION ALL
SELECT 2, 'Frankie', 1, DATE '2013-01-01' UNION ALL
SELECT 3, 'Robbie', 1, DATE '2014-01-01' UNION ALL
SELECT 4, 'Duncan', 1, DATE '2015-01-01' UNION ALL
SELECT 5, 'Rick', 2, DATE '2014-01-01'
)
Try the following query, whose logic is too verbose to explain it well. I join the parent and child tables, bringing into line the parent id, number of years elapsed since marriage, running number of children, and expected number of children. With this information in hand, we can easily find the first row whose running number of children matches or exceeds half of the expected number.
SELECT parent_id, num_years AS half_goal_reached
FROM
(
SELECT parent_id, num_years, cnt, expected_kids,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY num_years) rn
FROM
(
SELECT
t2.parent_id,
YEAR(t2.date_of_birth) - YEAR(t1.married_at) AS num_years,
(SELECT COUNT(*) FROM child c
WHERE c.parent_id = t2.parent_id AND
c.date_of_birth <= t2.date_of_birth) AS cnt,
t1.expected_kids
FROM parent t1
INNER JOIN child t2
ON t1.id = t2.parent_id
) t
WHERE
cnt >= expected_kids / 2
) t
WHERE t.rn = 1;
Note that there may be issues with how I computed the yearly differences, or how I compute the threshhold for half the number of expected children. Also, if we were using a recent enterprise database we could have used an analytic function to get the running number of children instead of a correlated subquery, but I was unsure if Big Query would support that, so I used the latter.

How to join 2 queries with different number of records and columns in oracle sql?

I have three tables:
Employee_leave(EmployeeID,Time_Period,leave_type)
Employee(EID,Department,Designation)
leave_eligibility(Department,Designation, LeaveType, LeavesBalance).
I want to fetch the number of leaves availed by a particular employee in each LeaveTypes(Category) so I wrote following query Query1
SELECT LEAVE_TYPE, SUM(TIME_PERIOD)
FROM EMPLOYEE_LEAVE
WHERE EMPLOYEEID=78
GROUP BY LEAVE_TYPE
order by leave_type;
output for Query1
Leave_Type | SUM(Time_Period)
Casual 1
Paid 4
Sick 1
I want to fetch the number of leaves an employee is eligible for each leave_type(category). Following query Query2 gives the desire result.
Select UNIQUE Leavetype,LEAVEBALANCE
from LEAVE_ELIGIBILITY
INNER JOIN EMPLOYEE
ON LEAVE_ELIGIBILITY.DEPARTMENT= EMPLOYEE.DEPARTMENT
AND LEAVE_ELIGIBILITY.DESIGNATION= EMPLOYEE.DESIGNATION
WHERE EID=78
order by leavetype;
output for Query2
LeaveType | LeaveBalance
Casual 10
Paid 15
Privlage 6
Sick 20
Now I want to join these 2 queries Query1 and Query2 or create view which displays records from both queries. Also as you can see from output there are different no. of records from different queries. For a record which is not present in output of query1, it should display 0 in final output. Like in present case there is no record in output of query1 like privlage but it should display 0 in Sum(time_period) in Privlage of final output. I tried creating views of these 2 queries and then joining them, but I'm unable to run final query.
Code for View 1
create or replace view combo_table1 as
Select UNIQUE Leavetype,LEAVEBALANCE,EMPLOYEE.DEPARTMENT,EMPLOYEE.DESIGNATION, EID
from LEAVE_ELIGIBILITY
INNER JOIN EMPLOYEE
ON LEAVE_ELIGIBILITY.DEPARTMENT= EMPLOYEE.DEPARTMENT
AND LEAVE_ELIGIBILITY.DESIGNATION= EMPLOYEE.DESIGNATION
WHERE EID='78';
Code for View 2
create or replace view combo_table2 as
SELECT LEAVE_TYPE, SUM(TIME_PERIOD) AS Leave_Availed
FROM EMPLOYEE_LEAVE
WHERE EMPLOYEEID='78'
GROUP BY LEAVE_TYPE;
Code for joining 2 views
SELECT combo_table1.Leavetype, combo_table1.LEAVEBALANCE, combo_table2.leave_availed
FROM combo_table1 v1
INNER JOIN combo_table2 v2
ON v1.Leavetype = v2.LEAVE_TYPE;
But I'm getting "%s: invalid identifier" while executing the above query. Also I know I can't use union as it requires same column which here it is not.
I'm using Oracle 11g, so please answer accordingly.
Thanks in advance.
Desired final output
LeaveType | LeaveBalance | Sum(Time_period)
Casual 10 1
Paid 15 4
Privlage 6 0
Sick 20 1
To get the final desired output ...
"For a record which is not present in output of query1, it should display 0 in final output. "
... use an outer join to tie the taken leave records to the other tables. This will give zero time_duration for leave types which the employee has not taken.
select emp.Employee_ID
, le.leavetype
, le.leavebalance
, sum (el.Time_Duration) as total_Time_Duration
from employee emp
inner join leave_eligibility le
on le.department= emp.department
and le.designation= emp.designation
left outer join Employee_leave el
on el.EmployeeID = emp.Employee_ID
and el.leave_type = le.leavetype
group by emp.Employee_ID
, le.leavetype
, le.leavebalance
;
Your immediate problem:
I'm getting "%s: invalid identifier"
Your view has references to a column EID although none of your posted tables have a column of that name. Likewise there is confusion between Time_Duration and time_period.
More generally, you will find life considerably easier if you use the exact same name for common columns (i.e. consistently use either employee_id or employeeid, don't chop and change).
Try this examle:
with t as (
select 'Casual' as Leave_Type, 1 as Time_Period, 0 as LeaveBalance from dual
union all
select 'Paid', 4,0 from dual
union all
select 'Sick', 1,0 from dual),
t1 as (
select 'Casual' as Leave_Type, 0 as Time_Period, 10 as LeaveBalance from dual
union all
select 'Paid', 0, 15 from dual
union all
select 'Privlage', 0, 6 from dual
union all
select 'Sick', 0, 20 from dual)
select Leave_Type, sum(Time_Period), sum(LeaveBalance)
from(
select *
from t
UNION ALL
select * from t1
)
group by Leave_Type
Ok, edit:
create or replace view combo_table1 as
Select UNIQUE Leavetype, 0 AS Leave_Availed, LEAVEBALANCE
from LEAVE_ELIGIBILITY INNER JOIN EMPLOYEE ON LEAVE_ELIGIBILITY.DEPARTMENT= EMPLOYEE.DEPARTMENT AND LEAVE_ELIGIBILITY.DESIGNATION= EMPLOYEE.DESIGNATION
WHERE EID='78';
create or replace view combo_table2 as
SELECT LEAVE_TYPE as Leavetype, SUM(TIME_PERIOD) AS Leave_Availed, 0 as LEAVEBALANCE
FROM EMPLOYEE_LEAVE
WHERE EMPLOYEEID='78'
GROUP BY LEAVE_TYPE, LEAVEBALANCE;
SELECT Leavetype, sum(LEAVEBALANCE), sum(leave_availed)
FROM (
select *
from combo_table1
UNION ALL
select * from combo_table2
)
group by Leavetype;

SQL where no match *for date range* in transaction table

Can't quite get my head into this: I want all those rows in master table that do not have matching transaction rows in transaction table, for a given date range. Example:
master table m:
id Name
1 John
2 David
3 Simon
4 Jessica
5 Becky
transaction table t
id parent date action
1 1 2015-08-28 IN
2 1 2015-09-03 IN
3 2 2015-08-17 IN
4 2 2015-10-01 IN
5 4 2015-09-05 IN
I want all those entries in m that do not have any transactions in september: so I should get m.id 2,3,5
1 does not match: events in september
2 matches: events but none in september
3 matches: no events at all
4 does not match: events in september
5 matches: No events at all
I can get those with nothing in t, with left join, and those with dates but not in range, with join, but can't see how to get both conditions. I might just be having a duh day.
Often, when we try to jump directly to a final query, it can turn out to be much more complicated that is should -- if it works at all. Generally, it doesn't hurt to just perform a straight join on the tables in question and look at the results. If nothing else, you verify the join is correct:
with
Master( ID, Name )as(
select 1, 'John' from dual union all
select 2, 'David' from dual union all
select 3, 'Simon' from dual union all
select 4, 'Jessica' from dual union all
select 5, 'Becky' from dual
),
Trans( MasterID, EffDate, Action )as(
select 1, date '2015-08-28', 'IN' from dual union all
select 1, date '2015-09-03', 'IN' from dual union all
select 2, date '2015-08-17', 'IN' from dual union all
select 2, date '2015-10-01', 'IN' from dual union all
select 4, date '2015-09-05', 'IN' from dual
)
select *
from Master m
join Trans t
on t.MasterID = m.ID;
(Excuse me for renaming some of your fields.) I happen to have Oracle up at the moment, you can use whatever you have. Probably this code will work with any non-Oracle system by just removing the 'from dual' from the CTE code.
Now let's extend the join criteria, but let's do so to generate the data we don't want to see.
join Trans t
on t.MasterID = m.ID
and t.EffDate >= date '2015-09-01'
and t.EffDate < date '2015-10-01';
I've hard-coded the date values, but this is the best format to use to get "during the month of..." ranges. Every value is selected from the first tick of the first day of the month to the absolutely last tick before the first day of the next month. You'll want to store these values in variables or generate them on the fly, of course.
So now we see only the two transactions that occurred during September. The obvious next step is to perform an outer join so we get all the other Master records that don't match.
left join Trans t
Now we have all the records we want, plus the two that we don't want. As they are the only ones that match the date restrictions, we add filtering criteria to remove those. Here is the final query:
select m.*
from Master m
left join Trans t
on t.MasterID = m.ID
and t.EffDate >= date '2015-09-01'
and t.EffDate < date '2015-10-01'
where t.MasterID is null;
Simple really. Once you've done this a few times, you'll be able to jump to the finished query without the intervening steps. Still, it doesn't hurt to execute the intervening steps and look at the outputs along the way. Any flaws in the logic will be caught earlier when it can be fixed easier.
select all master, join with transaction grouped by parent (which results in 0..1 row per master entry).
if there is no transaction record, t.parent will be null which translates in no transaction for that master entry.
if there are transaction entries, you'll find the count in t.a, if some of them are in september, you'll find them in t.m9
if you want all master without any transaction, you'll filter where t.parent is null
if you want all master without a transaction in september, you'll filter where t.parent is null or t.m9=0
select m.id, m.name,
, t.a, t.m9
from master_table m
left join ( select a = count(*)
, m9 = count(case when datepart(Month, t.date) = 9 then 1 end)
, t.parent
from transaction_table t
group by t.parent
) t on t.parent = m.id
where t.parent is null or t.m9=0

show recent records only

I have a requirement to show most recent records when user selects the option to view most recent records. I have 3 different tables from which I take data and display on the screen.
Below are the sample tables created.
Create table one(sealID integer,product_ser_num varchar2(20),create_time timestamp,status varchar2(10));
create table two(transID integer,formatID integer, formatStatus varchar,ctimeStamp timestamp,sealID integer);
create table three(transID integer,fieldStatus varchar,fieldValue varchar,exctype varchar);
I'm joining above 3 tables and showing the results in a single screen. I want to display the most recent records based on the timestamp.
Please find the sample data on the screen taken from 3 different tables.
ProductSerialNumber formatID formatStatus fieldStatus TimeStamp
ASD100 100 P P 2015-09-03 10:30:22
ASD100 200 p P 2015-09-03 10:30:22
ASD100 100 p P 2015-09-03 10:22:11
ASD100 200 p P 2015-09-03 10:22:11
I want to display the most recent records from the above shown table which should return first 2 rows as they are the recent records when checked with the timestamp column.
Please suggest what changes to be done to the below query to show most recent records.
SELECT transId,product_ser_num,status, to_char(timestamp, 'yyyy-mm-dd hh24:mi:ss') timestamp,
cnt
FROM (SELECT one.*,
row_number() over(ORDER BY
CASE
WHEN :orderDirection like '%asc%' THEN
CASE
WHEN :orderBy='product_ser_num' THEN product_ser_num,
WHEN :orderBy='status' THEN status
WHEN :orderBy='timestamp' THEN to_char(timestamp, 'yyyy-mm-dd hh24:mi:ss')
ELSE to_char(timestamp, 'yyyy-mm-dd hh24:mi:ss')
END
END ASC,
CASE
WHEN :orderDirection like '%desc%' THEN
CASE
WHEN :orderBy='product_ser_num' THEN product_ser_num,
WHEN :orderBy='status' THEN status
WHEN :orderBy='timestamp' THEN to_char(timestamp, 'yyyy-mm-dd hh24:mi:ss')
ELSE to_char(timestamp, 'yyyy-mm-dd hh24:mi:ss')
END
END DESC , transId ASC) line_number
FROM (select one_inner.*, COUNT(1) OVER() cnt
from (select two_tran.transaction_id,
one_res.product_serial_number productSerialNumber,
one_res.status status,from one one_res
left outer join two two_trans on two_trans.sealID = one_res.sealID
left outer join three three_flds on two_tran.transID = three_flds.transID and (three_flds.fieldStatus = 'P')
I don't think you are looking for a Top-n query as your topic title suggests.
It seems like you want to display the data in a customized order, as you have shown in the first image. You want the set of three rows to be grouped together on the basis of timestamp.
I have prepared a small test case to demonstrate the custom order of the rows:
SQL> WITH DATA(ID, num, datetime) AS(
2 SELECT 10, 1001, SYSDATE FROM dual UNION ALL
3 SELECT 10, 6009, SYSDATE FROM dual UNION ALL
4 SELECT 10, 3951, SYSDATE FROM dual UNION ALL
5 SELECT 10, 1001, SYSDATE -1 FROM dual UNION ALL
6 SELECT 10, 6009, SYSDATE -1 FROM dual UNION ALL
7 SELECT 10, 3951, SYSDATE -1 FROM dual
8 )
9 SELECT ID,
10 num,
11 TO_CHAR(DATETIME, 'yyyy-mm-dd hh24:mi:ss') TIMESTAMP
12 FROM
13 (SELECT t.*,
14 row_number() OVER(ORDER BY DATETIME DESC,
15 CASE num
16 WHEN 1001
17 THEN 1
18 WHEN 6009
19 THEN 2
20 WHEN 3951
21 THEN 3
22 END, num) rn
23 FROM DATA t
24 );
ID NUM TIMESTAMP
---------- ---------- -------------------
10 1001 2015-09-04 11:04:48
10 6009 2015-09-04 11:04:48
10 3951 2015-09-04 11:04:48
10 1001 2015-09-03 11:04:48
10 6009 2015-09-03 11:04:48
10 3951 2015-09-03 11:04:48
6 rows selected.
Now, you can see that for the same ID 10, the NUM values are grouped and also in a custom order.
This query seems very large and complex, so this may be oversimplifying things:
Add a clause to the end limit 3 ?
What I think you need to do is:
select
max(timestamp), engine_serial_number, formatID
from
<
joins here
>
group by engine_serial_number, formatID
This will basically give you the lines you want, but not all metadata.
Hence, you will just have to re-join all this with the main join to get the rest of the info (join on all three columns, engine serial number, formatID AND timestamp).
That should work.
Hope this helps!
It's hard to give you a precise answer, because your query is incomplete. But I'll give you the general idea, and you can tweak it into your query.
One way to accomplish what you want is by using the dense_rank() analytical function to number your rows by timestamp in descending order (You could use rank() too in this case, it doesn't actually matter). All rows with the same timestamp will be assigned the same "rank", so you can then filter by rank to only get the most recent records.
Try to adjust your query to something like this:
select ...
from (select ...,
dense_rank() over (order by timestamp desc) as timestamp_rank
from ...)
where timestamp_rank = 1
...
I suspect that with a better understanding of your data model and query, there would probably be a better solution. But based on the information provided, I think that the above should yield the results you are looking for.