Oracle SQL Join columns on 2 conditions - sql

I tried to search forums for my scenario but could not find anything remotely similar. So here goes my long winded explanation : I have 3 tables - order_fact , session_fact and orderline.
create table order_fact (order_no varchar2(20), order_timestamp date, cookie_id number, session_id number);
insert into order_fact values ('69857-20210329', to_date('29-MAR-2021 10:11:58', 'DD-MON-YYYY HH24:MI:SS'), 827678, 79853421);
insert into order_fact values ('78345-20210411', to_date('11-APR-2021 18:37:07', 'DD-MON-YYYY HH24:MI:SS'), 569834, 84886798);
insert into order_fact values ('79678-20210519', to_date('19-MAY-2021 20:51:34', 'DD-MON-YYYY HH24:MI:SS'), 589623, 89556782);
insert into order_fact values ('78759-20210411', to_date('11-APR-2021 09:46:52', 'DD-MON-YYYY HH24:MI:SS'), 685213, 77549823);
create table session_fact (cookie_id number, session_id number, session_timestamp date, marketing_vendor varchar2(30) , referral_type VARCHAR2(2) );
insert into session_fact values (827678, 79853421, to_date('29-MAR-2021 09:47:36', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (827678, 79853378, to_date('28-MAR-2021 12:47:36', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (827678, 79853313, to_date('24-MAR-2021 13:23:36', 'DD-MON-YYYY HH24:MI:SS'), 'Naaptol', 'S');
insert into session_fact values (827678, 79853254, to_date('23-MAR-2021 14:39:56', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (569834, 84886798, to_date('11-APR-2021 14:41:44', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (569834, 84886735, to_date('10-APR-2021 11:03:44', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (569834, 84886687, to_date('08-APR-2021 17:26:49', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (569834, 84886659, to_date('03-APR-2021 11:03:44', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (569834, 84886497, to_date('01-APR-2021 07:59:08', 'DD-MON-YYYY HH24:MI:SS'), 'Google', 'R');
insert into session_fact values (685213, 77549823, to_date('11-APR-2021 09:07:34', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (685213, 77549786, to_date('09-APR-2021 20:51:34', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (685213, 77549589, to_date('07-APR-2021 14:11:57', 'DD-MON-YYYY HH24:MI:SS'), 'FabShopping', 'D');
insert into session_fact values (685213, 77548356, to_date('03-APR-2021 15:38:42', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (589623, 89556782, to_date('19-MAY-2021 16:46:52', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (589623, 89556512, to_date('18-MAY-2021 09:46:52', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (589623, 89556477, to_date('13-MAY-2021 18:34:29', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
insert into session_fact values (589623, 89556348, to_date('10-MAY-2021 16:13:49', 'DD-MON-YYYY HH24:MI:SS'), '-1', 'D');
create table orderline (order_no varchar2(20), ol_nbr number, ol_ref varchar2(5));
insert into orderline values ('78345-20210411', 0, '-2');
insert into orderline values ('78345-20210411', 1, 'HV3');
insert into orderline values ('78345-20210411', 2, 'HV3');
insert into orderline values ('78759-20210411', 0, '-2');
insert into orderline values ('78759-20210411', 1, 'PS5');
insert into orderline values ('78759-20210411', 2, 'PS5');
insert into orderline values ('78759-20210411', 3, 'PS5');
insert into orderline values ('79678-20210519', 0, '-2');
insert into orderline values ('79678-20210519', 1, 'NPT');
insert into orderline values ('79678-20210519', 2, 'NPT');
insert into orderline values ('69857-20210329', 0, '-2');
insert into orderline values ('69857-20210329', 1, 'HV3');
insert into orderline values ('69857-20210329', 2, 'HV3');
insert into orderline values ('69857-20210329', 3, 'HV3');
As can be seen from above order_fact and session_fact tables are connected by cookie and session id. The request is to get these columns : ORDER_NO, MARKETING_VENDOR, REFERRAL_TYPE, OL_REF from the above 3 tables.
I have written the JOIN query :
select a.ORDER_NO, b.MARKETING_VENDOR,
b.REFERRAL_TYPE, c.OL_REF
FROM order_fact a
INNER JOIN session_fact b
ON (a.cookie_id = b.COOKIE_ID AND
b.session_timestamp < a.order_timestamp AND
b.session_timestamp > a.order_timestamp-7)
INNER JOIN orderline c ON
(a.ORDER_NO = c.ORDER_NO AND c.OL_NBR = 1);
Here is the sticky situation for me :
Get the data in session_fact table for a cookie_id in order_fact for timestamp of not more than 7 days before the order_timestamp. For example - order_no 78345-20210411 was placed on 11-APR-2021 18:37:07. Using the cookie id of that order I get all rows in session_fact till 11-APR - 7 days = 4-APR. So 3rd and 1st Apr data cannot be considered. This has been taken care in my query. But I wanted to mention why I had the additional AND clauses in the 1st JOIN ON condition.
From the data got in point 1 above do not consider those records where REFERRAL_TYPE = 'D' and MARKETING_VENDOR = '-1'. 'S' and '-1' can be considered and so is 'R' and '-1'. Basically any values can be considered as long as its NOT 'D' and '-1'. And select the record whose timestamp is closest to the order_timestamp in table order_fact. Now this is where it gets tricky - if there are no records of past 7 days where combo of REFERRAL_TYPE and MARKETING_VENDOR is NOT 'D' and '-1' then join the tables order_fact and session_fact on both cookie_id and session_id and fetch the values.
Join tables order_fact and orderline ON ORDER_NO and OL_NBR = 1. This also has been taken care in my join query.
So my only problem is getting the JOIN between session_fact and order_fact on the 2 different conditions mentioned in point 2. Can this be done by SQL? The Tech Lead of my team asked me to write a PL/SQL block. I did that because the original request was to add MARKETING_VENDOR, REFERRAL_TYPE, OL_REF columns in order_fact table and get the values from their respective tables. I cannot help but feel this can be done by SQL using CASE. Or am I wrong? If anyone could please help me with this query I will be grateful.
Edit : Adding the result data set
Edit : Any kind soul to help me out? 🙂 I take it it's not possible in a SQL statement.

And select the record whose timestamp is closest to the order_timestamp in table order_fact
From your description looks like you just need Top 1 record by session_timestamp:
with
step1 as (
SELECT
a.ORDER_NO
,a.order_timestamp
,c.MARKETING_VENDOR
,c.REFERRAL_TYPE
,c.session_timestamp
FROM order_fact a
cross apply (
select *
from session_fact b
where a.cookie_id = b.COOKIE_ID
and (REFERRAL_TYPE,MARKETING_VENDOR) not in (('D','-1'))
AND b.session_timestamp < a.order_timestamp
--AND b.session_timestamp > a.order_timestamp-7
order by b.session_timestamp desc
fetch first 1 rows only
) c
)
select
s.*
,o.OL_REF
FROM
step1 s
JOIN orderline o
ON (s.ORDER_NO = o.ORDER_NO AND o.OL_NBR = 1)
;
Result:
ORDER_NO ORDER_TIMESTAMP MARKETING_VENDOR REFERRAL_TYPE SESSION_TIMESTAMP OL_REF
-------------- ------------------- ---------------- ------------- ------------------- ------
78345-20210411 2021-04-11 18:37:07 Google R 2021-04-01 07:59:08 HV3
78759-20210411 2021-04-11 09:46:52 FabShopping D 2021-04-07 14:11:57 PS5
69857-20210329 2021-03-29 10:11:58 Naaptol S 2021-03-24 13:23:36 HV3

Related

Select users who don't have projects assigned in specific date, Oracle SQL

I have 3 tables: Projects, Employees and Assignments. Employees are assigned to different projects and projects can take some time. I need to find an employee who will be free on a specific day (won't have a project assigned).
CREATE TABLE Employee (
id NUMBER(10) NOT NULL PRIMARY KEY,
name VARCHAR2(50) NOT NULL
);
CREATE TABLE Projects (
id NUMBER(10) NOT NULL PRIMARY KEY,
name VARCHAR(200) NOT NULL,
start_date DATE NOT NULL,
end_date DATE NOT NULL
);
INSERT INTO
Employee(id, name) VALUES (1, 'Joe Doe');
INSERT INTO
Employee(id, name) VALUES (2, 'Alex Doe');
INSERT INTO
Projects(id, name, start_date, end_date)
VALUES (1, 'AAA',
TO_DATE('2020/12/30 00:00:00', 'yyyy/mm/dd hh24:mi:ss'),
TO_DATE('2021/06/30 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
INSERT INTO
Projects(id, name, start_date, end_date)
VALUES (2, 'BBB',
TO_DATE('2020/11/30 00:00:00', 'yyyy/mm/dd hh24:mi:ss'),
TO_DATE('2021/03/30 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
INSERT INTO
Projects(id, name, start_date, end_date)
VALUES (3, 'CCC',
TO_DATE('2020/11/10 00:00:00', 'yyyy/mm/dd hh24:mi:ss'),
TO_DATE('2020/11/30 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
CREATE TABLE Assignments (
id NUMBER(10) NOT NULL PRIMARY KEY,
employee_id NUMBER(10) NOT NULL,
project_id NUMBER(10) NOT NULL,
CONSTRAINT employee_fk FOREIGN KEY (employee_id) REFERENCES Employee(id),
CONSTRAINT project_fk FOREIGN KEY (project_id) REFERENCES Projects(id)
);
INSERT INTO Assignments(id, project_id, employee_id)
VALUES (1, 1, 1);
INSERT INTO Assignments(id, project_id, employee_id)
VALUES (2, 1, 2);
INSERT INTO Assignments(id, project_id, employee_id)
VALUES (3, 2, 1);
INSERT INTO Assignments(id, project_id, employee_id)
VALUES (4, 3, 1);
I know how to check people and their projects, but I need help with month and day condition. Let's say this specific day is 24th of March. I came up with something like this, but it's not what I want - because I guess I need somehow check if a free day is not in project duration time - am I right? How to do this?
SELECT Employee.name, Projects.name
FROM Assignments
INNER JOIN Employee ON Employee.id = Assignments.employee_id
INNER JOIN Projects ON Projects.id = Assignments.project_id
WHERE EXTRACT(month FROM Projects.start_date) < 3 AND
EXTRACT(day FROM Projects.start_date) > 24
GROUP BY Employee.name, Projects.name;
http://sqlfiddle.com/#!4/f483d/4
You can use the DATE literal to create the date and check if the employee is free on the specific date using the EXISTS as follows:
SELECT DISTINCT Employee.name FROM Assignments
INNER JOIN Employee ON Employee.id = Assignments.employee_id
WHERE NOT EXISTS (SELECT 1 FROM Projects
WHERE Projects.id = Assignments.project_id
AND DATE '2020-03-24'
BETWEEN Projects.START_date AND Projects.end_date);
SQL Fiddle
You can also use GROUP BY and HAVING as follows:
SELECT Employee.name FROM Assignments
INNER JOIN Employee ON Employee.id = Assignments.employee_id
INNER JOIN Projects ON Projects.id = Assignments.project_id
GROUP BY Employee.name
HAVING COUNT (CASE WHEN DATE '2020-03-24'
BETWEEN Projects.START_date
AND Projects.end_date
THEN 1 END) = 0
SQL Fiddle

Calculate difference in hours/days between different actions by a same user sql

I have a table where users perform an order action. I want to get difference in dates between his two or more orders. And similar for all users and then calculate their average or median.
Another issue is the order rows are duplicates because of another column in the table called order_received time which are 5 secs apart due to this two rows are created for the same users with same order time.
Based on your comment on my initial answer here is another worksheet.
Table DDL
create table tbl_order(
order_id integer,
account_number integer,
ordered_at date
);
Data as in other thread you pointed out
insert into tbl_order values (1, 1001, to_date('10-Sep-2019 00:00:00', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (2, 2001, to_date('01-Sep-2019 00:00:00', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (3, 2001, to_date('03-Sep-2019 00:00:00', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (4, 1001, to_date('12-Sep-2019 00:00:00', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (5, 3001, to_date('18-Sep-2019 00:00:00', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (6, 1001, to_date('20-Sep-2019 00:00:00', 'DD-MON-YYYY HH24:MI:SS'));
Query
WITH VW AS (
SELECT ACCOUNT_NUMBER,
MIN(ORDERED_AT) EARLIEST_ORDER_AT,
MAX(ORDERED_AT) LATEST_ORDER_AT,
ROUND(MAX(ORDERED_AT) - MIN(ORDERED_AT), 5) DIFF_IN_DAYS,
COUNT(*) TOTAL_ORDER_COUNT
FROM TBL_ORDER
GROUP BY ACCOUNT_NUMBER
)
SELECT ACCOUNT_NUMBER, EARLIEST_ORDER_AT, LATEST_ORDER_AT,
DIFF_IN_DAYS, ROUND( DIFF_IN_DAYS/TOTAL_ORDER_COUNT, 4) AVERAGE
FROM VW;
Result
===========Initial answer hereafter===========
Your question is not entirely clear, for example
Do you want difference in date per day (a user can make multiple orders per day) or just between their earliest and latest orders
What do you mean by average is it just (latest order date - earliest order date) / total purchase? This will be hours / purchase. is it even useful?
Anyways, here is a working sheet, this will give enough to set you in right direction (hopefully). This is for Oracle database, will work mostly for other database except the time conversion functions used here. You will have to search and use equivalent functions for database of your choice, if its not Oracle.
Create table
create table tbl_order(
order_id integer,
user_id integer,
item varchar2(100),
ordered_at date
);
Insert some data
insert into tbl_order values (8, 1, 'A2Z', to_date('21-Mar-2019 16:30:20', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (1, 1, 'ABC', to_date('22-Mar-2019 07:30:20', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (2, 1, 'ABC', to_date('22-Mar-2019 07:30:20', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (3, 1, 'EFGT', to_date('22-Mar-2019 09:30:30', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (4, 1, 'XYZ', to_date('22-Mar-2019 12:38:50', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (5, 1, 'ABC', to_date('22-Mar-2019 16:30:20', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (6, 2, 'ABC', to_date('22-Mar-2019 14:20:20', 'DD-MON-YYYY HH24:MI:SS'));
insert into tbl_order values (7, 2, 'A2C', to_date('22-Mar-2019 14:20:50', 'DD-MON-YYYY HH24:MI:SS'));
Get latest, earliest and total_purchase per user and an average
WITH VW AS (
SELECT USER_ID,
TO_CHAR(MIN(ORDERED_AT), 'DD-MON-YYYY HH24:MI:SS') EARLIEST_ORDER_AT,
TO_CHAR(MAX(ORDERED_AT), 'DD-MON-YYYY HH24:MI:SS')LATEST_ORDER_AT,
ROUND(MAX(ORDERED_AT) - MIN(ORDERED_AT), 5) * 24 DIFF_IN_HOURS,
COUNT(*) TOTAL_ORDER_COUNT
FROM TBL_ORDER
GROUP BY USER_ID
)
SELECT USER_ID, EARLIEST_ORDER_AT, LATEST_ORDER_AT,
DIFF_IN_HOURS, DIFF_IN_HOURS/TOTAL_ORDER_COUNT AVERAGE
FROM VW;
Get latest, earliest and total_purchase per user per day and an average
WITH VW AS (
SELECT USER_ID, TO_CHAR(ORDERED_AT, 'DD-MON-YYYY') ORDER_DATE_PART,
TO_CHAR(MIN(ORDERED_AT), 'DD-MON-YYYY HH24:MI:SS') EARLIEST_ORDER_AT,
TO_CHAR(MAX(ORDERED_AT), 'DD-MON-YYYY HH24:MI:SS')LATEST_ORDER_AT,
ROUND(MAX(ORDERED_AT) - MIN(ORDERED_AT), 5) * 24 DIFF_IN_HOURS,
COUNT(*) TOTAL_ORDER_COUNT
FROM TBL_ORDER
GROUP BY USER_ID, TO_CHAR(ORDERED_AT, 'DD-MON-YYYY')
)
SELECT USER_ID, ORDER_DATE_PART, EARLIEST_ORDER_AT, LATEST_ORDER_AT,
DIFF_IN_HOURS, DIFF_IN_HOURS/TOTAL_ORDER_COUNT AVERAGE
FROM VW;

How can I format the output of my SQL query to display nicely?

I've defined my own database to play around and learn SQL (using SQL*Plus via SSH to remote into my school's linux machines). However, I've been having problems displaying my tables nicely, specifically this one:
CREATE TABLE customer_account
(
ACCOUNT_ID NUMBER(10) NOT NULL,
PHONE_NUMBER VARCHAR(20) NOT NULL,
EMAIL VARCHAR(100) NOT NULL,
FNAME VARCHAR(100) NOT NULL,
LNAME VARCHAR(100) NOT NULL,
ADDRESS_STREET VARCHAR(50) NOT NULL,
ADDRESS_CITY VARCHAR(20) NOT NULL,
ADDRESS_STATE VARCHAR(2) NOT NULL,
ADDRESS_ZIP VARCHAR(5) NOT NULL,
BIRTH DATE DEFAULT NULL,
PRIMARY KEY (ACCOUNT_ID)
);
INSERT INTO customer_account
VALUES (1, '9174560091', 'jhunters01#cuny.edu', 'Jack', 'Hunter', '11 67ST', 'New York', 'NY', '10024', TO_DATE('1998/01/22 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
INSERT INTO customer_account
VALUES (2, '7134560012', 'L.Larson#gmail.com', 'Linda', 'Larson', '100-9 Brooklyn Hwy', 'New York', 'NY', '11225', TO_DATE('1996/12/20 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
INSERT INTO customer_account
VALUES (3, '5303056927', 'sciencerules#gmail.com', 'Albert', 'Newton', '1206 Francis Mine', 'Sacramento', 'CA', '95814', TO_DATE('2001/05/17 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
INSERT INTO customer_account
VALUES (4, '5106204676', 'luvlucy#yahoo.com', 'Ricky', 'Ricardo', '90 maple street west', 'Trenton', 'NJ', '08861', TO_DATE('1942/12/01 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
INSERT INTO customer_account
VALUES (5, '3237843058', 'RalphJRiggins#dayrep.com', 'Ralph', 'Riggins', '3373 Hillhaven Drive', 'Los Angeles', 'CA', '90017', TO_DATE('1964/10/02 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
INSERT INTO customer_account
VALUES (6, '2133384287', 'lavonnaRWilliams#mail.com', 'Lavonna', 'Williams', '1305 Zimmerman Lane', 'City of Commerce', 'CA', '90040', TO_DATE('1983/03/03 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
INSERT INTO customer_account
VALUES (7, '6313604478', 'antoninetteRe#gmail.com', 'Antoinette', 'Reynolds', '2329 Wayback Lane', 'Smithtown', 'NY', '11787', TO_DATE('1990/10/25 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
INSERT INTO customer_account
VALUES (8, '9736948587', 'Mcdonald#yahoo.com', 'Berger', 'McDonald', '3024 Spring Haven Trail', 'Mountain View', 'NJ', '07470', TO_DATE('1960/06/17 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
INSERT INTO customer_account
VALUES (9, '9082074677', 'M.Lester#gmail.com', 'Moe', 'Lester', '2980 Williams Mine Road', 'Lakewood', 'NJ', '08701', TO_DATE('1988/10/05 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
INSERT INTO customer_account
VALUES (10, '8282351937', 'son#rhyta.com', 'Dam', 'Son', '98 McVaney Road', 'Canton', 'NC', '28716', TO_DATE('1957/08/28 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
Whenever I did
SQL> SELECT * FROM customer_account;
the entire table does not come out nicely no matter what I tried. I've used set linesize to no avail. This is the best I could do
Is there too much going on per column in the actual table or could I do something to fix this?
I recommend using SQL developer.
Here are 2 suggestions.
Oracle SQL Developer
http://www.oracle.com/technetwork/developer-tools/sql-developer/overview/index-097090.html
SQuirreL SQL
http://squirrel-sql.sourceforge.net/

SQL loop inserts

I have a company table with list of companies name and company id.
Now there is a Value table which hold information about the company with reference to company id.
I need to first get the list and size of the companies and for all the companies insert a particular feature information in the Value table.
This means I need to have all companies having those features in the Value table.
I tried to use the below SQL which gives a compilation error. But the for loop works well without the insert.
DECLARE
x NUMBER(2) ;
BEGIN
FOR x IN (select distinct company_num from company where comp_IN_comp='T') LOOP
INSERT INTO VALUE (PROPERTY_NUM, DATA_GROUP, NUM_UPDATES,
CREATED_DATE, CREATED_BY, LAST_UPDATED_DATE, LAST_UPDATED_BY, VALUE) VALUES
('78', x ,'0', TO_DATE('2015-12-17 00:00:00', 'YYYY-MM-DD HH24:MI:SS'),
'ADMIN', TO_DATE('2015-12-17 00:00:00', 'YYYY-MM-DD HH24:MI:SS'), 'ADMIN', 'N');
END LOOP;
END;
You don't need a loop for this - just use an insert-select statement:
INSERT INTO VALUE (PROPERTY_NUM,
DATA_GROUP,
NUM_UPDATES,
CREATED_DATE,
CREATED_BY,
LAST_UPDATED_DATE,
LAST_UPDATED_BY,
VALUE)
SELECT DISTINCT '78',
company_num,
'0',
TO_DATE('2015-12-17 00:00:00', 'YYYY-MM-DD HH24:MI:SS'),
'ADMIN',
TO_DATE('2015-12-17 00:00:00', 'YYYY-MM-DD HH24:MI:SS'),
'ADMIN',
'N'
FROM company
WHERE comp_in_comp='T'

Calculating Variance from different data Sources Oracle SQL

I am attempting to create a list of variances based on data that I get from two difference sources. This data contains a date, a series of references and columns containing numeric counts etc.
The idea behind this is to check that the data from Data Source 1 has the same numeric count as the data from Data Source 2, and then logging the variances of each 5 minute interval.
Here I have the required code to create the table and sample data of a simplified scenario
Create Table ABP_PROFILE
( ABP_DATE Date not Null,
ABP_SOURCE_UID Number(10) not Null,
ABP_REFERENCE_1 Varchar2(30) not Null,
ABP_CHARGE Number(18,6),
ABP_COUNT Number(18)
);
insert into ABP_PROFILE (ABP_DATE, ABP_SOURCE_UID, ABP_REFERENCE_1, ABP_CHARGE, ABP_COUNT)
values (to_date('15-06-2015 00:05:00', 'dd-mm-yyyy hh24:mi:ss'), 1, 'Another Reference', 757.500000, 101);
insert into ABP_PROFILE (ABP_DATE, ABP_SOURCE_UID, ABP_REFERENCE_1, ABP_CHARGE, ABP_COUNT)
values (to_date('15-06-2015 00:05:00', 'dd-mm-yyyy hh24:mi:ss'), 1, 'Some Reference', 2954.000000, 211);
insert into ABP_PROFILE (ABP_DATE, ABP_SOURCE_UID, ABP_REFERENCE_1, ABP_CHARGE, ABP_COUNT)
values (to_date('15-06-2015 00:05:00', 'dd-mm-yyyy hh24:mi:ss'), 2, 'Another Reference', 757.500000, 101);
insert into ABP_PROFILE (ABP_DATE, ABP_SOURCE_UID, ABP_REFERENCE_1, ABP_CHARGE, ABP_COUNT)
values (to_date('15-06-2015 00:05:00', 'dd-mm-yyyy hh24:mi:ss'), 2, 'Some Reference', 2954.000000, 211);
insert into ABP_PROFILE (ABP_DATE, ABP_SOURCE_UID, ABP_REFERENCE_1, ABP_CHARGE, ABP_COUNT)
values (to_date('15-06-2015 00:10:00', 'dd-mm-yyyy hh24:mi:ss'), 1, 'Another Reference', 5300.250000, 191);
insert into ABP_PROFILE (ABP_DATE, ABP_SOURCE_UID, ABP_REFERENCE_1, ABP_CHARGE, ABP_COUNT)
values (to_date('15-06-2015 00:10:00', 'dd-mm-yyyy hh24:mi:ss'), 1, 'Some Reference', 9568.000000, 208);
insert into ABP_PROFILE (ABP_DATE, ABP_SOURCE_UID, ABP_REFERENCE_1, ABP_CHARGE, ABP_COUNT)
values (to_date('15-06-2015 00:10:00', 'dd-mm-yyyy hh24:mi:ss'), 2, 'Another Reference', 5300.250000, 5555);
insert into ABP_PROFILE (ABP_DATE, ABP_SOURCE_UID, ABP_REFERENCE_1, ABP_CHARGE, ABP_COUNT)
values (to_date('15-06-2015 00:10:00', 'dd-mm-yyyy hh24:mi:ss'), 2, 'Some Reference', 1111.000000, 208);
Here I have created a BASIC SQL query to how what it is I want to do.
With A_DATA As (
Select ABP_DATE As A_DATE,
ABP_REFERENCE_1 As A_REFERENCE_1,
ABP_CHARGE As A_CHARGE,
ABP_COUNT As A_COUNT
From ABP_PROFILE
Where ABP_SOURCE_UID = 1
), B_DATA As (
Select ABP_DATE As B_DATE,
ABP_REFERENCE_1 As B_REFERENCE_1,
ABP_CHARGE As B_CHARGE,
ABP_COUNT As B_COUNT
From ABP_PROFILE
Where ABP_SOURCE_UID = 2
)
Select A_DATE,
A_REFERENCE_1,
B_CHARGE - A_CHARGE As ChargeDifference,
B_COUNT - A_COUNT As CountDifference
From A_DATA,
B_DATA
Where A_DATE = B_DATE
And A_REFERENCE_1 = B_REFERENCE_1
;
This does a join based on the Date and Reference from the two data sources. I need to have a much more versatile solution which needs to also show variances if data from one side is missing, yes a full outer join can be used for this, but I want to explore other options.
I've been looking into Analytic Functions and I am sure there is one out there which can do what I want it to do. I'm wondering if any Oracle SQL Experts have any ideas that can help here.
FYI I am running 11gR2 Enterprise
Solution with FULL JOIN seems to be more readable, but if you search for alternative - here it is - with functions lag() and lead():
with data as (
select abp_date dt, abp_source_uid id, abp_reference_1 ref,
abp_charge charge, abp_count cnt,
lag(abp_source_uid) over (partition by abp_date, abp_reference_1
order by abp_source_uid) lgid,
lead(abp_charge) over (partition by abp_date, abp_reference_1
order by abp_source_uid) ldcharge,
lead( abp_count) over (partition by abp_date, abp_reference_1
order by abp_source_uid) ldcnt
from abp_profile a)
select dt, ref,
case when id = 1 then nvl(ldcharge, 0) - charge else charge end chrg_diff,
case when id = 1 then nvl(ldcnt, 0) - cnt else cnt end cnt_diff
from data
where id = 1 or id = 2 and lgid is null;
... and your modified query transformed to full join version, which I made to compare results:
With A_DATA As (
Select ABP_DATE As A_DATE,
ABP_REFERENCE_1 As A_REFERENCE_1,
ABP_CHARGE As A_CHARGE,
ABP_COUNT As A_COUNT
From ABP_PROFILE
Where ABP_SOURCE_UID = 1
), B_DATA As (
Select ABP_DATE As B_DATE,
ABP_REFERENCE_1 As B_REFERENCE_1,
ABP_CHARGE As B_CHARGE,
ABP_COUNT As B_COUNT
From ABP_PROFILE
Where ABP_SOURCE_UID = 2
)
Select nvl(A_DATE, b_date) dt,
nvl(A_REFERENCE_1, b_reference_1) ref,
nvl(B_CHARGE, 0) - nvl(A_CHARGE, 0) As Chrg_Diff,
nvl(B_COUNT, 0) - nvl(A_COUNT, 0) As Cnt_Diff
From A_DATA
full join B_DATA on A_DATE = B_DATE and A_REFERENCE_1 = B_REFERENCE_1;
SQLFiddle
Both queries qives same results. In examples I added two rows to show how to deal with missing data.
Here I used nvl(..., 0) but of course you can leave nulls or add column(s) informing of such situation.