Selecting the first instance of a vendor, part combination - sql

I am trying to create an indicator for if a particular transaction was the first time a part was purchased from a particular vendor.
I have a dataset that looks like this:
| transaction_id | vendor_id | part_id | trans_date |
|:--------------:|:---------:|:-------:|:-----------------:|
| 9Bx*2Pc' | a | 873 | 10/12/2018 |
| 1Po.4Ot, | a | 473 | 4/22/2016 |
| 9Sk"7Kv/ | b | 123 | 7/23/2016 |
| 2Lz&7Hu& | a | 873 | 12/20/2017 |
| 8Lz)5Is# | b | 743 | 10/22/2016 |
| 5Sc'6Jl/ | a | 113 | 10/6/2016 |
| 0Ra&8Hb& | a | 653 | 10/4/2017 |
| 4Wc-8Of* | c | 333 | 8/3/2017 |
| 8Vv+9Yo/ | c | 333 | 12/7/2016 |
| 6Qh!1Ha- | c | 333 | 3/28/2017 |
| 2Ol%4Rs# | c | 333 | 5/2/2017 |
| 1Gg#8Cm% | c | 333 | 11/15/2016 |
| 0Lw(6Pv/ | d | 873 | 8/13/2017 |
| 1Gy/7Zw, | a | 443 | 10/12/2018 |
| 2Gz,4Gp. | b | 103 | 1/5/2018 |
| 5Dj)6Wc+ | a | 893 | 12/17/2016 |
| 5Hl-8Ds! | a | 903 | 12/8/2017 |
| 8Ws$3Vy* | b | 873 | 1/13/2018 |
What I am looking to do is determine if the transaction_id was the first time (sorted by trans_date), that the part_id was purchased from a vendor_id. I would imagine the ideal output to look like this:
| transaction_id | vendor_id | part_id | trans_date | first_time |
|:--------------:|:---------:|:-------:|:-----------------:|:----------:|
| 9Bx*2Pc' | a | 873 | 10/12/2018 | N |
| 1Po.4Ot, | a | 473 | 4/22/2016 | Y |
| 9Sk"7Kv/ | b | 123 | 7/23/2016 | Y |
| 2Lz&7Hu& | a | 873 | 12/20/2017 | Y |
| 8Lz)5Is# | b | 743 | 10/22/2016 | Y |
| 5Sc'6Jl/ | a | 113 | 10/6/2016 | Y |
| 0Ra&8Hb& | a | 653 | 10/4/2017 | Y |
| 4Wc-8Of* | c | 333 | 8/3/2017 | N |
| 8Vv+9Yo/ | c | 333 | 12/7/2016 | N |
| 6Qh!1Ha- | c | 333 | 3/28/2017 | N |
| 2Ol%4Rs# | c | 333 | 5/2/2017 | N |
| 1Gg#8Cm% | c | 333 | 11/15/2016 | Y |
| 0Lw(6Pv/ | d | 873 | 8/13/2017 | Y |
| 1Gy/7Zw, | a | 443 | 10/12/2018 | Y |
| 2Gz,4Gp. | b | 103 | 1/5/2018 | Y |
| 5Dj)6Wc+ | a | 893 | 12/17/2016 | Y |
| 5Hl-8Ds! | a | 903 | 12/8/2017 | Y |
| 8Ws$3Vy* | b | 873 | 1/13/2018 | Y |
So far, I have tried (which was influenced by this post):
WITH
first_instance AS (
SELECT
tbl_trans.*,
ROW_NUMBER() OVER (PARTITION BY vendor_id||part_id ORDER BY trans_date) AS row_nums
FROM
tbl_trans
)
SELECT
x.*,
CASE WHEN y.row_nums = 1 THEN 'Y' ELSE 'N' END AS first_time_indicator
FROM
tbl_trans x
LEFT JOIN first_instance y
But I am met with:
ORA-00905: missing keyword
I have created a SQL FIDDLE with this data and the query thus far for testing. How can I determine the if a transaction was a first time purchase for a part/vendor combination?

Use window functions:
select t.*,
(case when row_number() over (partition by vendor_id, part_id order by trans_date) = 1
then 'Y' else 'N'
end) as first_time
from tbl_trans t;
You don't need a join.

Apart from row_number, there are multiple ways of achieving the desired result using analytical function as follows.
You can use first_value analytical function as follows:
Select t.*,
Case
when first_value(trans_date)
over (partition by vendor_id, part_id order by trans_date) = trans_date
then 'Y'
else 'N'
end as first_time
From your_table t;
The same way, you can also use min as follows:
Select t.*,
Case
when min(trans_date)
over (partition by vendor_id, part_id) = trans_date
then 'Y'
else 'N'
end as first_time
From your_table t;

Related

PostgresSql:Comparing two tables and obtaining its result and compare it with third table

TABLE 2 : trip_delivery_sales_lines
+-------+---------------------+------------+----------+------------+-------------+--------+--+
| Sl no | Order_date | Partner_id | Route_id | Product_id | Product qty | amount | |
+-------+---------------------+------------+----------+------------+-------------+--------+--+
| 1 | 2020-08-01 04:25:35 | 34567 | 152 | 432 | 2 | 100 | |
| 2 | 2021-09-11 02:25:35 | 34572 | 130 | 312 | 4 | 150 | |
| 3 | 2020-05-10 04:25:35 | 34567 | 152 | 432 | 3 | 123 | |
| 4 | 2021-02-16 01:10:35 | 34572 | 130 | 432 | 5 | 123 | |
| 5 | 2020-02-19 01:10:35 | 34567 | 152 | 432 | 2 | 600 | |
| 6 | 2021-03-20 01:10:35 | 34569 | 152 | 123 | 1 | 123 | |
| 7 | 2021-04-23 01:10:35 | 34570 | 152 | 432 | 4 | 200 | |
| 8 | 2021-07-08 01:10:35 | 34567 | 152 | 432 | 3 | 32 | |
| 9 | 2019-06-28 01:10:35 | 34570 | 152 | 432 | 2 | 100 | |
| 10 | 2018-11-14 01:10:35 | 34570 | 152 | 432 | 5 | 20 | |
| | | | | | | | |
+-------+---------------------+------------+----------+------------+-------------+--------+--+
From Table 2 : we had to find partners in route=152 and find the sum of product_qty of the last 2 sale [can be selected by desc order_date]
. We can find its result in table 3.
34567 – Serial number [ 1,8]
34570 – Serial number [ 7,9]
34569 – Serial number [6]
TABLE 3 : RESULT OBTAINED FROM TABLE 1,2
+------------+-------+
| Partner_id | count |
+------------+-------+
| 34567 | 5 |
| 34569 | 1 |
| 34570 | 6 |
| | |
+------------+-------+
From table 4 we want to find the above partner_ids leaf count
TABLE 4 :coupon_leaf
+------------+-------+
| Partner_id | Leaf |
+------------+-------+
| 34567 | XYZ1 |
| 34569 | XYZ2 |
| 34569 | DDHC |
| 34567 | DVDV |
| 34570 | DVFDV |
| 34576 | FVFV |
| 34567 | FVV |
| | |
+------------+-------+
From that we can find result as:
34567 – 3
34569-2
34570 -1
TABLE 5: result obtained from TABLE 4
+------------+-------+
| Partner_id | count |
+------------+-------+
| 34567 | 3 |
| 34569 | 2 |
| 34570 | 1 |
| | |
+------------+-------+
Now we want compare table 3 and 5
If partner_id count [table 3] > partner_id count [table 4]
Print partner_id
I want a single query to do all these operation
distinct partner_id can be found by: fROM TABLE 1
SELECT DISTINCT partner_id
FROM trip_delivery_sales ts
WHERE ts.route_id='152'
GROUP BY ts.partner_id
This answers the original version of the problem.
You seem to want to compare totals after aggregating tables 2 and 3. I don't know what table1 is for. It doesn't seem to do anything.
So:
select *
from (select partner_id, sum(quantity) as sum_quantity
from (select tdsl.*,
row_number() over (partition by t2.partner_id order by order_date) as seqnum
from trip_delivery_sales_lines tdsl
) tdsl
where seqnum <= 2
group by tdsl.partner_id
) tdsl left join
(select cl.partner_id, count(*) as leaf_cnt
from coupon_leaf cl
group by cl.partner_id
) cl
on cl.partner_id = tdsl.partner_id
where leaf_cnt is null or sum_quantity > leaf_cnt

Finding MAX date aggregated by order - Oracle SQL

I have a data orders that looks like this:
| Order | Step | Step Complete Date |
|:-----:|:----:|:------------------:|
| A | 1 | 11/1/2019 |
| | 2 | 11/1/2019 |
| | 3 | 11/1/2019 |
| | 4 | 11/3/2019 |
| | 5 | 11/3/2019 |
| | 6 | 11/5/2019 |
| | 7 | 11/5/2019 |
| B | 1 | 12/1/2019 |
| | 2 | 12/2/2019 |
| | 3 | |
| C | 1 | 10/21/2019 |
| | 2 | 10/23/2019 |
| | 3 | 10/25/2019 |
| | 4 | 10/25/2019 |
| | 5 | 10/25/2019 |
| | 6 | |
| | 7 | 10/27/2019 |
| | 8 | 10/28/2019 |
| | 9 | 10/29/2019 |
| | 10 | 10/30/2019 |
| D | 1 | 10/30/2019 |
| | 2 | 11/1/2019 |
| | 3 | 11/1/2019 |
| | 4 | 11/2/2019 |
| | 5 | 11/2/2019 |
What I need to accomplish is the following:
For each order, assign the 'Order_Completion_Date' field as the most recent 'Step_Complete_Date'. If ANY 'Step_Complete_Date' is NULL, then the value for 'Order_Completion_Date' should be NULL.
I set up a SQL FIDDLE with this data and my attempt, below:
SELECT
OrderNum,
MAX(Step_Complete_Date)
FROM
OrderNums
WHERE
Step_Complete_Date IS NOT NULL
GROUP BY
OrderNum
This is yielding:
ORDERNUM MAX(STEP_COMPLETE_DATE)
D 11/2/2019
A 11/5/2019
B 12/2/2019
C 10/30/2019
How can I achieve:
| OrderNum | Order_Completed_Date |
|:--------:|:--------------------:|
| A | 11/5/2019 |
| B | NULL |
| C | NULL |
| D | 11/2/2019 |
Aggregate function with KEEP can handle this
select ordernum,
max(step_complete_date)
keep (DENSE_RANK FIRST ORDER BY step_complete_date desc nulls first) res
FROM
OrderNums
GROUP BY
OrderNum
You can use a CASE expression to first count if there are any NULL values and if not then find the maximum value:
Query 1:
SELECT OrderNum,
CASE
WHEN COUNT( CASE WHEN Step_Complete_Date IS NULL THEN 1 END ) > 0
THEN NULL
ELSE MAX(Step_Complete_Date)
END AS Order_Completion_Date
FROM OrderNums
GROUP BY OrderNum
Results:
| ORDERNUM | ORDER_COMPLETION_DATE |
|----------|-----------------------|
| D | 11/2/2019 |
| A | 11/5/2019 |
| B | (null) |
| C | (null) |
First, you are representing dates as varchars in mm/dd/yyyy format (at least in fiddle). With max function it can produce incorrect result, try for example order with dates '11/10/2019' and '11/2/2019'.
Second, the most simple solution is IMHO to use fallback date for nulls and get null back when fallback date wins:
SELECT
OrderNum,
NULLIF(MAX(NVL(Step_Complete_Date,'~')),'~')
FROM
OrderNums
GROUP BY
OrderNum
(Example is still for varchars since tilde is greater than any digit. For dates, you could use 9999-12-31, for instance.)

SQL subcategory total is not properly placed at the end of every parent category

I am having trouble in SQl query,The query result should be like this
+------------+------------+-----+------+-------+--+--+--+
| District | Tehsil | yes | no | Total | | | |
+------------+------------+-----+------+-------+--+--+--+
| ABBOTTABAD | ABBOTTABAD | 377 | 5927 | 6304 | | | |
| ABBOTTABAD | HAVELIAN | 112 | 2276 | 2388 | | | |
| ABBOTTABAD | Overall | 489 | 8203 | 8692 | | | |
| CHARSADDA | CHARSADDA | 289 | 3762 | 4051 | | | |
| CHARSADDA | SHABQADAR | 121 | 1376 | 1497 | | | |
| CHARSADDA | TANGI | 94 | 1703 | 1797 | | | |
| CHARSADDA | Overall | 504 | 6841 | 7345 | | | |
+------------+------------+-----+------+-------+--+--+--+
The overall total should be should be shown at the end of every parent category but now it is showing like this
+------------+------------+-----+------+-------+--+--+--+
| District | Tehsil | yes | no | Total | | | |
+------------+------------+-----+------+-------+--+--+--+
| ABBOTTABAD | ABBOTTABAD | 377 | 5927 | 6304 | | | |
| ABBOTTABAD | HAVELIAN | 112 | 2276 | 2388 | | | |
| ABBOTTABAD | Overall | 489 | 8203 | 8692 | | | |
| CHARSADDA | CHARSADDA | 289 | 3762 | 4051 | | | |
| CHARSADDA | Overall | 504 | 6841 | 7345 | | | |
| CHARSADDA | SHABQADAR | 121 | 1376 | 1497 | | | |
| CHARSADDA | TANGI | 94 | 1703 | 1797 | | | |
+------------+------------+-----+------+-------+--+--+--+
My query is sorting second column with respect to first column although order by query is applied on my first column. This is my query
select District as 'District', tName as 'tehsil',[1] as 'yes',[0] as 'no',ISNULL([1]+[0], 0) as "Total" from
(
select d.Name as 'District',
case when grouping (t.Name)=1 then 'Overall' else t.Name end as tName,
BoundaryWallAvailable,
count(*) as total from School s
INNER JOIN SchoolIndicator i ON (i.refSchoolID=s.SchoolID)
INNER JOIN Tehsil t ON (t.TehsilID=s.refTehsilID)
INNER JOIN district d ON (d.DistrictID=t.refDistrictID)
group by
GROUPING sets((d.Name, BoundaryWallAvailable), (d.Name,t.Name, BoundaryWallAvailable))
) B
PIVOT
(
max(total) for BoundaryWallAvailable in ([1],[0])
) as Pvt
order by District
P.S: BoundaryWall is one column through pivoting i am breaking it into Yes and No Column

How to check dates condition from one table to another in SQL

Which way we can use to check and compare the dates from one table to another.
Table : inc
+--------+---------+-----------+-----------+-------------+
| inc_id | cust_id | item_id | serv_time | inc_date |
+--------+---------+-----------+-----------+-------------+
| 1 | john | HP | 40 | 17-Apr-2015 |
| 2 | John | HP | 60 | 10-Jan-2016 |
| 3 | Nick | Cisco | 120 | 11-Jan-2016 |
| 4 | samanta | EMC | 180 | 12-Jan-2016 |
| 5 | Kerlee | Oracle | 40 | 13-Jan-2016 |
| 6 | Amir | Microsoft | 300 | 14-Jan-2016 |
| 7 | John | HP | 120 | 15-Jan-2016 |
| 8 | samanta | EMC | 20 | 16-Jan-2016 |
| 9 | Kerlee | Oracle | 10 | 2-Feb-2017 |
+--------+---------+-----------+-----------+-------------+
Table: Contract:
+-----------+---------+----------+------------+
| item_id | con_id | Start | End |
+-----------+---------+----------+------------+
| Dell | DE2015 | 1/1/2015 | 12/31/2015 |
| HP | HP2015 | 1/1/2015 | 12/31/2015 |
| Cisco | CIS2016 | 1/1/2016 | 12/31/2016 |
| EMC | EMC2016 | 1/1/2016 | 12/31/2016 |
| HP | HP2016 | 1/1/2016 | 12/31/2016 |
| Oracle | OR2016 | 1/1/2016 | 12/31/2016 |
| Microsoft | MS2016 | 1/1/2016 | 12/31/2016 |
| Microsoft | MS2017 | 1/1/2017 | 12/31/2017 |
+-----------+---------+----------+------------+
Result:
+-------+---------+---------+--------------+
| Calls | Cust_id | Con_id | Tot_Ser_Time |
+-------+---------+---------+--------------+
| 2 | John | HP2016 | 180 |
| 2 | samanta | EMC2016 | 200 |
| 1 | Nick | CIS2016 | 120 |
| 1 | Amir | MS2016 | 300 |
| 1 | Oracle | OR2016 | 40 |
+-------+---------+---------+--------------+
MY Query:
select count(inc_id) as Calls, inc.cust_id, contract.con_id,
sum(inc.serv_time) as tot_serv_time
from inc inner join contract on inc.item_id = contract.item_id
where inc.inc_date between '2016-01-01' and '2016-12-31'
group by inc.cust_id, contract.con_id
The result from inc table with filter between 1-jan-2016 to 31-Dec-2016 with
count of inc_id based on the items and its contract start and end dates .
If I understand correctly your problem, this query will return the desidered result:
select
count(*) as Calls,
inc.cust_id,
contract.con_id,
sum(inc.serv_time) as tot_serv_time
from
inc inner join contract
on inc.item_id = contract.item_id
and inc.inc_date between contract.start and contract.end
where
inc.inc_date between '2016-01-01' and '2016-12-31'
group by
inc.cust_id,
contract.con_id
the question is a little vague so you might need some adjustments to this query.
select
Calls = count(*)
, Cust = i.Cust_id
, Contract = c.con_id
, Serv_Time = sum(Serv_Time)
from inc as i
inner join contract as c
on i.item_id = c.item_id
and i.inc_date >= c.[start]
and i.inc_date <= c.[end]
where c.[start]>='20160101'
group by i.Cust_id, c.con_id
order by i.Cust_Id, c.con_id
returns:
+-------+---------+----------+-----------+
| Calls | Cust | Contract | Serv_Time |
+-------+---------+----------+-----------+
| 1 | Amir | MS2016 | 300 |
| 2 | John | HP2016 | 180 |
| 1 | Kerlee | OR2016 | 40 |
| 1 | Nick | CIS2016 | 120 |
| 2 | samanta | EMC2016 | 200 |
+-------+---------+----------+-----------+
test setup: http://rextester.com/WSYDL43321
create table inc(
inc_id int
, cust_id varchar(16)
, item_id varchar(16)
, serv_time int
, inc_date date
);
insert into inc values
(1,'john','HP', 40 ,'17-Apr-2015')
,(2,'John','HP', 60 ,'10-Jan-2016')
,(3,'Nick','Cisco', 120 ,'11-Jan-2016')
,(4,'samanta','EMC', 180 ,'12-Jan-2016')
,(5,'Kerlee','Oracle', 40 ,'13-Jan-2016')
,(6,'Amir','Microsoft', 300 ,'14-Jan-2016')
,(7,'John','HP', 120 ,'15-Jan-2016')
,(8,'samanta','EMC', 20 ,'16-Jan-2016')
,(9,'Kerlee','Oracle', 10 ,'02-Feb-2017');
create table contract (
item_id varchar(16)
, con_id varchar(16)
, [Start] date
, [End] date
);
insert into contract values
('Dell','DE2015','20150101','20151231')
,('HP','HP2015','20150101','20151231')
,('Cisco','CIS2016','20160101','20161231')
,('EMC','EMC2016','20160101','20161231')
,('HP','HP2016','20160101','20161231')
,('Oracle','OR2016','20160101','20161231')
,('Microsoft','MS2016','20160101','20161231')
,('Microsoft','MS2017','20170101','20171231');

Considering values from one table as column header in another

I have a base table where I need to calculate the difference between two dates based on the type of the entry.
tblA
+----------+------------+---------------+--------------+
| TypeCode | Log_Date | Complete_Date | Pending_Date |
+----------+------------+---------------+--------------+
| 1 | 18/04/2016 | 19/04/2016 | |
| 2 | 10/04/2016 | 18/04/2016 | 15/04/2016 |
| 3 | 12/04/2016 | 19/04/2016 | |
| 4 | 15/04/2016 | 17/04/2016 | 16/04/2016 |
| 5 | 16/04/2016 | 21/04/2016 | |
| 1 | 19/04/2016 | 20/04/2016 | |
| 2 | 20/03/2016 | 31/03/2015 | |
| 3 | 25/03/2016 | 28/03/2016 | |
| 4 | 26/03/2016 | 27/03/2016 | |
| 5 | 27/03/2016 | 30/03/2016 | |
+----------+------------+---------------+--------------+
I have another look up table which has the column names to be considered based on the TypeCode.
tblB
+----------+----------+---------------+
| TypeCode | DateCol1 | DateCol2 |
+----------+----------+---------------+
| 1 | Log_Date | Complete_Date |
| 2 | Log_Date | Pending_Date |
| 3 | Log_Date | Complete_Date |
| 4 | Log_Date | Pending_Date |
| 5 | Log_Date | Complete_Date |
+----------+----------+---------------+
I am doing a simple DATEDIFF between two dates for my calculation. However I want to lookup which columns to consider for this calculation from tblB and apply it on tblA based on the TypeCode.
Resulting table:
For example: When the TypeCode is 2 or 4 then the calculation should be DATEDIFF(d, Log_Date, Pending_Date), otherwise DATEDIFF(d, Log_Date, Complete_Date)
+----------+------------+---------------+--------------+----------+
| TypeCode | Log_Date | Complete_Date | Pending_Date | Cal_Days |
+----------+------------+---------------+--------------+----------+
| 1 | 18/04/2016 | 19/04/2016 | | 1 |
| 2 | 10/04/2016 | 18/04/2016 | 15/04/2016 | 5 |
| 3 | 12/04/2016 | 19/04/2016 | | 7 |
| 4 | 15/04/2016 | 17/04/2016 | 16/04/2016 | 1 |
| 5 | 16/04/2016 | 21/04/2016 | | 5 |
| 1 | 19/04/2016 | 20/04/2016 | | 1 |
| 2 | 20/03/2016 | 31/03/2015 | | |
| 3 | 25/03/2016 | 28/03/2016 | | 3 |
| 4 | 26/03/2016 | 27/03/2016 | | |
| 5 | 27/03/2016 | 30/03/2016 | | 3 |
+----------+------------+---------------+--------------+----------+
Any help would be appreciated. Thanks.
Use JOIN with CASE expression:
SELECT
a.*,
Cal_Days =
DATEDIFF(
DAY,
CASE
WHEN b.DateCol1 = 'Log_Date' THEN a.Log_Date
WHEN b.DateCol1 = 'Complete_Date' THEN a.Complete_Date
ELSE a.Pending_Date
END,
CASE
WHEN b.DateCol2 = 'Log_Date' THEN a.Log_Date
WHEN b.DateCol2 = 'Complete_Date' THEN a.Complete_Date
ELSE a.Pending_Date
END
)
FROM TblA a
INNER JOIN TblB b
ON b.TypeCode = a.TypeCode