IF Else or Case Function for SQL select problem - sql

Hi I would like to make a select expression using case or if/else which seems to be a simple solution from logic perspective but I can't seem to get it to work. Basically I am joining against two table here, the first table is customer record with date filter called min_del_date and then the second table for the model scoring table with BIN and update_date parameters.
There are two logics I want to display
Picking the model score that was the month before min_del_date
If model score month before delivery is greater than 50 (Bin > 50) then pick the model score for same month as min_del_date
My 1st logic code is below
with cust as (
select
distinct cust_no, max(del_date) as del_date, min(del_date) as min_del_date, (EXTRACT(YEAR FROM min(del_date)) -1900)*12 + EXTRACT(MONTH FROM min(del_date)) AS upd_seq
from customer.cust_history
group by 1
)
,model as (
select party_id, model_id, update_date, upd_seq, bin, var_data8, var_data2
from
(
select
party_id, update_date, bin, var_data8, var_data2,
(EXTRACT(YEAR FROM UPDATE_DATE) -1900)*12 + EXTRACT(MONTH FROM UPDATE_DATE) AS upd_seq,
dense_Rank() over (partition by (EXTRACT(YEAR FROM UPDATE_DATE) -1900)*12 + EXTRACT(MONTH FROM UPDATE_DATE) order by update_date desc) as rank1
from
(
select party_id,update_date, bin, var_data8, var_data2
from model.rpm_model
group by party_id,update_date, bin, var_data8, var_data2
) model
)model_final
where rank1 = 1
)
-- Add model scores
-- 1st logic Picking the model score that was the month before delivery date
select *
from
(
select cust.cust_no, cust.del_date, cust.min_del_date, model.upd_seq, model.bin
from cust
left join cust
on cust.cust_no = model.party_id
and cust.upd_seq = model.upd_seq + 1
)a
Now I am struggling in creating the 2nd logic in the same query?.. any assistance would be appreciated
cust table
cust_no
min_del_date
upd_seq
123
2021-01-11
1453
234
2020-06-29
1446
456
2020-07-20
1447
model table
party_id
update_date
upd_seq
BIN
123
2020-11-30
1451
22
123
2020-12-25
1452
54
123
2020-01-11
1453
14
234
2020-05-23
1445
76
234
2020-06-18
1446
48
234
2020-07-23
1447
12
456
2020-06-18
1446
23
456
2020-07-23
1447
39
456
2020-08-21
1448
21
desired results
cust_no
min_del_date
model.upd_seq
update_date
BIN
123
2021-01-11
1453
2020-01-11
14
234
2020-06-29
1446
2020-06-18
48
456
2020-07-20
1446
2020-06-18
23
Update
I managed to find the solution by myself, thanks for everyone who has attending this question. The solution is per below
select a.cust_no, a.del_date, a.min_del_date, b.update_date, b.upd_seq, b.bin
from
(
select cust.cust_no, cust.del_date, cust.min_del_date,
CASE WHEN model.BIN <=50 THEN model.upd_seq WHEN BIN > 50 THEN model.upd_seq +1 ELSE NULL END as upd_seq
from cust
inner join model
on cust.cust_no = model.party_id
and cust.upd_seq = model.upd_seq + 1
)a
inner join model b
on a.cust_no = b.party_id
and a.upd_seq = b.upd_seq

Related

Retain all the records grouped by an ID comparing only the values with similar strings within the group that has the minimum value

Given this data:
Bolt_Table:
PID
UNIQ ID
GROUP_ID
Distance
PID_24_2225
14
13
1141
PID_5_1444E
3214
13
652
PID_5_14454
3152
13
802
PID_24_2225
15
14
1141
PID_5_14454
3151
14
802
PID_5_1444E
3213
14
652
PID_26_21FC
536
2300
597
PID_5_13388
4121
2300
620
PID_5_13382
4169
2300
802
This is the desired result:
PID
UNIQ_ID
GROUP_ID
Distance
PID_5_1444E
3214
13
652
PID_5_1444E
3213
14
652
PID_5_13388
4121
2300
620
Explanation:
1st Record: #Group ID = 13,
Get the similar PID
PID_5_1444E and PID_5_14454 - compare the corresponding distances minimum of (652 and 802). Since 652 is the least, the corresponding PID: " PID_5_1444E " should be retained, hence record 1 of the desired table
What would be the query for SQL? (Microsoft Access)
I tried using LIKE, MID(String,1,4), GROUP BY & HAVING but nothing seems to work. How should I make the query for this?
The closest one I got is when I force to do the judging under a hard coded GROUP_ID,I would like to do it FOR EACH GROUP_ID
SELECT TOP 1 PERCENT PID, UNIQ_ID, GROUP_ID, Distance
FROM
(
SELECT
a.PID, a.UNIQ_ID, a.GROUP_ID, ID, a.Distance,
(select count(PID) as counter from Bolt_Table where GROUP_ID = a.GROUP_ID and LEFT(PID, 9) = LEFT(a.PID, 9)) as counter from Bolt_Table a WHERE a.GROUP_ID = 13
)
where counter > 1
order by Distance
SELECT b.pid, b.[uniq id], b.group_id, b.distance FROM bolt_table as b INNER JOIN (SELECT group_id, min(distance) as mindist FROM bolt_table GROUP BY group_id) as a on b.group_id = a.group_id AND b.distance = a.mindist

SQL: Select only users who are new in 2021

If we have a table as follows:
User_ID
Order_date
Order_ID
1
2020-02-02
23
2
2021-03-03
45
1
2021-02-02
13
3
2019-05-23
34
3
2021-01-31
56
How to select only the user whose first order is in the year 2021 (in this case, only User 2)?
You can use aggregation:
select user_id
from t
group by user_id
having min(order_date) >= '2021-01-01';
This checks that the earliest order date is after the first of the year.

Query that'll identify returning active users in span of week

Write a query that'll identify returning active users. A returning active user is a user that has made a second purchase within 7 days of their first purchase.
id u_id item created_at revenue
1 109 milk 3/3/2020 0:00 123
2 139 biscuit 3/18/2020 0:00 421
3 120 milk 3/18/2020 0:00 176
4 108 banana 3/18/2020 0:00 862
5 130 milk 3/28/2020 0:00 333
6 103 bread 3/29/2020 0:00 862
7 122 banana 3/7/2020 0:00 952
8 125 bread 3/13/2020 0:00 317
9 139 bread 3/23/2020 0:00 929
10 141 banana 3/17/2020 0:00 812
11 116 bread 3/31/2020 0:00 226
12 128 bread 3/4/2020 0:00 112
13 146 biscuit 3/4/2020 0:00 362
14 119 banana 3/28/2020 0:00 127
You can use window functions to get the earliest creation date and then look for other records within one week:
select distinct u_id
from (select t.*,
min(created_at) over (partition by u_id) as min_created_at
from t
) t
where created_at > min_created_at and
created_at < min_created_at + interval 7 day;
If you just check the first time the customer purchased, and a second visit in the next 7 days, you will discard a third purchase after the second visit.
just globally check two purchases in a 7 day interval like this:
create table t(id integer, u_id integer, item varchar(100),created_at date,revenue float);
insert into t
values (1, 109, "milk" , STR_TO_DATE("3/3/2020", '%m/%d/%Y') , 123)
, (2,139,"biscuit",STR_TO_DATE("3/18/2020", '%m/%d/%Y'),421)
, (3,120,"milk",STR_TO_DATE("3/18/2020", '%m/%d/%Y'),176)
, (4,108,"banana",STR_TO_DATE("3/18/2020", '%m/%d/%Y'),862)
, (5,130,"milk",STR_TO_DATE("3/28/2020", '%m/%d/%Y'),333)
, (6,103,"bread",STR_TO_DATE("3/29/2020", '%m/%d/%Y'),862)
, (7,122,"banana",STR_TO_DATE("3/7/2020", '%m/%d/%Y'),952)
, (8,125,"bread",STR_TO_DATE("3/13/2020", '%m/%d/%Y'),317)
, (9,139,"bread",STR_TO_DATE("3/23/2020", '%m/%d/%Y'),929)
, (10,141,"banana",STR_TO_DATE("3/17/2020", '%m/%d/%Y'),812)
, (11,116,"bread",STR_TO_DATE("3/31/2020", '%m/%d/%Y'),226)
, (12,128,"bread",STR_TO_DATE("3/4/2020", '%m/%d/%Y'),112)
, (13,146,"biscuit",STR_TO_DATE("3/4/2020", '%m/%d/%Y'),362)
, (14,119,"banana",STR_TO_DATE("3/28/2020", '%m/%d/%Y'),127);
select * from t as t1 where exists (select * from t as t2 where t1.u_id = t2.u_id and t1.created_at - t2.created_at > 0 and t1.created_at - t2.created_at <= 7 );
The above example solved using window functions. You need to use date casting at the where condition since the given table in datetime format. This is tricky part to handle.
create table public.purchase_history(
id int,
userid int,
item varchar,
created_at datetime,
revenue int);
Insert into public.purchase_history values
(1, 109, 'milk' ,'03/03/2020' , 123)
, (2,139,'biscuit','03/18/2020',421)
, (3,120,'milk','03/18/2020',176)
, (4,108,'banana','03/18/2020',862)
, (5,130,'milk','03/28/2020',333)
, (6,103,'bread','03/29/2020',862)
, (7,122,'banana','03/07/2020',952)
, (8,125,'bread','03/13/2020',317)
, (9,139,'bread','03/23/2020',929)
, (10,141,'banana','03/17/2020',812)
, (11,116,'bread','03/31/2020',226)
, (12,128,'bread','03/04/2020',112)
, (13,146,'biscuit','03/04/2020',362)
, (14,119,'banana','03/28/2020',127)
, (15,120,'milk','03/28/2020',186);
select distinct userid
FROM
(select
id,
userid,
created_at,
coalesce(lead(created_at)over(partition by userid order by created_at),'9999-12-31') as next_purchase
from
public.purchase_history order by userid)repeated
where (repeated.next_purchase::date-repeated.created_at::date)<=7;
Let's say the tables name is amazon_transactions
SELECT distinct a.user_id FROM amazon_transactions a
JOIN amazon_transactions b
ON a.user_id = b.user_id
WHERE a.id <>b.id
AND a.created_at <= b.created_at
AND b.created_at <= a.created_at+7
ORDER BY a.user_id

How duplicate a rows in SQL base on difference between date columns and divided aggregated column per duplicate row?

I have a table with some records about fuel consumption. The important columns in the table are: CONSUME_DATE_FROM and CONSUM_DATE_TO.
I want to calculate average fuel consumption per cars on a monthly basis but some rows are not in the same month. For example some have a three month difference between them and the total of gas per litre is aggregated in a single row.
Now I should find records that have difference more than a month between CONSUME_DATE_FROM and CONSUM_DATE_TO, and duplicate them in current or second table per count of month and divide the total gas per litre between related rows.
I've this table with the following data:
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER
1 100 2018-10-25 2018-12-01 600
2 101 2018-07-19 2018-07-24 100
3 102 2018-12-31 2019-01-01 400
4 103 2018-03-29 2018-05-29 200
5 104 2018-02-05 2018-02-09 50
The expected output table should be as below
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER
1 100 2018-10-25 2018-12-01 200
1 100 2018-10-25 2018-12-01 200
1 100 2018-10-25 2018-12-01 200
2 101 2018-07-19 2018-07-24 100
3 102 2018-12-31 2019-01-01 200
3 102 2018-12-31 2019-01-01 200
4 103 2018-03-29 2018-05-29 66.66
4 103 2018-03-29 2018-05-29 66.66
4 103 2018-03-29 2018-05-29 66.66
5 104 2018-02-05 2018-02-09 50
Or as below
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER DATE_RELOAD_GAS
1 100 2018-10-25 2018-12-01 200 2018-10-01
1 100 2018-10-25 2018-12-01 200 2018-11-01
1 100 2018-10-25 2018-12-01 200 2018-12-01
2 101 2018-07-19 2018-07-24 100 2018-07-01
3 102 2018-12-31 2019-01-01 200 2018-12-01
3 102 2018-12-31 2019-01-01 200 2019-01-01
4 103 2018-03-29 2018-05-29 66.66 2018-03-01
4 103 2018-03-29 2018-05-29 66.66 2018-04-01
4 103 2018-03-29 2018-05-29 66.66 2018-05-01
5 104 2018-02-05 2018-02-09 50 2018-02-01
Can someone please help me out with this query?
I'm using oracle database
Your business rule treats the difference between CONSUME_DATE_FROM and CONSUM_DATE_TO as absolute months. So you expect the difference between 2018-10-25 and 2018-12-01 to be three months whereas the difference in days actually equates to about 1.1 months. So we can't use simple date arithmetic to get your desired output, we need to do some additional massaging of the dates.
The query below implements your desired logic by deriving the first day of the month for CONSUME_DATE_FROM and the last day of the month for CONSUME_DATE_TO, then using ceil() to round the difference up to the nearest whole number of months.
This is calculated in a subquery which is used in the main query with the old connect by level trick to multiply a record by level number of times:
with cte as (
select f.*
, ceil(months_between(last_day(CONSUM_DATE_TO)
, trunc(CONSUME_DATE_FROM,'mm'))) as diff
from fuel_consumption f
)
select cte.id
, cte.VehicleId
, cte.CONSUME_DATE_FROM
, cte.CONSUM_DATE_TO
, cte.GAS_PER_LITER/cte.diff as GAS_PER_LITER
, add_months(trunc(cte.CONSUME_DATE_FROM, 'mm'), level-1) as DATE_RELOAD_GAS
from cte
connect by level <= cte.diff
and prior cte.id = cte.id
and prior sys_guid() is not null
;
"what about if add a additional column "DATE_RELOAD_GAS" that display difference date for similar rows"
From your posted sample it seems like DATE_RELOAD_GAS is the first day of the month for each month bounded by CONSUME_DATE_FROM and CONSUM_DATE_TO. I have amended my solution to implement this rule.
By using connect by level structure with considering to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') as month I was able to resolve as below :
select ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO,
trunc(GAS_PER_LITER/max(rn) over (partition by ID order by ID),2) as GAS_PER_LITER,
'01.'||substr(myMonth,5,2)||'.'||substr(myMonth,1,4) as DATE_RELOAD_GAS
from
(
with consumption( ID, VehicleId, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER ) as
(
select 1,100,date'2018-10-25',date'2018-12-01',600 from dual union all
select 2,101,date'2018-07-19',date'2018-07-24',100 from dual union all
select 3,102,date'2018-12-31',date'2019-01-01',400 from dual union all
select 4,103,date'2018-03-29',date'2018-05-29',200 from dual union all
select 5,104,date'2018-02-05',date'2018-02-09', 50 from dual
)
select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth,
VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
row_number() over (partition by ID order by ID) as rn
from dual join consumption c
on c.ID >= 2
group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
union all
select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth,
VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
row_number() over (partition by ID order by ID) as rn
from dual join consumption c
on c.ID = 1
group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
) q
group by ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER, rn
order by ID, myMonth;
I met an interesting issue that if I consider the join condition in the subquery as c.ID >= 1 query hangs on for huge period of time, so splitted into two parts by union all
as c.ID >= 2 and c.ID = 1
Rextester Demo

List the last two records for each id

Good Afternoon!
I'm having trouble list the last two records each idmicro
Ex:
idhist idmicro idother room unit Dtmov
100 1102 0 8 coa 2009-10-23 10:40:00.000
101 1102 0 1 coa 2009-10-28 10:40:00.000
102 1102 0 2 dib 2008-10-24 10:40:00.000
103 1201 0 6 diraf 2008-10-23 10:40:00.000
104 1201 0 7 diraf 2009-10-21 10:40:00.000
105 1201 0 4 dimel 2008-10-22 10:40:00.000
Would look like this:
ex:
result
idhist idmicro idoutros room unit Dtmov
101 1102 0 1 coa 2009-10-28 10:40:00.000
102 1102 0 2 dib 2008-10-24 10:40:00.000
103 1201 0 6 diraf 2008-10-22 10:40:00.000
104 1201 0 7 diraf 2009-10-21 10:40:00.000
I'm starting to delve into SQL and am having trouble finding this solution
Sorry
Thank you.
EDIT: I am using SQL server, and I made no query.
Yes! is based on the date and time
You can do the same thing with an imbricated SELECT statement.
SELECT *
FROM (
SELECT row_number() OVER (
PARTITION BY idmicro ORDER BY idhist
) AS ind
,*
FROM data
) AS initialResultSet
WHERE initialResultSet.ind < 3
Here is a sample SQLFiddle with how this query works.
WITH etc
AS (
SELECT *
,row_number() OVER (
PARTITION BY idmicro ORDER BY idhist
) AS r
,count() OVER (
PARTITION BY idmicro ORDER BY idhist
) cfrom TABLE
)
SELECT *
FROM etc
WHERE r > c - 2
Use row_number and over partition
SELECT *
FROM (
SELECT *, row_number() OVER (PARTITION BY idmicro ORDER BY idhist desc) AS rownum
FROM data
) AS initialResultSet
WHERE initialResultSet.rownum<=2