I am running an Oracle database and have two tables below.
#account
+----------------------------------+
| acc_id | date | acc_type |
+--------+------------+------------+
| 1 | 11-07-2018 | customer |
| 2 | 01-11-2018 | customer |
| 3 | 02-09-2018 | employee |
| 4 | 01-09-2018 | customer |
+--------+------------+------------+
#credit_request
+-----------------------------------------------------------------+
| credit_id | date | credit_type | acc_id | credit_amount |
+------------+-------------+---------- +--------+
| 1112 | 01-08-2018 | failed | 1 | 2200 |
| 1214 | 02-12-2018 | success | 2 | 1500 |
| 1312 | 03-11-2018 | success | 4 | 8750 |
| 1468 | 01-12-2018 | failed | 2 | 3500 |
+------------+-------------+-------------+--------+---------------+
Want to have followings for each customer:
the last successful credit_request
sum of credit_amount of all failed credit_requests
Here is one method:
select a.acct_id, acr.num_fails,
acr.num_successes / nullif(acr.num_fails) as ratio, -- seems weird. Why not just the failure rate?
last_cr.credit_id, last_cr.date, last_cr.credit_amount
from account a left join
(select acc_id,
sum(case when credit_type = 'failed' then 1 else 0 end) as num_fails,
sum(case when credit_type = 'failed' then credit_amount else 0 end) as num_fails,
sum(case when credit_type = 'success' then 1 else 0 end) as num_successes
max(case when credit_type = 'success' then date else 0 end) as max_success_date
from credit_request
group by acct_id
) acr left join
credit_request last_cr
on last_cr.acct_id = acr.acct_id and last_cr.date = acr.date;
The following query should do the trick.
SELECT
acc_id,
MAX(CASE WHEN credit_type = 'success' AND rn = 1 THEN credit_id END) as last_successfull_credit_id,
MAX(CASE WHEN credit_type = 'success' AND rn = 1 THEN cdate END) as last_successfull_credit_date,
MAX(CASE WHEN credit_type = 'success' AND rn = 1 THEN credit_amount END) as last_successfull_credit_amount,
SUM(CASE WHEN credit_type = 'failed' THEN credit_amount ELSE 0 END) total_amount_of_failed_credit,
SUM(CASE WHEN credit_type = 'failed' THEN 1 ELSE 0 END) / COUNT(*) ratio_success_request
FROM (
SELECT
a.acc_id,
a.cdate adate,
a.acc_type,
c.credit_id,
c.cdate,
c.credit_type,
c.credit_amount,
ROW_NUMBER() OVER(PARTITION BY c.acc_id, c.credit_type ORDER BY c.cdate DESC) rn
FROM
account a
LEFT JOIN credit_request c ON c.acc_id = a.acc_id
) x
GROUP BY acc_id
ORDER BY acc_id
The subquery assigns a sequence to each record, within groups of accounts and credit types, using ROW_NUMBR(). The outer query does conditional aggrgation to compute the different computation you asked for.
This Db Fiddle demo with your test data returns :
ACC_ID | LAST_SUCCESSFULL_CREDIT_ID | LAST_SUCCESSFULL_CREDIT_DATE | LAST_SUCCESSFULL_CREDIT_AMOUNT | TOTAL_AMOUNT_OF_FAILED_CREDIT | RATIO_SUCCESS_REQUEST
-----: | -------------------------: | :--------------------------- | -----------------------------: | ----------------------------: | --------------------:
1 | null | null | null | 2200 | 1
2 | 1214 | 02-DEC-18 | 1500 | 3500 | .5
3 | null | null | null | 0 | 0
4 | 1312 | 03-NOV-18 | 8750 | 0 | 0
This might be what you are looking for... Since you did not show expected results, this might not be 100% accurate, feel free to adapt this.
I guess the below query is easy to understand and implement. Also, to avoid more and more terms in the CASE statements you can just make use of WITH clause and use it in the CASE statements to reduce the query size.
SELECT a.acc_id,
c.credit_type,
(distinct c.credit_id),
CASE WHEN
c.credit_type='success'
THEN max(date)
END CASE,
CASE WHEN
c.credit_type='failure'
THEN sum(credit_amount)
END CASE,
(CASE WHEN
c.credit_type='success'
THEN count(*)
END CASE )/
( CASE WHEN
c.credit_type='failure'
THEN count(*)
END CASE)
from accounts a LEFT JOIN
credit_request c on
a.acc_id=c.acc_id
where a.acc_type= 'customer'
group by c.credit_type
Related
I have table with following columns.
For the above table I need to get the count by date for each cd depends on ind value combinations and expecting the below output table.
for the row2 in output table, one OK and one no is there for id 45 and so need to take count as 1 for the date 2020-02-24 because it has 1 ok
similarly, for row4, it has notok and no, so for this combination we need to take notok for the max date for the id 30
I need to develop this in hive and could someone suggest how we can implement this. I tried separate sub queries to write but it is hitting performance due to many joins(I am writing individual query to calculate each combination separately and joining the results)
Updated for other scenario:
I have below data in table.
When we give weightage, it looks as follows
First case: when we are doing group by date, for 1/1/2020 I am getting count 1 which is correct
2nd Case: for date 1/2/2020, we suppose to get only count 1 for notOk but it is giving 2(as it is looking for 1st case row for 1/2/2020 for cd 1.
Also another scienario:
when I have multiple records for the same cd in different dates, not giving right results.
I have 2 "ok"s for cd 1 in different dates. So We need to consider only count 1 and we need to drop other ok which is either 1/1/2020 or 1/2/2020 as it is for same cd.
Really appreciate for your help.
Thanks,
Babu
If you need to take ind count for latest date for a given ID then the query will look like following
select dt,count(case when ind='ok' then 1 end) as ok_count,
count(case when ind='No' then 1 end) as No_count,
count(case when ind='not ok' then 1 end) as not_ok_count
from mytable_test where dt in (select max(dt) from mytable_test group by cd) group by dt;
However, If there are certain truth table conditions such as : for a given ID,
-choose OK if it has both OK and No.
-choose not ok if it has both No and not ok.
then it might not be a very efficient one, but will work fine.
select dt,count(case when ind='ok' then 1 end) as ok_count,
count(case when ind='No' then 1 end) as No_count,
count(case when ind='not ok' then 1 end) as not_ok_count
from mytable_test where dt in (
select max(a.dt) from mytable_test a,(select cd, (case when ind_to_consider=0 then 'No' when ind_to_consider=1 then 'ok' when ind_to_consider=2 then 'not ok' end ) as decoeded_ind from (select cd,max(ind_wt) as ind_to_consider from (select dt,cd,ind,(case when ind='ok' then 1 when ind='No' then 0 when ind='not ok' then 2 end ) as ind_wt from mytable_test) wt group by cd) decoder) k where a.cd=k.cd and a.ind=k.decoeded_ind group by a.cd,a.ind) group by dt;
explaination
first provide some weightage to the conditions of ind that you have provided .
In this case, based on your example, I am assuming NOK will be of least weigtage and OK medium and not ok of highest
select dt,cd,ind,(case when ind='ok' then 1 when ind='No' then 0 when ind='not ok' then 2 end ) as ind_wt from mytable_test
+-------------+-----+---------+---------+--+
| dt | cd | ind | ind_wt |
+-------------+-----+---------+---------+--+
| 2020-08-24 | 10 | ok | 1 |
| 2020-02-21 | 45 | No | 0 |
| 2020-02-24 | 45 | ok | 1 |
| 2020-08-25 | 20 | No | 0 |
| 2020-10-09 | 30 | not ok | 2 |
| 2020-10-13 | 30 | not ok | 2 |
| 2020-10-21 | 30 | No | 0 |
| 2020-10-23 | 30 | No | 0 |
| 2020-09-14 | 12 | No | 0 |
+-------------+-----+---------+---------+--+
next get the max weightage for each CD ( in the wt block)
select cd,max(ind_wt) as ind_to_consider from (select dt,cd,ind,(case when ind='ok' then 1 when ind='No' then 0 when ind='not ok' then 2 end ) as ind_wt from mytable_test) wt group by cd
+-----+------------------+--+
| cd | ind_to_consider |
+-----+------------------+--+
| 10 | 1 |
| 12 | 0 |
| 20 | 0 |
| 30 | 2 |
| 45 | 1 |
+-----+------------------+--+
Now that you have to decode the weight back to indicator so that you can get the latest date for each cd and max indicator .
select max(a.dt) from mytable_test a,(select cd, (case when ind_to_consider=0 then 'No' when ind_to_consider=1 then 'ok' when ind_to_consider=2 then 'not ok' end ) as decoeded_ind from (select cd,max(ind_wt) as ind_to_consider from (select dt,cd,ind,(case when ind='ok' then 1 when ind='No' then 0 when ind='not ok' then 2 end ) as ind_wt from mytable_test) wt group by cd) decoder) k where a.cd=k.cd and a.ind=k.decoeded_ind group by a.cd,a.ind
+-------------+--+
| _c0 |
+-------------+--+
| 2020-08-24 |
| 2020-09-14 |
| 2020-08-25 |
| 2020-10-13 |
| 2020-02-24 |
+-------------+--+
then use these dates to get the piviot
select dt,count(case when ind='ok' then 1 end) as ok_count,
count(case when ind='No' then 1 end) as No_count,
count(case when ind='not ok' then 1 end) as not_ok_count
from mytable_test where dt in (
select max(a.dt) from mytable_test a,(select cd, (case when ind_to_consider=0 then 'No' when ind_to_consider=1 then 'ok' when ind_to_consider=2 then 'not ok' end ) as decoeded_ind from (select cd,max(ind_wt) as ind_to_consider from (select dt,cd,ind,(case when ind='ok' then 1 when ind='No' then 0 when ind='not ok' then 2 end ) as ind_wt from mytable_test) wt group by cd) decoder) k where a.cd=k.cd and a.ind=k.decoeded_ind group by a.cd,a.ind) group by dt;
+-------------+-----------+-----------+---------------+--+
| dt | ok_count | no_count | not_ok_count |
+-------------+-----------+-----------+---------------+--+
| 2020-02-24 | 1 | 0 | 0 |
| 2020-08-24 | 1 | 0 | 0 |
| 2020-08-25 | 0 | 1 | 0 |
| 2020-09-14 | 0 | 1 | 0 |
| 2020-10-13 | 0 | 0 | 1 |
+-------------+-----------+-----------+---------------+--+
Use conditional aggregation:
select date,
sum(case when ind = 'ok' then 1 else 0 end) ok_count,
sum(case when ind = 'No' then 1 else 0 end) no_count,
sum(case when ind = 'not ok' then 1 else 0 end) not_ok_count
from mytable
group by date
Or, if you want to only take in account the latest row per id, we can pre-filter with row_number() first:
select date,
sum(case when ind = 'ok' then 1 else 0 end) ok_count,
sum(case when ind = 'No' then 1 else 0 end) no_count,
sum(case when ind = 'not ok' then 1 else 0 end) not_ok_count
from (
select t.*, row_number() over(partition by id order by date desc) rn
from mytable t
) t
where rn = 1
group by date
I'm trying to pull a list of all open orders from a customer where the same customer has used both one of our special payment types as well as one of our standard options. Specifically, those that have open orders with either prepay or 10n30 and at least one normal order. So, in the example tables below I would want to return order_id 1, 3, and 4.
cust_orders order_info
+----------+-----------+ +----------+-------------+----------+
| cust_id | order_id | | order_id | pay_type | status |
+----------+-----------+ +----------+-------------+----------+
| 1 | 1 | | 1 | standard | open |
| 1 | 2 | | 2 | prepay | closed |
| 1 | 3 | | 3 | prepay | open |
| 1 | 4 | | 4 | 10n30 | open |
| 2 | 5 | | 5 | standard | deferred |
| 2 | 6 | | 6 | prepay | open |
| 3 | 7 | | 7 | N/A | deferred |
| 4 | 8 | | 8 | prepay | open |
| 4 | 9 | | 9 | standard | closed |
| 4 | 10 | | 10 | prepay | open |
+----------+-----------+ +----------+-------------+----------+
I have the following query
SELECT *
FROM cust_orders AS co
LEFT JOIN ( SELECT *
FROM order_info
WHERE pay_type IN('prepay', '10n30')
AND status = 'open' ) AS o1 on o1.order_id = co.order_id
LEFT JOIN ( SELECT *
FROM order_info
WHERE pay_type NOT IN('prepay', '10n30')
AND status = 'open' ) AS o2 on o2.order_id = co.order_id
WHERE o1.order_id IS NOT NULL
AND o2.order_id IS NOT NULL
ORDER BY co.order_id DESC;
but it runs very slowly and returns a bunch of duplicates.
I've looked at Search for orders that have two products, one with specific reference, other with specific description and SELECT all orders with more than one item and check all items status but neither seems to be what I need.
EDIT: Thanks to gjvdkamp for the basis to the code below; I modified their solution to use in a larger query and everything runs fine now.
SELECT co.*, [other fields]
FROM cust_order AS co
LEFT JOIN [other tables]
WHERE cust_id IN ( SELECT co.cust_id
FROM cust_order AS co
LEFT JOIN order_info o on o.order_id = co.order_id
WHERE o.status = 'open'
GROUP BY co.cust_id
HAVING SUM(CASE WHEN o.pay_type IN ('prepay', '10n30') THEN 1 ELSE 0 END) > 0
AND SUM(CASE WHEN (o.pay_type NOT IN ('prepay', '10n30') OR o.pay_type IS NULL) THEN 1 ELSE 0 END) > 0)
A 'handrolled pivot' would work well here:
select cust_id,
sum(case when pay_type = 'normal' then 1 else 0 end) as NormalCount,
sum(case when pay_type in ('prepay', '10n30') then 1 else 0 end) as OtherCount
from cust_order co
inner join order o on co.order_id = o.order_id
where o.status = 'open'
and o.pay_type in ('normal','prepay','10n30')
group by cust_id
having NormalCount> 0 and
OtherCount > 0
This would only require a single join (merge if you have you indexes right) and then aggregegates that. Don't know the statistics on your orders table but added where statement on pay_type for good measure. This would be hard to beat speed wise..
Edit: removed the with statement as it's not even needed
I think some window functions do the trick:
select o.*
from (select o.*,
sum(case when o.pay_type in ('prepay', '10n30') then 1 else 0 end) over (partition by co.cust_id) as num_special,
sum(case when o.pay_type in ('standard') then 1 else 0 end) over (partition by co.cust_id) as num_standard
from cust_orders co join
order_info o
on co.orderid = o.order_id
where o.status = 'open'
) o
where num_standard > 0 and
num_special > 0;
I have a table with almost a million records of claims for 6 different conditions like Diabetes, Hypertension, Heart Failure etc. Every member has a number of claims. He might have claims with the condition as Diabetes or Hypertension or anything else. My goal is to group the conditions they have(number of claims) per every member row.
Existing table
+--------------+---------------+------+------------+
| Conditions | ConditionCode | ID | Member_Key |
+--------------+---------------+------+------------+
| DM | 3001 | 1212 | A1528 |
| HTN | 5001 | 1213 | A1528 |
| COPD | 6001 | 1214 | A1528 |
| DM | 3001 | 1215 | A1528 |
| CAD | 8001 | 1823 | B4354 |
| HTN | 5001 | 3458 | B4354 |
+--------------+---------------+------+------------+
Desired Result
+------------+------+-----+----+----+-----+-----+
| Member_Key | COPD | CAD | DM | HF | CHF | HTN |
+------------+------+-----+----+----+-----+-----+
| A1528 | 1 | | 2 | | | 1 |
| B4354 | | 1 | | | | 1 |
+------------+------+-----+----+----+-----+-----+
Query
select distinct tr.Member_Key,C.COPD,D.CAD,DM.DM,HF.HF,CHF.CHF,HTN.HTN
FROM myTable tr
--COPD
left outer join (select Member_Key,'X' as COPD
FROM myTable
where Condition=6001) C
on C.Member_Key=tr.Member_Key
--CAD
left outer join ( ....
For now I'm just using 'X'. But i'm trying to get the number of claims in place of X based on condition. I don't think using a left outer join is efficient when you are searching 1 million rows and doing a distinct. Do you have any other approach in solving this
You don't want so many sub-queries, this is easy with group by and case statements:
SELECT Member_Key
SUM(CASE WHEN Condition=6001 THEN 1 ELSE 0 END) AS COPD,
SUM(CASE WHEN Condition=3001 THEN 1 ELSE 0 END) AS DM,
SUM(CASE WHEN Condition=5001 THEN 1 ELSE 0 END) AS HTN,
SUM(CASE WHEN Condition=8001 THEN 1 ELSE 0 END) AS CAD
FROM myTable
GROUP BY Member_Key
This is an ideal situation for CASE statments:
SELECT tr.Member_Key,
SUM(CASE WHEN Condition=6001 THEN 1 ELSE 0 END) as COPD,
SUM(CASE WHEN Condition=6002 THEN 1 ELSE 0 END) as OtherIssue,
SUM(CASE etc.)
FROM myTable tr
GROUP BY tr.Member_Key
This should be done with a PIVOT, like:
SELECT *
FROM
(SELECT conditions, member_key
FROM t) src
PIVOT
(COUNT (conditions)
for conditions in ([COPD], [CAD], [DM], [HF], [CHF], [HTN])) pvt
I have table a
| id | value | comment |
|--------------------------|
| 1 | Some1 | comm1 |
|--------------------------|
| 2 | Some2 | comm2 |
|--------------------------|
and i have table b with table a as foreign key
| id | id_a |name | amount | factor |
|--------------------------------------------|
| 1 | 1 |Car | 12 | 2 |
|--------------------------------------------|
| 2 | 1 |Bike | 22 | 5 |
|--------------------------------------------|
| 3 | 2 |Car | 54 | 1 |
|--------------------------------------------|
| 4 | 2 |Bike | 55 | 4 |
|--------------------------------------------|
As result I want to have a combination:
|id| value | comment | Car_Amount | Car_factor | Bike_Amount | Bike_Factor |
|--------------------------------------------------------------------------|
| 1| Some1 | comm1 | 12 | 2 | 22 | 5 |
|--------------------------------------------------------------------------|
| 2| Some2 | comm2 | 54 | 1 | 55 | 4 |
|--------------------------------------------------------------------------|
It is not a pivot as far as I can see. But I am not sure if this is good practise at all. I am not an expert in SQL things, but it looks utterly wrong to mix tables like that.
I mean "they" want to have it as a flat result to use it for reporting...
Is it possible at all?
thanks
Aggregate values like this:
select
a.id, a.value, a.comment,
sum(case when b.name='Car' then b.amount end) as Car_Amount,
sum(case when b.name='Car' then b.factor end) as Car_Factor,
sum(case when b.name='Bike' then b.amount end) as Bike_Amount,
sum(case when b.name='Bike' then b.factor end) as Bike_Factor
from a left join b on a.id=b.id_a
group by a.id, a.value, a.comment;
SELECT t1.id,
t1.value,
MAX(CASE WHEN t2.name = 'Car' THEN t2.amount END) AS Car_Amount,
MAX(CASE WHEN t2.name = 'Car' THEN t2.factor END) AS Car_Factor,
MAX(CASE WHEN t2.name = 'Bike' THEN t2.amount END) AS Bike_amount,
MAX(CASE WHEN t2.name = 'Bike' THEN t2.factor END) AS Bike_Factor
FROM a t1
INNER JOIN b t2
ON t1.id = t2.id_a
GROUP BY t1.id
Try this
SELECT ID,value,comment,
SUM(CASE WHEN Name='Car' THEN Amount END) AS Car_Amount,
SUM(CASE WHEN Name='Car' THEN factor END) AS Car_factor ,
SUM(CASE WHEN Name='Bike' THEN Amount END) AS Bike_Amount,
SUM(CASE WHEN Name='Bike' THEN factor END) AS Bike_factor
FROM TableB
INNER JOIN TableA on TableB.ID= TableA.id
Group by ID,value,comment
I am trying to write a validation for the following set of data:
SSYS | Material_Number | Characteristic | Description
001 | 000000000001111 | SH_DESC | TEST
001 | 000000000001111 | DESIGN_TYPE | NULL
001 | 000000000001111 | VOLTAGE | NULL
001 | 000000000009999 | SH_DESC | TEST2
001 | 000000000009999 | OPER_METHOD | LIGHT
001 | 000000000009999 | FILTER_TYPE | Filter element,Air
001 | 000000000014560 | SH_DESC | Horn,Signal
001 | 000000000014560 | DIMENSION_SIZE | NULL
001 | 000000000014560 | FILTER_TYPE | NULL
I would like to group by the Material_Number and count as 1 (ie. true) if within the Material_Number group, the SH_DESC description is NOT NULL and all other characteristics' descriptions IS NULL. So, in this case my result would be:
SSYS | Material_Number | Characteristic | Description | COUNT
001 | 000000000001111 | SH_DESC | TEST | 1
001 | 000000000009999 | SH_DESC | TEST2 | 0
001 | 000000000014560 | SH_DESC | Horn,Signal | 1
My attempt:
Select COUNT (*), SSYS, Material_Number, Characteristic, Description
From myDB where (Characteristic = 'SH_DESC' AND DESCRIPTION IS NOT NULL) AND (Characteristic NOT IN ('SH_DESC') IS NULL)
GROUP BY SSYS, Material_Number, Characteristic, Description HAVING COUNT (*) < 2
Any help is much appreciated!
Try:
Select SSYS,
Material_Number,
'SH_DESC' Characteristic,
MAX(CASE WHEN Characteristic = 'SH_DESC' THEN Description END) Description,
CASE WHEN MAX(CASE WHEN Characteristic = 'SH_DESC' THEN Description END) IS NOT NULL AND
MAX(CASE WHEN Characteristic <>'SH_DESC' THEN Description END) IS NULL
THEN 1
ELSE 0
END COUNT
From myDB
GROUP BY SSYS, Material_Number
Try this:
select ssys, material_number, 'SH_DESC' as characteristic,
(case when sum(case when characteristic is not null and characteristic<> 'SH_DESC' and description is null then 1 else 0 end) = count(*) - 1
then 1
else 0
end) as count
from t
group by ssys, material_number
It groups by material and counts the number of rows that have non-null characterist where the description is null. It sets count accordingly.
An alternative to the GROUP BY and SUM(CASE WHEN) options...
SELECT
*,
CASE WHEN Description IS NULL THEN 0
WHEN EXISTS (SELECT *
FROM myDB as lookup
WHERE lookup.SSYS = myDB.SSYS
AND lookup.Material_Number = myDB.Material_Number
AND lookup.Characteristic <> 'SH_DESC'
AND lookup.Description IS NOT NULL) THEN 0
ELSE 1 END as myCount
FROM
myDB
WHERE
Characteristic = 'SH_DESC'
Try this -- Here I guess you cant get the description bcos there is nothiing to filter the specific description.
CREATE TABLE yourtable(SSYS varchar(10),Material_Number varchar(100),Characteristic varchar(100),Description varchar(100))
INSERT INTO yourtable
VALUES('001','000000000001111','SH_DESC','TEST'),
('001','000000000001111','DESIGN_TYPE','NULL'),
('001','000000000001111','VOLTAGE','NULL'),
('001','000000000009999','SH_DESC','TEST2'),
('001','000000000009999','SH_DESC','LIGHT'),
('001','000000000009999','FILTER_TYPE','Filter element,Air'),
('001','000000000014560','SH_DESC','Horn,Signal'),
('001','000000000014560','DIMENSION_SIZE','NULL'),
('001','000000000014560','FILTER_TYPE ','NULL')
select max(SSYS),
max(Material_Number),
'SH_DESC' as Characteristic,
CASE WHEN SUM(CASE WHEN Characteristic='SH_DESC' and Description is not null then 1 else 0 end) = 1 then 1 else 0 end as cnt
from yourtable
group by Material_Number