Group rows while excluding null values on specific column and aggregating another

Group rows while excluding null values on specific column and aggregating another - sql

I have a table like this
CUSTOMER_ID
ALIAS_ID
ULTIMATE_NAME
MODEL_SUB_TYPE
OLD_PD
OLD_EXP
OLD_ECAP
RATING
Client A
123
Company A
CI_COM_KN
1
1
1
BB+
Client A
123
Company A
CI_POL_KN
0.5
1
1
null
Client A
456
Company B
CI_COM_KN
1
1
3
BB+
Client A
456
Company B
CI_POL_KN
0.5
1
3
null
What I need my query to do is to ignore the values in OLD_PD, OLD_EXP and RATING columns when MODEL_SUB_TYPE = Sub_type B, and aggregate (sum) the OLD_ECAP column regardless of MODEL_SUB_TYPE.
What I have so far is this:
SELECT
CUSTOMER_ID,
ALIAS_ID,
SUBSTR(ULTIMATE_NAME, 0, 70) as ULTIMATE_NAME,
MODEL_SUB_TYPE,
CASE
WHEN MODEL_SUB_TYPE LIKE 'CI_COM_%' THEN ULTIMATE_POD
END AS OLD_PD,
SUM(
CASE
WHEN MODEL_SUB_TYPE LIKE 'CI_COM_%' THEN CREDIT_LIMIT_NET_EXPOSURE
END
) AS OLD_EXP,
SUM(EC_CONSUMPTION_ND) AS OLD_ECAP,
ULTIMATE_RATING AS RATING
FROM
CALC6619.SO_REPORTING -- OLD QUARTER --
WHERE
MODEL_TYPE LIKE 'IR'
AND MODEL_SUB_TYPE LIKE 'CI_%'
AND CUSTOMER_ID = '09781C1 01' -- Customer ID
GROUP BY
CUSTOMER_ID,
ALIAS_ID,
ULTIMATE_NAME,
MODEL_SUB_TYPE,
ULTIMATE_POD,
ULTIMATE_RATING
What I want is my query to return a table like this (based on table above):
CUSTOMER_ID
ALIAS_ID
ULTIMATE_NAME
MODEL_SUB_TYPE
OLD_PD
OLD_EXP
OLD_ECAP
RATING
Client A
123
Company A
CI_COM_KN
1
1
2
BB+
Client A
456
Company B
CI_COM_KN
1
1
6
BB+
But it's actually returning a table like the first one, but with null values where it's supposed to be, but not grouping the rows per company ID, like so:
CUSTOMER_ID
ALIAS_ID
ULTIMATE_NAME
MODEL_SUB_TYPE
OLD_PD
OLD_EXP
OLD_ECAP
RATING
Client A
123
Company A
CI_COM_KN
1
1
1
BB+
Client A
123
Company A
CI_POL_KN
null
null
1
null
Client A
456
Company B
CI_COM_KN
1
1
3
BB+
Client A
456
Company B
CI_POL_KN
null
null
3
null

You just have to remove the columns from group by and do some changes in SELECT clause as follows:
SELECT
CUSTOMER_ID,
ALIAS_ID,
SUBSTR(ULTIMATE_NAME, 0, 70) as ULTIMATE_NAME,
MAX(CASE WHEN MODEL_SUB_TYPE LIKE 'CI_COM_%' THEN MODEL_SUB_TYPE END) AS MODEL_SUB_TYPE,
SUM(CASE
WHEN MODEL_SUB_TYPE LIKE 'CI_COM_%' THEN ULTIMATE_POD
END) AS OLD_PD,
SUM(
CASE
WHEN MODEL_SUB_TYPE LIKE 'CI_COM_%' THEN CREDIT_LIMIT_NET_EXPOSURE
END
) AS OLD_EXP,
SUM(EC_CONSUMPTION_ND) AS OLD_ECAP,
MAX(ULTIMATE_RATING) AS RATING
FROM
CALC6619.SO_REPORTING -- OLD QUARTER --
WHERE
MODEL_TYPE LIKE 'IR'
AND MODEL_SUB_TYPE LIKE 'CI_%'
AND CUSTOMER_ID = '09781C1 01' -- Customer ID
GROUP BY
CUSTOMER_ID,
ALIAS_ID,
SUBSTR(ULTIMATE_NAME, 0, 70)

Related

I need a 3 table join

This is my parent table acc_detial -
ACC_DETIAL example -
acc_id
1
2
3
Now i have 3 tables:
ORDER
EMAIL
REPORT
Each table contains 100 rows and acc_id are ForeignKey from ACC_DETIAL.
In ORDER table I have a columns ACC_ID and QUANTITY. I want the count of ACC_ID and sum of QUANTITY.
ORDER table example:
acc_id
quantity
date
1
2
2022/01/22
2
5
2022/01/23
1
10
2022/01/25
3
1
2022/01/25
In EMAIL table I have a column name ACC_ID and I want count of ACC_ID.
EMAIL table example:
acc_id
mail
date
1
5
2022/01/22
2
10
2022/01/22
1
7
2022/01/23
1
7
2022/01/24
2
10
2022/01/25
In REPORT table I have a columns ACC_ID and TYPE and I want the count of ACC_ID and TYPE. Note that TYPE column has only two, possible values:
postive
negative
I want count of each, i.e. count of postive and count of negative in TYPE column.
REPORT table example:
acc_id
type
date
1
positive
2022/01/22
2
negative
2022/01/22
1
negative
2022/01/23
2
postitive
2022/01/26
2
postitive
2022/01/27
I need to take this in a single i need answer as raw query or sqlalchemy. Is it possible or not? Do I need to write separate query to get each table result ?
Result -
result based on above examplec -
acc_id
total_Order_acc_id
total_Order_quantity
total_Email_acc_id
total_Report_acc_id
total_postitive_report
total_negative_report
1
2
12
3
2
1
1
2
1
5
2
3
2
1
3
1
1
Null
Null
Null
Null

You need to aggregate then join as the following:
SELECT ADL.acc_id,
ORD.ord_cnt AS total_Order_acc_id,
ORD.tot_quantity AS total_Order_quantity,
EML.eml_cnt AS total_Email_acc_id,
RPT.rpt_cnt AS total_Report_acc_id,
RPT.pcnt AS total_postitive_report,
RPT.ncnt AS total_negative_report
FROM ACC_DETIAL ADL LEFT JOIN
(
SELECT acc_id,
SUM(quantity) AS tot_quantity,
COUNT(*) AS ord_cnt
FROM ORDERS
GROUP BY acc_id
) ORD
ON ADL.acc_id = ORD.acc_id
LEFT JOIN
(
SELECT acc_id, COUNT(*) AS eml_cnt
FROM EMAIL
GROUP BY acc_id
) EML
ON ADL.acc_id = EML.acc_id
LEFT JOIN
(
SELECT acc_id,
COUNT(*) AS rpt_cnt,
COUNT(*) FILTER (WHERE type='positive') AS pcnt,
COUNT(*) FILTER (WHERE type='negative') AS ncnt
FROM REPORT
GROUP BY acc_id
) RPT
ON ADL.acc_id = RPT.acc_id
See demo

Sample :
Select
`order`.`acc_id`,
report_email_select.`type`,
report_email_select.report_count,
report_email_select.email_count,
SUM(`quantity`) as quantity_sum
FROM
`order`
Left JOIN(
Select
report_select.`acc_id`,
report_select.`type`,
report_select.report_count,
COUNT(*) as email_count
from
(
SELECT
report.`acc_id`,
report.`type`,
COUNT(*) as report_count
FROM
`report`
WHERE
1
GROUP BY
report.`acc_id`,
report.`type`
) AS report_select
INNER JOIN email ON email.acc_id = report_select.acc_id
GROUP BY
report_select.`acc_id`,
report_select.`type`
) AS report_email_select ON `order`.acc_id = report_email_select.acc_id
GROUP BY
`order`.`acc_id`,
report_email_select.`type`;

Calculate sum and cumul by age group and by date (but people changes age group as time passes)

DBMS : postgreSQL
my problem :
In my database I have a person table with id and birth date, an events table that links a person, an event (id_event) and a date, an age table used for grouping ages. In the real database the person table is about 40 millions obs, and events 3 times bigger.
I need to produce a report (sum and cumul of X events) by age (age_group) and date (event_date). There isn't any problem to count the number of events by date. The problem lies with the cumul : contrary to other variables (sex for example), a person grow older and changes age group
as time passes, so for a given age group the cumul can increase then decrease. I want that the event's cumul, on every date in my report, uses the age of the persons on these dates.
Example of my inputs and desired output
The only way I found is to do a Cartesian product on the tables person and the dates v_dates, so it's easy to follow an event and make it change age_group. The code below uses this method.
BUT I can't use a cartesian product on my real data (makes a table way too big) and I need to use another method.
reproductible example
In this simplified example I want to produce a report by month from 2020-07-01 to 2022-07-01 (view v_dates). In reality I need to produce the same report by day but the logic remains the same.
My inputs
/* create table person*/
DROP TABLE IF EXISTS person;
CREATE TABLE person
(
person_id varchar(1),
person_birth_date date
);
INSERT INTO person
VALUES ('A', '2017-01-01'),
('B', '2016-07-01');
person_id
person_birth_date
A
2000-10-01
B
2010-02-01
/* create table events*/
DROP TABLE IF EXISTS events;
CREATE TABLE events
(
person_id varchar(1),
event_id integer,
event_date date
);
INSERT INTO events
VALUES ('A', 1, '2020-07-01'),
('A', 2, '2021-07-01'),
('B', 1, '2021-01-01'),
('B', 2, '2022-01-01');
person_id
event_id
event_date
A
1
2020-01-01
A
2
2021-01-01
B
1
2020-07-01
B
2
2021-01-01
/* create table age*/
DROP TABLE IF EXISTS age;
CREATE TABLE age
(
age integer,
age_group varchar(8)
);
INSERT INTO age
VALUES (0,'[0-4]'),
(1,'[0-4]'),
(2,'[0-4]'),
(3,'[0-4]'),
(4,'[0-4]'),
(5,'[5-9]'),
(6,'[5-9]'),
(7,'[5-9]'),
(8,'[5-9]'),
(9,'[5-9]');
/* create view dates : contains monthly dates from 2020-07-01 to 2022-07-01*/
CREATE or replace view v_dates AS
SELECT GENERATE_SERIES('2020-07-01'::date, '2022-07-01'::date, '6 month')::date as event_date;
age
age_group
0
[0-4]
1
[0-4]
5
[5-9]
My current method using a cartesian product
CROSS JOIN person * v_dates
with a LEFT JOIN to get info from table events
with a LEFT JOIN to get age_group from table age
CREATE or replace view v_person_event AS
SELECT
pdev.person_id,
pdev.event_date,
pdev.age,
ag.age_group,
pdev.event1,
pdev.event2
FROM
(
SELECT pd.person_id,
pd.event_date,
date_part('year', age(pd.event_date::TIMESTAMP, pd.person_birth_date::TIMESTAMP)) as age,
CASE WHEN ev.event_id = 1 THEN 1 else 0 END as event1,
CASE WHEN ev.event_id = 2 THEN 1 else 0 END as event2
FROM
(
SELECT *
FROM person
CROSS JOIN v_dates
) pd
LEFT JOIN events ev
on pd.person_id = ev.person_id
and pd.event_date = ev.event_date
) pdev
Left JOIN age as ag on pdev.age = ag.age
ORDER by pdev.person_id, pdev.event_date;
add columns event1_cum and event2_cum
CREATE or replace view v_person_event_cum AS
SELECT *,
SUM(event1) OVER (PARTITION BY person_id ORDER BY event_date) event1_cum,
SUM(event2) OVER (PARTITION BY person_id ORDER BY event_date) event2_cum
FROM v_person_event;
SELECT * FROM v_person_event_cum;
person_id
event_date
age
age_group
event1
event2
event1_cum
event2_cum
A
2020-07-01
3
[0-4]
1
0
1
0
A
2021-01-01
4
[0-4]
0
0
1
0
A
2021-07-01
4
[0-4]
0
1
1
1
A
2022-01-01
5
[5-9]
0
0
1
1
A
2022-07-01
5
[5-9]
0
0
1
1
B
2020-07-01
4
[0-4]
0
0
0
0
B
2021-01-01
4
[0-4]
1
0
1
0
B
2021-07-01
5
[5-9]
0
0
1
0
B
2022-01-01
5
[5-9]
0
1
1
1
B
2022-07-01
6
[5-9]
0
0
1
1
desired output : create a report grouped by variables age_group and event_date
SELECT
age_group,
event_date,
SUM(event1) as event1,
SUM(event2) as event2,
SUM(event1_cum) as event1_cum,
SUM(event2_cum) as event2_cum
FROM v_person_event_cum
GROUP BY age_group, event_date
ORDER BY age_group, event_date;
age_group
event_date
event1
event2
event1_cum
event2_cum
[0-4]
2020-07-01
1
0
1
0
[0-4]
2021-01-01
1
0
2
0
[0-4]
2021-07-01
0
1
1
1
[5-9]
2021-07-01
0
0
1
0
[5-9]
2022-01-01
0
1
2
2
This is why this is not an ordinary cumul : for the age_group [0-4], event1_cum goes from 2 at '2021-01-01' to 1 at '2021-07-01' because A was in [0-4] at the time of the event 1, still in [0-4] at '2021-01-01' but in [5-9] at 2021-07-01
When we read the report:
the 2021-01-01, there was 2 person between 0 and 4 (at that date) who had event1 and 0 person who had event2.
the 2021-07-01, there was 1 person between 0 and 4 who had event1 and 1 person who had event2.
I can't get a solution to this problem without using a cartesian Product...
Thanks in advance!

How to get Result 1 for all user_ids that at least one time have source as paid

How to get Result=1 for all user_ids that at least one time have source as paid. I mean not just for one row where source=paid, but for all rows for this user_id.
Result column does not exist in the table! We should get it somehow using the code!
Row Table
source session_number user_id
NULL 1 12345
NULL 2 12345
NULL 3 12345
NULL 4 12345
NULL 1 67890
paid 2 67890
NULL 3 67890
Desired Table
source session_number user_id result
NULL 1 12345 0
NULL 2 12345 0
NULL 3 12345 0
NULL 4 12345 0
NULL 1 67890 1
paid 2 67890 1
NULL 3 67890 1

You seem to want a window function. It would seem to be:
select t.*,
max(case when source = 'paid' then 1 else 0 end) over (partition by userid) as result
from t;
In Postgres, you can return a boolean as:
select t.*,
bool_or(source = 'paid') over (partition by userid) as result
from t;

use exists
select a.* from table_name a
where exists( select 1 from table_name b where a.userid=b.userid
and b.source='paid')
and result=1

With subquery
SELECT *,
CASE
WHEN user_id IN
(
SELECT user_id
FROM table_name
WHERE source = 'paid'
)
THEN 1
ELSE 0
END AS result
FROM table_name

Sql Grouping Query

I'm wondering how i can get a query to put these groupings into one line so i can put it into a vb.net datagrid.
For example Number, Company Name, Current, 31-60, 61-90
Which would be for example company A, but get the grouping to all be on one line.
104680777, Company A, 643546.344, 34534534.77, 3454.55
To even get this query below. I had to do this.
select sum(Amount), DunsNum, CompanyName, Age
from tblARAged
group by DunsNum, Age, CompanyName
Amount Num CompanyName Age
63546.344 104680777 Company a 1
34534534.77 104680777 Company a 2
3454.55 104680777 Company a 3
3453453.66 186830733 Company b 1
345342.45 186830733 Company b 2
4542.55 186830733 Company c 3
3434.55 26409797 Company c 1
345345 26409797 Company c 2
The 1 correlates to current, 2 correlates to 31-60 and 3 correlates to 61-90 for age

I would do what Nimesh stated, though I would make a few tweaks. You want to do aggregation as late as possible:
SELECT DunsNum ,
CompanyName ,
SUM(CASE WHEN ( Age = 1 ) THEN Amt
ELSE 0
END) AS [Amount_Cur] ,
SUM(CASE WHEN ( Age = 2 ) THEN Amt
ELSE 0
END) AS [Amount_31-60] ,
SUM(CASE WHEN ( Age = 3 ) THEN Amt
ELSE 0
END) AS [Amount_61-90]
FROM tblARAged
GROUP BY DunsNum ,
CompanyName;
I haven't tested this code

SQL required data based on date

I have been working on a report for the required output. The scenario is that a block manufacturing firm having multiple orders of the same client delivers orders on a credit on different dates and clients pays amount partially irrespective of the orders. I have been stuck in these two tables:
Orders_master,
do_no Client_id Site_id Order_date Amount
1 1 1 2013-10-27 50000
2 1 1 2013-10-29 47000
3 1 1 2013-10-15 10000
Client_payments,
P_id Client_id Site_id P_date Amount
1 1 1 2013-11-05 30000
2 1 1 2013-11-10 67000
3 1 1 2013-11-20 10000
I need help to write a query which gives the following output all rows from both tables,
Do_no Client_id Site_id Order_date P_date Order_amount Payment_amount
1 1 1 2013-10-27 Null 50000 Null
2 1 1 2013-10-29 Null 47000 Null
Null 1 1 Null 2013-11-05 Null 30000
Null 1 1 null 2013-11-10 Null 67000
3 1 1 2013-11-15 Null 10000 Null
Null 1 1 Null 2013-11-20 Null 10000
Below query returns all the rows of orders_master table but misses the last row of the required output shows above,
select om.*, cp.*
from orders_master om left join
client_payment cp on
om.order_date = cp.p_date and
om.site_id = cp.site_id
where om.site_id = 1
I tried different joins but it does not return all the rows of both the columns, if returns then with repeating values and not nulls

It looks like you want to use UNION [ALL] to combine the two tables, rather than a JOIN:
SELECT do_no,
client_id,
site_id,
Order_Date,
P_Date = NULL,
Order_Amount = Amount,
Payment_Amount = NULL
FROM Orders_Master
WHERE Site_ID = 1
UNION ALL
SELECT do_no = NULL,
client_id,
site_id,
Order_Date = NULL,
P_Date = P_Date,
Order_Amount = NULL,
Payment_Amount = Amount
FROM Client_Payments
WHERE Site_ID = 1;
Example on SQL Fiddle

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Group rows while excluding null values on specific column and aggregating another - sql

Related

I need a 3 table join

Calculate sum and cumul by age group and by date (but people changes age group as time passes)

How to get Result 1 for all user_ids that at least one time have source as paid

Sql Grouping Query

SQL required data based on date

Categories

Resources