How to sum from multiple columns and segregate into separate column if result is positive and negative - sql

I am using postgresql and need to write a query to sum values from separate columns of two different tables and then segregate into separate columns if positive or negative.
For Example,
Below is the source table
Below is the resultant table which need to be created also used while populating it
I have written below query to aggregate sum and able to populate TOT_CREDIT and TOT_DEBIT column. Is there any optimized query to achieve that ?
select t.account_id,
t.transaction_date,
SUM(t.transaction_amt) filter (where t.transaction_amt >= 0) as tot_debit,
SUM(t.transaction_amt) filter (where t.transaction_amt < 0) as tot_credit,
case
when
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
) < 0
then
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
)
end as credit_balance,
case
when
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
) > 0
then
(
SUM(t.transaction_amt) +
SUM(COALESCE(b.credit_balance,0)) +
SUM(COALESCE(b.debit_balance,0))
)
end as debit_balance,
from
transaction t
LEFT OUTER JOIN balance b ON (t.account_id = b.account_id
and t.transaction_date = b.transaction_date
and b.transaction_date=t.transaction_date- INTERVAL '1 DAYS')
group by
t.account_id,
t.transaction_date
Please provide some pointer.
EDIT 1: This query is not working in expected manner.

One way is to break your logic into smal queries and join them in the end!
select tw.account_id, tw.t_date,tw.t_c,th.T_D,fo.C_B,fi.d_B from
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as t_C from TransactionTABLE
where Transaction_AMT<0 group by account_id, Transaction_date ) as tw inner join
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as t_d from TransactionTABLE
where Transaction_AMT>0 group by account_id, Transaction_date ) as th on tw.account_id=th.account_id and tw.t_date=th.t_date inner join
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as C_B from TransactionTABLE
where sum(Transaction_AMT)<0 group by account_id, Transaction_date ) as fo on th.account_id=fo.account_id and th.t_date=fo.t_date inner join
(select account_id, Transaction_date as t_date, sum(Transaction_AMT) as d_B from TransactionTABLE
where sum(Transaction_AMT)>0 group by account_id, Transaction_date ) as fi on fi.account_id=fo.account_id and fi.t_date=fo.t_date;
Or else
You could try something as follows which calculates the running count of d_B over the Transaction_date and account_id
select account_id,
transaction_date,
SUM(transaction_amt) filter (where transaction_amt >= 0) as tot_debit,
SUM(transaction_amt) filter (where transaction_amt < 0) as tot_credit,
sum(transaction_amt) over (partition by account_id where sum(transaction_amt)<0) as credit_balance,
sum(transaction_amt) over (partition by account_id where sum(transaction_amt)>=0) as debit_balance
from TransactionTABLE group by account_id, Transaction_date order by 1,2;

Related

oracle sql get transactions between the period

I have 3 tables in oracle sql namely investor, share and transaction.
I am trying to get new investors invested in any shares for a certain period. As they are the new investor, there should not be a transaction in the transaction table for that investor against that share prior to the search period.
For the transaction table with the following records:
Id TranDt InvCode ShareCode
1 2020-01-01 00:00:00.000 inv1 S1
2 2019-04-01 00:00:00.000 inv1 S1
3 2020-04-01 00:00:00.000 inv1 S1
4 2021-03-06 11:50:20.560 inv2 S2
5 2020-04-01 00:00:00.000 inv3 S1
For the search period between 2020-01-01 and 2020-05-01, I should get the output as
5 2020-04-01 00:00:00.000 inv3 S1
Though there are transactions for inv1 in the table for that period, there is also a transaction prior to the search period, so that shouldn't be included as it's not considered as new investor within the search period.
Below query is working but it's really taking ages to return the results calling from c# code leading to timeout issues. Is there anything we can do to refine to get the results quicker?
WITH
INVESTORS AS
(
SELECT I.INVCODE FROM INVESTOR I WHERE I.CLOSED IS NULL)
),
SHARES AS
(
SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL))
),
SHARES_IN_PERIOD AS
(
SELECT DISTINCT
T.INVCODE,
T.SHARECODE,
T.TYPE
FROM TRANSACTION T
JOIN INVESTORS I ON T.INVCODE = I.INVCODE
JOIN SHARES S ON T.SHARECODE = S.SHARECODE
WHERE T.TRANDT >= :startDate AND T.TRANDT <= :endDate
),
PREVIOUS_SHARES AS
(
SELECT DISTINCT
T.INVCODE,
T.SHARECODE,
T.TYPE
FROM TRANSACTION T
JOIN INVESTORS I ON T.INVCODE = I.INVCODE
JOIN SHARES S ON T.TRSTCODE = S.TRSTCODE
WHERE T.TRANDT < :startDate
)
SELECT
DISTINCT
SP.INVCODE AS InvestorCode,
SP.SHARECODE AS ShareCode,
SP.TYPE AS ShareType
FROM SHARES_IN_PERIOD SP
WHERE (SP.INVCODE, SP.SHARECODE, SP.TYPE) NOT IN
(
SELECT
PS.INVCODE,
PS.SHARECODE,
PS.TYPE
FROM PREVIOUS_SHARES PS
)
With the suggestion given by #Gordon Linoff, I tried following options (for all the shares I need) but they are taking long time too. Transaction table is over 32 million rows.
1.
WITH
SHARES AS
(
SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL))
)
select t.invcode, t.sharecode, t.type
from (select t.*,
row_number() over (partition by invcode, sharecode, type order by trandt)
as seqnum
from transactions t
) t
join shares s on s.sharecode = t.sharecode
where seqnum = 1 and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
WITH
INVESTORS AS
(
SELECT I.INVCODE FROM INVESTOR I WHERE I.CLOSED IS NULL)
),
SHARES AS
(
SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL))
)
select t.invcode, t.sharecode, t.type
from (select t.*,
row_number() over (partition by invcode, sharecode, type order by trandt)
as seqnum
from transactions t
) t
join investors i on i.invcode = t.invcode
join shares s on s.sharecode = t.sharecode
where seqnum = 1 and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
select t.invcode, t.sharecode, t.type
from (select t.*,
row_number() over (partition by invcode, sharecode, type order by trandt)
as seqnum
from transactions t
) t
where seqnum = 1 and
t.sharecode IN (SELECT S.SHARECODE FROM SHARE S WHERE S.DORMANT IS NULL)))
and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
If you want to know if the first record in transactions for a share is during a period, you can use window functions:
select t.*
from (select t.*,
row_number() over (partition by invcode, sharecode order by trandt) as seqnum
from transactions t
) t
where seqnum = 1 and
t.sharecode = :sharecode and
t.trandt >= date '2020-01-01' and
t.trandt < date '2020-05-01';
For performance for this code, you want an index on transactions(invcode, sharecode, trandate).

Find increase in history records in specific range

I want to find records in date range 1/1/19-1/7/19 which increase amount
using table HISTORY:
DATE AMOUNT ID
(Date, number, varchar2(30))
I find IDs inside range correctly
assuming increase/decrease can happens only when having two records with same Id
with suspect as
(select id
from history
where t.createddate < to_date('2019-07-01', 'yyyy-mm-dd')
group by id
having count(1) > 1),
ids as
(select id
from history
join suspect
on history.id = suspect.id
where history.date > to_date('2019-01-01', 'yyyy-mm-dd')
and history.date < to_date('2019-07-01', 'yyyy-mm-dd'))
select count(distinct id)
from history a, history b
where a.id = b.id
and a.date < b.date
and a.amount < b.amount
The problem to find increase I need to find previous record which can be before time range
I can find last previous time before time range, but I failed to use it:
ids_prevtime as (
select history.*, max(t.date) over (partition by t.id) max_date
from history
join ids on history.userid = ids.id
where history.date < to_date('2019-01-01','yyyy-mm-dd' )
), ids_prev as (
select * from ids_prevtime where createdate=max_date
)
I see that you found solution, but maybe you could do it simpler, using lag():
select count(distinct id)
from (select id, date_, amount,
lag(amount) over (partition by id order by date_) prev_amt
from history)
where date_ between date '2019-01-01' and date '2019-07-01'
and amount > prev_amt;
dbfiddle
Add union of last history records before range with records inside range
ids_prev as
(select ID, DATE, AMOUNT
from id_before_rangetime
where createddate = max_date),
ids_in_range as
(select history.*
from history
join ids
on history.ID = ids.ID
where history.date > to_date('2019-01-01', 'yyyy-mm-dd')
and history.date < to_date('2019-07-01', 'yyyy-mm-dd')),
all_relevant as
(select * from ids_in_range union all select * from ids_prev)
and then count increases:
select count(distinct id)
from all_relevant a, all_relevant b
where a.id = b.id
and a.date < b.date
and a.amount < b.amount

Postgres get sales for top account with ranking

I have the following tables:
Account (id, name)
Solution (id, name)
Sales (solution_id, account_id, month, year, amount)
I need to calculate the monthly sales of each account in a specific period:
SELECT
to_char(make_date(sales.year, sales.month, 1), 'YYYY-MM') AS period,
acc.id AS account_id,
acc.name AS account_name,
COALESCE(SUM(sales.net_sales), 0) AS amount
FROM
(SELECT *
FROM sales
WHERE make_date(year, month, 1) >= FROM_DATE
AND make_date(year, month, 1) <= TO_DATE) sales
INNER JOIN account acc.id = sales.account_id
GROUP BY sales.year, sales.month
ORDER BY sales.year, sales.month ASC
I can now calculate the total sales, in the period in the range:
SELECT
to_char(make_date(sales.year, sales.month, 1), 'YYYY-MM') AS period,
acc.id AS account_id,
acc.name AS account_name,
COALESCE(SUM(sales.net_sales), 0) AS amount
FROM
(SELECT *, COALESCE(SUM(net_sales) OVER (PARTITION BY client_id), 0) AS total
FROM sales
WHERE make_date(year, month, 1) >= FROM_DATE
AND make_date(year, month, 1) <= TO_DATE) sales
INNER JOIN account acc.id = sales.account_id
GROUP BY sales.year, sales.month
ORDER BY sales.year, sales.month ASC
Is there a way to rank the total sales in order to get only the n top account in the selected period?
Your queries are a bit of a mess. The first is not syntactically correct. I think you can simplify and the intention is:
SELECT to_char(make_date(s.year, s.month, 1), 'YYYY-MM') AS period,
a.id AS account_id, a.name AS account_name,
COALESCE(SUM(s.net_sales), 0) AS amount,
SUM(SUM(s.net_sales)) OVER (PARTITION BY a.id) as total
FROM sales s INNER JOIN
account a
ON a.id = s.account_id
WHERE make_date(s.year, s.month, 1) >= FROM_DATE AND
make_date(s.year, s.month, 1) <= TO_DATE
GROUP BY s.year, s.month, a.id, a.name
ORDER BY s.year, s.month ASC;
If you want to rank by total sales (or monthly sales), then you can use dense_rank():
SELECT ym.*
FROM (SELECT to_char(make_date(s.year, s.month, 1), 'YYYY-MM') AS period,
a.id AS account_id, a.name AS account_name,
COALESCE(SUM(s.net_sales), 0) AS amount,
total,
DENSE_RANK() OVER (ORDER BY total DESC) as seqnum
FROM (SELECT s.*, SUM(s.net_sales) OVER (PARTITION BY client_id) as total
FROM sales s
) s INNER JOIN
account a
ON a.id = s.account_id
WHERE make_date(s.year, s.month, 1) >= FROM_DATE AND
make_date(s.year, s.month, 1) <= TO_DATE
GROUP BY s.year, s.month
) ym
WHERE seqnum <= 3
ORDER BY s.year, s.month ASC;

Oracle SQL: Show entries from component tables once apiece

My objective is produce a dataset that shows a boatload of data from, in total, just shy of 50 tables, all in the same Oracle SQL database schema. Each table except the first consists of, as far as the report I'm building cares, two elements:
A foreign-key identifier that matches a row on the first table
A date
There may be many rows on one of these tables corresponding to one case, and it will NOT be the same number of rows from table to table.
My objective is to have each row in the first table show up as many times as needed to display all the results from the other tables once. So, something like this (except on a lot more tables):
CASE_FILE_ID INITIATED_DATE INSPECTION_DATE PAYMENT_DATE ACTION_DATE
------------ -------------- --------------- ------------ -----------
1000 10-JUL-1986 14-JUL-1987 10-JUL-1986
1000 14-JUL-1988 10-JUL-1987
1000 14-JUL-1989 10-JUL-1988
1000 10-JUL-1989
My current SQL code (shrunk down to five tables, but the rest all follow the same format as T1-T4):
SELECT DISTINCT
A.CASE_FILE_ID,
T1.DATE AS INITIATED_DATE,
T2.DATE AS INSPECTION_DATE,
T3.DATE AS PAYMENT_DATE,
T4.DATE AS ACTION_DATE
FROM
RECORDS.CASE_FILE A
LEFT OUTER JOIN RECORDS.INITIATE T1 ON A.CASE_FILE_ID = T1.CASE_FILE_ID
LEFT OUTER JOIN RECORDS.INSPECTION T2 ON A.CASE_FILE_ID = T2.CASE_FILE_ID
LEFT OUTER JOIN RECORDS.PAYMENT T3 ON A.CASE_FILE_ID = T3.CASE_FILE_ID
LEFT OUTER JOIN RECORDS.ACTION T4 ON A.CASE_FILE_ID = T4.CASE_FILE_ID
ORDER BY
A.CASE_FILE_ID
The problem is, the output this produces results in distinct combinations; so in the above example (where I added a 'WHERE' clause of A.CASE_FILE_ID = '1000'), instead of four rows for case 1000, it'd show twelve (1 Initiated Date * 3 Inspection Dates * 4 Payment Dates = 12 rows). Suffice it to say, as the number of tables increases, this would get very prohibitive in both display and runtime, very quickly.
What is the best way to get an output loosely akin to the ideal above, where any one date is only shown once? Failing that, is there a way to get it to only show as many lines for one CASE_FILE as it needs to show all the dates, even if some dates repeat within that?
There isn't a good way, but there are two ways. One method involves subqueries for each table and complex outer joins. The second involves subqueries and union all. Let's go with that one:
SELECT CASE_FILE_ID,
MAX(INITIATED_DATE) as INITIATED_DATE,
MAX(INSPECTION_DATE) as INSPECTION_DATE,
MAX(PAYMENT_DATE) as PAYMENT_DATE,
MAX(ACTION) as ACTION
FROM ((SELECT A.CASE_FILE_ID, NULL as INITIATED_DATE, NULL as INSPECTION_DATE,
NULL as PAYMENT_DATE, NULL as ACTION_DATE,
1 as seqnum
FROM RECORDS.CASE_FILE A
) UNION ALL
(SELECT T1.CASE_FILE_ID, DATE as INITIATED_DATE, NULL as INSPECTION_DATE,
NULL as PAYMENT_DATE, NULL as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INITIATE
) UNION ALL
(SELECT T1.CASE_FILE_ID, NULL as INITIATED_DATE, DATE as INSPECTION_DATE,
NULL as PAYMENT_DATE, NULL as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INSPECTION
) UNION ALL
(SELECT T1.CASE_FILE_ID, NULL as INITIATED_DATE, NULL as INSPECTION_DATE,
DATE as PAYMENT_DATE, NULL as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.PAYMENT
) UNION ALL
(SELECT T1.CASE_FILE_ID, NULL as INITIATED_DATE, NULL as INSPECTION_DATE,
NULL as PAYMENT_DATE, ACTION as ACTION_DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.ACTION
)
) a
GROUP BY CASE_FILE_ID, seqnum;
Hmmm, a closely related solution is easier to maintain:
SELECT CASE_FILE_ID,
MAX(CASE WHEN type = 'INITIATED' THEN DATE END) as INITIATED_DATE,
MAX(CASE WHEN type = 'INSPECTION' THEN DATE END) as INSPECTION_DATE,
MAX(CASE WHEN type = 'PAYMENT' THEN DATE END) as PAYMENT_DATE,
MAX(CASE WHEN type = 'ACTION' THEN DATE END) as ACTION
FROM ((SELECT A.CASE_FILE_ID, NULL as TYPE, NULL as DATE,
1 as seqnum
FROM RECORDS.CASE_FILE A
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'INSPECTION', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INITIATE
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'INSPECTION', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.INSPECTION
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'PAYMENT', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.PAYMENT
) UNION ALL
(SELECT T1.CASE_FILE_ID, 'ACTION', DATE,
ROW_NUMBER() OVER (PARTITION BY CASE_FILE_ID ORDER BY DATE) as seqnum
FROM RECORDS.ACTION
)
) a
GROUP BY CASE_FILE_ID, seqnum;

filtering with statement without using from

I want to count products showed in events between two dates. I have to fill 9 columns, each with other product type.
I would like to ask you if there are possibility to short this statement.
Below sql is first working but not effective attempt.
with events(event_id, customer_id) as (
select * from event
where start_date >= :stare_date
and end_date <= :end_date
),
select
(select count(*) from event_product where event_id in (select event_id from events where customer_id = customer.customer_id) and product_type = 'YLW') customer_ylw_products -- it works but its ugly and non effective
-------
-- repeat seven times for other type of products
-------
(select count(*) from event_product where event_id in (select event_id from events where customer_id = customer.customer_id) and product_type = 'RTL') customer_rtl_products
from customer
;
Notice that line
(select event_id from events where customer_id = customer.customer_id)
repeats about 9 times.
I've been trying to short this one by add following:
with events(event_id, customer_id) as (
select * from event
where start_date >= :stare_date
and end_date <= :end_date
),
**customer_events (event_id, customer_id) as (select * from events)**
select
(select count(*) from event_product where event_id in (select event_id from customer_events) and product_type = 'RTL') customer_rtl_products
from customers
where customer_events.customer_id = customer.customer_id -- doesnt works
having customer_events.customer_id = customer.customer_id -- doesnt works
Why don't you use case expressions?
WITH
events (event_id, customer_id)
AS (
SELECT
*
FROM event
WHERE start_date >= :stare_date
AND end_date <= :end_date
)
SELECT
*
FROM customer
LEFT JOIN (
SELECT
event_product.customer_id
, COUNT(CASE
WHEN event_product.product_type = 'YLW' THEN 1 END) AS count_YLW
, COUNT(CASE
WHEN event_product.product_type = 'RTL' THEN 1 END) AS count_RTL
FROM event_product
INNER JOIN events
ON event_product.event_id = events.event_id
GROUP BY
event_product.customer_id
) ev_counts
ON customer.customer_id = ev_counts.customer_id
;
You could do this without the CTE too if you prefer, just use what you currently have in the CTE as a derived table where events is now placed in the inner join.
footnote select * is a convenience only I don't know what fields are to be used, but they should be specified.
#Used_By_Already thanks for inspire me with inner joins between event_product and event and that Event_product doesnt have column customer_id so I simply added it!
That's my solution
with events(event_id, customer_id) as (
select * from event
where start_date >= :stare_date
and end_date <= :end_date
),
product_events (customer_id, product_type) as (
select event.customer_id, event_product.product_type
from events,event_product
where event_product.event_id = event.event_id and event_product.product_type in (''product_types'')
)
select
(select count(*) from product_events where customer_id = customer.customer_id and product_type = 'RTL') customer_rtl_products
from customers;
Performance for 50 rows in search increased from 45 seconds to only 5!
Thank you so much!