Aggregate columns based on different conditions? - sql

I have a Teradata query that generates:
customer | order | amount | days_ago
123 | 1 | 50 | 2
123 | 1 | 50 | 7
123 | 2 | 10 | 19
123 | 3 | 100 | 35
234 | 4 | 20 | 20
234 | 5 | 10 | 10
With performance in mind, what’s the most efficient way to produce an output per customer where orders is the number of distinct orders a customer had within the last 30 days and total is the sum of the amount of the distinct orders regardless of how many days ago the order was placed?
Desired output:
customer | orders | total
123 | 2 | 160
234 | 2 | 30

Given your rules, maybe it takes two steps - de-duplicate first then aggregate:
SELECT customer,
SUM(CASE WHEN days_ago <=30 THEN 1 ELSE 0 END) AS orders,
SUM(amount) AS total
FROM
(SELECT customer, order, MAX-or-MIN(amount) AS amount, MIN-or-MAX(days_ago) AS days_ago
FROM your_relation
GROUP BY 1, 2) AS DistinctCustOrder
GROUP BY 1;

Related

Subtract constant across database tables

I need to subtract a value, found in a different table, from values across different rows.
For example, the tables I have are:
ProductID | Warehouse | Locator | qtyOnHand
-------------------------------------------
100 | A | 123 | 12
100 | A | 124 | 12
100 | A | 124 | 8
101 | A | 126 | 6
101 | B | 127 | 12
ProductID | Sold
----------------
100 | 26
101 | 16
Result:
ProductID | Warehouse | Locator | qtyOnHand | available
-------------------------------------------------------
100 | A | 123 | 12 | 0
100 | A | 123 | 12 | 0
100 | A | 124 | 8 | 6
101 | A | 126 | 6 | 0
101 | B | 127 | 12 | 12
The value should only be subtracted from those in warehouse A.
Im using postgresql. Any help is much appreciated!
If I understand correctly, you want to compare the overall stock to the cumulative amounts in the first table. The rows in the first table appear to be ordered from largest to smallest. Note: This is an interpretation and not 100% consistent with the data in the question.
Use JOIN to bring the data together and then cumulative sums and arithmetic:
select t1.*,
(case when running_qoh < t2.sold then 0
when running_qoh - qtyOnHand < t2.sold then (running_qoh - t2.sold)
else qtyOnHand
end) as available
from (select t1.*,
sum(qtyOnHand) over (partition by productID order by qtyOnHand desc) as running_qoh
from table1 t1
) t1 join
table2 t2
using (ProductID)

count total items, sold items (in another table reference by id) and grouped by serial number

I have a table of items in the shop, an item may have different entries with same serial number (sn) (but different ids) if the same item was bought again later on with different price (price here is how much did a single item cost the shop)
id | sn | amount | price
----+------+--------+-------
1 | AP01 | 100 | 7
2 | AP01 | 50 | 8
3 | X2P0 | 200 | 12
4 | X2P0 | 30 | 18
5 | STT0 | 20 | 20
6 | PLX1 | 200 | 10
and a table of transactions
id | item_id | price
----+---------+-------
1 | 1 | 10
2 | 1 | 9
3 | 1 | 10
4 | 2 | 11
5 | 3 | 15
6 | 3 | 15
7 | 3 | 15
8 | 4 | 18
9 | 5 | 22
10 | 5 | 22
11 | 5 | 22
12 | 5 | 22
and transaction.item_id references items(id)
I want to group items by serial number (sn), get their sum(amount) and avg(price), and join it with a sold column that counts number of transactions with referenced id
I did the first with
select i.sn, sum(i.amount), avg(i.price) from items i group by i.sn;
sn | sum | avg
------+-----+---------------------
STT0 | 20 | 20.0000000000000000
PLX1 | 200 | 10.0000000000000000
AP01 | 150 | 7.5000000000000000
X2P0 | 230 | 15.0000000000000000
Then when I tried to join it with transactions I got strange results
select i.sn, sum(i.amount), avg(i.price) avg_cost, count(t.item_id) sold, sum(t.price) profit from items i left join transactions t on (i.id=t.item_id) group by i.sn;
sn | sum | avg_cost | sold | profit
------+-----+---------------------+------+--------
STT0 | 80 | 20.0000000000000000 | 4 | 88
PLX1 | 200 | 10.0000000000000000 | 0 | (null)
AP01 | 350 | 7.2500000000000000 | 4 | 40
X2P0 | 630 | 13.5000000000000000 | 4 | 63
As you can see, only the sold and profit columns show correct results, the sum and avg show different results than the expected
I can't separate the statements because I am not sure how can I add the count to the sn group which has the item_id as its id?
select
j.sn,
j.sum,
j.avg,
count(item_id)
from (
select
i.sn,
sum(i.amount),
avg(i.price)
from items i
group by i.sn
) j
left join transactions t
on (j.id???=t.item_id);
There are multiple matches in both tables, so the join multiplies the rows (and eventually produces wron results). I would recommend pre-joining, then aggregating:
select
sn,
sum(amount) total_amount,
avg(price) avg_price,
sum(no_transactions) no_transactions
from (
select
i.*,
(
select count(*)
from transactions t
where t.item_id = i.id
) no_transactions
from items i
) t
group by sn

SQL: How do I get a filtered count for each distinct row in a table?

Note: this is different from questions that ask "I want a count for each distinct row in a table" which has been answered numerous times. This is a filtered count, so the counting part of the query needs a more complex WHERE clause. Consider this dataset:
customer_id | user_id | age
-----------------------------
1 | 932 | 20
1 | 21 | 3
1 | 2334 | 32
2 | 232 | 10
2 | 238 | 28
3 | 838 | 39
3 | 928 | 83
4 | 842 | 12
I want to query this table and know the number of users over the age of 13 for each distinct customer_id. So the result would be:
customer_id | over_13_count
-----------------------------
1 | 2
2 | 1
3 | 2
4 | 0
I've tried something like this but it just runs forever, so I think I'm doing it wrong:
SELECT DISTINCT customer_id,
(SELECT COUNT(*) FROM mytable AS m2 WHERE m2.customer_id = m1.customer_id AND age > 13) AS over_13_count
FROM mytable AS m1
ORDER BY customer_id
Just use conditional aggregation:
SELECT customer_id,
SUM(CASE WHEN age > 13 THEN 1 ELSE 0 END) asover_13_count
FROM mytable m1
GROUP BY customer_id

Subtract the value of a row from grouped result

I have a table supplier_account which has five coloumns supplier_account_id(pk),supplier_id(fk),voucher_no,debit and credit. I want to get the sum of debit grouped by supplier_id and then subtract the value of credit of the rows in which voucher_no is not null. So for each subsequent rows the value of sum of debit gets reduced. I have tried using 'with' clause.
with debitdetails as(
select supplier_id,sum(debit) as amt
from supplier_account group by supplier_id
)
select acs.supplier_id,s.supplier_name,acs.purchase_voucher_no,acs.purchase_voucher_date,dd.amt-acs.credit as amount
from supplier_account acs
left join supplier s on acs.supplier_id=s.supplier_id
left join debitdetails dd on acs.supplier_id=dd.supplier_id
where voucher_no is not null
But here the debit value will be same for all rows. After subtraction in the first row I want to get the result in second row and subtract the next credit value from that.
I know it is possible by using temporary tables. The problem is I cannot use temporary tables because the procedure is used to generate reports using Jasper Reports.
What you need is an implementation of the running total. The easiest way to do it with a help of a window function:
with debitdetails as(
select id,sum(debit) as amt
from suppliers group by id
)
select s.id, purchase_voucher_no, dd.amt, s.credit,
dd.amt - sum(s.credit) over (partition by s.id order by purchase_voucher_no asc)
from suppliers s
left join debitdetails dd on s.id=dd.id
order by s.id, purchase_voucher_no
SQL Fiddle
Results:
| id | purchase_voucher_no | amt | credit | ?column? |
|----|---------------------|-----|--------|----------|
| 1 | 1 | 43 | 5 | 38 |
| 1 | 2 | 43 | 18 | 20 |
| 1 | 3 | 43 | 8 | 12 |
| 2 | 4 | 60 | 5 | 55 |
| 2 | 5 | 60 | 15 | 40 |
| 2 | 6 | 60 | 30 | 10 |

How to insert additional values in between a GROUP BY

i am currently making a monthly report using MySQL. I have a table named "monthly" that looks something like this:
id | date | amount
10 | 2009-12-01 22:10:08 | 7
9 | 2009-11-01 22:10:08 | 78
8 | 2009-10-01 23:10:08 | 5
7 | 2009-07-01 21:10:08 | 54
6 | 2009-03-01 04:10:08 | 3
5 | 2009-02-01 09:10:08 | 456
4 | 2009-02-01 14:10:08 | 4
3 | 2009-01-01 20:10:08 | 20
2 | 2009-01-01 13:10:15 | 10
1 | 2008-12-01 10:10:10 | 5
Then, when i make a monthly report (which is based by per month of per year), i get something like this.
yearmonth | total
2008-12 | 5
2009-01 | 30
2009-02 | 460
2009-03 | 3
2009-07 | 54
2009-10 | 5
2009-11 | 78
2009-12 | 7
I used this query to achieved the result:
SELECT substring( date, 1, 7 ) AS yearmonth, sum( amount ) AS total
FROM monthly
GROUP BY substring( date, 1, 7 )
But I need something like this:
yearmonth | total
2008-01 | 0
2008-02 | 0
2008-03 | 0
2008-04 | 0
2008-05 | 0
2008-06 | 0
2008-07 | 0
2008-08 | 0
2008-09 | 0
2008-10 | 0
2008-11 | 0
2008-12 | 5
2009-01 | 30
2009-02 | 460
2009-03 | 3
2009-05 | 0
2009-06 | 0
2009-07 | 54
2009-08 | 0
2009-09 | 0
2009-10 | 5
2009-11 | 78
2009-12 | 7
Something that would display the zeroes for the month that doesnt have any value. Is it even possible to do that in a MySQL query?
You should generate a dummy rowsource and LEFT JOIN with it:
SELECT *
FROM (
SELECT 1 AS month
UNION ALL
SELECT 2
…
UNION ALL
SELECT 12
) months
CROSS JOIN
(
SELECT 2008 AS year
UNION ALL
SELECT 2009 AS year
) years
LEFT JOIN
mydata m
ON m.date >= CONCAT_WS('.', year, month, 1)
AND m.date < CONCAT_WS('.', year, month, 1) + INTERVAL 1 MONTH
GROUP BY
year, month
You can create these as tables on disk rather than generate them each time.
MySQL is the only system of the major four that does have allow an easy way to generate arbitrary resultsets.
Oracle, SQL Server and PostgreSQL do have those (CONNECT BY, recursive CTE's and generate_series, respectively)
Quassnoi is right, and I'll add a comment about how to recognize when you need something like this:
You want '2008-01' in your result, yet nothing in the source table has a date in January, 2008. Result sets have to come from the tables you query, so the obvious conclusion is that you need an additional table - one that contains each month you want as part of your result.