SQL: How do I get a filtered count for each distinct row in a table? - sql

Note: this is different from questions that ask "I want a count for each distinct row in a table" which has been answered numerous times. This is a filtered count, so the counting part of the query needs a more complex WHERE clause. Consider this dataset:
customer_id | user_id | age
-----------------------------
1 | 932 | 20
1 | 21 | 3
1 | 2334 | 32
2 | 232 | 10
2 | 238 | 28
3 | 838 | 39
3 | 928 | 83
4 | 842 | 12
I want to query this table and know the number of users over the age of 13 for each distinct customer_id. So the result would be:
customer_id | over_13_count
-----------------------------
1 | 2
2 | 1
3 | 2
4 | 0
I've tried something like this but it just runs forever, so I think I'm doing it wrong:
SELECT DISTINCT customer_id,
(SELECT COUNT(*) FROM mytable AS m2 WHERE m2.customer_id = m1.customer_id AND age > 13) AS over_13_count
FROM mytable AS m1
ORDER BY customer_id

Just use conditional aggregation:
SELECT customer_id,
SUM(CASE WHEN age > 13 THEN 1 ELSE 0 END) asover_13_count
FROM mytable m1
GROUP BY customer_id

Related

Aggregate columns based on different conditions?

I have a Teradata query that generates:
customer | order | amount | days_ago
123 | 1 | 50 | 2
123 | 1 | 50 | 7
123 | 2 | 10 | 19
123 | 3 | 100 | 35
234 | 4 | 20 | 20
234 | 5 | 10 | 10
With performance in mind, what’s the most efficient way to produce an output per customer where orders is the number of distinct orders a customer had within the last 30 days and total is the sum of the amount of the distinct orders regardless of how many days ago the order was placed?
Desired output:
customer | orders | total
123 | 2 | 160
234 | 2 | 30
Given your rules, maybe it takes two steps - de-duplicate first then aggregate:
SELECT customer,
SUM(CASE WHEN days_ago <=30 THEN 1 ELSE 0 END) AS orders,
SUM(amount) AS total
FROM
(SELECT customer, order, MAX-or-MIN(amount) AS amount, MIN-or-MAX(days_ago) AS days_ago
FROM your_relation
GROUP BY 1, 2) AS DistinctCustOrder
GROUP BY 1;

Oracle: apply condition over aggregated functions and not for all rows of the SQL statement

I want to calculate for an Oracle table the max value from a select statement based not on all rows returned from the query, but on a condition over a sub-set of those rows.
I'll try to explain better with an example:
ID | NAME | AGE | COMPANY
1 | Alex | 28 | A
2 | Alan | 22 | A
3 | Bob | 21 | B
4 | Carl | 20 | C
5 | Dave | 24 | C
6 | Eric | 26 | C
7 | Matt | 33 | D
I want to obtain the max age for every company under-25 years, but I also want to count the total numbers of persons for every company.
So, I want this result:
COMPANY | DEPENDENTS | MAX_UNDER_25
A | 2 | 22
B | 1 | 21
C | 3 | 24
D | 1 | (null)
How can I obtain this result with a single SQL query, without joining the elaboration for the sum of records and the other elaboration for the max with condition ?
I want to avoid this select:
SELECT R1.COMPANY, R1.DEPENDENTS, R2.MAX_UNDER_25 FROM
(SELECT COMPANY, COUNT(*) DEPENDENTS FROM TABLE
GROUP BY COMPANY) R1
LEFT JOIN
(SELECT COMPANY, MAX(AGE) MAX_UNDER_25 FROM TABLE
WHERE AGE < 25
GROUP BY COMPANY) R2
ON R1.COMPANY = R2.COMPANY;
It is possible to obtain that with a more simple query?
Simply use a case expression when aggregating MAX:
select company, count(*), max(case when age < 25 then age end)
from table
group by company

select only tuples where second column always has same value

I have a similar table to this one
ID | CountryID
1 | 22
1 | 22
2 | 19
3 | 0
3 | 14
3 | 18
3 | 21
3 | 22
3 | 23
4 | 19
5 | 9
5 | 9
6 | 14
and I want to group by the first ID column but select only rows, where the CountryID has the same value throughout an ID. The resulting table should look like
ID | CountryID
1 | 22
2 | 19
4 | 19
5 | 9
6 | 14
Any ideas?
I think the following query should work:
SELECT ID, MAX(CountryID)
FROM Table1
GROUP BY ID
HAVING MIN(CountryID) = MAX(CountryID)
SELECT ID, count(distinct CountryID)
FROM Table1
GROUP BY ID
HAVING count(distinct CountryID)=1

Update a column and refer back it in the same query

I have a table in SQL Server 2014 and need to recursively update a column based on its previous value. For e.g.
---------------------------------------
ID | price | diff_with_prev_price |
---------------------------------------
1 | 29 | 0 |
2 | 25 | 0 |
3 | 20 | 0 |
4 | 35 | 0 |
5 | 40 | 0 |
--------------------------------------|
I want to recursively update third column like below
---------------------------------------
ID | price | diff_with_prev_price |
---------------------------------------
1 | 29 | 0 |
2 | 25 | 25 |
3 | 20 | 5 |
4 | 35 | -30 |
5 | 40 | 10 |
--------------------------------------|
It is the summation of previous value of third column with next value of 'price'.
Can someone please give some hint to do this either using CTE or LEAD/LAG, but without using cursors. I have to update million rows.
You can try this:
SELECT 1 AS ID , 29 AS price, 0 AS diff_with_prev_prive
INTO #tmp
UNION SELECT 2 AS ID , 25 AS price, 0 AS diff_with_prev_prive
UNION SELECT 3 AS ID , 20 AS price, 0 AS diff_with_prev_prive
UNION SELECT 4 AS ID , 35 AS price, 0 AS diff_with_prev_prive
UNION SELECT 5 AS ID , 40 AS price, 0 AS diff_with_prev_prive
WITH cte AS
(
SELECT
ID
, price
, diff_with_prev_prive
, price - ISNULL(LAG(price) OVER (ORDER BY ID),0) AS new_value
FROM #tmp
)
UPDATE t
SET diff_with_prev_prive = t.new_value
FROM cte t
SELECT * FROM #tmp

Subtract the value of a row from grouped result

I have a table supplier_account which has five coloumns supplier_account_id(pk),supplier_id(fk),voucher_no,debit and credit. I want to get the sum of debit grouped by supplier_id and then subtract the value of credit of the rows in which voucher_no is not null. So for each subsequent rows the value of sum of debit gets reduced. I have tried using 'with' clause.
with debitdetails as(
select supplier_id,sum(debit) as amt
from supplier_account group by supplier_id
)
select acs.supplier_id,s.supplier_name,acs.purchase_voucher_no,acs.purchase_voucher_date,dd.amt-acs.credit as amount
from supplier_account acs
left join supplier s on acs.supplier_id=s.supplier_id
left join debitdetails dd on acs.supplier_id=dd.supplier_id
where voucher_no is not null
But here the debit value will be same for all rows. After subtraction in the first row I want to get the result in second row and subtract the next credit value from that.
I know it is possible by using temporary tables. The problem is I cannot use temporary tables because the procedure is used to generate reports using Jasper Reports.
What you need is an implementation of the running total. The easiest way to do it with a help of a window function:
with debitdetails as(
select id,sum(debit) as amt
from suppliers group by id
)
select s.id, purchase_voucher_no, dd.amt, s.credit,
dd.amt - sum(s.credit) over (partition by s.id order by purchase_voucher_no asc)
from suppliers s
left join debitdetails dd on s.id=dd.id
order by s.id, purchase_voucher_no
SQL Fiddle
Results:
| id | purchase_voucher_no | amt | credit | ?column? |
|----|---------------------|-----|--------|----------|
| 1 | 1 | 43 | 5 | 38 |
| 1 | 2 | 43 | 18 | 20 |
| 1 | 3 | 43 | 8 | 12 |
| 2 | 4 | 60 | 5 | 55 |
| 2 | 5 | 60 | 15 | 40 |
| 2 | 6 | 60 | 30 | 10 |