Sum two rows with different IDs - sql

I have the following data:
ID | Field A | Quantity
1 | A | 10
2 | B | 20
3 | C | 30
I would like to sum the fields with ids 1 and 3 in a way that result will be:
ID | Field A | Quantity
1 | A | 40
2 | B | 20
Sounds to me more like code manipulation rather than SQL, but still want to try it.
My DMBS is sql-server.

you can try by using case when
select case when id in(1,3) then 'A' else 'B' end as field,sum(Quantity) as Quantity
from tablename group by case when id in(1,3) then 'A' else 'B' end

I think simple aggregation does what you want:
select min(id) as id, min(fieldA) as fieldA, sum(quantity) as quantity
from t
group by (case when id in (1, 3) then 1 else id end);

Not sure what method to use to combine 1,3 ID but you could try case:
select
case when id in (1,3) then 1 else id end
, min("Field A") "Field A"
, sum(quantity) quantity from myTable
group by case when id in (1,3) then 1 else id end
The above uses group by to aggregate the data. In this case it organizes the ID field using some logic to combine 1,3. All other unique ID will have its own group.
Aggregate functions take care of the other fields, including logic to take the min() value for Field A which seems to fit your requirement

Related

postgres to grab the first N most populous groups and put the rest in an "Other" group

I have a table with the columns status, operator, and cost, where the columns status and operator are categorical and I want to sum up the cost per (status, operator) pair. Usually I would do this with a simple statement like
SELECT SUM(cost), status, operator FROM my_table GROUP BY status, operator;
but the hard part is there could be 100s of unique operators which I can't visualize for the client in a meaningful way. What I want to be able to do is only explicitly show the top N many operator categories (meaning to top N operators that have the highest SUM(cost) across the entire dataset) and then group all of the remaining rows in an "Other" operator. An inefficient way to do this would be the following:
-- letting N = 12
SELECT
SUM(cost),
status,
CASE
WHEN operator IN (
SELECT t.operator
FROM my_table AS t
GROUP BY t.operator ORDER BY SUM(t.cost) DESC
LIMIT 12
) THEN operator
ELSE 'Other'
END AS operator
FROM my_table
GROUP BY
status,
CASE
WHEN operator IN (
SELECT t.operator
FROM my_table AS t
GROUP BY t.operator ORDER BY SUM(t.cost) DESC
LIMIT 12
) THEN operator
ELSE 'Other'
END;
While the inefficient way works, in production it is far too slow. In actuality, the cost is not a simple column in a table but is computed by a subquery that is very slow to compute and the table is large, so I can't afford to use the CASE statement with an IN clause. What I would rather do is somehow have the full table where I use the GROUP BY statement I listed first in a FROM-clause subquery, then aggregate that to get the top N operator categories and an "Other" category. I tried to do this with window functions but I don't really understand how those work and I could not find something that got the right answer. If somebody could help it would be greatly appreciated.
EDIT: The cost column is not an actual column. I should have been more clear. It's computed by a very expensive subquery so I want to compute the cost for each row of the original table as few times as possible.
Example:
Say we have a table that looks like:
pk | status | operator | cost
----+-----------+-------------------+----------------------
1 | A | op_1 | 1
2 | A | op_1 | 5
3 | A | op_1 | 3
4 | A | op_1 | 7
5 | B | op_2 | 10
6 | B | op_2 | 15
7 | A | op_3 | 100
8 | A | op_4 | 1000
9 | B | op_5 | 12000
10 | A | op_5 | 10200
11 | B | op_5 | 10020
If I only want the top 3 operators (meaning the three operators with the highest SUM(cost) - in this case operators 3, 4, 5), the query should return:
status | operator | cost
-----------+-------------------+----------------------
B | op_5 | 32220
A | op_4 | 1000
A | op_3 | 100
B | Other | 25
A | Other | 16
In this example, operators 1-2 get rolled up into the "Other" operator, since we only want the top 3 given explicitly. So the first "Other" row in the result table sums all rows where status=B and operator is not one of the top three operators. The second "Other" row sums up all the rows where status=A and operator is not one of the top three operators.
I have converted your query from sub-query to join. You can use Left join as follows:
SELECT
SUM(cost),
status,
CASE
WHEN tt.operator is not null
THEN tt.operator
ELSE 'Other'
END AS operator
FROM my_table t
LEFT JOIN (
SELECT t.operator FROM my_table AS t
GROUP BY t.operator
ORDER BY SUM(t.cost) DESC LIMIT 12 ) tt
On t.operator = tt.operator
GROUP BY
status,
CASE
WHEN tt.operator is not null
THEN tt.operator
ELSE 'Other'
END;
Now, comming to what I understood from description. You want total 13 or less rows(12 operator and 1 other) for particular status if there is N=12. You can use row_number window function as follows
SELECT SUM(cost),
status,
Operator
From
(SELECT SUM(cost),
status,
Case when Row_number() over (partition by stqtus order by sum(cost) desc) <= 12
then operator
else 'Others'
end as operator
FROM my_table
GROUP BY status, operator) t
GROUP BY status, operator;

How to groupby by aggregating different keys in SQL

I have tables like below.
I would like to groupby by generating new keys like D,Dmeans AorB
In this case,countin D is 2 becauseAandBhas 1 record each.
Are there any way to generate new keys and groupby by using this?
product sex age
A M 10
B M 20
C F 30
My desired result is like below.
product count
C 1
D (A orB) 2
If you have same experience please let me know.
Thanks
Instead of the column product you must group by a derived column that matches your condition:
select
case when product in ('A', 'B') then 'D' else product end product,
count(*)
from tablename
group by case when product in ('A', 'B') then 'D' else product end
See the demo.
Results:
| newproduct | count(*) |
| ---------- | -------- |
| C | 1 |
| D | 2 |
ANSI SQL compliant query, use a case expression in a derived table to put A and B into D. GROUP BY its result:
select product, count(*)
from
(
select case when product in ('A', 'B') then 'D' else product end product
from tablename
) dt
group by product

Improving query that counts distinct values that have particular values in another column

Say I have a table in the format:
| id | category|
|----|---------|
| 10 | A |
| 10 | B |
| 10 | C |
| 2 | C |
I want to count the number of distinct id's that have all three values A, B, and C in the category variable. In this case, the query would return 1 since only for id = 10 is this true.
My intuition is to write the following query to get this value:
SELECT
COUNT(DISTINCT id),
SUM(CASE WHEN category = 'A' THEN 1 else 0 END) AS A,
SUM(CASE WHEN category = 'B' THEN 1 else 0 END) AS B,
SUM(CASE WHEN category = 'C' THEN 1 else 0 END) AS C
FROM
table
GROUP BY
id
HAVING
A >= 1
AND
B >= 1
AND
C >= 1
This feels a bit overwrought though -- is there a simpler way to achieve the desired outcome?
You are close, but you need two levels of aggregation. Assuming no duplicate rows:
SELECT COUNT(*)
FROM (SELECT id
FROM t
WHERE Category IN ('A', 'B', 'C')
GROUP BY id
HAVING COUNT(*) = 3
) t;
I assume this is part of a larger table, your id and categories can appear multiple times and still be distinct due to other fields, and that you know how many categories you're looking for.
SELECT ID, COUNT(ID)
FROM(
SELECT DISTINCT ID, CATEGORY
FROM TABLE)
GROUP BY ID
HAVING COUNT(ID) = 3 --or however many categories you want
Your subquery here removes extraneous info and forces your id to show up once per category. You then count up the number of times it shows up and look up the ones that show up 3 or however many times you want.

SQL Query find users with only one product type

I solemnly swear I did my best to find an existing question, may I'm not sure how to phrase it correctly.
I would like to return records for users that have quota for only one product type.
| user_id | product |
| 1 | A |
| 1 | B |
| 1 | C |
| 2 | B |
| 3 | B |
| 3 | C |
| 3 | D |
In the example above I'd like a query that only returns users who carry quota for only one product type - doesn't really matter which product at this point.
I tried using select user_id, product from table group by 1,2 having count(user) < 2 but this does not work, nor does select user_id, product from table group by 1,2 having count(*) < 2
Any help is appreciated.
Your having clause is good; the issue's with your group by. Try this:
select user_id
, count(distinct product) NumberOfProducts
from table
group by user_id
having count(distinct product) = 1
Or you could do this; which is closer to your original:
select user_id
from table
group by user_id
having count(*) < 2
The group by clause can't take ordinal arguments (like, e.g., the order by clause can). When grouping by a value like 1, you're in fact grouping by the literal value 1, which would just be the same for any row in the table, and thus will group all the rows in the table to one group. Since there are more than one product in the entire table, no rows will be returned.
Instead, you should group by the user_id:
SELECT user_id
FROM mytable
GROUP BY user_id
HAVING COUNT(*) = 1
If you want the product, then do:
select user_id, max(product) as product
from table
group by user_id
having min(product) = max(product);
The having clause could also be:
having count(distinct product) = 1

SQL Server - group by ID if column contains a value

I have following table:
ID | NR | Status
1000 | 1 | A
1000 | 2 | A
1001 | 3 | A
1002 | 4 | A
1002 | 5 | N
1003 | 6 | N
I need to an output which groups these by ID's. The NR column can be ignored. If one of the records with those ID's contains Status A, That status will be given as result.
So my output would be:
ID | Status
1000 | A
1001 | A
1002 | A
1003 | N
Any suggestions/ideas?
Although min() is the simplest method, it is not easily generalizable. Another method is:
select id
(case when sum(case when status = 'A' then 1 else 0 end) > 0
then 'A'
else 'N' -- or whatever
end) as status
from t
group by id;
Or, if you have a table with one row per id, then I would use exists:
select ids.id,
(case when exists (select 1 from t where t.id = ids.id and t.status = 'A')
then 'A' else 'N'
end) as status
from ids;
This saves on the group by aggregation and can use an index on (id, status) for optimal performance.
Do a GROUP BY, use MIN() to pick minimum status value for each id, and A < N!
select id, min(status)
from tablename
group by id
You want exactly the records that match the predicate "If one of the records with those ID's contains Status A, that status will be given as result." ?
The query can be written simply as:
Select distinct ID, STATUS from [your working TABLE] where STATUS = 'A'.
Hope this can help.