Sum of distinct items values Oracle SQL - sql

I got data like this:
ITEM COLOR VOL
1 RED 3
2 BLUE 3
3 RED 3
4 GREEN 12
5 BLUE 3
6 GREEN 12
and I want to have the total sum of each color,
mean RED + BLUE + GREEN = 3+3+12 = 18
P.S I can't do it in sub-query since it is a part of a big query already. I am looking for a way could do it in select clause.something like
select sum(distinct(COLOR) VOL) from myTable group by COLOR
Thanks a lot!

Sum the sum of distinct, as grouped by color
select sum(sum(distinct VOL))
from MyTable
group by COLOR
Tested locally and here

One approach uses a CTE or subquery to find the mean volumes for each color. Then take the sum of all mean volumes, for all colors.
WITH cte AS (
SELECT COLOR, AVG(VOL) AS VOL -- or MIN(VOL), or MAX(VOL)
FROM yourTable
GROUP BY COLOR
)
SELECT SUM(t.VOL)
FROM cte t

sum(max(vol)) from ... group by color
will work, but it's not clear why you should need such a thing. Likely this sum can be computed (much) earlier in your query, not right at the end.
Proof of concept (on a standard Oracle schema):
SQL> select sum(max(sal)) from scott.emp group by deptno;
SUM(MAX(SAL))
-------------
10850
1 row selected.

Related

SQL get first reord in each of a list of records

HELP! Kind of new to SQL. I've been working with simple statements for a few years but I need a little advanced help. I know it can be done and will save me time.
Here is my example to try to find results:
select top 1 apples, color from fruits
where apples in ('gala', 'fuji', 'granny')
and (inStock is not null and inStock <> '')
In the above query I would get the first color in 'gala' apples and thats it. What I want is the first color in 'gala', the first in 'fuji', first in 'granny' and so on.
InStock isn't as important - it's just an additional filter in the search results.
What I want is a two column list. Left Column being apple types and right column being the first color result for each apple type.
You can use row_number() window ranking function to serialize apples wise colors in a specific order. Then choose first one from each group by selecting first rows.
with cte as
(
select apples, color ,row_number()over(partition by apples order by apples) rn from fruits
where apples in ('gala', 'fuji', 'granny')
and (inStock is not null and inStock <> '')
)
select apples, color from cte where rn=1
I think one issue you might have here is the concept of "first". A color is a categorical variable and tables don't typically attach meaning to a "first" or "last" value with a few exceptions. If you're dead set on returning the first row for each fruit, one easy way to get the result utilizes union all.
SELECT top 1 apples, color from fruits where apples = 'gala'
UNION ALL
SELECT top 1 apples, color from fruits where apples = 'fuji'
UNION ALL
SELECT top 1 apples, color from fruits where apples = 'granny'

How do I SELECT minimum set of rows to cover all possible values of each columns in SQL?

I am running a SQL query to get data from a table to map all different possible values of all categories represented by each columns.
How do I run the SELECT query such that it returns the minimum number of rows just enough to include all possible values of all columns?
For example, if I have a table of 10 rows and 3 columns, each column containing 3 possible values:
TABLE sales
--------------------------------
brandID color size
--------------------------------
2 red big
3 blue big
2 blue big
2 red small
2 blue medium
3 green small
3 red big
1 green medium
2 red medium
2 blue big
Of course I could SELECT all rows from table without filter, but that would be an expensive query of 10 rows.
However, as you can see, if we filter the SELECT query to only return the following rows below, it is possible to cover all the possible values of all columns:
1,2,3 for brandID
red,blue,green for color
big,small,medium for size
--------------------------------
brandID color size
--------------------------------
3 blue big
2 red small
1 green medium
How do I do that in SQL query?
This one does what you expect:
select b.brandid, c.color, s.size
from (
select brandid, row_number() over (order by brandid) as rn
from sales
group by brandid
) b
full join (
select color, row_number() over (order by color) as rn
from sales
group by color
) c on b.rn = c.rn
full join (
select size, row_number() over (order by size) as rn
from sales
group by size
) s on b.rn = s.rn;
Online example: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=e72e7d1dfed43825025c5703b5d3671a
But this only works properly, if you have the same number of (distinct) brands, colors and sizes. If you have e.g. 5 brands, 6 colors and 7 sizes the result is rather "strange":
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=4417a4d97ecf7601364f09d65f6522fa
First, a query that returns ten rows is not "expensive".
Second, this is a very hard problem. It involves looking at all combinations of rows to see if the set has all combinations of columns. I suspect that any algorithm will need to basically search through all possible combinations -- although there may be some efficiencies, such as automatically including all rows with a unique value in any column.
As a hard problem involving comparing zillions of sets, SQL is not really an appropriate language for addressing the issue.
This is a rather weird requirement... But you might try something along this:
DECLARE #sales TABLE(BrandID INT, color VARCHAR(10),size VARCHAR(10));
INSERT INTO #sales VALUES
(2,'red', 'big'),
(3,'blue', 'big'),
(2,'blue', 'big'),
(2,'red', 'small'),
(2,'blue', 'medium'),
(3,'green', 'small'),
(3,'red', 'big'),
(1,'green', 'medium'),
(2,'red', 'medium'),
(2,'blue', 'big');
WITH AllBrands AS (SELECT ROW_NUMBER() OVER(ORDER BY BrandID) AS RowInx, BrandID FROM #sales GROUP BY BrandID)
,AllColors AS (SELECT ROW_NUMBER() OVER(ORDER BY color) AS RowInx, color FROM #sales GROUP BY color)
,AllSizes AS (SELECT ROW_NUMBER() OVER(ORDER BY size) AS RowInx, size FROM #sales GROUP BY size)
SELECT COALESCE(b.RowInx,c.RowInx,s.RowInx) AS RowInx
,b.BrandID
,c.color
,s.size
FROM AllBrands b
FULL OUTER JOIN AllColors c ON COALESCE(b.RowInx,c.RowInx)=c.RowInx
FULL OUTER JOIN AllSizes s ON COALESCE(b.RowInx,c.RowInx,s.RowInx)=s.RowInx;
This solution is similar to #a_horse_with_no_name's, but avoids gaps in the result in case of unequal counts of values per column.
The idea in short:
We create a numbered set of all distinct values per column and join all sets on this number. As we don't know the count in advance I use COALESCE to pick the first value, which is not null.
This is not a good problem if you demand ONE AND ONLY ONE query and ONE AND ONLY ONE of each result set, and ONE AND ONLY ONE instance of each result. As Gordon Linoff accurately put: that is not a problem for SQL.I get that maybe you have a MUCH larger table, but he's absolutely right.
But add another layer, and you can have exactly what you want, with all the efficiency you want, and a readable output. Use a cursor and some basic SELECT from dynamic SQL with a SELECT columns.name from sys.tables JOIN sys.columns ON tables.object_id = columns.object_id, if you absolutely have to do this with TSQL alone.
And if you're willing to build a basic application with any framework with a SQL driver, you can just SELECT DISTINCT FROM < and put the various results into arrays.
Alternatively: reword your question, with the understanding that the results of any SQL query are gonna be x rows by x columns. Not an array for each column.
I think your example confuses things by having exactly 3 values for each field, which makes the requested result seem like a reasonable thing to expect. But what happens when two more brands are added, or a new colour? Then what would you expect to be returned?
Really you are asking three questions, so I feel this should be done as three queries:
"What are the different brands?"
"What are the different colours?"
"What are the different sizes?"
If they need to be displayed in a neat table, stitch them together afterwards in your application layer. You could maybe do it in the SQL with something like a_horse_with_no_name suggests, but really its the wrong place.

How write sql query group by?

I have a ERD. But I want to write a sql query.
The meaning is that you can select all columns of artgrp of regroupid 11 grouped by artdept.
I have this:
Select *
From artgrp
Where regroudid = "11"
Group by artdept;
My question is: how can I write: select all columns of artgrp group by the columns of artdept?
Here is my model
SELECT d.description, d.lifetime, d.name, COUNT(d.artdeptid) as Departments
FROM artgrp g INNER JOIN arddept d ON g.artdeptid = d.artdeptid
WHERE regroudid = "11"
GROUP BY d.description, d.lifetime, d.name
Group by is used to identify the count, max, min, avg, etc in a cluster of data. To be more clear say for instance you have Cars table with fields make, color, price. And you want to see the count of cars in different colors(this will be cluster of different colors) you can use the following query
select count(1),color from cars group by color;
Output will look like this
Blue 3
Grey 17
Red 5
Note: whatever column you use in group by will be used in select columns as well. In the above example I grouped by using color if add more fields say for instance make it will have two clusters(color,make) output would be
Ford Blue 3
Ford Grey 7
Honda Red 5
Honda Grey 10
So you can identify what is the function you need to perform before grouping your data like count, min, max, avg, rank etc. And if you want all the fields in your select clause you will have to use analytical function. Edit your question with sample data also with expected output, I can give you the answer with analytical query as well if required.
Thanks for editing your question sample data will be more clear, still I will go ahead with what I understood. I am using analytical function here as solution
Select artgrpid,description,descid,relgroupid,
artdeptid,default,name,brand,questions,
ROW_NUMBER() over (Prtition by artdeptid order by artdeptid) from
artgrp;
you can use rank() or count(1) etc instead of ROW_NUMBER().

SQL - Select top item from a grouping of two columns

I have a list of numbers attached to two separate columns, and I want to just return the first "match" of the two columns to get that data. I got close with this answer, but it only works with one field. I need it to work with a combination of fields. About ten second before I was ready to post.
Here's an example table "Item":
Item Color Area
Boat Red 1
Boat Red 2
Boat Blue 4
Boat Blue 5
Car Red 3
Car Red 4
Car Blue 10
Car Blue 31
And the result set returned should be:
Item Color Area
Boat Red 1
Boat Blue 4
Car Red 3
Car Blue 10
A much simpler way to do this:
select Item,
Color,
min(Area) as Area
from Item
group by Item
Color
Just use the MIN function with a GROUP BY.
SELECT Item, Color, MIN(area) AS Area
FROM Item
GROUP BY Item, Color
Output:
Item Color Area
Boat Blue 4
Boat Red 1
Car Blue 10
Car Red 3
SQL Fiddle: http://sqlfiddle.com/#!9/46a154/1/0
SQL tables represent unordered sets. Hence, there is no "first" of anything without a column specifying the ordering.
For your example results, the simplest query is:
select item, color, min(area) as area
from item i
group by item, color;
About ten seconds before I was ready to post the question, I realized the answer:
WITH summary AS (
SELECT i.item + ':' + i.color,
a.area,
ROW_NUMBER() OVER(PARTITION BY i.item + ':' + i.color,
ORDER BY i.item + ':' + i.color DESC) AS rk
FROM Item i
group by (i.item + ':' + i.color, i.Area)
SELECT s.*
FROM summary s
WHERE s.rk = 1
It's as simple as combining the two composite key fields into one field and grouping based on that. This might be a bit hackish so if anyone wants to suggest a better option I'm all for it.

Select unique record based on column value priority

This is a continuation of my previous question here.
In the following example:
id PRODUCT ID COLOUR
1 1001 GREEN
2 1002 GREEN
3 1002 RED
4 1003 RED
Given a product ID, I want to retrieve only one record - that with GREEN colour, if one exists, or the RED one otherwise. It sounds like I need to employ DISTINCT somehow, but I don't know how to supply the priority rule.
Pretty basic I'm sure, but my SQL skills are more than rusty..
Edit: Thank you everybody. One more question please: how can this be made to work with multiple records, ie. if the WHERE clause returns more than just one record? The LIMIT 1 would limit across the entire set, while what I'd want would be to limit just within each product.
For example, if I had something like SELECT * FROM table WHERE productID LIKE "1%" ... how can I retrieve each unique product, but still respecting the colour priority (GREEN>RED)?
try this:
SELECT top 1 *
FROM <table>
WHERE ProductID = <id>
ORDER BY case when colour ='GREEN' then 1
when colour ='RED' then 2 end
If you want to order it based on another color, you can give it in the case statement
SELECT *
FROM yourtable
WHERE ProductID = (your id)
ORDER BY colour
LIMIT 1
(Green will come before Red, you see. The LIMIT clause returns only one record)
For your subsequent edit, you can do this
select yourtable.*
from
yourtable
inner join
(select productid, min(colour) mincolour
from yourtable
where productid like '10%'
group by productid) v
on yourtable.productid=v.productid
and yourtable.colour=v.mincolour