SQL Aggregation function to get concrete value - sql

I need help with an aggregation functionality.
what I want to know is if it is possible to extract a concrete value from a grouped query on one of the columns I return, like this
STORE
fruit
color
stock
apple
red
30
apple
green
5
banana
yellow
40
berry
red
5
pear
green
5
SELECT SUM(stock), [?] FROM store GROUP BY fruit
[?] -> i need to take a concrete value, for example RED. but the SUM must have 35 in apples.
can this be done without a subquery?
Thanks
I expect this results
Column A
Column B
35
red
in this case the query does not make sense but for my personal case it does. I try to use STRING_AGG to take the data and make explode in my code, but its not the best way i think

I think you're looking for the GROUP BY clause. Try this:
SELECT SUM(stock), color
FROM store
GROUP BY color
This will return a list of all colors, and the sum of the stock for each color.

I'm not entirely clear what you mean by a "concrete value" (singlular) as there are potentially two or more values... and you did mention STRING_AGG(). Also, you omitted the "fruit" from the query, which made things a bit confusing. Nonetheless, this will get either one Color value or all color values using STRING_AGG() OR MAX() and without a sub-query:
-- the WITH is just a way to get your data into the query
;
WITH AdriansData AS
(SELECT * FROM (VALUES('apple', 'red', 30),
('apple', 'green', 5),
('banana', 'yellow', 40),
('berry', 'red', 5),
('pear', 'green', 5)
) AS X (fruit, color, stock)
)
SELECT fruit,
SUM(stock),
STRING_AGG(color, ', ') AS Colors,
MAX(color) AS JustOneColor
FROM AdriansData
GROUP BY fruit

Related

SQL get first reord in each of a list of records

HELP! Kind of new to SQL. I've been working with simple statements for a few years but I need a little advanced help. I know it can be done and will save me time.
Here is my example to try to find results:
select top 1 apples, color from fruits
where apples in ('gala', 'fuji', 'granny')
and (inStock is not null and inStock <> '')
In the above query I would get the first color in 'gala' apples and thats it. What I want is the first color in 'gala', the first in 'fuji', first in 'granny' and so on.
InStock isn't as important - it's just an additional filter in the search results.
What I want is a two column list. Left Column being apple types and right column being the first color result for each apple type.
You can use row_number() window ranking function to serialize apples wise colors in a specific order. Then choose first one from each group by selecting first rows.
with cte as
(
select apples, color ,row_number()over(partition by apples order by apples) rn from fruits
where apples in ('gala', 'fuji', 'granny')
and (inStock is not null and inStock <> '')
)
select apples, color from cte where rn=1
I think one issue you might have here is the concept of "first". A color is a categorical variable and tables don't typically attach meaning to a "first" or "last" value with a few exceptions. If you're dead set on returning the first row for each fruit, one easy way to get the result utilizes union all.
SELECT top 1 apples, color from fruits where apples = 'gala'
UNION ALL
SELECT top 1 apples, color from fruits where apples = 'fuji'
UNION ALL
SELECT top 1 apples, color from fruits where apples = 'granny'

How do I SELECT minimum set of rows to cover all possible values of each columns in SQL?

I am running a SQL query to get data from a table to map all different possible values of all categories represented by each columns.
How do I run the SELECT query such that it returns the minimum number of rows just enough to include all possible values of all columns?
For example, if I have a table of 10 rows and 3 columns, each column containing 3 possible values:
TABLE sales
--------------------------------
brandID color size
--------------------------------
2 red big
3 blue big
2 blue big
2 red small
2 blue medium
3 green small
3 red big
1 green medium
2 red medium
2 blue big
Of course I could SELECT all rows from table without filter, but that would be an expensive query of 10 rows.
However, as you can see, if we filter the SELECT query to only return the following rows below, it is possible to cover all the possible values of all columns:
1,2,3 for brandID
red,blue,green for color
big,small,medium for size
--------------------------------
brandID color size
--------------------------------
3 blue big
2 red small
1 green medium
How do I do that in SQL query?
This one does what you expect:
select b.brandid, c.color, s.size
from (
select brandid, row_number() over (order by brandid) as rn
from sales
group by brandid
) b
full join (
select color, row_number() over (order by color) as rn
from sales
group by color
) c on b.rn = c.rn
full join (
select size, row_number() over (order by size) as rn
from sales
group by size
) s on b.rn = s.rn;
Online example: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=e72e7d1dfed43825025c5703b5d3671a
But this only works properly, if you have the same number of (distinct) brands, colors and sizes. If you have e.g. 5 brands, 6 colors and 7 sizes the result is rather "strange":
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=4417a4d97ecf7601364f09d65f6522fa
First, a query that returns ten rows is not "expensive".
Second, this is a very hard problem. It involves looking at all combinations of rows to see if the set has all combinations of columns. I suspect that any algorithm will need to basically search through all possible combinations -- although there may be some efficiencies, such as automatically including all rows with a unique value in any column.
As a hard problem involving comparing zillions of sets, SQL is not really an appropriate language for addressing the issue.
This is a rather weird requirement... But you might try something along this:
DECLARE #sales TABLE(BrandID INT, color VARCHAR(10),size VARCHAR(10));
INSERT INTO #sales VALUES
(2,'red', 'big'),
(3,'blue', 'big'),
(2,'blue', 'big'),
(2,'red', 'small'),
(2,'blue', 'medium'),
(3,'green', 'small'),
(3,'red', 'big'),
(1,'green', 'medium'),
(2,'red', 'medium'),
(2,'blue', 'big');
WITH AllBrands AS (SELECT ROW_NUMBER() OVER(ORDER BY BrandID) AS RowInx, BrandID FROM #sales GROUP BY BrandID)
,AllColors AS (SELECT ROW_NUMBER() OVER(ORDER BY color) AS RowInx, color FROM #sales GROUP BY color)
,AllSizes AS (SELECT ROW_NUMBER() OVER(ORDER BY size) AS RowInx, size FROM #sales GROUP BY size)
SELECT COALESCE(b.RowInx,c.RowInx,s.RowInx) AS RowInx
,b.BrandID
,c.color
,s.size
FROM AllBrands b
FULL OUTER JOIN AllColors c ON COALESCE(b.RowInx,c.RowInx)=c.RowInx
FULL OUTER JOIN AllSizes s ON COALESCE(b.RowInx,c.RowInx,s.RowInx)=s.RowInx;
This solution is similar to #a_horse_with_no_name's, but avoids gaps in the result in case of unequal counts of values per column.
The idea in short:
We create a numbered set of all distinct values per column and join all sets on this number. As we don't know the count in advance I use COALESCE to pick the first value, which is not null.
This is not a good problem if you demand ONE AND ONLY ONE query and ONE AND ONLY ONE of each result set, and ONE AND ONLY ONE instance of each result. As Gordon Linoff accurately put: that is not a problem for SQL.I get that maybe you have a MUCH larger table, but he's absolutely right.
But add another layer, and you can have exactly what you want, with all the efficiency you want, and a readable output. Use a cursor and some basic SELECT from dynamic SQL with a SELECT columns.name from sys.tables JOIN sys.columns ON tables.object_id = columns.object_id, if you absolutely have to do this with TSQL alone.
And if you're willing to build a basic application with any framework with a SQL driver, you can just SELECT DISTINCT FROM < and put the various results into arrays.
Alternatively: reword your question, with the understanding that the results of any SQL query are gonna be x rows by x columns. Not an array for each column.
I think your example confuses things by having exactly 3 values for each field, which makes the requested result seem like a reasonable thing to expect. But what happens when two more brands are added, or a new colour? Then what would you expect to be returned?
Really you are asking three questions, so I feel this should be done as three queries:
"What are the different brands?"
"What are the different colours?"
"What are the different sizes?"
If they need to be displayed in a neat table, stitch them together afterwards in your application layer. You could maybe do it in the SQL with something like a_horse_with_no_name suggests, but really its the wrong place.

Count Total Bar codes of Specific Color by fetching values from various fields when they match in SQL

I have a situation where I have 4 color columns(Color1, Color2, Color3, Color4) I need to add the values of bar codes present in all color columns when they match. Its bit complicated, I have the graphical representation here:
Color1 Color2 Color3 Color4 Barcodes
Red 1
Red 3
Red 4
Red 2
Expected Result: Total Barcodes where Color is Red=10
I am using SQL Server
Any Assistance in this would be really helpful.
EDIT: there are 320 colors in the table
I would unpivot and aggregate:
select sum(Barcodes)
from t cross apply
(values (color1), (color2), (color3), (color4)) v(color)
where color = 'Red';
If you want this for each color:
select color, sum(Barcodes)
from t cross apply
(values (color1), (color2), (color3), (color4)) v(color)
group by color;

Sum of distinct items values Oracle SQL

I got data like this:
ITEM COLOR VOL
1 RED 3
2 BLUE 3
3 RED 3
4 GREEN 12
5 BLUE 3
6 GREEN 12
and I want to have the total sum of each color,
mean RED + BLUE + GREEN = 3+3+12 = 18
P.S I can't do it in sub-query since it is a part of a big query already. I am looking for a way could do it in select clause.something like
select sum(distinct(COLOR) VOL) from myTable group by COLOR
Thanks a lot!
Sum the sum of distinct, as grouped by color
select sum(sum(distinct VOL))
from MyTable
group by COLOR
Tested locally and here
One approach uses a CTE or subquery to find the mean volumes for each color. Then take the sum of all mean volumes, for all colors.
WITH cte AS (
SELECT COLOR, AVG(VOL) AS VOL -- or MIN(VOL), or MAX(VOL)
FROM yourTable
GROUP BY COLOR
)
SELECT SUM(t.VOL)
FROM cte t
sum(max(vol)) from ... group by color
will work, but it's not clear why you should need such a thing. Likely this sum can be computed (much) earlier in your query, not right at the end.
Proof of concept (on a standard Oracle schema):
SQL> select sum(max(sal)) from scott.emp group by deptno;
SUM(MAX(SAL))
-------------
10850
1 row selected.

Is there such a thing as a 'IS IN' query

I have 3 tables:
Silk_Skey Name
1 Black White Checks Yellow Arms
2 Black Crimson Stripes
3 Crimson Yellow Stripes
Sub Colour Major Colour
Black Black
White White
Yellow Yellow
Crimson Red
MajorColour_Skey Major Colour
1 Black
2 White
3 Yellow
4 Red
And I want to achieve this:
ID Silk_Skey MajorColour_Skey
1 1 1
2 1 2
3 1 3
4 2 1
5 2 4
6 3 3
7 3 4
What I need to do is create a linked table matching all the colours from the 3 tables and break down the silks names so I would show 4 lines in the new table) see SQL below. My boss has advised me to use a 'IS IN' query but I have no idea what that is can you help?
SELECT s.Silks_Skey, mc.MajorColour_Skey
FROM Silks s INNER JOIN SubColour sc on sc.SubColour **'IS IN HERE'** s.SilksName
INNER JOIN MajorColour mc
ON sc.MajorColour = mc.MajorColour
You can use IN
AND table.column IN ('a','b','c')
or
AND table.column IN (1,2,3)
or if you're looking for a string like something you can do
AND table.column LIKE '%word' -- table.column ends with 'word'
AND table.column LIKE 'word%' -- table.column starts with 'word'
AND table.column LIKE '%word%' -- table.column has 'word' anywhere in the column
This is a design doomed to poor performance and awkward and painful to write queries. If your database will never be large, then it may be workable, but if it will be large, you cannot use this design structure and hope to have good performance because you will not be able to properly use indexes. Personally I would add a silk colors table related to the silks table and store the colors indivudally. One of the first rules of database design is never store more than one piece of informatino in a field. You are storing a list which always means you need a related table to have effective use of the database.
One clue to a bad (and over time usually unworkable)database design is if you need to join using functions or caluations of any type or if you need to use wildcards at the start of a phrase in a like clause. Fix this now and things will be much smoother, maintenance will take less time and performacne will be better. There is no upside to your current structure at all.
You may need to take a bit of extra time to parse and store the silk names by individual color, but the time you save in querying the database will be significant becasue you can now make use of a join and then use indexes. Search for fn_split and you will see a method of spliting the silk names into individual colors that you can use when you insert the records.
If you foolishly decide to retain the current structure, then look into using fuilltext search. It wil be faster than using a like clause with a wildcard as the first character.
For what you want to do, you need to do string manipulation because you are trying to compare one color to a list of colors in a string.
The like operator can do this. Try this on clause:
on ' '+ s.SilksName +' ' like '% '+sc.SubColour+' %'
This checks to see if a given color (sc.SubColour) in in the list (s.SilksName). For instance, if you have a list like 'RED GREEN' this will match either '%RED%' or '%GREEN%'.
The purpose of concatenating white space is to avoid partial-word matches. For instance, "blue-green" would match both "blue" and "green" without the delimiters.
The following query returns 7 rows, which seems to be correct (3 for the first row in silks and 2 for each of the other two):
with silks as (
select 1 as silks_skey, 'Black White Checks Yellow Arms' as silksname union all
select 2, 'Black Crimson Stripes' union all
select 3, 'Crimson Yellow Stripes'
),
subcolour as (
select 'black' as subcolour, 'black' as majorcolour union all
select 'white', 'white' union all
select 'yellow', 'yellow' union all
select 'crimson', 'red'
),
MajorColour as (
select 1 as MajorColour_skey, 'black' as MajorColour union all
select 2, 'white' union all
select 3, 'yellow' union all
select 4, 'red'
)
SELECT s.Silks_Skey, mc.MajorColour_Skey
FROM Silks s INNER JOIN SubColour sc on ' ' + s.SilksName + ' ' like '% ' + sc.SubColour + ' %'
INNER JOIN MajorColour mc
ON sc.MajorColour = mc.MajorColour
Sounds like what you really want to do is split the Name field on spaces and then for each one of those values which is contained in the colours table (joined on the sub-colour given that major colours are valid sub-colours too) you want one entry in a new table. Problem is that there is no intrinsic T-SQL function for splitting strings. To do that your best bet is to visit Erland Sommarskog's definitive answer on how to do this.
An alternative, and one which is not very neat and may or may not work, is to use the CONTAINS keyword in your predicate. However in order to achieve this you need to use full text indexing
and I suspect using Erland's excellent giudes on splitting strings and arrays in SQL will be more appropriate and faster.
This is the answer folks, thanks for all your ideas.
Select S.[Silks_Skey], MC.[MajorColour_Skey]
from [dbo].[Silks] S
inner join [dbo].[SubColour] SC on CHARINDEX(SC.[SubColour],S.[SilksName]) <> 0
inner join [dbo].[MajorColour] MC on SC.[MajorColour] = MC.[MajorColour]
UNION ALL
Select S.[Silks_Skey], MC.[MajorColour_Skey]
from [dbo].[Silks] S
inner join [dbo].[MajorColour] MC on CHARINDEX(MC.[MajorColour],S.[SilksName]) <> 0
ORDER BY S.[Silks_Skey]