How to tally distinct combinations of rows in sqlite? - sql

Say I have a collection of glass jars with some plastic balls of different colors in them. There can be at most one ball of each color in every jar and every jar contains one or more balls.
I represent this collection with an sqlite table with the following columns:
JarID INTEGER, Color TEXT
A collection of three jars might then look like this:
JarID Color
0, 'Red'
1, 'Red'
1, 'Green'
2, 'Red'
2, 'Blue'
3, 'Red'
I'd like to write a query that will find all the different color combinations that exist in my jar collection, and list each combination alongside the total number of jars with that combination.
For the table above, the query should return either:
'Red', 2
'Red,Green', 1
'Red,Blue', 1
Or:
Red
2
Red
Green
1
Red
Blue
1
Currently I have a terrible mess of common table expressions and window functions that seems to achieve the desired result, but I can't help feeling that I'm missing some elegant, standard SQL solution to this.

The Group_Concat function is what you're looking for, but first you need to aggregate and order your list so red-green-red comes out the same as green-red-red.
Do that with:
Select Jar_Id, Color
From Jars
Group By Jar_Id, Color
Order by Jar_Id, Color
Now feed that to a group_concat subquery to make a list of colors by jar_id:
Select Jar_Id, Group_Concat(Color) as Combos
From <first subquery>
Group By Jar_Id
Then feed that to an aggregator to count occurrences, and you're done:
Select Combos, count(*) as Occurrences
From <second subquery>
Group By Combos
Order by Combos
Put it all together with:
Select Combos,count(*) as Occurrences
From (
Select Jar_Id, Group_Concat(Color) as Combos
From (
Select Jar_Id, Color
From Jars
Group By Jar_Id, Color
Order by Jar_Id, Color
)
Group By Jar_Id
)
Group By Combos
Order by Combos
One caution: Order Bys in a subquery are not strictly defined in SQL and might be ignored, but most implementations will do them.

Related

multiple count(0 on table

I have a vehicle database and want to count how many cars have a specific colour.
But I don't know what colours there are as there are many, also combinations.
So this code does not do the trick for me:
SELECT
SUM(CASE WHEN colour='red' THEN 1 ELSE 0 END) red,
SUM(CASE WHEN colour='green' THEN 1 ELSE 0 END) green
(etc)
FROM vehicles
To get all colours, I could do:
select distinct colour from vehicles
But how can I use that information in a sql statement like the one above?
I am using MS sql server.
You could put the result set in rows rather than columns:
SELECT colour, count(*)
FROM vehicles
GROUP BY colour;
The alternative is that you would need to use dynamic SQL or express the result set as XML or JSON.
Why not simply do the aggregation ?
select colour, count(*) as no_vehicles
from vehicles v
group by colour;
This will pull a list of all colours found in table vehicles:
SELECT distinct colour
from vehicles
But what you really want to use is the group by clause, like so:
SELECT
colours
,count(*) HowMany
from vehicles
group by
colours
This will produce one row for every distinct value in column colours. It will NOT parse out color combinations; "red with black trim" will be its own column, and not +1 for red, _1 for black--that would be a much more complex problem.

Pulling/Outputting data from teradata

I'm using Teradata as a database and I'm trying to pull and output some data for it.
Here are the links to:
Relationship
Metadata
I'm looking to figure out the queries for the following items.
1) What is the most common color and how many unique products have this color?
SELECT TOP 1 COLOR
FROM SKUINFO
GROUP BY COLOR
ORDER BY Count (*) desc;
this allowed me to get the color: Black.
But now I don't know how to get the number of products
2) What sku has the largest profit per unit and in which stores is this sku being sold?
Thank you!
Consider the WHERE clause subquery approach:
SELECT Count(*) AS ProductCount
FROM SKUINFO
WHERE Color IN
(SELECT TOP 1 COLOR
FROM SKUINFO
GROUP BY COLOR
ORDER BY Count (*) DESC);

Issue in Query Formation

I am facing issue in query formation, require your inputs. Scenario is as below:
I am having three color of t-shirts that is blue, yellow and red. Each student is bind with one color t-shirt only. Can you help me to find out number of t-shirt of each color in a class(2/3/4) means group by class 2/3/4.
In DB we have studentId, class and stShirtColor(B/Y/R)
Thanks in advance
is this what you are looking for :
SELECT class,
stShirtColor,
Count(*) AS NumberOfTshirts
FROM yourtable
WHERE class IN( 2, 3, 4 )
GROUP BY class,stShirtColor
where clause will filter out the results as what you need and group by will make group of class,stShirtColor combination and ultimately count(*) will give you count of such groups.
You can get and print the color of the Tshirt and its count for each class by following query,
SELECT class, stShirtColor, count(*) as NUMBER_OF_TSHIRST FROM YOUR_TABLE WHERE class IN (2,3,4) group by class, stShirtColor;
Group By will organize your data based on class. You can use most aggregate functions when using GROUP BY.

In PostgresSQL when is it best to use WHERE and when is it best to use CASE WHEN?

I don't really understand the difference between CASE WHEN and WHERE, can they be used to achieve the same thing? Will someone please clarify?
Thanks!
To understand the difference between the two, you have to think of the classic structure of a SQL statement:
SELECT ...
FROM ...
WHERE ...
ORDER BY ...;
For example, if you think of a table called "shirts" with 500 rows, let's start by selecting all 500 of them:
-- first example: all rows as stored
SELECT
name,
color,
size,
maker
FROM shirts
ORDER BY name;
Next, let's add a WHERE clause. This serves to limit the rows returned in the final result set to only those that match some specified criterion. For instance, you can say "WHERE color = 'white'" to return a smaller set of (say) 80 rows:
-- second example: only some rows
SELECT
name,
color,
size,
maker
FROM shirts
WHERE color = 'white'
ORDER BY name;
The CASE statement serves an entirely different function: it is normally part of the SELECT clause-- it cannot be a separate clause in its own right-- and serves to reformat the output you receive in some way. So with a case statement, you would still get all 500 rows back, but your results won't look the same as they did in my first example:
-- third example: all rows, but color field is tweaked
SELECT
name,
(CASE WHEN color = 'white' THEN 'white' ELSE 'other' END) AS tweaked_color,
size,
maker
FROM shirts
ORDER BY name;
In my first example, you'd have seen various values for "color": "white", "black", "brown", "yellow", "red", etc. With this third example, you'll see 80 rows with "white" and 420 rows with "other".
Hope that helps clarify the difference.
P.S. please note that I've arranged the whitespace in my examples in a somewhat unusual way for clarity's sake; how your arrange the whitespace doesn't make any difference to the syntax. I also added a pair of unnecessary parentheses around the CASE statement to help add some visual clarity; those also would usually be omitted.

Nested loop to match colors

Basically I have a table of colors, now I implemented a query which matches all colors together. I was wondering is it possible to do this with a loop? (perhaps it is a nested loop).
My idea is to loop the first color with every other color and then loop the second color with every other etc. Help is greatly appreciated.
My table - contains different colors
CREATE TABLE Colors
(c_ID VARCHAR2(3) NOT NULL,
c_NAME VARCHAR2(11));
INSERT INTO Colors VALUES
('T01','RED');
INSERT INTO Colors VALUES
('T02','BLUE');
INSERT INTO Colors VALUES
('T03','BLACK');
INSERT INTO Colors VALUES
('T04','YELLOW');
INSERT INTO Colors VALUES
('T05','ORANGE');
The sql query that I used to match different colors:
select a.c_id as HM, s.c_id as AW
from colors a, colors s
where a.c_id <> s.c_id
order by a.c_id;
Recursive query. (this is for Postgres, Your Syntax May Vary)
CREATE TABLE Colors
(c_ID INTEGER NOT NULL
, c_NAME VARCHAR
);
INSERT INTO Colors VALUES
(1,'RED'), (2,'BLUE'), (3,'BLACK'), (4,'YELLOW'), (5,'ORANGE');
WITH RECURSIVE xxx AS (
SELECT
c1.c_ID AS last_id
, c1.c_NAME::text AS all_colors
FROM Colors c1
UNION ALL
SELECT c2.c_ID AS last_id
, x.all_colors|| '+' || c2.c_NAME::text AS all_colors
FROM Colors c2
JOIN xxx x ON x.last_id < c2.c_ID
)
SELECT all_colors
FROM xxx
;
Results:
CREATE TABLE
INSERT 0 5
all_colors
------------------------------
RED
BLUE
BLACK
YELLOW
ORANGE
RED+BLUE
RED+BLACK
RED+YELLOW
RED+ORANGE
BLUE+BLACK
BLUE+YELLOW
BLUE+ORANGE
BLACK+YELLOW
BLACK+ORANGE
YELLOW+ORANGE
RED+BLUE+BLACK
RED+BLUE+YELLOW
RED+BLUE+ORANGE
RED+BLACK+YELLOW
RED+BLACK+ORANGE
RED+YELLOW+ORANGE
BLUE+BLACK+YELLOW
BLUE+BLACK+ORANGE
BLUE+YELLOW+ORANGE
BLACK+YELLOW+ORANGE
RED+BLUE+BLACK+YELLOW
RED+BLUE+BLACK+ORANGE
RED+BLUE+YELLOW+ORANGE
RED+BLACK+YELLOW+ORANGE
BLUE+BLACK+YELLOW+ORANGE
RED+BLUE+BLACK+YELLOW+ORANGE
(31 rows)
If I understand you correctly you want all colors in a single row, or even in a single column of a single row?
The query you show, will only result in pairs of colors. If you want all colors, you need to self-join as many times as you have colors. So adding or removing a color will turn your ugly query into a broken query. In general you will have a query result, where the number of columns depends on the number of colors. This does not play well with the relational paradigm.
If you just want all colors as a single value, then you need to aggregeate colors. The query result will then be a single value, with all the colors combined, possibly separated by commas.
To aggregate things you need an aggregate function. Well-known aggregate functions are SUM, MIN or AVG, none of which do what you need here. What aggregate function to chose depends on your particular SQL dialect.
For oracle you may look into pivot or xmlagg.
You may also consider wrapping the whole thing in procedural code.