Oracle SQL Query for Stacked Pareto Chart - sql

I have an Oracle Table that contains data similar to the following basic example:
+--------+----------+
| SERIES | CATEGORY |
+--------+----------+
| green | apple |
| green | pear |
| green | pear |
| yellow | apple |
| yellow | apple |
| yellow | pear |
| yellow | pear |
| yellow | pear |
| yellow | banana |
| yellow | banana |
| yellow | banana |
| red | apple |
+--------+----------+
I would like to generate a Pareto-like Graph of this data that should look as like Stacked Pareto Chart,
To create this graph I would like to run a SQL query and get the following output:
+----------+--------+-------+
| CATEGORY | SERIES | COUNT |
+----------+--------+-------+
| pear | green | 2 |
| pear | yellow | 3 |
| apple | green | 1 |
| apple | yellow | 2 |
| apple | red | 1 |
| banana | yellow | 3 |
+----------+--------+-------+
The actual table has millions of entries and it currently takes a significant amount of time to query the database as the current procedure I am using is not very efficient:
Order the categories by the amount of entries in each category:
SELECT CATEGORY, COUNT(CATEGORY) FROM FRUIT GROUP BY CATEGORY ORDER BY COUNT(CATEGORY);
Then for each category I list the relevant series in order of the series:
SELECT SERIES, COUNT(SERIES) FROM FRUIT WHERE CATEGORY = [current category] GROUP BY SERIES ORDER BY SERIES;
What would be the most efficient way to query the database (Preferably a single SQL statement) in order to get the desired output?

Some shorter version:
select category, series, CntS
from (
select distinct count(category) over (partition by category) cntC,
count(series) over (partition by category, series ) cntS,
category, series
from fruit ) Tab
order by CntC desc, cntS desc;

You can achieve the desired result by grouping on both CATEGORY and SERIES:
SELECT
CATEGORY, SERIES, COUNT(*)
FROM FRUIT
GROUP BY CATEGORY, SERIES
ORDER BY COUNT(*);
UPDATE:
To order by total of CATEGORY first and then green, yellow, red, just like your expected output:
SELECT t1.*
FROM (
SELECT
CATEGORY, SERIES, COUNT(*) AS CNT
FROM FRUIT
GROUP BY CATEGORY, SERIES
) t1
INNER JOIN (
SELECT
CATEGORY, COUNT(*) AS CNT
FROM FRUIT
GROUP BY CATEGORY
) t2
ON t1.CATEGORY = t2.CATEGORY
ORDER BY
t2.CNT DESC,
CASE t1.SERIES
WHEN 'green' THEN 1
WHEN 'yellow' THEN 2
WHEN 'red' THEN 3
END

Related

Oracle - Fill null values in a column with values from another column

I am using Oracle 11.1.1.9.0 and my goal is to fill the Null values with the first NOT NULL values in "Raw Materials" column by each product i.e A, B and C in Product column. An example table and the intended result are illustrated at the end of this request.
None of the code sets in below works:

CODE 1:
IFNULL(Raw Materials,
First_value(Raw Materials) OVER (PARTITION BY Product))

CODE 2:
IFNULL(Raw Materials, 
First_value(Raw Materials) OVER (PARTITION BY Product
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW))

CODE 3:
COALESCE(lag(Raw Materials ignore null) OVER (partition by Product),
Raw Materials)
CODE 4:
IFNULL(Raw Materials, EVALUATE('LAG(%1, 1) OVER (PARTITION BY %2)' AS varchar2(20), Raw Materials, Product))
Note: IFNULL function does work in the environment. It was tested with IFNULL(Raw Materials, '1') and it resulted in all null values becoming 1 in Raw Materials column.
Thank you.
+---------+----------+ +---------+----------+
| product | material | | product | material |
+---------+----------+ +---------+----------+
| A | | | A | Apple |
| A | | | A | Apple |
| A | | | A | Apple |
| A | | | A | Apple |
| A | Apple | | A | Apple |
| B | | | B | Orange |
| B | | | B | Orange |
| B | | => | B | Orange |
| B | | | B | Orange |
| B | Orange | | B | Orange |
| C | | | C | Banana |
| C | | | C | Banana |
| C | | | C | Banana |
| C | | | C | Banana |
| C | Banana | | C | Banana |
+---------+----------+ +---------+----------+
Left is the example table data. Right is the intended result.
The below link "Oracle code environment" shows the code environment and samples of Oracle Logical SQL function.
Oracle code environment
Oracle Logical SQL manual: https://docs.oracle.com/middleware/11119/biee/BIEUG/appsql.htm#CHDDCFJI
For your dataset, you could simply do a window MAX() or MIN():
NVL(Raw_Materials, MAX(Raw_Materials) OVER(PARTITION BY Product))
If you have a column that can be used to order the rows (I assumed id), you can use LAG() with the IGNORE NULLS clause:
NVL(Raw_Materials, LAG(Raw_Materials IGNORE NULLS) OVER(PARTITION BY Product ORDER BY id))
While you say that you are looking for some "first" value, your sample data suggests that you just want all same products to have the same material:
update mytable m1 set material =
(
select min(material)
from mytable m2
where m2.product = m1.product
);
If you just want to select this data. Then you can use this:
select product, min(material) over (partition by product)
from mytable;
According to the docs (https://docs.oracle.com/cd/E28280_01/bi.1111/e10540/sqlref.htm#BIEMG678) it seems OBIEE uses a special syntax for analytic window functions (e.g. MIN() OVER()):
select
product,
evaluate('min(%1) over (partition by %2)', material, product)
from mytable;
You must enable this by seeting the EVALUATE_SUPPORT_LEVEL accordingly.
(I hope I got this right. Otherwise read the docs on this and try something along the lines for yourself.)
You can try below query,We are using First value analytic function nullif, COALESCE, etc work on row level not column level.
with temp as (select 'A' product,NULL raw_material from dual union all
select 'A',NULL from dual union all
select 'A',NULL from dual union all
select 'A',NULL from dual union all
select 'A','APPLE' from dual union all
select 'B',NULL from dual union all
select 'B',NULL from dual union all
select 'B',NULL from dual union all
select 'B',NULL from dual union all
select 'B','ORANGE' from dual union all
select 'C',NULL from dual union all
select 'C',NULL from dual union all
select 'C',NULL from dual union all
select 'C',NULL from dual union all
select 'C',NULL from dual union all
select 'C','Banana' from dual)
select a.*,FIRST_VALUE(raw_material IGNORE NULLS)
OVER (partition by product ORDER BY product) first_product from temp a;
Oracle does not have an IFNULL function. Your code would have worked if you swapped IFNULL for COALESCE in either of your first two code snippets:
SELECT t.*,
COALESCE(
raw_material,
FIRST_VALUE(raw_material)
IGNORE NULLS
OVER ( PARTITION BY product )
) AS updated_raw_material
FROM test_data t;
Outputs:
PRODUCT | RAW_MATERIAL | UPDATED_RAW_MATERIAL
:------ | :----------- | :-------------------
A | null | Apple
A | null | Apple
A | null | Apple
A | Apple | Apple
B | null | Orange
B | null | Orange
B | null | Orange
B | null | Orange
B | Orange | Orange
C | null | Banana
C | null | Banana
C | null | Banana
C | null | Banana
C | null | Banana
C | Banana | Banana
db<>fiddle here

Join multiple rows from Table 2 into separate columns when selecting data based on ID linked to table 1?

I may have the incorrect database design, but I have an issue as follows:
+-----------+ +------------+-----------+
| table1 | | table2 |
+-----------+ +------------+-----------+
| Type | | Type | Item |
| Fruit | | Fruit | Apple |
| Vegetable | | Fruit | Orange |
+-----------+ | Fruit | Pear |
| Vegetable | Zucchini |
+------------+-----------+
I would like to query my DB to return something that looks like this:
+-------------+----------+----------+---------+
| Result | | | |
+-------------+----------+----------+---------+
| Type | Result1 | Result2 | Result3 |
| Fruit | Apple | Orange | Pear |
+-------------+----------+----------+---------+
When I query items based on the "Fruit" ID. My current query will return a row for each, but I'd like to turn these rows into separate columns in the results table.
I've looked around into using different types of joins and group_concat, but I don't think any of these are explicitly appropriate as a solution by itself. I'm a bit of an SQL rookie so I'm having trouble on knowing "what" to look for.
SELECT t1.Type, t2.item,
FROM table1 as t1, table2 as t2
WHERE t2.Type="Fruit";
I understand that this will return each iteration of the results, but is not what I want.
You can use conditional aggregation:
select type,
max(case when seqnum = 1 then item end) as item_1,
max(case when seqnum = 2 then item end) as item_2,
max(case when seqnum = 3 then item end) as item_3
from (select t2.*, row_number() over (partition by type order by type) as seqnum
from table2 t2
) t2
where type = 'Fruit'
group by type;
Note that table1 is not needed because type is in table2.

SQL count how many times a value appears across multiple columns?

Below is an example of a table in our CRM, its not the way I'd have chosen to store this data but thats by the by, What would be the 'nice' way to count how many times each option was selected by each team?
asking here before i go headlong into a convoluted case statement :)
+----------+--------+---------+---------+---------+
| PersonID | Team | Option1 | Option2 | Option3 |
+----------+--------+---------+---------+---------+
| 1 | Blue | A | B | C |
| 2 | Blue | B | C | D |
| 3 | Blue | D | A | E |
| 4 | Red | A | B | D |
| 5 | Red | B | A | C |
| 6 | Yellow | A | B | C |
| 7 | Yellow | A | C | D |
+----------+--------+---------+---------+---------+
Thanks in advance
You can unpivot your 3 option columns into a single column using CROSS APPLY and a table value constructor and then perform your count:
SELECT t.Team, upvt.[Option], COUNT(*) AS Occurances
FROM dbo.T
CROSS APPLY (VALUES (t.Option1), (t.Option2), (t.Option3)) AS upvt ([Option])
GROUP BY t.Team, upvt.[Option]
ORDER BY t.Team, upvt.[Option];
So this would give:
Team Option Occurances
-------------------------------
Blue A 2
Blue B 2
Blue C 2
Blue D 2
Blue E 1
Red A 2
Red B 2
Red C 1
Red D 1
Yellow A 2
Yellow B 1
Yellow C 2
Yellow D 1
You can use UNION ALL to move the values into single column and then, do aggregation:
select
team, option, count(*) cnt
from(
select team, option1 option from t union all
select team, option2 from t union all
select team, option3 from t
)t group by team, option;

Show rows in which there are duplicates of a concat of 2 fields for Access SQL?

I have the below query:
SELECT DISTINCT [FRUIT], [FRUIT] & "-" & [COLOR] as FruitColor
FROM FRUITS as c;
That returns:
| Fruit | Color |
|:-----------|----------------:|
| Banana | Banana-Yellow |
| Orange | Orange-Orange |
| Orange | Orange-Red |
| Banana | Banana-Green |
| Apple | Apple-Red |
| Pear | Pear-Green |
The original table has many combinations of Fruit and Color, however obviously, one fruit can only have one color i.e. Banana-Green is an incorrect entry. I would like to identify all those rows that have more than one Fruit-Color combination i.e. in this table that is Banana and Orange. I would like to have Access show me all these rows.
I have another code that may help:
SELECT [Fruit], count([Color]) as count_unique
FROM (select distinct [Fruit], [Color] from Fruits) AS [%$###_Alias]
WHERE [Color] NOT IN ('Blue', 'Beige') AND [Fruit] <> 'Cucumber'
GROUP BY [Fruit];
(I don't care about colors Blue and Beige or Cucumbers.
Which returns:
| Fruit | count_unique |
|:-----------|----------------:|
| Banana | 2 |
| Orange | 2 |
| Apple | 1 |
| Pear | 1 |
Maybe I can use some kind of HAVING Count >1 , however I am unsure of how to structure this.
Solved / Solution:
SELECT x.*
FROM FRUITS x
INNER JOIN ( SELECT [Fruit], [COLOR]
FROM FRUITS GROUP BY [Fruit], [COLOR]) AS y
ON x.[Fruit] = y.[Fruit]
I think following query will give you, the fruits which have more than one colour, that's what you wanted. Right ?.
SELECT [Fruit], count([COLOR]) as count_unique
FROM FRUITS
GROUP BY [Fruit]
HAVING count(COLOR]) > 1;
This should work:
SELECT DISTINCT a.Fruit, a.Fruit & "-" & a.Color AS FruitColor
FROM FRUITS AS a
WHERE 1 < ( SELECT COUNT(*) FROM FRUITS AS b WHERE a.Fruit = b.Fruit );
Kind regards, Rene

Running total of "matches" using a window function in SQL

I want to create a window function that will count how many times the value of the field in the current row appears in the part of the ordered partition coming before the current row. To make this more concrete, suppose we have a table like so:
| id| fruit | date |
+---+--------+------+
| 1 | apple | 1 |
| 1 | cherry | 2 |
| 1 | apple | 3 |
| 1 | cherry | 4 |
| 2 | orange | 1 |
| 2 | grape | 2 |
| 2 | grape | 3 |
And we want to create a table like so (omitting the date column for clarity):
| id| fruit | prior |
+---+--------+-------+
| 1 | apple | 0 |
| 1 | cherry | 0 |
| 1 | apple | 1 |
| 1 | cherry | 1 |
| 2 | orange | 0 |
| 2 | grape | 0 |
| 2 | grape | 1 |
Note that for id = 1, moving along the ordered partition, the first entry 'apple' doesn't match anything (since the implied set is empty), the next fruit, 'cherry' also doesn't match. Then we get to 'apple' again, which is a match and so on. I'm imagining the SQL looks something like this:
SELECT
id, fruit,
<some kind of INTERSECT?> OVER (PARTITION BY id ORDER by date) AS prior
FROM fruit_table;
But I cannot find anything that looks right. FWIW, I'm using PostgreSQL 8.4.
You could solve that without a window function rather elegantly with a self-left join and a count():
SELECT t.id, t.fruit, t.day, count(t0.*) AS prior
FROM tbl t
LEFT JOIN tbl t0 ON (t0.id, t0.fruit) = (t.id, t.fruit) AND t0.day < t.day
GROUP BY t.id, t.day, t.fruit
ORDER BY t.id, t.day
I renamed the date column day because date is a reserved word in every SQL standard and in PostgreSQL.
I corrected a mistake in your sample data. They way you had it, it did not check out. Might confuse people.
If your point is to do it with a window function, this one should work:
SELECT id, fruit, day
,count(*) OVER (PARTITION BY id, fruit ORDER BY day) - 1 AS prior
FROM tbl
ORDER BY id, day
This works, because, I quote the manual:
If frame_end is omitted it defaults to CURRENT ROW.
You effectively count how many rows had the same (id, fruit) on prior days - including the current row. That's what the - 1 is for.