How to group items by rows - sql

I wanted to group the number of shop but i am not sure what is the syntax to create a group that is not exist in the table. I wanted the output to be like this
Group | Number of items
1 | XXX
2 | XXX
Group 1 would have number of items that is less than 10 while group 2 would have item that is more than 10.I have the data for the number of items, but I need to create the group number and I am not sure how. Thank you in advance.
Way I have tried:
SELECT
case when b.item_stock < 10 then count(a.shopid) else null end as Group_1,
case when b.item_stock >= 10 or b.item_stock < 100 then count(a.shopid) else null end as Group_2
FROM `table_a` a
left join `table_b` b
on a.id= b.id
where registration_time between "2017-01-01" and "2017-05-31"
group by b.item_stock
LIMIT 1000

Below is the BigQuery way of doing this
select 'group_' || range_bucket(item_stock, [0, 10]) as group_id,
count(*) as number_of_items
from your_table
group by group_id
if apply to dummy data like
with your_table as (
select 'ID001' shop_id, 40 item_stock union all
select 'ID002', 20 union all
select 'ID003', 30 union all
select 'ID004', 9 union all
select 'ID005', 44 union all
select 'ID006', 22 union all
select 'ID007', 28 union all
select 'ID008', 35 union all
select 'ID009', 20 union all
select 'ID010', 4 union all
select 'ID011', 5 union all
select 'ID012', 45 union all
select 'ID013', 29 union all
select 'ID014', 8 union all
select 'ID015', 40 union all
select 'ID016', 26 union all
select 'ID017', 31 union all
select 'ID018', 48 union all
select 'ID019', 45 union all
select 'ID020', 13
)
output is
Benefit of this solution is that it is easily extended to any number of ranges just by adding those into range_bucket function -
for example : range_bucket(item_stock, [0, 10, 50, 100, 1000])

From the example you've shared you were close to solving this one, just need to tweak your case statement.
The case statement in your query is splitting the groups into two separate columns, whereas you need these groups in one column with the totals to the right.
Consider the below change to your select statement.
case when b.item_stock < 10 then "Group_1"
when b.item_stock >= 10 then "Group_2" else null end as Groups,
count(a.shop_id) as total
Schema (MySQL v5.7)
CREATE TABLE id (
`shop_id` VARCHAR(5),
`item_stock` INTEGER
);
INSERT INTO id
(`shop_id`, `item_stock`)
VALUES
('ID001', '40'),
('ID002', '20'),
('ID003', '30'),
('ID004', '9'),
('ID005', '44'),
('ID006', '22'),
('ID007', '28'),
('ID008', '35'),
('ID009', '20'),
('ID010', '4'),
('ID011', '5'),
('ID012', '45'),
('ID013', '29'),
('ID014', '8'),
('ID015', '40'),
('ID016', '26'),
('ID017', '31'),
('ID018', '48'),
('ID019', '45'),
('ID020', '13');
Query #1
SELECT
case when item_stock < 10 then "Group_1"
when item_stock >= 10 then "Group_2" else null end as Groups,
count(shop_id) as total
FROM id group by 1;
Groups
total
Group_1
4
Group_2
16
View on DB Fiddle
Tom

Related

How to use distinct keyword on two columns in oracle sql?

I used distinct keyword on one column it did work very well but when I add the second column in select query it doesn't work for me as both columns have duplicate values. So I want to not show me the duplicate values in both columns. Is there any proper select query for that.
The sample data is:
For Col001:
555
555
7878
7878
89.
Col002:
43
43
56
56
56
67
67
67
79
79
79.
I want these data in this format:
Col001:
555
7878
89.
Col002:
43
56
67
79
I tried the following query:
Select distinct col001, col002 from tbl1
Use a set operator. UNION will give you the set of unique values from two subqueries.
select col001 as unq_col_val
from your_table
union
select col002
from your_table;
This presumes you're not fussed whether the value comes from COL001 or COL002. If you are fussed, this variant preserves that information:
select 'COL001' as source_col
,col001 as unq_col_val
from your_table
union
select 'COL002' as source_col
,col002
from your_table;
Note that this result set will contain more rows if the same value exists in both columns.
DISTINCT works across the entire row considering all values in the row and will remove duplicate values where the entire row is duplicated.
For example, given the sample data:
CREATE TABLE table_name (col001, col002) AS
SELECT 1, 1 FROM DUAL UNION ALL
SELECT 1, 2 FROM DUAL UNION ALL
SELECT 1, 3 FROM DUAL UNION ALL
SELECT 2, 1 FROM DUAL UNION ALL
SELECT 2, 2 FROM DUAL UNION ALL
--
SELECT 1, 2 FROM DUAL UNION ALL -- These are duplicates
SELECT 2, 2 FROM DUAL;
Then:
SELECT DISTINCT
col001,
col002
FROM table_name
Outputs:
COL001
COL002
1
1
1
2
1
3
2
1
2
2
And the duplicates have been removed.
If you want to only display distinct values for each column then you need to consider each column separately and can use something like:
SELECT c1.col001,
c2.col002
FROM ( SELECT DISTINCT
col001,
DENSE_RANK() OVER (ORDER BY col001) AS rnk
FROM table_name
) c1
FULL OUTER JOIN
( SELECT DISTINCT
col002,
DENSE_RANK() OVER (ORDER BY col002) AS rnk
FROM table_name
) c2
ON (c1.rnk = c2.rnk)
Which outputs:
COL001
COL002
1
1
2
2
null
3
db<>fiddle here

Oracle Finding a string match from multiple database tables

This is somewhat a complex problem to describe, but I'll try to explain it with an example. I thought I would have been able to use the Oracle Instr function to accomplish this, but it does not accept queries as parameters.
Here is a simplification of my data:
Table1
Person Qualities
Joe 5,6,7,8,9
Mary 7,8,10,15,20
Bob 7,8,9,10,11,12
Table2
Id Desc
5 Nice
6 Tall
7 Short
Table3
Id Desc
8 Angry
9 Sad
10 Fun
Table4
Id Desc
11 Boring
12 Happy
15 Cool
20 Mad
Here is somewhat of a query to give an idea of what I'm trying to accomplish:
select * from table1
where instr (Qualities, select Id from table2, 1,1) <> 0
and instr (Qualities, select Id from table3, 1,1) <> 0
and instr (Qualities, select Id from table3, 1,1) <> 0
I'm trying to figure out which people have at least 1 quality from each of the 3 groups of qualities (tables 2,3, and 4)
So Joe would not be returned in the results because he does not have the quality from each of the 3 groups, but Mary and Joe would since they have at least 1 quality from each group.
We are running Oracle 12, thanks!
Here's one option:
SQL> with
2 table1 (person, qualities) as
3 (select 'Joe', '5,6,7,8,9' from dual union all
4 select 'Mary', '7,8,10,15,20' from dual union all
5 select 'Bob', '7,8,9,10,11,12' from dual
6 ),
7 table2 (id, descr) as
8 (select 5, 'Nice' from dual union all
9 select 6, 'Tall' from dual union all
10 select 7, 'Short' from dual
11 ),
12 table3 (id, descr) as
13 (select 8, 'Angry' from dual union all
14 select 9, 'Sad' from dual union all
15 select 10, 'Fun' from dual
16 ),
17 table4 (id, descr) as
18 (select 11, 'Boring' from dual union all
19 select 12, 'Happy' from dual union all
20 select 15, 'Cool' from dual union all
21 select 20, 'Mad' from dual
22 ),
23 t1new (person, id) as
24 (select person, regexp_substr(qualities, '[^,]+', 1, column_value) id
25 from table1 cross join table(cast(multiset(select level from dual
26 connect by level <= regexp_count(qualities, ',') + 1
27 ) as sys.odcinumberlist))
28 )
29 select a.person,
30 count(b.id) bid,
31 count(c.id) cid,
32 count(d.id) did
33 from t1new a left join table2 b on a.id = b.id
34 left join table3 c on a.id = c.id
35 left join table4 d on a.id = d.id
36 group by a.person
37 having ( count(b.id) > 0
38 and count(c.id) > 0
39 and count(d.id) > 0
40 );
PERS BID CID DID
---- ---------- ---------- ----------
Bob 1 3 2
Mary 1 2 2
SQL>
What does it do?
lines #1 - 22 represent your sample data
T1NEW CTE (lines #23 - 28) splits comma-separated qualities into rows, per every person
final select (lines #29 - 40) are outer joining t1new with each of "description" tables (table2/3/4) and counting how many qualities are contained in there for each of person's qualities (represented by rows from t1new)
having clause is here to return only desired persons; each of those counts have to be a positive number
Maybe this will help:
{1} Create a view that categorises all qualities and allows you to SELECT quality IDs and categories . {2} JOIN the view to TABLE1 and use a join condition that "splits" the CSV value stored in TABLE1.
{1} View
create or replace view allqualities
as
select 1 as category, id as qid, descr from table2
union
select 2, id, descr from table3
union
select 3, id, descr from table4
;
select * from allqualities order by category, qid ;
CATEGORY QID DESCR
---------- ---------- ------
1 5 Nice
1 6 Tall
1 7 Short
2 8 Angry
2 9 Sad
2 10 Fun
3 11 Boring
3 12 Happy
3 15 Cool
3 20 Mad
{2} Query
-- JOIN CONDITION:
-- {1} add a comma at the start and at the end of T1.qualities
-- {2} remove all blanks (spaces) from T1.qualities
-- {3} use LIKE and the qid (of allqualities), wrapped in commas
--
-- inline view: use UNIQUE, otherwise we may get counts > 3
--
select person
from (
select unique person, category
from table1 T1
join allqualities A
on ',' || replace( T1.qualities, ' ', '' ) || ',' like '%,' || A.qid || ',%'
)
group by person
having count(*) = ( select count( distinct category ) from allqualities )
;
-- result
PERSON
Bob
Mary
Tested w/ Oracle 18c and 11g. DBfiddle here.

Oracle SQL (Toad): Expand table

Suppose I have an SQL (Oracle Toad) table named "test", which has the following fields and entries (dates are in dd/mm/yyyy format):
id ref_date value
---------------------
1 01/01/2014 20
1 01/02/2014 25
1 01/06/2014 3
1 01/09/2014 6
2 01/04/2015 7
2 01/08/2015 43
2 01/09/2015 85
2 01/12/2015 4
I know from how the table has been created that, since there are value entries for id = 1 for February 2014 and June 2014, the values for March through May 2014 must be 0. The same applies to July and August 2014 for id = 1, and for May through July 2015 and October through November 2015 for id = 2.
Now, if I want to calculate, say, the median of the value column for a given id, I will not arrive at the correct result using the table as it stands - as I'm missing 5 zero entries for each id.
I would therefore like to create/use the following (potentially just temporary table)...
id ref_date value
---------------------
1 01/01/2014 20
1 01/02/2014 25
1 01/03/2014 0
1 01/04/2014 0
1 01/05/2014 0
1 01/06/2014 3
1 01/07/2014 0
1 01/08/2014 0
1 01/09/2014 6
2 01/04/2015 7
2 01/05/2015 0
2 01/06/2015 0
2 01/07/2015 0
2 01/08/2015 43
2 01/09/2015 85
2 01/10/2015 0
2 01/11/2015 0
2 01/12/2015 4
...on which I could then compute the median by id:
select id, median(value) as med_value from test group by id
How do I do this? Or would there be an alternative way?
Many thanks,
Mr Clueless
In this solution, I build a table with all the "needed dates" and value of 0 for all of them. Then, instead of a join, I do a union all, group by id and ref_date and ADD the values in each group. If the date had a row with a value in the original table, then that's the resulting value; and if it didn't, the value will be 0. This avoids a join. In almost all cases a union all + aggregate will be faster (sometimes much faster) than a join.
I added more input data for more thorough testing. In your original question, you have two id's, and for both of them you have four positive values. You are missing five values in each case, so there will be five zeros (0) which means the median is 0 in both cases. For id=3 (which I added) I have three positive values and three zeros; the median is half of the smallest positive number. For id=4 I have just one value, which then should be the median as well.
The solution includes, in particular, an answer to your specific question - how to create the temporary table (which most likely doesn't need to be a temporary table at all, but an inline view). With factored subqueries (in the WITH clause), the optimizer decides if to treat them as temporary tables or inline views; you can see what the optimizer decided if you look at the Explain Plan.
with
inputs ( id, ref_date, value ) as (
select 1, to_date('01/01/2014', 'dd/mm/yyyy'), 20 from dual union all
select 1, to_date('01/02/2014', 'dd/mm/yyyy'), 25 from dual union all
select 1, to_date('01/06/2014', 'dd/mm/yyyy'), 3 from dual union all
select 1, to_date('01/09/2014', 'dd/mm/yyyy'), 6 from dual union all
select 2, to_date('01/04/2015', 'dd/mm/yyyy'), 7 from dual union all
select 2, to_date('01/08/2015', 'dd/mm/yyyy'), 43 from dual union all
select 2, to_date('01/09/2015', 'dd/mm/yyyy'), 85 from dual union all
select 2, to_date('01/12/2015', 'dd/mm/yyyy'), 4 from dual union all
select 3, to_date('01/01/2016', 'dd/mm/yyyy'), 12 from dual union all
select 3, to_date('01/03/2016', 'dd/mm/yyyy'), 23 from dual union all
select 3, to_date('01/06/2016', 'dd/mm/yyyy'), 2 from dual union all
select 4, to_date('01/11/2014', 'dd/mm/yyyy'), 9 from dual
),
-- the "inputs" table constructed above is for testing only,
-- it is not part of the solution.
ranges ( id, min_date, max_date ) as (
select id, min(ref_date), max(ref_date)
from inputs
group by id
),
prep ( id, ref_date, value ) as (
select id, add_months(min_date, level - 1), 0
from ranges
connect by level <= 1 + months_between( max_date, min_date )
and prior id = id
and prior sys_guid() is not null
),
v ( id, ref_date, value ) as (
select id, ref_date, sum(value)
from ( select id, ref_date, value from prep union all
select id, ref_date, value from inputs
)
group by id, ref_date
)
select id, median(value) as median_value
from v
group by id
order by id -- ORDER BY is optional
;
ID MEDIAN_VALUE
-- ------------
1 0
2 0
3 1
4 9
If ref_date is date and is second
with int1 as (select id
, max(ref_date) as max_date
, min(ref_date) as min_date from test group by id )
, s(n) as (select level -1 from dual connect by level <= (select max(months_between(max_date, min_date)) from int1 ) )
select i.id
, add_months(i.min_date,s.n) as ref_date
, nvl(value,0) as value
from int1 i
join s on add_months(i.min_date,s.n) <= i.max_date
LEFT join test t on t.id = i.id and add_months(i.min_date,s.n) = t.ref_date
And with median
with int1 as (select id
, max(ref_date) as max_date
, min(ref_date) as min_date from test group by id )
, s(n) as (select level -1 from dual connect by level <= (select max(months_between(max_date, min_date)) from int1 ) )
select i.id
, MEDIAN(nvl(value,0)) as value
from int1 i
join s on add_months(i.min_date,s.n) <= i.max_date
LEFT join test t on t.id = i.id and add_months(i.min_date,s.n) = t.ref_date
group by i.id

How can I find unoccupied id numbers in a table?

In my table I want to see a list of unoccupied id numbers in a certain range.
For example there are 10 records in my table with id's: "2,3,4,5,10,12,16,18,21,22" and say that I want to see available ones between 1 and 25. So I want to see a list like:
1,6,7,89,11,13,14,15,17,19,20,23,24,25
How should I write my sql query?
Select the numbers form 1 to 25 and show only those that are not in your table
select n from
( select rownum n from dual connect by level <= 25)
where n not in (select id from table);
Let's say you a #numbers table with three numbers -
CREATE TABLE #numbers (num INT)
INSERT INTO #numbers (num)
SELECT 1
UNION
SELECT 3
UNION
SELECT 6
Now, you can use CTE to generate numbers recursively from 1-25 and deselect those which are in your #numbers table in the WHERE clause -
;WITH n(n) AS
(
SELECT 1
UNION ALL
SELECT n+1 FROM n WHERE n < 25
)
SELECT n FROM n
WHERE n NOT IN (select num from #numbers)
ORDER BY n
OPTION (MAXRECURSION 25);
You can try using the "NOT IN" clause:
select
u1.user_id + 1 as start
from users as u1
left outer join users as u2 on u1.user_id + 1 = u2.id
where
u2.id is null
see also SQL query to find Missing sequence numbers
You need LISTAGG to get the output in a single row.
SQL> WITH DATA1 AS(
2 SELECT LEVEL rn FROM dual CONNECT BY LEVEL <=25
3 ),
4 data2 AS(
5 SELECT 2 num FROM dual UNION ALL
6 SELECT 3 FROM dual UNION ALL
7 SELECT 4 from dual union all
8 SELECT 5 FROM dual UNION ALL
9 SELECT 10 FROM dual UNION ALL
10 SELECT 12 from dual union all
11 SELECT 16 from dual union all
12 SELECT 18 FROM dual UNION ALL
13 SELECT 21 FROM dual UNION ALL
14 SELECT 22 FROM dual)
15 SELECT listagg(rn, ',')
16 WITHIN GROUP (ORDER BY rn) num_list FROM data1
17 WHERE rn NOT IN(SELECT num FROM data2)
18 /
NUM_LIST
----------------------------------------------------
1,6,7,8,9,11,13,14,15,17,19,20,23,24,25
SQL>

How to write a query to produce counts for arbitrary value bands?

My table had 3 fields: id and unit. I want to count how many ids have <10, 10-49, 50-100 etc units. The final result should look like:
Category | countIds
<10 | 1516
10 - 49 | 710
50 - 99 | 632
etc.
This is the query that returns each id and how many units it has:
select id, count(unit) as numUnits
from myTable
group by id
How can I build on that query to give me the category, countIds result?
create temporary table ranges (
seq int primary key,
range_label varchar(10),
lower int,
upper int
);
insert into ranges values
(1, '<10', 0, 9),
(2, '10 - 49', 10, 49),
(3, '50 - 99', 50, 99)
etc.
select r.range_label, count(c.numUnits) as countIds
from ranges as r
join (
select id, count(unit) as numUnits
from myTable
group by id) as c
on c.numUnits between r.lower and r.upper
group by r.range_label
order by r.seq;
edit: changed sum() to count() above.
select category_bucket, count(*)
from (select case when category < 10 then "<10"
when category >= 10 and category <= 49 then "10 - 49"
when category >= 50 and category <= 99 then "50 - 99"
else "100+"
end category_bucket, num_units
from my_table)
group by category_bucket
A dynamically grouped solution is much harder.
SELECT id, countIds
FROM (
SELECT id
, 'LESS_THAN_TEN' CATEGORY
, COUNT(unit) countIds
FROM table1
GROUP BY ID
HAVING COUNT(UNIT) < 10
UNION ALL
SELECT id
, 'BETWEEN_10_AND_49' category
, COUNT(unit) countIds
FROM table1
GROUP BY ID
HAVING COUNT(UNIT) BETWEEN 10 AND 49
UNION ALL
SELECT id
, 'BETWEEN_50_AND_99' category
, COUNT(unit) countIds
FROM table1
GROUP BY id
HAVING COUNT(UNIT) BETWEEN 50 AND 99
) x
Giving an example for one range: (10 - 49)
select count(id) from
(select id, count(unit) as numUnits from myTable group by id)
where numUnits >= '10' && numUnits <= '49'
It's not precisely what you want, but you could use fixed ranges, like so:
select ' < ' || floor(id / 50) * 50, count(unit) as numUnits
from myTable
group by floor(id / 50) * 50
order by 1
Try this working sample in SQL Server TSQL
SET NOCOUNT ON
GO
WITH MyTable AS
(
SELECT 00 as Id, 1 Value UNION ALL
SELECT 05 , 2 UNION ALL
SELECT 10 , 3 UNION ALL
SELECT 15 , 1 UNION ALL
SELECT 20 , 2 UNION ALL
SELECT 25 , 3 UNION ALL
SELECT 30 , 1 UNION ALL
SELECT 35 , 2 UNION ALL
SELECT 40 , 3 UNION ALL
SELECT 45 , 1 UNION ALL
SELECT 40 , 3 UNION ALL
SELECT 45 , 1 UNION ALL
SELECT 50 , 3 UNION ALL
SELECT 55 , 1 UNION ALL
SELECT 60 , 3 UNION ALL
SELECT 65 , 1 UNION ALL
SELECT 70 , 3 UNION ALL
SELECT 75 , 1 UNION ALL
SELECT 80 , 3 UNION ALL
SELECT 85 , 1 UNION ALL
SELECT 90 , 3 UNION ALL
SELECT 95 , 1 UNION ALL
SELECT 100 , 3 UNION ALL
SELECT 105 , 1 UNION ALL
SELECT 110 , 3 UNION ALL
SELECT 115 , 1 Value
)
SELECT Category, COUNT (*) CountIds
FROM
(
SELECT
CASE
WHEN Id BETWEEN 0 and 9 then '<10'
WHEN Id BETWEEN 10 and 49 then '10-49'
WHEN Id BETWEEN 50 and 99 then '50-99'
WHEN Id > 99 then '>99'
ELSE '0' END as Category
FROM MyTable
) as A
GROUP BY Category
This will give you the following result
Category CountIds
-------- -----------
<10 2
>99 4
10-49 10
50-99 10