SQL Query: Count the number of distinct values in a table - sql

I am trying to create a SQL query to count the number of distinct values in a SQL table.
My table has 6 columns:
n1 n2 n3 n4 n5 n6
______________________
3 5 7 9 11 20
3 7 11 15 17 20
3 15 26 28 30 40
15 26 30 40 55 56
3 4 5 9 15 17
17 20 26 28 30 40
And here's the result I am trying to get:
value frequency
______________________
3 4
4 1
5 2
7 2
9 2
11 2
15 3
17 3
20 3
26 3
28 2
30 3
40 3
55 1
56 1
So basically, I need the query to look at the whole table, take a note of each value that appears, and count the number of times that particular value appears.

Use UNION ALL to get all the nX column values in 1 column and aggregate:
select value, count(*) as frequency
from (
select n1 as value from tablename union all
select n2 from tablename union all
select n3 from tablename union all
select n4 from tablename union all
select n5 from tablename union all
select n6 from tablename
) t
group by value

I would recommend cross apply for this purpose:
select v.n, count(*) as frequency
from t cross apply
(values (n1), (n2), (n3), (n4), (n5), (n6)) v(n)
group by v.n;
cross apply, which implements a lateral join is more efficient than union all for unpivoting data. This is particularly true if your "table" is really a view or complex query.

here is the beautiful use case of UNPIVOT if you are using SQL SERVER or ORACLE:
SELECT
[value]
, count(*) frequency
FROM
( select n1,n2,n3,n4,n5,n6 from tablename) p
UNPIVOT ([value] for nums in ( n1,n2,n3,n4,n5,n6 )) as unpvt
GROUP BY [value]
ORDER BY frequency DESC
which is more efficient than Union , if performance matters there.

Related

SQL Query - Looping

I'm trying to output a record per part for each the quantity in the field. E.g. if a part has a qty of 10 then I'd want that part to be listed 10 times, if the qty was 2 then I'd only want the part to be list twice.
Here's a sample of the data:
Part Qty
PSR6621581 17
PSR6620952 13
PSR6620754 11
PSR6621436 11
PSR6621029 9
PSR661712 9
PSR661907 9
PSR662998 8
PSR6620574 7
PSR661781 7
Any suggestions?
You can use a recursive CTE to expand the rows. For example:
with
p as (
select part, qty, 1 as n from t
union all
select part, qty, n + 1
from p
where n < qty
)
select part, qty from p
Result:
part qty
----- ---
ABC 1
DEF 4
DEF 4
DEF 4
DEF 4
See running example at db<>fiddle.
Here is another option. This is using a tally which is the ideal way to handle this type of thing. I keep this view on my system as a crazy fast way of having a tally table.
create View [dbo].[cteTally] as
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select N from cteTally
GO
Now we just create a dummy table with your sample data.
declare #Something table
(
Part varchar(10)
, Qty int
)
insert #Something
select 'PSR6621581', 17 union all
select 'PSR6620952', 13 union all
select 'PSR6620754', 11 union all
select 'PSR6621436', 11 union all
select 'PSR6621029', 9 union all
select 'PSR661712', 9 union all
select 'PSR661907', 9 union all
select 'PSR662998', 8 union all
select 'PSR6620574', 7 union all
select 'PSR661781', 7
Now that the setup is complete the query to produce the output you want is super easy and lightning fast to execute.
select s.Part
, s.Qty
from #Something s
join cteTally t on t.N <= s.Qty
order by s.Part
, t.N

How to unnest/explode/flatten the comma separated value in a column in Amazon Redshift?

I am trying to generate a new row for each value in col2. As the value is in string format, I need to wrap it in double quotes before using any Redshift json function on it.
Input:
col1(int) col2(varchar)
1 ab,cd,ef
2 gh
3 jk,lm,kn,ut,zx
Output:
col1(int) col2(varchar)
1 ab
1 cd
1 ef
2 gh
3 jk
3 lm
3 kn
3 ut
3 zx
with NS AS (
select 1 as n union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10
)
select
TRIM(SPLIT_PART(B.col2, ',', NS.n)) AS col2
from NS
inner join table B ON NS.n <= REGEXP_COUNT(B.col2, ',') + 1
Here, the NS (number sequence) is a CTE that returns a list of number from 1 to N, here we have to make sure that our max number is greater than the size of our maximum tags, so you can try adding more numbers to the list depending on your context.

finding consecutive numbers

ok - so i searched the internet for this, and none of the examples i found are exactly like mine.
i have a table with 5 columns and thousands of rows.
i need to find consecutive numbers within each row. i need to end up with 3 queries for the situations shown below
n1 n2 n3 n4 n5
=======================
1 3 4 6 9 = should result in 1 (when checking for pairs)
1 3 4 5 9 = should result in 1 (when checking for triplets)
1 2 5 8 9 = should result in 1 (when checking for double pairs)
This is what i have to move the columns into rows, but i am not sure how to check this now.
select n1 from (
select n1 from myTable where Id = 1
union all select n2 from myTable where Id = 1
union all select n3 from myTable where Id = 1
union all select n4 from myTable where Id = 1
union all select n5 from myTable where Id = 1
) t
order by n1
Thank you for all your help!
#TimBiegeleise, update :
so i found this on google for Gaps & Islands:
SELECT ID, StartSeqNo=MIN(SeqNo), EndSeqNo=MAX(SeqNo)
FROM (
SELECT ID, SeqNo
,rn=SeqNo-ROW_NUMBER() OVER (PARTITION BY ID ORDER BY SeqNo)
FROM dbo.GapsIslands) a
GROUP BY ID, rn;
this is my updated query converting the columns to rows (but it requires 2 statements, i much rather have 1) and implementing the island part - but i don't understand how that give me the result what i need (see above). below i show the original row data and the result.
select n1, IDENTITY (INT, 1, 1) AS ID
into #test
from (
select n1 from myTable where Id = 8
union all select n2 from myTable where Id = 8
union all select n3 from myTable where Id = 8
union all select n4 from myTable where Id = 8
union all select n5 from myTable where Id = 8
) as t
order by n1
SELECT ID, StartSeqNo=MIN(n1), EndSeqNo=MAX(n1)
FROM (
SELECT ID, n1
,rn=n1-ROW_NUMBER() OVER (PARTITION BY ID ORDER BY n1)
FROM #test) a
GROUP BY ID, rn
drop table #test
original row - should return 1 (when checking for "pair"/consecutive numbers
n1 n2 n3 n4 n5
=======================
31 27 28 36 12
the result i get with the above query:
StartSeqNo EndSeqNo
1 12 12
2 27 27
3 28 28
4 31 31
5 36 36
help :-) !
ok, i got it. this query returns a value of 1 for the above stated row
select COUNT(*) as pairs
from (
SELECT StartSeqNo=MIN(n1), EndSeqNo=MAX(n1)
FROM (
SELECT n1, rn=n1-ROW_NUMBER() OVER (ORDER BY n1)
from (
select n1 from myTable where Id = 8
union all select n2 from myTable where Id = 8
union all select n3 from myTable where Id = 8
union all select n4 from myTable where Id = 8
union all select n5 from myTable where Id = 8
) t
) x
GROUP BY rn
) z
where StartSeqNo+1 = EndSeqNo

Find closest or higher values in SQL

I have a table:
table1
rank value
1 10
25 120
29 130
99 980
I have to generate the following table:
table2
rank value
1 10
2 10
3 10
4 10
5 10
6 10
7 10
8 10
9 10
10 10
11 10
12 10
13 120
14 120
15 120
.
.
.
.
25 120
26 120
27 130
28 130
29 130
30 130
.
.
.
.
.
62 980
63 980
.
.
.
99 980
100 980
So, table2 should have all values from 1 to 100. There are 3 cases:
If it's an exact match, for ex. rank 25, value would be 120
Find closest, for ex. for rank 9 in table2, we do NOT have exact match, but 1 is closest to 9 (9-1 = 8 whereas 25-9 = 16), so assign value of 1
If there is equal distribution from both sides, use higher rank value, for ex. for rank 27, we have 25 as well as 29 which are equally distant, so take higher value which is 29 and assign value.
something like
-- your testdata
with table1(rank,
value) as
(select 1, 10
from dual
union all
select 25, 120
from dual
union all
select 29, 130
from dual
union all
select 99, 980
from dual),
-- range 1..100
data(rank) as
(select level from dual connect by level <= 100)
select d.rank,
min(t.value) keep(dense_rank first order by abs(t.rank - d.rank) asc, t.rank desc)
from table1 t, data d
group by d.rank;
If I understand well your need, you could use the following:
select num, value
from (
select num, value, row_number() over (partition by num order by abs(num-rank) asc, rank desc) as rn
from table1
cross join ( select level as num from dual connect by level <= 100) numbers
)
where rn = 1
This joins your table with the [1,100] interval and then keeps only the first row for each number, ordering by the difference and keeping, in case of equal difference, the greatest value.
Join hierarchical number generator with your table and use lag() with ignore nulls clause:
select h.rank, case when value is null
then lag(value ignore nulls) over (order by h.rank)
else value
end value
from (select level rank from dual connect by level <= 100) h
left join t on h.rank = t.rank
order by h.rank
Test:
with t(rank, value) as (
select 1, 10 from dual union all
select 25, 120 from dual union all
select 29, 130 from dual union all
select 99, 980 from dual )
select h.rank, case when value is null
then lag(value ignore nulls) over (order by h.rank)
else value
end value
from (select level rank from dual connect by level <= 100) h
left join t on h.rank = t.rank
order by h.rank
RANK RANK
---------- ----------
1 10
2 10
...
24 10
25 120
26 120
27 120
28 120
29 130
30 130
...
98 130
99 980
100 980
Here's an alternative that doesn't need a cross join (but does use a couple of analytic functions, so you'd need to test whether this is more performant for your set of data than the other solutions):
WITH sample_data AS (SELECT 1 rnk, 10 VALUE FROM dual UNION ALL
SELECT 25 rnk, 120 VALUE FROM dual UNION ALL
SELECT 29 rnk, 130 VALUE FROM dual UNION ALL
SELECT 99 rnk, 980 VALUE FROM dual)
SELECT rnk + LEVEL - 1 rnk,
CASE WHEN rnk + LEVEL - 1 < rnk + (next_rank - rnk)/2 THEN
VALUE
ELSE next_value
END VALUE
FROM (SELECT rnk,
VALUE,
LEAD(rnk, 1, 100 + 1) OVER (ORDER BY rnk) next_rank,
LEAD(VALUE, 1, VALUE) OVER (ORDER BY rnk) next_value
FROM sample_data)
CONNECT BY PRIOR rnk = rnk
AND PRIOR sys_guid() IS NOT NULL
AND LEVEL <= next_rank - rnk;
RNK VALUE
---------- ----------
1 10
2 10
... ...
12 10
13 120
... ...
24 120
25 120
26 120
27 130
28 130
29 130
30 130
... ...
63 130
64 980
65 980
... ...
98 980
99 980
100 980
N.B, I'm not sure why you have 62 and 63 as having a value of 980 - the mid point between 29 and 99 is 64.
Also, you'll see that I've used 100 + 1 instead of 101 - this is because if you wanted to parameterise things, you would replace 100 with the parameter - e.g. v_max_rank + 1
You can use cross join. Use the below query to get your result:
select t1.* from table1 t1
cross join
(select * from table1) t2
on (t1.rank=t2.rank);

oracle sql - numbering group of rows

i have the following table with different prices in every week and need a numbering like in the last column. consecutive rows with same prices should have the same number like in weeks 11/12 or 18/19. but on the other side weeks 2 and 16 have the same prices but are not consecutive so they should get a different number.
w | price | r1 | need
===========================
1 167,93 1 1
2 180 1 2
3 164,72 1 3
4 147,42 1 4
5 133,46 1 5
6 145,43 1 6
7 147 1 7
8 147,57 1 8
9 150,95 1 9
10 158,14 1 10
11 170 1 11
12 170 2 11
13 166,59 1 12
14 161,06 1 13
15 162,88 1 14
16 180 2 15
17 183,15 1 16
18 195 1 17
19 195 2 17
i have already experimented with the analytics functions (row_number, rank, dens_rank), but didn't found a solution for this problem so far.
(oracle sql 10,11)
does anyone have a hint? thanks.
Simulating your table first:
SQL> create table mytable (w,price,r1)
2 as
3 select 1 , 167.93, 1 from dual union all
4 select 2 , 180 , 1 from dual union all
5 select 3 , 164.72, 1 from dual union all
6 select 4 , 147.42, 1 from dual union all
7 select 5 , 133.46, 1 from dual union all
8 select 6 , 145.43, 1 from dual union all
9 select 7 , 147 , 1 from dual union all
10 select 8 , 147.57, 1 from dual union all
11 select 9 , 150.95, 1 from dual union all
12 select 10, 158.14, 1 from dual union all
13 select 11, 170 , 1 from dual union all
14 select 12, 170 , 2 from dual union all
15 select 13, 166.59, 1 from dual union all
16 select 14, 161.06, 1 from dual union all
17 select 15, 162.88, 1 from dual union all
18 select 16, 180 , 2 from dual union all
19 select 17, 183.15, 1 from dual union all
20 select 18, 195 , 1 from dual union all
21 select 19, 195 , 2 from dual
22 /
Table created.
Your need column is calculated in two parts: first compute a delta column which denotes whether the previous price-column differs from the current rows price column. If you have that delta column, the second part is easy by computing the sum of those deltas.
SQL> with x as
2 ( select w
3 , price
4 , r1
5 , case lag(price,1,-1) over (order by w)
6 when price then 0
7 else 1
8 end delta
9 from mytable
10 )
11 select w
12 , price
13 , r1
14 , sum(delta) over (order by w) need
15 from x
16 /
W PRICE R1 NEED
---------- ---------- ---------- ----------
1 167.93 1 1
2 180 1 2
3 164.72 1 3
4 147.42 1 4
5 133.46 1 5
6 145.43 1 6
7 147 1 7
8 147.57 1 8
9 150.95 1 9
10 158.14 1 10
11 170 1 11
12 170 2 11
13 166.59 1 12
14 161.06 1 13
15 162.88 1 14
16 180 2 15
17 183.15 1 16
18 195 1 17
19 195 2 17
19 rows selected.
You can nest your analytic functions using inline views, so you first group the consecutive weeks with same prices and then dense_rank using those groups:
select w
, price
, r1
, dense_rank() over (
order by first_w_same_price
) drank
from (
select w
, price
, r1
, last_value(w_start_same_price) ignore nulls over (
order by w
rows between unbounded preceding and current row
) first_w_same_price
from (
select w
, price
, r1
, case lag(price) over (order by w)
when price then null
else w
end w_start_same_price
from your_table
)
)
order by w
The innermost inline view with LAG function lets the starting week of every consecutive group get it's own week number, but every consecutive week with same price gets null (weeks 12 and 19 in your data.)
The middle inline view with LAST_VALUE function then use the IGNORE NULLS feature to give the consecutive weeks the same value as the first week within each group. So week 11 and 12 both gets 11 in first_w_same_price and week 18 and 19 both gets 18 in first_w_same_price.
And finally the outer query use DENSE_RANK to give the desired result.
For each row you should count previous rows where (w-1) row price isn't the same as (w) price:
select T1.*,
(SELECT count(*)
FROM T T2
JOIN T T3 ON T2.w-1=T3.w
WHERE T2.Price<>T3.Price
AND T2.W<=T1.W)+1 rn
from t T1
SQLFiddle demo
Try this:
with tt as (
select t.*, decode(lag(price) over(order by w) - price, 0, 1, 0) diff
from t
)
select w
, price
, r1
, row_number() over (order by w) - sum(diff) over(order by w rows between UNBOUNDED PRECEDING and current row) need
from tt
SELECT w, price, r1,
ROW_NUMBER () OVER (PARTITION BY price ORDER BY price) row_column
FROM TABLE