Count values between double quotes and brackets - sql

How can I group by and count the values separated by double quotes between the brackets? I have 400K rows, so I'm also concerned about performance.
["853","1800"]
["852","1500"]
["833","1800"]
["857","1820"]
["23468","3184"]
.....
Desired output:
Value Count
23468 1212
09692 987
... ...

Do you mean something like this? (The with clause is only for testing - remove it, and use your actual table and column names in the main query.)
with
sample_data (j_arr) as (
select '["853","1800"]' from dual union all
select '["852","1500"]' from dual union all
select '["833","1800"]' from dual union all
select '["857","1820"]' from dual union all
select '["23468","3184"]' from dual union all
select '["013", "013", "013"]' from dual
)
select str, count(*) as ct
from sample_data cross apply json_table(j_arr, '$[*]' columns str path '$')
group by str
order by ct desc, str -- or whatever you need
;
STR CT
-------- ---
013 3
1800 2
1500 1
1820 1
23468 1
3184 1
833 1
852 1
853 1
857 1
I am sorry, but I have no clue what "register" means in this context. If you mean that you have 400K rows, I can't see how performance would be an issue. A quick test on my system (with 402K rows) took about 0.33 seconds.

Related

How to use distinct keyword on two columns in oracle sql?

I used distinct keyword on one column it did work very well but when I add the second column in select query it doesn't work for me as both columns have duplicate values. So I want to not show me the duplicate values in both columns. Is there any proper select query for that.
The sample data is:
For Col001:
555
555
7878
7878
89.
Col002:
43
43
56
56
56
67
67
67
79
79
79.
I want these data in this format:
Col001:
555
7878
89.
Col002:
43
56
67
79
I tried the following query:
Select distinct col001, col002 from tbl1
Use a set operator. UNION will give you the set of unique values from two subqueries.
select col001 as unq_col_val
from your_table
union
select col002
from your_table;
This presumes you're not fussed whether the value comes from COL001 or COL002. If you are fussed, this variant preserves that information:
select 'COL001' as source_col
,col001 as unq_col_val
from your_table
union
select 'COL002' as source_col
,col002
from your_table;
Note that this result set will contain more rows if the same value exists in both columns.
DISTINCT works across the entire row considering all values in the row and will remove duplicate values where the entire row is duplicated.
For example, given the sample data:
CREATE TABLE table_name (col001, col002) AS
SELECT 1, 1 FROM DUAL UNION ALL
SELECT 1, 2 FROM DUAL UNION ALL
SELECT 1, 3 FROM DUAL UNION ALL
SELECT 2, 1 FROM DUAL UNION ALL
SELECT 2, 2 FROM DUAL UNION ALL
--
SELECT 1, 2 FROM DUAL UNION ALL -- These are duplicates
SELECT 2, 2 FROM DUAL;
Then:
SELECT DISTINCT
col001,
col002
FROM table_name
Outputs:
COL001
COL002
1
1
1
2
1
3
2
1
2
2
And the duplicates have been removed.
If you want to only display distinct values for each column then you need to consider each column separately and can use something like:
SELECT c1.col001,
c2.col002
FROM ( SELECT DISTINCT
col001,
DENSE_RANK() OVER (ORDER BY col001) AS rnk
FROM table_name
) c1
FULL OUTER JOIN
( SELECT DISTINCT
col002,
DENSE_RANK() OVER (ORDER BY col002) AS rnk
FROM table_name
) c2
ON (c1.rnk = c2.rnk)
Which outputs:
COL001
COL002
1
1
2
2
null
3
db<>fiddle here

How do I extract the first 3 consonates from a string field SQL?

how can I extract from a field in records that contain names only the first 3 consonants and if a name does not have 3 consonants it adds the first vowel of the name?
For example, if I had the following record in the People table:
Field:Name
VALUE:Richard result=> RCH
FIELD:Name
VALUE:Paul result=> PLA
Here's one option; read comments within code.
Sample data:
SQL> with test (name) as
2 (select 'Richard' from dual union all
3 select 'Paul' from dual
4 ),
Query begins here:
5 temp as
6 -- val1 - consonants; val2 - vowels
7 (select
8 name,
9 translate(upper(name), '#AEIOU', '#') val1,
10 translate(upper(name), '#BCDFGHJKLMNPQRSTWXYZ', '#') val2
11 from test
12 )
13 -- finally: if there are enough consonants (val1's length is >= 3), return the first 3
14 -- letters (that's WHEN).
15 -- Otherwise, add as many vowels as necessary (that's what ELSE does)
16 select name,
17 case when length(val1) >= 3 then substr(val1, 1, 3)
18 else val1 || substr(val2, 1, 3 - length(val1))
19 end result
20 from temp;
NAME RESULT
------- --------------
Richard RCH
Paul PLA
SQL>
Just for fun using regexp:
select
name
,substr(
regexp_replace(
upper(name)
,'^([AEIOU]*)([^AEIOU]*)([AEIOU]*)([^AEIOU]*)([AEIOU]*)([^AEIOU]*).*'
,'\2\4\6\1\3\5'
),1,3) as result
from test;
([AEIOU]*) - is a group of vowels, 0 or more characters
([^AEIOU]*) - is a group of not-vowels (or consonants in this case), 0 or more characters
so this regexp looks for a pattern (vowels1)(consonants1)(vowels2)(consonants2)(vowels3)(consonants3) and reorders it to (consonants1)(consonants2)(consonants3)(vowels1)(vowels2)(vowels3)
then we just take first 3 characters from the reordered string
Full test case:
with test (name) as
(select 'Richard' from dual union all
select 'Paul' from dual union all
select 'Annete' from dual union all
select 'Anny' from dual union all
select 'Aiua' from dual union all
select 'Isaiah' from dual union all
select 'Sue' from dual
)
select
name
,substr(
regexp_replace(
upper(name)
,'^([AEIOU]*)([^AEIOU]*)([AEIOU]*)([^AEIOU]*)([AEIOU]*)([^AEIOU]*).*'
,'\2\4\6\1\3\5'
),1,3) as result
from test;
NAME RESULT
------- ------------
Richard RCH
Paul PLA
Annete NNT
Anny NNY
Aiua AIU
Isaiah SHI
Sue SUE
7 rows selected.

Selecting rows where column having number as substring

I have to write a query to fetch all the rows with a column containing numbers as substring
Data
------
abc123
defgh
wz127bdn
Now my desired result is
Result
-----
abc123
wz127bdn
I wrote the query like
SELECT data
FROM table
WHERE data like '%[0-9]%'
But this is not fetching the result.
You need a REGEXP_LIKE:
SQL> with test(data) as (
2 select 'abc123' from dual union all
3 select 'defgh' from dual union all
4 select 'wz127bdn' from dual union all
5 select '[0-9]' from dual
6 )
7 select *
8 from test
9 where regexp_like(data, '[0-9]')
10 ;
DATA
--------
abc123
wz127bdn
[0-9]
LIKE will not interpret '[0-9]' as "look for a digit", but exactly as you write it, thus searching for the string '[0-9]':
SQL> with test(data) as (
2 select 'abc123' from dual union all
3 select 'defgh' from dual union all
4 select 'wz127bdn' from dual union all
5 select '[0-9]' from dual
6 )
7 select *
8 from test
9 where data like '%[0-9]%' ;
DATA
--------
[0-9]
Use regexp_like() in Oracle:
SELECT data
FROM table
WHERE regexp_like(data, '[0-9]');
Note that the wildcards are not necessary, because regular expressions match anywhere in the string. If you like, you can do:
WHERE regexp_like(data, '.*[0-9].*');

How do I Display Aggregated Values Alongside Disaggregated Values in an Oracle Query?

I have a simple table in Oracle that I want to group a certain way: I want to display disaggregated results alongside aggregated results in the same row. Here is the input table:
with abc as
(
select 'aaa' nnn, 100 amt from dual union
select 'aaa', 20 from dual union
select 'aaa', 3 from dual union
select 'bbb', 44 from dual
)
select * from abc
I want to display each individual row joined with a sum of the AMT column grouped by the NNN column. I don't know how to explain this, so here's what it'd look like:
The Sum column in a given row will equal the sum of the values of the AMT column for all of the rows with an NNN value equal to the NNN value in the same given row.
I can do this by joining the input table with a grouped version of itself using the query below, but I think this is messy. My question is: Is there a builtin function in Oracle that accomplishes this? (My Oracle experience is a little weak, although I have lots of experience with SQL Server.)
with abc as
(
select 'aaa' nnn, 100 amt from dual union
select 'aaa', 20 from dual union
select 'aaa', 3 from dual union
select 'bbb', 44 from dual
)
select tblLeft.nnn, tblLeft.amt, tblRight.amtSum
from
(
select nnn, amt from abc
) tblLeft
inner join
(
select nnn, sum(amt) amtSum
from abc
group by nnn
) tblRight on tblLeft.nnn = tblRight.nnn
You could use analytic function to achieve your goal:
with abc as
(
select 'aaa' nnn, 100 amt from dual union
select 'aaa', 20 from dual union
select 'aaa', 3 from dual union
select 'bbb', 44 from dual
)
select nnn,
amt,
sum(amt) over (partition by nnn)
from abc;
Output:
NNN AMT SUM(AMT)OVER(PARTITIONBYNNN)
aaa 3 123
aaa 20 123
aaa 100 123
bbb 44 44
What analytic functions do: they allow you to use functions like SUM, but they calculate the value for each row instead of aggregating the result. They have some other interesting options, if you would like to learn more:
https://oracle-base.com/articles/misc/analytic-functions
SQLFiddle example

Using ratio_to_report analytic

I am trying to get the percentage of rows that a set of particular value has. Best explained by example. I can do this by each column very simply using ratio-to-report function and over(), but am having issues with multiple groupings
Assume table has 2 columns:
column a column b
1000 some data
1100 some data
2000 some data
1400 some data
1500 some data
With the following query, I can get for this domain set, each one is 20% of the total rows
select columna, count(*), trunc(ratio_to_report(count(columna)) over() * 100, 2) as perc
from table
group by columna
order by perc desc;
However, what I need is for example to determine the percentage & count of the rows that contain 1000, 1400 or 2000; From looking at it, you can tell its 60%, but need a query to return that. This needs to be efficient, as the query will be running against millions of rows. Like I said before, I have this working on a single value and its percentage, but the multiple is what is throwing me.
Seems like I need to be able to put an IN clause somewhere, but the values will not be these specific values each time. I will need to get the values for the "IN" part of it from another table, if that makes sense. guess I need some kind of multiple grouping.
Potentially, you're looking for something like
SQL> ed
Wrote file afiedt.buf
1 with x as (
2 select 1000 a from dual
3 union all
4 select 1100 from dual
5 union all
6 select 1400 from dual
7 union all
8 select 1500 from dual
9 union all
10 select 2000 from dual
11 )
12 select (case when a in (1000,1400,1500)
13 then 1
14 else 0
15 end) bucket,
16 count(*),
17 ratio_to_report(count(*)) over ()
18 from x
19 group by (case when a in (1000,1400,1500)
20 then 1
21 else 0
22* end)
SQL> /
BUCKET COUNT(*) RATIO_TO_REPORT(COUNT(*))OVER()
---------- ---------- -------------------------------
1 3 .6
0 2 .4
I'm not sure I entirely understand the requirement, but do you need ratio_to_report at all? Have a look at the following, and let me know how close this is to what you want, and we can work from there!
T1 is the table containing your sample data
create table t1(a primary key) as
select 1000 as a from dual union all
select 1100 as a from dual union all
select 1400 as a from dual union all
select 1500 as a from dual union all
select 2000 as a from dual;
T2 is the lookup table you mentioned (where you get the list of IDs)
create table t2(a primary key) as
select 1000 as a from dual union all
select 1400 as a from dual union all
select 2000 as a from dual;
A left join from T1->T2 will return all rows in T1 paired with all matching rows in T2. For each A in T1 that does not exist in your set (T2), the result will be padded with NULL. We can exploit the fact that COUNT() doesn't count (hehe) nulls.
select count(t1.a) as num_rows
,count(t2.a) as in_set
,count(t2.a) / count(t1.a) as shr_in_set
from t1
left
join t2 on(t1.a = t2.a);
The result of running the query is:
NUM_ROWS IN_SET SHR_IN_SET
---------- ---------- ----------
5 3 ,6