SQL - Point differences between two lists - sql

Given two comma separated (un-ordered) lists of numbers, I want to extract only the differences between them (using regexp probably).
e.g.:
select
'1010484,1025781,1051394,1069679' as list_1,
'1005923,1010484,1025781,1034010,1044261,1048311,1051394' as list_2
What I wish for is a result such as:
l1_additional_data: 1069679
l2_additional_data: 1005923,1034010,1044261,1048311
How can this be done?
I'm using Vertica, BTW - That means that no hierarchic ("connect by") queries could be used here.
Thanks in advance!

There's a relevant post that will be helpful - Splitting string into multiple rows in Oracle
I don't know vertica but based on oracle You could go with:
with list1 as
(
select
regexp_substr(list_1 ,'[^,]+', 1, level) as list_1_rows
from (
select
'1010484,1025781,1051394,1069679' as list_1
from dual)
connect by
regexp_substr(list_1 ,'[^,]+', 1, level) is not null),
list2 as (select
regexp_substr(list_2 ,'[^,]+', 1, level) as list_2_rows
from (
select
'1005923,1010484,1025781,1034010,1044261,1048311,1051394' as list_2
from dual)
connect by regexp_substr(list_2 ,'[^,]+', 1, level) is not null)
select * from list1
full outer join list2
on list1.list_1_rows = list2.list_2_rows
where list_1_rows is null or list_2_rows is null

OK, Here's my solution - But it's not very efficient, and it probably won't scale (in terms of performance):
WITH lists AS
(SELECT'1010484,1025781,1051394,1069679' AS list_1, '1005923,1010484,1025781,1034010,1044261,1048311,1051394' AS list_2 )
, numbers AS
(SELECT row_number() over() i
FROM system_columns limit 100)
SELECT group_concat(parsed_code_1) list_1_additions,
group_concat(parsed_code_2) list_2_additions
FROM(SELECT parsed_code_1
FROM(SELECT split_part(list_1, ',', i) parsed_code_1
FROM lists
CROSS JOIN numbers
WHERE i <= regexp_count(list_1, ',')+1) l
WHERE parsed_code_1 IS NOT NULL) a
FULL OUTER JOIN
(SELECT parsed_code_2
FROM(SELECT split_part(list_2, ',', i) parsed_code_2
FROM lists
CROSS JOIN numbers
WHERE i <= regexp_count(list_2, ',')+1) l
WHERE parsed_code_2 IS NOT NULL) b ON(parsed_code_1 = parsed_code_2)
WHERE parsed_code_1 IS NULL OR parsed_code_2 IS NULL

Related

How do I create a list of all possible anagrams of a word/string in PostgreSQL

How do I create a list of all possible anagrams of a word/string in PostgreSQL.
For example if String is 'act'
then the desired output should be:
act,
atc,
cta,
cat,
tac,
tca
I have one Table 'tbl_words' which contains million of words.
Then I want to check/search for only valid words in my database table from this anagrams list.
Like from above list of anagrams valid words are : act, cat.
Is there any way to do this?
Update 1:
I need output like this:
(all permutation for given word )
any idea ??
The query generates all permutations of 3 elements set:
with recursive numbers as (
select generate_series(1, 3) as i
),
rec as (
select i, array[i] as p
from numbers
union all
select n.i, p || n.i
from numbers n
join rec on cardinality(p) < 3 and not n.i = any(p)
)
select p as permutation
from rec
where cardinality(p) = 3
order by 1
permutation
-------------
{1,2,3}
{1,3,2}
{2,1,3}
{2,3,1}
{3,1,2}
{3,2,1}
(6 rows)
Modify the final query to generate permutations of the letters of a given word:
with recursive numbers as (
select generate_series(1, 3) as i
),
rec as (
select i, array[i] as p
from numbers
union all
select n.i, p || n.i
from numbers n
join rec on cardinality(p) < 3 and not n.i = any(p)
)
select a[p[1]] || a[p[2]] || a[p[3]] as result
from rec
cross join regexp_split_to_array('act', '') as a
where cardinality(p) = 3
order by 1
result
--------
act
atc
cat
cta
tac
tca
(6 rows)
Here is a solution:
with recursive params as (
select *
from (values ('cata')) v(str)
),
nums as (
select str, 1 as n
from params
union all
select str, 1 + n
from nums
where n < length(str)
),
pos as (
select str, array[n] as poses, array_remove(array_agg(n) over (partition by str), n) as rests, 1 as lev
from nums
union all
select pos.str, array_append(pos.poses, nums.n), array_remove(rests, nums.n), lev + 1
from pos join
nums
on pos.str = nums.str and array_position(pos.rests, nums.n) > 0
where cardinality(rests) > 0
)
select distinct pos.str , string_agg(substr(pos.str, thepos, 1), '')
from pos cross join lateral
unnest(pos.poses) thepos
where cardinality(rests) = 0
group by pos.str, pos.poses;
This is quite tricky, particularly when there are repeated letters in the string. The approach taken here generates all permutations of the numbers from 1 to n, where n is the length of the string. It then uses these as indexes to extract characters from the original string.
Those who are keen will notice that this uses select distinct with group by. That seems like the easiest way to avoid duplication in the resultant strings.

Apply order by in comma separated string in oracle

I have one of the column in oracle table which has below value :
select csv_val from my_table where date='09-OCT-18';
output
==================
50,100,25,5000,1000
I want this values to be in ascending order with select query, output would looks like :
output
==================
25,50,100,1000,5000
I tried this link, but looks like it has some restriction on number of digits.
Here, I made you a modified version of the answer you linked to that can handle an arbitrary (hardcoded) number of commas. It's pretty heavy on CTEs. As with most LISTAGG answers, it'll have a 4000-char limit. I also changed your regexp to be able to handle null list entries, based on this answer.
WITH
T (N) AS --TEST DATA
(SELECT '50,100,25,5000,1000' FROM DUAL
UNION
SELECT '25464,89453,15686' FROM DUAL
UNION
SELECT '21561,68547,51612' FROM DUAL
),
nums (x) as -- arbitrary limit of 20, can be changed
(select level from dual connect by level <= 20),
splitstr (N, x, substring) as
(select N, x, regexp_substr(N, '(.*?)(,|$)', 1, x, NULL, 1)
from T
inner join nums on x <= 1 + regexp_count(N, ',')
order by N, x)
select N, listagg(substring, ',') within group (order by to_number(substring)) as sorted_N
from splitstr
group by N
;
Probably it can be improved, but eh...
Based on sample data you posted, relatively simple query would work (you need lines 3 - 7). If data doesn't really look like that, query might need adjustment.
SQL> with my_table (csv_val) as
2 (select '50,100,25,5000,1000' from dual)
3 select listagg(token, ',') within group (order by to_number(token)) result
4 from (select regexp_substr(csv_val, '[^,]+', 1, level) token
5 from my_table
6 connect by level <= regexp_count(csv_val, ',') + 1
7 );
RESULT
-------------------------
25,50,100,1000,5000
SQL>

Select where record does not exists

I am trying out my hands on oracle 11g. I have a requirement such that I want to fetch those id from list which does not exists in table.
For example:
SELECT * FROM STOCK
where item_id in ('1','2'); // Return those records where result is null
I mean if item_id '1' is not present in db then the query should return me 1.
How can I achieve this?
You need to store the values in some sort of "table". Then you can use left join or not exists or something similar:
with ids as (
select 1 as id from dual union all
select 2 from dual
)
select ids.id
from ids
where not exists (select 1 from stock s where s.item_id = ids.id);
You can use a LEFT JOIN to an in-line table that contains the values to be searched:
SELECT t1.val
FROM (
SELECT '1' val UNION ALL SELECT '2'
) t1
LEFT JOIN STOCK t2 ON t1.val = t2.item_id
WHERE t2.item_id IS NULL
First create the list of possible IDs (e.g. 0 to 99 in below query). You can use a recursive cte for this. Then select these IDs and remove the IDs already present in the table from the result:
with possible_ids(id) as
(
select 0 as id from dual
union all
select id + 1 as id from possible_ids where id < 99
)
select id from possible_ids
minus
select item_id from stock;
A primary concern of the OP seems to be a terse notation of the query, notably the set of values to test for. The straightforwwrd recommendation would be to retrieve these values by another query or to generate them as a union of queries from the dual table (see the other answers for this).
The following alternative solution allows for a verbatim specification of the test values under the following conditions:
There is a character that does not occur in any of the test values provided ( in the example that will be - )
The number of values to test stays well below 2000 (to be precise, the list of values plus separators must be written as a varchar2 literal, which imposes the length limit ). However, this should not be an actual concern - If the test involves lists of hundreds of ids, these lists should definitely be retrieved froma table/view.
Caveat
Whether this method is worth the hassle ( not to mention potential performance impacts ) is questionable, imho.
Solution
The test values will be provided as a single varchar2 literal with - separating the values which is as terse as the specification as a list argument to the IN operator. The string starts and ends with -.
'-1-2-3-156-489-4654648-'
The number of items is computed as follows:
select cond, regexp_count ( cond, '[-]' ) - 1 cnt_items from (select '-1-2-3-156-489-4654648-' cond from dual)
A list of integers up to the number of items starting with 1 can be generated using the LEVEL pseudocolumn from hierarchical queries:
select level from dual connect by level < 42;
The n-th integer from that list will serve to extract the n-th value from the string (exemplified for the 4th value) :
select substr ( cond, instr(cond,'-', 1, 4 )+1, instr(cond,'-', 1, 4+1 ) - instr(cond,'-', 1, 4 ) - 1 ) si from (select cond, regexp_count ( cond, '[-]' ) - 1 cnt_items from (select '-1-2-3-156-489-4654648-' cond from dual) );
The non-existent stock ids are generated by subtracting the set of stock ids from the set of values. Putting it all together:
select substr ( cond, instr(cond,'-',1,level )+1, instr(cond,'-',1,level+1 ) - instr(cond,'-',1,level ) - 1 ) si
from (
select cond
, regexp_count ( cond, '[-]' ) - 1 cnt_items
from (
select '-1-2-3-156-489-4654648-' cond from dual
)
)
connect by level <= cnt_items + 1
minus
select item_id from stock
;

How to convert only first letter uppercase without using Initcap in Oracle?

Is there a way to convert the first letter uppercase in Oracle SQl without using the Initcap Function?
I have the problem, that I must work with the DISTINCT keyword in SQL clause and the Initcap function doesn´t work.
Heres is my SQL example:
select distinct p.nr, initcap(p.firstname), initcap(p.lastname), ill.describtion
from patient p left join illness ill
on p.id = ill.id
where p.deleted = 0
order by p.lastname, p.firstname;
I get this error message: ORA-01791: not a SELECTed expression
When SELECT DISTINCT, you can't ORDER BY columns that aren't selected. Use column aliases instead, as:
select distinct p.nr, initcap(p.firstname) fname, initcap(p.lastname) lname, ill.describtion
from patient p left join illness ill
on p.id = ill.id
where p.deleted = 0
order by lname, fname
this would do it, but i think you need to post your query as there may be a better solution
select upper(substr(<column>,1,1)) || substr(<column>,2,9999) from dual
To change string to String, you can use this:
SELECT
regexp_replace ('string', '[a-z]', upper (substr ('string', 1, 1)), 1, 1, 'i')
FROM dual;
This assumes that the first letter is the one you want to convert. It your input text starts with a number, such as 2 strings then it won't change it to 2 Strings.
You can also use the column number instead of the name or alias:
select distinct p.nr, initcap(p.firstname), initcap(p.lastname), ill.describtion
from patient p left join illness ill
on p.id = ill.id
where p.deleted = 0
order by 3, 2;
WITH inData AS
(
SELECT 'word1, wORD2, word3, woRD4, worD5, word6' str FROM dual
),
inRows as
(
SELECT 1 as tId, LEVEL as rId, trim(regexp_substr(str, '([A-Za-z0-9])+', 1, LEVEL)) as str
FROM inData
CONNECT BY instr(str, ',', 1, LEVEL - 1) > 0
)
SELECT tId, LISTAGG( upper(substr(str, 1, 1)) || substr(str, 2) , '') WITHIN GROUP (ORDER BY rId) AS camelCase
FROM inRows
GROUP BY tId;

SQL: how to get all the distinct characters in a column, across all rows

Is there an elegant way in SQL Server to find all the distinct characters in a single varchar(50) column, across all rows?
Bonus points if it can be done without cursors :)
For example, say my data contains 3 rows:
productname
-----------
product1
widget2
nicknack3
The distinct inventory of characters would be "productwigenka123"
Here's a query that returns each character as a separate row, along with the number of occurrences. Assuming your table is called 'Products'
WITH ProductChars(aChar, remain) AS (
SELECT LEFT(productName,1), RIGHT(productName, LEN(productName)-1)
FROM Products WHERE LEN(productName)>0
UNION ALL
SELECT LEFT(remain,1), RIGHT(remain, LEN(remain)-1) FROM ProductChars
WHERE LEN(remain)>0
)
SELECT aChar, COUNT(*) FROM ProductChars
GROUP BY aChar
To combine them all to a single row, (as stated in the question), change the final SELECT to
SELECT aChar AS [text()] FROM
(SELECT DISTINCT aChar FROM ProductChars) base
FOR XML PATH('')
The above uses a nice hack I found here, which emulates the GROUP_CONCAT from MySQL.
The first level of recursion is unrolled so that the query doesn't return empty strings in the output.
Use this (shall work on any CTE-capable RDBMS):
select x.v into prod from (values('product1'),('widget2'),('nicknack3')) as x(v);
Test Query:
with a as
(
select v, '' as x, 0 as n from prod
union all
select v, substring(v,n+1,1) as x, n+1 as n from a where n < len(v)
)
select v, x, n from a -- where n > 0
order by v, n
option (maxrecursion 0)
Final Query:
with a as
(
select v, '' as x, 0 as n from prod
union all
select v, substring(v,n+1,1) as x, n+1 as n from a where n < len(v)
)
select distinct x from a where n > 0
order by x
option (maxrecursion 0)
Oracle version:
with a(v,x,n) as
(
select v, '' as x, 0 as n from prod
union all
select v, substr(v,n+1,1) as x, n+1 as n from a where n < length(v)
)
select distinct x from a where n > 0
Given that your column is varchar, it means it can only store characters from codes 0 to 255, on whatever code page you have. If you only use the 32-128 ASCII code range, then you can simply see if you have any of the characters 32-128, one by one. The following query does that, looking in sys.objects.name:
with cteDigits as (
select 0 as Number
union all select 1 as Number
union all select 2 as Number
union all select 3 as Number
union all select 4 as Number
union all select 5 as Number
union all select 6 as Number
union all select 7 as Number
union all select 8 as Number
union all select 9 as Number)
, cteNumbers as (
select U.Number + T.Number*10 + H.Number*100 as Number
from cteDigits U
cross join cteDigits T
cross join cteDigits H)
, cteChars as (
select CHAR(Number) as Char
from cteNumbers
where Number between 32 and 128)
select cteChars.Char as [*]
from cteChars
cross apply (
select top(1) *
from sys.objects
where CHARINDEX(cteChars.Char, name, 0) > 0) as o
for xml path('');
If you have a Numbers or Tally table which contains a sequential list of integers you can do something like:
Select Distinct '' + Substring(Products.ProductName, N.Value, 1)
From dbo.Numbers As N
Cross Join dbo.Products
Where N.Value <= Len(Products.ProductName)
For Xml Path('')
If you are using SQL Server 2005 and beyond, you can generate your Numbers table on the fly using a CTE:
With Numbers As
(
Select Row_Number() Over ( Order By c1.object_id ) As Value
From sys.columns As c1
Cross Join sys.columns As c2
)
Select Distinct '' + Substring(Products.ProductName, N.Value, 1)
From Numbers As N
Cross Join dbo.Products
Where N.Value <= Len(Products.ProductName)
For Xml Path('')
Building on mdma's answer, this version gives you a single string, but decodes some of the changes that FOR XML will make, like & -> &.
WITH ProductChars(aChar, remain) AS (
SELECT LEFT(productName,1), RIGHT(productName, LEN(productName)-1)
FROM Products WHERE LEN(productName)>0
UNION ALL
SELECT LEFT(remain,1), RIGHT(remain, LEN(remain)-1) FROM ProductChars
WHERE LEN(remain)>0
)
SELECT STUFF((
SELECT N'' + aChar AS [text()]
FROM (SELECT DISTINCT aChar FROM Chars) base
ORDER BY aChar
FOR XML PATH, TYPE).value(N'.[1]', N'nvarchar(max)'),1, 1, N'')
-- Allow for a lot of recursion. Set to 0 for infinite recursion
OPTION (MAXRECURSION 365)