Select records where column has n character occurrences - sql

I was wondering if this is possible in sqlite.
SELECT * FROM tbl WHERE substr_count(f, '*') = 5
It should return records that have 5 asterisks in the "f" column, like
a*b**c**
****a*
and so on

SELECT * FROM tbl WHERE length(f)-replace(f,'*','') = 5

This solution is easy if you have a tally or numbers table which simply contains a sequential list of integers. This would be a table you populated once but has many uses. With that you have:
Create Table Tally ( N int );
Insert Tally( N )
...
Select Z.<PrimaryKeyCol>, Sum( Z.Val )
From (
Select <PrimaryKeyCol>, 1 As Val
From tbl
Cross Join Tally As T
Where substr( tbl.f, T.N, 1 ) = '*'
) As Z
Group By Z.<PrimaryKeyCol>
Having Sum( Z.Val ) = 5

Related

How do I create a list of all possible anagrams of a word/string in PostgreSQL

How do I create a list of all possible anagrams of a word/string in PostgreSQL.
For example if String is 'act'
then the desired output should be:
act,
atc,
cta,
cat,
tac,
tca
I have one Table 'tbl_words' which contains million of words.
Then I want to check/search for only valid words in my database table from this anagrams list.
Like from above list of anagrams valid words are : act, cat.
Is there any way to do this?
Update 1:
I need output like this:
(all permutation for given word )
any idea ??
The query generates all permutations of 3 elements set:
with recursive numbers as (
select generate_series(1, 3) as i
),
rec as (
select i, array[i] as p
from numbers
union all
select n.i, p || n.i
from numbers n
join rec on cardinality(p) < 3 and not n.i = any(p)
)
select p as permutation
from rec
where cardinality(p) = 3
order by 1
permutation
-------------
{1,2,3}
{1,3,2}
{2,1,3}
{2,3,1}
{3,1,2}
{3,2,1}
(6 rows)
Modify the final query to generate permutations of the letters of a given word:
with recursive numbers as (
select generate_series(1, 3) as i
),
rec as (
select i, array[i] as p
from numbers
union all
select n.i, p || n.i
from numbers n
join rec on cardinality(p) < 3 and not n.i = any(p)
)
select a[p[1]] || a[p[2]] || a[p[3]] as result
from rec
cross join regexp_split_to_array('act', '') as a
where cardinality(p) = 3
order by 1
result
--------
act
atc
cat
cta
tac
tca
(6 rows)
Here is a solution:
with recursive params as (
select *
from (values ('cata')) v(str)
),
nums as (
select str, 1 as n
from params
union all
select str, 1 + n
from nums
where n < length(str)
),
pos as (
select str, array[n] as poses, array_remove(array_agg(n) over (partition by str), n) as rests, 1 as lev
from nums
union all
select pos.str, array_append(pos.poses, nums.n), array_remove(rests, nums.n), lev + 1
from pos join
nums
on pos.str = nums.str and array_position(pos.rests, nums.n) > 0
where cardinality(rests) > 0
)
select distinct pos.str , string_agg(substr(pos.str, thepos, 1), '')
from pos cross join lateral
unnest(pos.poses) thepos
where cardinality(rests) = 0
group by pos.str, pos.poses;
This is quite tricky, particularly when there are repeated letters in the string. The approach taken here generates all permutations of the numbers from 1 to n, where n is the length of the string. It then uses these as indexes to extract characters from the original string.
Those who are keen will notice that this uses select distinct with group by. That seems like the easiest way to avoid duplication in the resultant strings.

select statement to list numbers in range

In DB2, I have this query to list numbers 1-x:
select level from SYSIBM.SYSDUMMY1 connect by level <= "some number"
But this maxes out due to SQL20450N Recursion limit exceeded within a hierarchical query.
How can I generate a list of numbers between 1 and x using a select statement when x is not known at runtime?
I found an answer based on this post:
WITH d AS
(SELECT LEVEL - 1 AS dig FROM SYSIBM.SYSDUMMY1 CONNECT BY LEVEL <= 10)
SELECT t1.n
FROM (SELECT (d7.dig * 1000000) +
(d6.dig * 100000) +
(d5.dig * 10000) +
(d4.dig * 1000) +
(d3.dig * 100) +
(d2.dig * 10) +
d1.dig AS n
FROM d d1
CROSS JOIN d d2
CROSS JOIN d d3
CROSS JOIN d d4
CROSS JOIN d d5
CROSS JOIN d d6
CROSS JOIN d d7) t1
JOIN ("subselect that returns desired value as i") t2
ON t1.n <= t2.i
ORDER BY t1.n
That's how I usually create lists:
For your example
numberlist (num) as
(
select min(1) from anytable
union all
select num + 1 from numberlist
where num <= x
)
I did something like this when I wanted a list of values to correspond with months:
with t1 (mon) as (
values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)
)
select * from t1
It seems a bit kludgy, but for a small list like 1-12, or even 1-50, it did what I needed it to.
It's nice to see someone else tagging their questions with DB2.
If you have any table known to have more than x rows, you can always do:
select * from (
select row_number() over () num
from my_big_table
) where num <= x
or, per bhamby's suggestion:
select row_number() over () num
from my_big_table
fetch first X rows only
For DB2 you can use recursive common table expressions (cf. IBM documentation on recursive CTE):
with max(num) as (
select 1 from sysibm.sysdummy1
)
,result (num) as (
select num from max
union ALL
select result.num+1
from result
where result.num<=100
)
select * from result;

Query Split string into rows

I have a table that looks like this:
ID Value
1 1,10
2 7,9
I want my result to look like this:
ID Value
1 1
1 2
1 3
1 4
1 5
1 6
1 7
1 8
1 9
1 10
2 7
2 8
2 9
I'm after both a range between 2 numbers with , as the delimiter (there can only be one delimiter in the value) and how to split this into rows.
Splitting the comma separated numbers is a small part of this problem. The parsing should be done in the application and the range stored in separate columns. For more than one reason: Storing numbers as strings is a bad idea. Storing two attributes in a single column is a bad idea. And, actually, storing unsanitized user input in the database is also often a bad idea.
In any case, one way to generate the list of numbers is to use a recursive CTE:
with t as (
select t.*, cast(left(value, charindex(',', value) - 1) as int) as first,
cast(substring(value, charindex(',', value) + 1, 100) as int) as last
from table t
),
cte as (
select t.id, t.first as value, t.last
from t
union all
select cte.id, cte.value + 1, cte.last
from cte
where cte.value < cte.last
)
select id, value
from cte
order by id, value;
You may need to fiddle with the value of MAXRECURSION if the ranges are really big.
Any table that a field with multiple values such as this is a problem in terms of design. The only way to deal with these records as it is is to split the values on the delimiter and put them into a temporary table, implement custom splitting code, integrate a CTE as noted, or redesign the original table to put the comma-delimited fields into separate fields, eg
ID LOWLIMIT HILIMIT
1 1 10
similar with Gordon Linoff variant, but has some difference
--create temp table for data sample
DECLARE #Yourdata AS TABLE ( id INT, VALUE VARCHAR(20) )
INSERT #Yourdata
( id, VALUE )
VALUES ( 1, '1,10' ),
( 2, '7,9' )
--final query
;WITH Tally
AS ( SELECT MIN(CONVERT(INT, SUBSTRING(y.VALUE, 1, CHARINDEX(',', y.value) - 1))) AS MinV ,
MAX(CONVERT(INT, SUBSTRING(y.VALUE, CHARINDEX(',', y.value) + 1, 18))) AS MaxV
FROM #yourdata AS y
UNION ALL
SELECT MinV = MinV + 1 , MaxV
FROM Tally
WHERE MinV < Maxv
)
SELECT y.id , t.minV AS value
FROM #yourdata AS y
JOIN tally AS t ON t.MinV BETWEEN CONVERT(INT, SUBSTRING(y.VALUE, 1, CHARINDEX(',', y.value) - 1))
AND CONVERT(INT, SUBSTRING(y.VALUE, CHARINDEX(',', y.value) + 1, 18))
ORDER BY id, minV
OPTION ( MAXRECURSION 999 ) --change it if required
output

SQL query for finding first missing sequence string (prefix+no)

T-SQL query for finding first missing sequence string (prefix+no)
Sequence can have a prefix + a continuing no.
ex sequence will be
ID
-------
AUTO_500
AUTO_501
AUTO_502
AUTO_504
AUTO_505
AUTO_506
AUTO_507
AUTO_508
So above the missing sequence is AUTO_503 or if there is no missing sequence then it must return next sequence.
Also starting no is to specified ex. 500 in this case and prefix can be null i.e. no prefix only numbers as sequence.
You could LEFT JOIN the id numbers on shifted(+1) values to find gaps in sequential order:
SELECT
MIN(a.offsetnum) AS first_missing_num
FROM
(
SELECT 500 AS offsetnum
UNION
SELECT CAST(REPLACE(id, 'AUTO_', '') AS INT) + 1
FROM tbl
) a
LEFT JOIN
(SELECT CAST(REPLACE(id, 'AUTO_', '') AS INT) AS idnum FROM tbl) b ON a.offsetnum = b.idnum
WHERE
a.offsetnum >= 500 AND b.idnum IS NULL
SQLFiddle Demo
Using a recursive CTE to dynamically generate the sequence between the min and max of the ID Numbers maybe over complicated things a bit but it seems to work -
LIVE ON FIDDLE
CREATE TABLE tbl (
id VARCHAR(55)
);
INSERT INTO tbl VALUES
('AUTO_500'),
('AUTO_501'),
('AUTO_502'),
('AUTO_504'),
('AUTO_505'),
('AUTO_506'),
('AUTO_507'),
('AUTO_508'),
('509');
;WITH
data_cte(id)AS
(SELECT [id] = CAST(REPLACE(id, 'AUTO_', '') AS INT) FROM tbl)
,maxmin_cte(minId, maxId)AS
(SELECT [minId] = min(id),[maxId] = max(id) FROM data_cte)
,recursive_cte(n) AS
(
SELECT [minId] n from maxmin_cte
UNION ALL
SELECT (1 + n) n FROM recursive_cte WHERE n < (SELECT [maxId] from maxmin_cte)
)
SELECT x.n
FROM
recursive_cte x
LEFT OUTER JOIN data_cte y ON
x.n = y.id
WHERE y.id IS NULL
Check this solution.Here you just need to add identity column.
CREATE TABLE tbl (
id VARCHAR(55),
idn int identity(0,1)
);
INSERT INTO tbl VALUES
('AUTO_500'),
('AUTO_501'),
('AUTO_502'),
('AUTO_504'),
('AUTO_505'),
('AUTO_506'),
('AUTO_507'),
('AUTO_508'),
('509');
SELECT min(idn+500) FROM tbl where 'AUTO_'+cast((idn+500) as varchar)<>id
try this:
with cte as(
select cast(REPLACE(id,'AUTO_','') as int)-500+1 [diff],ROW_NUMBER()
over(order by cast(REPLACE(id,'AUTO_','') as int)) [rnk] from tbl)
select top 1 'AUTO_'+cast(500+rnk as varchar(50)) [ID] from cte
where [diff]=[rnk]
order by rnk desc
SQL FIddle Demo
Had a similar situation, where we have R_Cds that were like this R01005
;with Active_R_CD (R_CD)
As
(
Select Distinct Cast(Replace(R_CD,'R', ' ') as Int)
from table
where stat = 1)
select Arc.R_CD + 1 as 'Gaps in R Code'
from Active_R_CD as Arc
left outer join Active_R_CD as r on ARC.R_CD + 1 = R.R_CD
where R.R_CD is null
order by 1

SQL: how to get all the distinct characters in a column, across all rows

Is there an elegant way in SQL Server to find all the distinct characters in a single varchar(50) column, across all rows?
Bonus points if it can be done without cursors :)
For example, say my data contains 3 rows:
productname
-----------
product1
widget2
nicknack3
The distinct inventory of characters would be "productwigenka123"
Here's a query that returns each character as a separate row, along with the number of occurrences. Assuming your table is called 'Products'
WITH ProductChars(aChar, remain) AS (
SELECT LEFT(productName,1), RIGHT(productName, LEN(productName)-1)
FROM Products WHERE LEN(productName)>0
UNION ALL
SELECT LEFT(remain,1), RIGHT(remain, LEN(remain)-1) FROM ProductChars
WHERE LEN(remain)>0
)
SELECT aChar, COUNT(*) FROM ProductChars
GROUP BY aChar
To combine them all to a single row, (as stated in the question), change the final SELECT to
SELECT aChar AS [text()] FROM
(SELECT DISTINCT aChar FROM ProductChars) base
FOR XML PATH('')
The above uses a nice hack I found here, which emulates the GROUP_CONCAT from MySQL.
The first level of recursion is unrolled so that the query doesn't return empty strings in the output.
Use this (shall work on any CTE-capable RDBMS):
select x.v into prod from (values('product1'),('widget2'),('nicknack3')) as x(v);
Test Query:
with a as
(
select v, '' as x, 0 as n from prod
union all
select v, substring(v,n+1,1) as x, n+1 as n from a where n < len(v)
)
select v, x, n from a -- where n > 0
order by v, n
option (maxrecursion 0)
Final Query:
with a as
(
select v, '' as x, 0 as n from prod
union all
select v, substring(v,n+1,1) as x, n+1 as n from a where n < len(v)
)
select distinct x from a where n > 0
order by x
option (maxrecursion 0)
Oracle version:
with a(v,x,n) as
(
select v, '' as x, 0 as n from prod
union all
select v, substr(v,n+1,1) as x, n+1 as n from a where n < length(v)
)
select distinct x from a where n > 0
Given that your column is varchar, it means it can only store characters from codes 0 to 255, on whatever code page you have. If you only use the 32-128 ASCII code range, then you can simply see if you have any of the characters 32-128, one by one. The following query does that, looking in sys.objects.name:
with cteDigits as (
select 0 as Number
union all select 1 as Number
union all select 2 as Number
union all select 3 as Number
union all select 4 as Number
union all select 5 as Number
union all select 6 as Number
union all select 7 as Number
union all select 8 as Number
union all select 9 as Number)
, cteNumbers as (
select U.Number + T.Number*10 + H.Number*100 as Number
from cteDigits U
cross join cteDigits T
cross join cteDigits H)
, cteChars as (
select CHAR(Number) as Char
from cteNumbers
where Number between 32 and 128)
select cteChars.Char as [*]
from cteChars
cross apply (
select top(1) *
from sys.objects
where CHARINDEX(cteChars.Char, name, 0) > 0) as o
for xml path('');
If you have a Numbers or Tally table which contains a sequential list of integers you can do something like:
Select Distinct '' + Substring(Products.ProductName, N.Value, 1)
From dbo.Numbers As N
Cross Join dbo.Products
Where N.Value <= Len(Products.ProductName)
For Xml Path('')
If you are using SQL Server 2005 and beyond, you can generate your Numbers table on the fly using a CTE:
With Numbers As
(
Select Row_Number() Over ( Order By c1.object_id ) As Value
From sys.columns As c1
Cross Join sys.columns As c2
)
Select Distinct '' + Substring(Products.ProductName, N.Value, 1)
From Numbers As N
Cross Join dbo.Products
Where N.Value <= Len(Products.ProductName)
For Xml Path('')
Building on mdma's answer, this version gives you a single string, but decodes some of the changes that FOR XML will make, like & -> &.
WITH ProductChars(aChar, remain) AS (
SELECT LEFT(productName,1), RIGHT(productName, LEN(productName)-1)
FROM Products WHERE LEN(productName)>0
UNION ALL
SELECT LEFT(remain,1), RIGHT(remain, LEN(remain)-1) FROM ProductChars
WHERE LEN(remain)>0
)
SELECT STUFF((
SELECT N'' + aChar AS [text()]
FROM (SELECT DISTINCT aChar FROM Chars) base
ORDER BY aChar
FOR XML PATH, TYPE).value(N'.[1]', N'nvarchar(max)'),1, 1, N'')
-- Allow for a lot of recursion. Set to 0 for infinite recursion
OPTION (MAXRECURSION 365)