SQL _ wildcard not working as expected. Why? - sql

so i have this query
select id, col1, len(col1)
from tableA
from there I wanted to grab all data in col1 that have exactly 5 characters and start with 15
select id, col1, len(col1)
from tableA
where col1 like '15___' -- underscore 3 times
Now col1 is a nvarchar(192) and there are data that starts with 15 and are of length 5. But the second query always shows me no rows.
Why is that?

The case could be that the field is a large empty string? Such as "15123 "
You could also try another solution?
select id, col1, len(col1)
from tableA
where col1 like '15%' AND Len(col1)=5
EDIT - FOR FUTURE REFERENCE:
For sake of comprehensiveness, char and nchar uses the full field size, so char(10) would be 15________ ("15" + 8 characters) long, because it implicitly forces the size, whereas a varchar resizes based on what it is supplied 15 is simply 15.
To get around this you could
A) Do an LTRIM/RTRIM To cut off all extra spaces
select id, col1, len(col1)
from tableA
where rtrim(ltrim(col1)) like '15___'
B) Do a LEFT() to only grab the left 5 characters
select id, col1, len(col1)
from tableA
where left(col1,5) like '15___'
C) Cast as a varchar, a rather sloppy approach
select id, col1, len(col1)
from tableA
where CAST(col1 AS Varchar(192)) like '15___'

Does this query return anything?
select id, col1, len(col1)
from tableA
where len(col1) = 5 and
left(col1, 2) = '15';
If not, then there are no values that match that pattern. And, my best guess would be spaces, in which case, this might work:
select id, col1, len(col1)
from tableA
where ltrim(rtrim(col1)) like '15___';

Related

Postgresql subtract comma separated string in one column from another column

The format is like:
col1
col2
V1,V2,V3,V4,V5,V6
V4,V1,V6
V1,V2,V3
V2,V3
I want to create another column called col3 which contains the subtraction of two columns.
What I have tried:
UPDATE myTable
SET col3=(replace(col1,col2,''))
It works well for rows like row2 since the order of replacing patterns matters.
I was wondering if there's a perfect way to achieve the same goal for rows like row1.
So the desired output would be:
col1
col2
col3
V1,V2,V3,V4,V5,V6
V4,V1,V6
V2,V3,V5
V1,V2,V3
V2,V3
V1
Any suggestions would be appreciated!
Split values into tables, subtract sets and then assemble it back. Everything is possible as an expression defining new query column.
with t (col1,col2) as (values
('V1,V2,V3,V4,V5,V6','V4,V1,V6'),
('V1,V2,V3','V2,V3')
)
select col1,col2
, (
select string_agg(v,',')
from (
select v from unnest(string_to_array(t.col1,',')) as a1(v)
except
select v from unnest(string_to_array(t.col2,',')) as a2(v)
) x
)
from t
DB fiddle
You will have to unnest the elements then apply an EXCEPT clause on the "unnested" rows and aggregate back:
select col1,
col2,
(select string_agg(item,',' order by item)
from (
select *
from string_to_table(col1, ',') as c1(item)
except
select *
from string_to_table(col2, ',') as c2(item)
) t)
from the_table;
I wouldn't store that result in a separate column, but if you really need to introduce even more problems by storing another comma separated list.
update the_table
set col3 = (select string_agg(item,',' order by item)
from (
select *
from string_to_table(col1, ',') as c1(item)
except
select *
from string_to_table(col2, ',') as c2(item)
) t)
;
string_to_table() requires Postgres 14 or newer. If you are using an older version, you need to use unnest(string_to_array(col1, ',')) instead
If you need that a lot, consider creating a function:
create function remove_items(p_one text, p_other text)
returns text
as
$$
select string_agg(item,',' order by item)
from (
select *
from string_to_table(col1, ',') as c1(item)
except
select *
from string_to_table(col2, ',') as c2(item)
) t;
$$
language sql
immutable;
Then the above can be simplified to:
select col1, col2, remove_items(col1, col2)
from the_table;
Note, POSTGRESQL is not my forte, but thought I'd have a go at it. Try:
SELECT col1, col2, RTRIM(REGEXP_REPLACE(Col1,CONCAT('\m(?:', REPLACE(Col2,',','|'),')\M,?'),'','g'), ',') as col3 FROM myTable
See an online fidle.
The idea is to use a regular expession to replace all values, based on following pattern:
\m - Word-boundary at start of word;
(?:V4|V1|V6) - A non-capture group that holds the alternatives from col2;
\M - Word-boundary at end of word;
,? - Optional comma.
When replaced with nothing we need to clean up a possible trailing comma with RTRIM(). See an online demo where I had to replace the word-boundaries with the \b word-boundary to showcase the outcome.

tsql - extract only a portion of a dynamic string

I have a table with a column that holds a string such as:
xx-xx-xxx-84
xx-25-xxx-xx
xx-xx-123-xx
I want to go ahead and query out the numbers only, but as you can see, the numbers are placed at different places in the string every time. Is there away to query for only the numbers in the string?
Thank you,
I appreciate your time!
This requires repeated application of string functions. One method that helps with all the nesting is using OUTER APPLY. Something like this:
select t3.col
from t outer apply
(select t.*, patindex(t.col, '[0-9]') - 1 as numpos) t1 outer apply
(select t1.*, substring(t1.col, t1.numpos, len(t1.col)) as col2) t2 outer apply
(select t2.*,
(case when col2 like '%-%'
then substring(t2.col, charindex('-', t2.col))
else t2.col
end) as col3
) t3
The easy way (sure if only 'x' and '-' are in strings):
SELECT REPLACE(REPLACE(s,'x',''),'-','') FROM T
Or if X can be any non digital character then using PATINDEX() function:
SELECT S, SUBSTRING(S,
PATINDEX('%[0-9]%',s),
PATINDEX('%[0-9][^0-9]%',s)
+PATINDEX('%[0-9]',s)
+1
-PATINDEX('%[0-9]%',s)) as digit
FROM T

combining 3 tables where combination of 2 columns is not unique

There are 3 (will be up to 6 in the future) tables with the same columns.
I need to unify them, i.e. union on same columns. In addition to this - rows shall not be unique, based on 2 column combination! There are a couple of examples on the net, but all of them show how to exclude unique column values based on WHERE for one column. In my case there are 2 columns (Col1 and Col2 combination).
Here are the schematics:
and
And here is how I imagined final query (for 3-tables) would look like:
SELECT
*
FROM
(
SELECT * FROM table1
UNION
SELECT * FROM table2
UNION
SELECT * FROM table3
)
GROUP BY
Col1, Col2
HAVING
COUNT (*) > 1
What would be a correct way?
P.S. FYI single-column solutions
Multiple NOT distinct
How to select non "unique" rows
How to Select Every Row Where Column Value is NOT Distinct
EDIT:
I have used the code from accepted answer and added additional search criteria:
ON (SOUNDEX(Merged.[name_t1]) = SOUNDEX(Multiples.[name_t1]) OR Merged.[name_t1] LIKE '%' + Multiples.[name_t1] + '%' OR Multiples.[name_t1] LIKE '%' + Merged.[name_t1] + '%')
AND (SOUNDEX(Merged.[name_t2]) = SOUNDEX(Multiples.[name_t2]) OR Merged.[name_t2] LIKE '%' + Multiples.[name_t2] + '%' OR Multiples.[name_t2] LIKE '%' + Merged.[name_t2] + '%')
search col1 and col2:
-by SOUNDEX
-by col1 like (col1 from other table)
-by (col1 from other table) like col1
Here's the basics of a CTE-based approach :
With Merged AS
( -- CTE 1 : All Data in one table
SELECT * FROM table1
UNION ALL
SELECT * FROM table2
UNION ALL
SELECT * FROM table3
)
, Multiples AS
( -- CTE 2 : Group by the fields in common
SELECT Col1, Col2
FROM Merged
GROUP BY Col1, Col2
HAVING Count(*)>1 -- only want Groups of 2 or more
)
SELECT
Merged.*
FROM Merged INNER JOIN Multiples
-- Only return rows from Merged if they're in Multiples
ON Merged.[Col1]=Multiples.[Col1]
AND Merged.[Col2]=Multiples.[Col2]
Something like that works with my own example MS-SQL data, and it looks like SQLite syntax is the same. HTH!

SQL Server SELECT shows my own value not from database

I'm looking for way to add my data to some columns,
select co1, col2, col3 from tbl
I want to cod3 exist but only show my data
select co1, col2, col3=3 from tbl
output should be
1, 0, 3
I have problem with CR9 and this is only way I guess .
if you want to call the 3rd column col3 just do
select co1,col2, '3' as col3 from tbl
by the way
select co1,col2,col=3 from tbl
was valid acceptable but not recommenened by microsoft until SQL2008R2 in 2012 is not accepted anymore
just use "select co1,col2,'3' from tbl".
try this answer
select co1,col2, '3' as col3 from tbl

Difference two rows in a single SQL SELECT statement

I have a database table that has a structure like the one shown below:
CREATE TABLE dated_records (
recdate DATE NOT NULL
col1 DOUBLE NOT NULL,
col2 DOUBLE NOT NULL,
col3 DOUBLE NOT NULL,
col4 DOUBLE NOT NULL,
col5 DOUBLE NOT NULL,
col6 DOUBLE NOT NULL,
col7 DOUBLE NOT NULL,
col8 DOUBLE NOT NULL
);
I want to write an SQL statement that will allow me to return a record containing the changes between two supplied dates, for specified columns - e.g. col1, col2 and col3
for example, if I wanted to see how much the value in col1, col2 and col3 has changed during the interval between two dates. A dumb way of doing this would be to select the rows (separately) for each date and then difference the fields outside the db server -
SQL1 = "SELECT col1, col2 col3 FROM dated_records WHERE recdate='2001-01-01'";
SQL1 = "SELECT col1, col2 col3 FROM dated_records WHERE recdate='2001-02-01'";
however, I'm sure there there is a way a smarter way of performing the differencing using pure SQL. I am guessing that it will involve using a self join (and possibly a nested subquery), but I may be over complicating things - I decided it would be better to ask the SQL experts on here to see how they would solve this problem in the most efficient way.
Ideally the SQL should be DB agnostic, but if it needs to be tied to be a particular db, then it would have to be PostgreSQL.
Just select the two rows, join them into one, and subtract the values:
select d1.recdate, d2.recdate,
(d2.col1 - d1.col1) as delta_col1,
(d2.col2 - d1.col2) as delta_col2,
...
from (select *
from dated_records
where recdate = <date1>
) d1 cross join
(select *
from dated_records
where recdate = <date2>
) d2
I think that if what you want to do is get in the result set rows that doesn't intersect the two select queries , you can use the EXCEPT operator :
The EXCEPT operator returns the rows that are in the first result set
but not in the second.
So your two queries will become one single query with the except operator joining them :
SELECT col1, col2 col3 FROM dated_records WHERE recdate='2001-01-01'
EXCEPT
SELECT col1, col2 col3 FROM dated_records WHERE recdate='2001-02-01'
SELECT
COALESCE
(a.col1 -
(
SELECT b.col1
FROM dated_records b
WHERE b.id = a.id + 1
),
a.col1)
FROM dated_records a
WHERE recdate='2001-01-01';
You could use window functions plus DISTINCT:
SELECT DISTINCT
first_value(recdate) OVER () AS date1
,last_value(recdate) OVER () AS date2
,last_value(col1) OVER () - first_value(col1) OVER () AS delta1
,last_value(col2) OVER () - first_value(col2) OVER () AS delta2
...
FROM dated_records
WHERE recdate IN ('2001-01-01', '2001-01-03')
For any two days. Uses a single index or table scan, so it should be fast.
I did not order the window, but all calculations use the same window, so the values are consistent.
This solution can easily be generalized for calculations between n rows. You may want to use nth_value() from the Postgres arsenal of window functions in this case.
This seemed a quicker way to write this if you are looking for a simple delta.
SELECT first(col1) - last(col1) AS delta_col1
, first(col2) - last(col2) AS delta_col2
FROM dated_records WHERE recdate IN ('2001-02-01', '2001-01-01')
You may not know whether the first row or the second row comes first, but you can always wrap the answer in abs(first(col1)-last(col1))