LIKE with integers, in SQL - sql

Can I replace the = statement with the LIKE one for the integers ?
by eg. are the following the same thing:
select * from FOOS where FOOID like 2
// and
select * from FOOS where FOOID = 2
I'd prefer to use LIKE instead of = because I could use % when I have no filter for FOOID...
SQL Server 2005.
EDIT 1 #Martin

select * from FOOS where FOOID like 2
should be avoided as it will cause both sides to be implicitly cast as varchar and mean that an index cannot be used to satisfy the query.
CREATE TABLE #FOOS
(
FOOID INT PRIMARY KEY,
Filler CHAR(1000)
)
INSERT INTO #FOOS(FOOID)
SELECT DISTINCT number
FROM master..spt_values
SELECT * FROM #FOOS WHERE FOOID LIKE 2
SELECT * FROM #FOOS WHERE FOOID = 2
DROP TABLE #FOOS
Plans (notice the estimated costs)
Another way of seeing the difference in costs is to add SET STATISTICS IO ON
You see that the first version returns something like
Table '#FOOS__000000000015'. Scan count 1, logical reads 310
The second version returns
Table '#FOOS__000000000015'. Scan count 0, logical reads 2
This is beacuse the reads required for the seek on this index are proportional to the index depth whereas the reads required for the scan are proportional to the number of pages in the index. The bigger the table gets the larger the discrepancy between these 2 numbers will become. You can see both of these figures by running the following.
SELECT index_depth, page_count
FROM
sys.dm_db_index_physical_stats (2,object_id('tempdb..#FOOS'), DEFAULT,DEFAULT, DEFAULT)
WHERE object_id = object_id('tempdb..#FOOS') /*In case it hasn't been created yet*/

Use a CASE statement to convert an input string to an integer. Convert the wildcard % to a NULL. This will give better performance than implicitly converting the entire int column to string.
CREATE PROCEDURE GetFoos(#fooIdOrWildcard varchar(100))
AS
BEGIN
DECLARE #fooId int
SET #fooId =
CASE
-- Case 1 - Wildcard
WHEN #fooIdOrWildcard = '%'
THEN NULL
-- Case 2 - Integer
WHEN LEN(#fooIdOrWildcard) BETWEEN 1 AND 9
AND #fooIdOrWildcard NOT LIKE '%[^0-9]%'
THEN CAST(#fooIdOrWildcard AS int)
-- Case 3 - Invalid input
ELSE 0
END
SELECT FooId, Name
FROM dbo.Foos
WHERE FooId BETWEEN COALESCE(#fooId, 1) AND COALESCE(#fooId, 2147483647)
END

Yes, you can just use it:
SELECT *
FROM FOOS
WHERE FOOID like 2
or
SELECT *
FROM FOOS
WHERE FOOID like '%'
Integers will be implicitly converted into strings.
Note that neither of these condition is sargable, i. e. able to use an index on fooid. This will always result in a full table scan (or a full index scan on fooid).

This is a late comment but I thought maybe some other people are looking for the same thing so as I was able to find a solution for this, I thought I should share it here:)
A short description of the problem:
the problem I had was to be able to use the wild card foe integer data types. I am using SQL Server and so my syntax is for SQL Server. I have a column which shows department number and I wanted to pass a variable from my page from a drop down menu. There is an 'All' option as well which in that case I wanted to pass '%' as the parameter. I was using this:
select * from table1 where deptNo Like #DepartmentID
It was working for when I pass a number but not for % because sql server implicitly converts the #DepartmentID to int (as my deptNo is of type int)
So I casted the deptNo and that fixed the issue:
select * from table1 where CAST(deptNo AS varchar(2)) Like #DepartmentID
This one works for both when I pass a number like 4 and when I pass %.

Use NULL as the parameter value instead of % for your wildcard condition
select * from table1 where (#DepartmentID IS NULL OR deptNo = #DepartmentID)

Related

T-SQL Stored Procedure: Performance of select count(*) vs. select count([uniqueId])

So, I'm looking at a stored procedure here, which has more than one line like the following pseudocode:
if(select count(*) > 0)
...
on tables having a unique id (or identifier, for making it more general).
Now, in terms of performance, is it more performant to change this clause
to
if(select count([uniqueId]) > 0)
...
where uniqueId is, e.g., an Idx containing double values?
An example:
Consider a table like Idx (double) | Name (String) | Address (String)
Now the 'Idx' is a foreign key which I want to join in a stored procedure.
So, in terms of performance: what is better here?
if(select count(*) > 0)
...
or
if(select count(Idx) > 0)
...
? Or does the SQL Engine Change select count(*) to select count(Idx) internally, so we do not have to bother about this? Because at first sight, I'd say that select count(Idx) would be more performant.
The two are slightly different. count(*) counts rows. count([uniqueid]) counts the number of non-NULL values for uniqueid. Because a unique constraint allows a NULL value, SQL Server actually needs to read the column. This could add microseconds of time to a query, particularly if the page with the id is not already in memory. This also gives SQL Server more opportunities to optimize count(*).
As #lad2025 writes in a comment, the performant solution is to use if (exists . . ..
SELECT t1.*
FROM Table1 t1
JOIN Table2 t2 ON t2.idx = t1.idx
will give you only the rows in t1 that match an idx value in Table2. I'm not sure there is a good reason to do an if(select count...).
If you are really interested in the performance of something like this, just create a temp table with a million rows and give it a go:
CREATE TABLE #TempTable (id int identity, txt varchar(50))
GO
INSERT #TempTable (txt) VALUES (##IDENTITY)
GO 1000000

Indexing a LEFT operation in SQL Server

I have a database table of E.164 calling codes (e.g. 1 for USA/Canada, 44 for the United Kingdom, etc). Here's the table design:
CREATE TABLE CountryCosts (
CallingCode varchar(5) PK NOT NULL
IsFree bit NOT NULL
)
I have a scalar function which accepts a full phone number and indicates if any country-code in the table matches the number (simply by checking if the number begins with any CountryCode in the table) and indiciates if IsFree is true or not.
SELECT
TOP 1
CallingCode,
GratisPermitted
FROM
CountryCosts
WHERE
GratisPermitted = 1
AND
LEFT( #recipient, LEN( CallingCode ) ) = CallingCode
(Variations exist, including using SELECT COUNT(1) inside an SELECT CASE WHEN EXISTS and using #recipient LIKE CONCAT( CallingCode, '%' ) as the predicate)
The Actual Execution Plan reports the main expense is a Clustered Index Scan of the Clustered PK index.
I want to know if there's any way I can improve the performance by adding another index, is there any index that works on varchar columns that SQL Server would use to optimize the LEFT predicate?

Solution to avoid non-sargable argument in where clause

In the code_list CTE in this query I have a row constructor that will eventually take any number of arguments. The column icd in the patient_codes CTE is a five digit identifier that is most descriptive that the three digit codes that the row constructor has. The table icd_patient has a 100 million rows so for performance's sake, I would like to filer the rows on this table before I do any further work. I have
;with code_list(code_list)
as
(
select x.code_list
from (values ('70700'),('25002')) as x(code_list)
),patient_codes
as
(
select distinct icd,pat_id,id
from icd_patient
where icd in (select icd from code_list)
)
select distinct pat_id from patient_codes
The problem is, however, is that in the icd_patient table all of the icd columns are five digit and more descriptive. If I look at the execution plan of this query it's pretty streamlined. If I do
;with code_list(code_list)
as
(
select x.code_list
from (values ('70700'),('25002')) as x(code_list)
),patient_codes
as
(
select substring(icd,1,3) as icd,pat_id
from icd_patient2
where substring(icd,1,3) in (select * from code_list)
)
select * from patient_codes
this if course has a large performance impact because of the substring expression in the where clause. Does something akin to like in exist so I can take advantage of my indexes?
Index on icd_patient
CREATE NONCLUSTERED INDEX [ix_icd_patient] ON [dbo].[icd_patient2]
(
[pat_id] ASC
)
INCLUDE ( [id],
This much simpler query should be better than (or, at worst, the same as) your existing query.
select pat_id
FROM dbo.icd_patient
where icd LIKE '707%'
OR icd LIKE '250%'
GROUP BY pat_id;
Note that sargability only matters if there is actually an index on this column.
An alternative (since OR can sometimes give the optimizer fits):
SELECT pat_id FROM
(
SELECT pat_id
FROM dbo.icd_patient
WHERE icd LIKE '707%'
UNION ALL
SELECT pat_id
FROM dbo.icd_patient
WHERE icd LIKE '250%'
) AS x
GROUP BY pat_id;
To make this extensible beyond a handful of OR conditions, I would use a table-valued parameter (TVP).
CREATE TYPE dbo.StringPatterns AS TABLE(s VARCHAR(3) PRIMARY KEY);
Then your stored procedure could say:
CREATE PROCEDURE dbo.whatever
#sp dbo.StringPatterns READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT p.pat_id
FROM dbo.icd_patient AS p
INNER JOIN #sp AS sp
ON p.pat_id LIKE sp.s + '%'
GROUP BY p.pat_id;
END
Then you can pass in your set of three-character substrings from a DataTable or other collection in C#. From T-SQL just as an example:
DECLARE #p dbo.StringPatterns;
INSERT #p VALUES('707'),('250');
EXEC dbo.whatever #sp = #p;
Something like like in does not exist. The following is sargable:
select *
from icd_patient
where icd like '70700%' or
icd like '25002%'
Because like with a constant initial substring is a special case for SQL Server. This does not work when the strings on the right are variables.
One solution is to create an indexed view on the icd_patient table with an index on the first five characters of the icd code.
Using "IN" makes that part of a command non-sargable on both sides. End of discussion.
Saying he fixes it using substring, completely changes what it would return while it remains non sarged.
Any "fix" should exactly match results. The actual fix is to join the cte so the five characters match or put three characters in the cte and match that in a join or put 4 characters in the cte where the fourth is "%" and join matching by using LIKE
Using a "like" that starts with "%" increases the complexity of the search, but it would still use the index to find the value because parsing the index should use less reading by only getting the full table row when a search is successful.

Check if field is numeric, then execute comparison on only those field in one statement?

This may be simple, but I am no SQL whiz so I am getting lost. I understand that sql takes your query and executes it in a certain order, which I believe is why this query does not work:
select * from purchaseorders
where IsNumeric(purchase_order_number) = 1
and cast(purchase_order_number as int) >= 7
MOST of the purchar_order_number fields are numeric, but we introduce alphanumeric ones recently. The data I am trying to get is to see if '7' is greater than the highest numeric purchase_order_number.
The Numeric() function filters out the alphanumeric fields fine, but doing the subsequent cast comparison throws this error:
Conversion failed when converting the nvarchar value '124-4356AB' to data type int.
I am not asking what the error means, that is obvious. I am asking if there is a way to accomplish what I want in a single query, preferably in the where clause due to ORM constraints.
does this work for you?
select * from purchaseorders
where (case when IsNumeric(purchase_order_number) = 1
then cast(purchase_order_number as int)
else 0 end) >= 7
You can do a select with a subselect
select * from (
select * from purchaseorders
where IsNumeric(purchase_order_number) = 1) as correct_orders
where cast(purchase_order_number as int) >= 7
try this:
select * from purchaseorders
where try_cast(purchase_order_number as int) >= 7
have to check which column has numeric values only.
Currently, in a table every field is setted with nvarchar(max) Like tableName (field1 nvarchar(max),field2 nvarchar(max),field3 nvarchar(3)) and tableName has 25lac Rows.
But on manually Check Field2 Contain the numeric Values Only... How to Check With t-sql that in the Complete Column (Field2) has numeric Value or not/null value with Longest Length in the Column!

How do you query an int column for any value?

How can you query a column for any value in that column? (ie. How do I build a dynamic where clause that can either filter the value, or not.)
I want to be able to query for either a specific value, or not. For instance, I might want the value to be 1, but I might want it to be any number.
Is there a way to use a wild card (like "*"), to match any value, so that it can be dynamically inserted where I want no filter?
For instance:
select int_col from table where int_col = 1 // Query for a specific value
select int_col from table where int_col = * // Query for any value
The reason why I do not want to use 2 separate SQL statements is because I am using this as a SQL Data Source, which can only have 1 select statement.
Sometimes I would query for actual value (like 1, 2...) so I can't not have a condition either.
I take it you want some dynamic behavior on your WHERE clause, without having to dynamically build your WHERE clause.
With a single parameter, you can use ISNULL (or COALESCE) like this:
SELECT * FROM Table WHERE ID = ISNULL(#id, ID)
which allows a NULL parameter to match all. Some prefer the longer but more explicit:
SELECT * FROM Table WHERE (#id IS NULL) OR (ID = #id)
A simple answer would be use: IS NOT NULL. But if you are asking for say 123* for numbers like 123456 or 1234 or 1237 then the you could convert it to a varchar and then test against using standard wild cards.
In your where clause: cast(myIntColumn as varchar(15)) like '123%'.
Assuming the value you're filtering on is a parameter in a stored procedure, or contained in a variable called #Value, you can do it like this:
select * from table where #Value is null or intCol = #Value
If #Value is null then the or part of the clause is ignored, so the query won't filter on intCol.
The equivalent of wildcards for numbers are the comparators.
So, if you wanted to find all positive integers:
select int_col from table where int_col > 0
any numbers between a hundred and a thousand:
select int_col from table where int_col BETWEEN 100 AND 1000
and so on.
I don't quite understand what you're asking. I think you should use two different queries for the different situations you have.
When you're not looking for a specific value:
SELECT * FROM table
When you are looking for a specific value:
SELECT * FROM table WHERE intcol = 1
You can use the parameter as a wildcard by assigning special meaning to NULL:
DECLARE #q INT = 1
SELECT * FROM table WHERE IntegerColumn = #q OR #q IS NULL
This way, when you pass in NULL; you get all rows.
If NULL is a valid value to query for, then you need to use two parameters.
If you really want the value of your column for all rows on the table you can simply use
select int_col
from table
If you want to know all the distinct values, but don't care how many times they're repeated you can use
select distinct int_col
from table
And if you want to know all the distinct values and how many times they each appear, use
select int_col, count(*)
from table
group by int_col
To have the values sorted properly you can add
order by int_col
to all the queries above.
Share and enjoy.