There is a behaviour I would like to understand for good.
Query #1:
SELECT count(id) FROM table WHERE message like '%TEXT1%'
Output : 504
Query #2
SELECT count(distinct id) FROM table WHERE message like '%TEXT2%'
Output : 87
Query #3
SELECT count(distinct id) FROM table WHERE message in ('%TEXT1%','%TEXT2%' )
Output : 0
I want to understand why am I getting zero in the third query. Based on this, the ( , ) is equivalent to a multiple OR. Isn't this OR inclusive ?
the ( , ) is equivalent to a multiple OR. Isn't this OR inclusive ?
Sure, it's inclusive. But it's still an equality comparison, with no wildcard matching. It's like writing
WHERE (message = '%TEXT1%' or message = '%TEXT2%')
rather than
WHERE (message LIKE '%TEXT1%' or message LIKE '%TEXT2%')
IN does not take wildcards. They are specific to LIKE.
So, you need to use:
WHERE message like '%TEXT1%' OR message like '%TEST2%'
Or, you can use regular expressions:
WHERE message ~ 'TEXT1|TEXT2'
IN checks if the value on its left-hand side is equal to any of the values in the list. It does not support pattern matching.
This behavior is standard ANSI SQL, and is also described in Postgres documentation:
expression IN (value [, ...])
The right-hand side is a parenthesized list of scalar expressions. The result is “true” if the left-hand expression's result is equal to any of the right-hand expressions. This is a shorthand notation for:
expression = value1 OR expression = value2 OR ...
So if you want to match against several possible patterns, you need OR:
where message like '%TEXT1%' or message like '%TEST2%'
Related
I want to select rows in clickhouse table where two string columns are LIKE each other (foe example where column1 is 'Hello' and column2 is '%llo')
I tried LIKE operator:
SELECT * FROM table_name WHERE column1 LIKE column2;
but it said:
Received exception from server (version 21.2.8):
Code: 44. DB::Exception: Received from localhost:9000. DB::Exception: Argument at index 1 for function like must be constant: while executing 'FUNCTION like(column1 : 17, column2 : 17) -> like(column1, column2) UInt8 : 28'.
it seems that the second argument should be a constant value. Is there any other way to apply this condition?
CH Like supports only constant argument.
There is no general solution. The same problem with regex functions and so on. (because Clickhouse applies compiled expression and applies to a column byte-stream before separating to rows).
In some cases you can use position or countSubstrings functions for this task.
You can use LOCATE or POSITION for this (https://clickhouse.tech/docs/en/sql-reference/functions/string-search-functions/). The query would look something like this:
SELECT *
FROM table_name
WHERE position(column1, column2, character_length(column1) - character_length(column2) + 1) > 0;
This may be flawed. It seems that in clickhouse most string functions work on bytes or variable UTF8 byte lengths rather than on characters. One has to pay attentention hence how the functions work and how they should be combined. I am using the third parameter start_pos above and assume that it refers to the character position, but well, it can be bytes just as well - I have not been able to find this information in the docs .
Having a table
CREATE TABLE t (x int)
INSERT INTO t VALUES (null), (0) ,(42)
And query with two placeholders:
SELECT x FROM t WHERE x = ? OR x IS NULL AND ? IS NULL
The logic is the following: if I resolve those placeholders with 42 it returns 42. If I resolve them with null it returns null. In other words, it is a null-including search.
The question is:
Is it possible to rewrite this query (in Postgresql) to have a single placeholder ? instead of two?
Postgres implements null-safe equality through operator IS [NOT] DISTINCT FROM, which is exactly what you ask for.
So:
SELECT x FROM t WHERE x IS NOT DISTINCT FROM ?
The placeholder for parameters in PostgreSQL is $1, $2, etc. Letting you use ? instead is something some drivers implement for convenience, but they do offer less flexibility.
Using the real notation, you can specify one parameter to occur in more than one place:
SELECT x FROM t WHERE x = $1 OR x IS NULL AND $1 IS NULL
This has the advantage over IS NOT DISTINCT FROM in that it can use an index.
Yes. You can use is not distinct from and only use one placeholder:
where x is not distinct from ?
is distinct from/is not distinct from are null-safe comparison operators that treat null as a "real" value for comparison purposes (so null = null for instance).
I am using a tool to produce SQL queries and I need to filter one of the queries with a multiple parameters.
The query is similar to this:
Select *
From Products
Where (#ProductTypeIds is null or Product.ProductTypeId in (#ProductTypeIds))
I know the above query is not correct on a traditional SQL, read on..
Essentially, I'm trying to apply a filter where if nothing is passed for #ProductTypeIds parameter, the where condition is not applied.
When multiple parameters are being passed, though, #ProductTypeIds is being translated by the tool into the following query:
Select *
From Products
Where (#ProductTypeIds1, #ProductTypeIds2 is null or Product.ProductTypeId in (#ProductTypeIds1, #ProductTypeIds2))
Which is clearly an invalid query. So I thought I could be clever and use COALESCE to check if they are null:
Select *
From Products
Where (COALESCE(#ProductTypeIds, null) is null or Product.ProductTypeId in (#ProductTypeIds))
This query is being translated correctly, however, now my use of COALESCE throws an error:
At least one of the arguments to COALESCE must be an expression that is not the NULL constant.
How can I efficiently check that #ProductTypeIds (which be being translated into #ProductTypeIds1, #ProductTypeIds2 is all null so I can apply the filter or ignore?
In other words, is there a way to Distinct a list of parameters to check if the final result is null ?
Thanks
I have no idea how your tool works, but try the following.
Instead of checking for null check for the value that will never come in your params like:
WHERE COALESCE(#ProductTypeIds1, #ProductTypeIds2, -666) == -666 OR ...
Short story. I am working on a project where I need to communicate with SQLite database. And there I have several problems:
There is one FTS table with nodeId and nodeName columns. I need to select all nodeIds for which nodeNames contains some text pattern. For instance all node names with "Donald" inside. Something similar was discussed in this thread. The point is that I can't use CONTAINS keyword. Instead I use MATCH. And here is the question itself: how should this "Donald" string be "framed"? With '*' or with '%' character? Here is my query:
SELECT * FROM nodeFtsTable WHERE nodeName MATCH "Donald"
Is it OK to write multiple comparison in SELECT statement? I mean something like this:
SELECT * FROM distanceTable WHERE pointId = 1 OR pointId = 4 OR pointId = 203 AND distance<200
I hope that it does not sound very confusing. Thank you in advance!
Edit: Sorry, I missed the fact that you are using FTS4. It looks like you can just do this:
SELECT * FROM nodeFtsTable WHERE nodeName MATCH 'Donald'
Here is relevant documentation.
No wildcard characters are needed in order to match all entries in which Donald is a discrete word (e.g. the above will match Donald Duck). If you want to match Donald as part of a word (e.g. Donalds) then you need to use * in the appropriate place:
SELECT * FROM nodeFtsTable WHERE nodeName MATCH 'Donald*'
If your query wasn't working, it was probably because you used double quotes.
From the SQLite documentation:
The MATCH operator is a special syntax for the match()
application-defined function. The default match() function
implementation raises an exception and is not really useful for
anything. But extensions can override the match() function with more
helpful logic.
FTS4 is an extension that provides a match() function.
Yes, it is ok to use multiple conditions as in your second query. When you have a complex set of conditions, it is important to understand the order in which the conditions will be evaluated. AND is always evaluated before OR (they are analagous to mathematical multiplication and addition, respectively). In practice, I think it is always best to use parentheses for clarity when using a combination of AND and OR:
--This is the same as with no parentheses, but is clearer:
SELECT * FROM distanceTable WHERE
pointId = 1 OR
pointId = 4 OR
(pointId = 203 AND distance<200)
--This is something completely different:
SELECT * FROM distanceTable WHERE
(pointId = 1 OR pointId = 4 OR pointId = 203) AND
distance<200
Is it possible to use a comparison operator in a CONVERT or CAST function?
I've got a statement that looks like this:
SELECT
...
CASE field
WHEN 'Y' THEN 1 # If Y then True
ELSE 0 # Anything else is False
END
...
FROM ...
A similar thing happens for a few fields, so I would like to change it to a shorter version:
SELECT
...
CONVERT(BIT, field = 'Y')
...
FROM ...
But MSSQL is giving an error Incorrect syntax near '='.
My interpretation of the help is that it should work:
CONVERT ( data_type [ ( length ) ] , expression [ , style ] )
expression: expression { binary_operator } expression
binary_operator: Is an operator that defines the way two expressions are combined to yield a single result. binary_operator can be an arithmetic operator, the assignment operator (=), a bitwise operator, a comparison operator, a logical operator, the string concatenation operator (+), or a unary operator.
comparison operator: ( = | > | < | >= | <= | <> | != | !< | !> )
I ran a few tests and got these results:
SELECT CONVERT(BIT, 0) // 0
SELECT CONVERT(BIT, 1) // 1
SELECT CONVERT(BIT, 1+2) // 1
SELECT CONVERT(BIT, 1=2) // Error
SELECT CONVERT(BIT, (1=2)) // Error
SELECT CONVERT(BIT, (1)=(2)) // Error
I think you are misinterpreting the documentation for CONVERT. There is nothing in the documentation for CONVERT that states it will handle an expression that makes use of the comparison operators, only that it accepts an expression. It turns out that CONVERT does not handle every valid SQL expression. At the very least it cannot handle the results of an expression that uses a comparison operator.
If you check the documentation for Operators, you'll see that the comparison operators (which is what you want = to be, in this case) return a Boolean data type, and are used in WHERE clauses and control-of-flow statements. From the documentation for Operators:
The result of a comparison operator has the Boolean data type, which has three values:
TRUE, FALSE, and UNKNOWN. Expressions that return a Boolean data type are known as
Boolean expressions.
Unlike other SQL Server data types, a Boolean data type cannot be
specified as the data type of a table column or variable, and cannot
be returned in a result set.
...
Expressions with Boolean data types are used in the WHERE clause to
filter the rows that qualify for the search conditions and in
control-of-flow language statements such as IF and WHILE...
That helps to explain why SQL like SELECT 1=2 is invalid SQL, because it would create a result set with a Boolean data type, which the documentation says is not allowed. That also explains why the CASE WHEN construct is necessary, because it can evaluate the comparison operators and return a single value of a data type that SQL Server can return in a result set.
Furthermore, if you look at the documentation for CONVERT, you'll see that Boolean is not supported in either CAST or CONVERT (see the table towards the middle of the page, there is no Boolean data type in there).
For your purposes, I think you're stuck using CASE WHEN. If it helps you can write it all on one line:
CASE WHEN field = 'Y' THEN 1 ELSE 0 END
Alternatively, you could create a UDF to handle the CASE expression (something like dbo.func_DoCase(field, 'Y', 1, 0)), but personally I would just stick with CASE WHEN.