Querying case-insensitive columns by SQL in Tarantool - sql

We know that string Tarantool indices can be made case-insensitive by specifying the collation option: collation = "unicode_ci". E.g.:
t = box.schema.create_space("test")
t:format({{name = "id", type = "number"}, {name = "col1", type = "string"}})
t:create_index('primary')
t:create_index("col1_idx", {parts = {{field = "col1", type = "string", collation = "unicode_ci"}}})
t:insert{1, "aaa"}
t:insert{2, "bbb"}
t:insert{3, "ccc"}
Now we can do a case-insensitive query:
tarantool> t.index.col1_idx:select("AAA")
---
- - [1, 'aaa']
...
But how to do it using SQL? This doesn't work:
tarantool> box.execute("select * from \"test\" where \"col1\" = 'AAA'")
---
- metadata:
- name: id
type: number
- name: col1
type: string
rows: []
...
Neither does this:
tarantool> box.execute("select * from \"test\" indexed by \"col1_idx\" where \"col1\" = 'AAA'")
---
- metadata:
- name: id
type: number
- name: col1
type: string
rows: []
...
There's a dirty trick with a poor performance (full scan). We don't want it, do we?
tarantool> box.execute("select * from \"test\" indexed by \"col1_idx\" where upper(\"col1\") = 'AAA'")
---
- metadata:
- name: id
type: number
- name: col1
type: string
rows:
- [1, 'aaa']
...
At last, we have one more workaround:
tarantool> box.execute("select * from \"test\" where \"col1\" = 'AAA' collate \"unicode_ci\"")
---
- metadata:
- name: id
type: number
- name: col1
type: string
rows:
- [1, 'aaa']
...
But the question is - does it use the index? Without an index it also works...

One can check query plan to figure out whether particular index is used or not. To get query plan simply add 'EXPLAIN QUERY PLAN ' prefix to the original query. For instance:
tarantool> box.execute("explain query plan select * from \"test\" where \"col1\" = 'AAA' collate \"unicode_ci\"")
---
- metadata:
- name: selectid
type: integer
- name: order
type: integer
- name: from
type: integer
- name: detail
type: text
rows:
- [0, 0, 0, 'SEARCH TABLE test USING COVERING INDEX col1_idx (col1=?) (~1 row)']
...
So the answer is 'yes', index is used in this case.
As for another example:
box.execute("select * from \"test\" indexed by \"col1_idx\" where \"col1\" = 'AAA'")
Unfortunately collation in this comparison is binary, since index's collation is ignored. In SQL only column's collations are considered to be used during comparison. This limitation will be resolved as soon as corresponding issue is closed.

Related

How to get all rows with initial values after the assertion in dataform?

I'm doing transformation with dataform , I added some assertions in my transformation
example of table input (all types of columns are STRING):
id | valueu1 | value2 | value3
1 |1?2 | 01/01/1900 00:00:00|x1
2 |1.2 | 01-01-1900 00:00:00|x2
example of stg transformation:
config {
type: "table",
database: "dev",
schema: "schema1",
tags: ["staging", "daily"],
/* DESCRIPTION */
description:"some description",
columns: {
id: "description",
value1: "description",
value2: "description",
value3: "description" }
,
assertions: {
uniqueKey: ["id"],
nonNull: ["valueu1","value2"]
}
}
select
id,
safe_cast(
REGEXP_REPLACE(
value1,
",",
"."
) as numeric
) value1,
SAFE.PARSE_DATETIME('%d/%m/%Y %H:%M:%S', value2) value2,
value3
from
${ref('input')}
The problem here is when the assertion is failed beacause of column value1 or value2 of table input when I want to know the raw value from the view in assertionSchema I can't beacause in the view the value is null (the result of safe_cast).
How to get the raw value of column failed in the assertion, I know that I can get this value from input table, but I want to the value in the view of assertionSchema.
Thanks

How to query from "any"/"map" data type on Tarantool?

Following example from this answer. If I created map without index, how to query the inner value of the map?
box.schema.create_space('x', {format = {[1] = {'id', 'unsigned'}, [2] = {'obj', 'map'}}})
box.space.x:create_index('pk', {parts = {[1] = {field = 1, type = 'unsigned'}}})
box.space.x:insert({2, {text = 'second', timestamp = 123}}
box.execute [[ SELECT * FROM "x" ]]
-- [2, {'timestamp': 123, 'text': 'second'}]
How to fetch timestamp or text column directly from SQL without creating index?
Tried these but didn't work:
SELECT "obj.text" FROM "x"
SELECT "obj"."text" FROM "x"
SELECT "obj"["text"] FROM "x"
SELECT "obj"->"text" FROM "x"
You can register a Lua function to call it from SQL. The first example from our SQL + Lua manual shows exactly what you asked.
A bit simplified version of the example to explain the idea:
box.schema.func.create('GETFIELD', {
language = 'LUA',
returns = 'any',
body = [[
function(msgpack_value, field)
return require('msgpack').decode(msgpack_value)[field]
end]],
is_sandboxed = false,
param_list = {'string', 'string'},
exports = {'SQL'},
is_deterministic = true
})
After registration of the function you can call it from SQL:
tarantool> \set language sql
tarantool> select getfield("obj", 'text') from "x"
---
- metadata:
- name: COLUMN_1
type: any
rows:
- ['second']
...
tarantool> select getfield("obj", 'timestamp') from "x"
---
- metadata:
- name: COLUMN_1
type: any
rows:
- [123]
...
Differences from the example in the manual:
No hack with the global variable, but no dot syntax ('foo.bar.baz').
Exported only to SQL.
The return type is 'any': so it can be used for, say, the numeric 'timestamp' field. Downside: 'any' is reported in the result set metainformation.
(The idea suggested by Nikita Pettik, my teammate.)

How to filter on a json column for a specific value?

I'm on postgres and have a table orders with a data column which is jsonb. Here's a condensed example of data in one of them - they have UUID keys and a value of { id, value }
{
'36462bd9-4ffa-4ee3-9a04-c2eb7575fe6c': {
id: '',
value: '2020-04-20T01:32:14.017Z',
},
'9baaed61-1275-4bbc-ae4f-2994ec9f7fda': { id: '4', value: 'Paper Towels' },
}
How can I do operations such as to find any orders where data has some UUID (ie. 9baaed61-1275-4bbc-ae4f-2994ec9f7fda) and { id: '4' }?
You can use the contains operator #>
select *
from the_table
where data #> '{"9baaed61-1275-4bbc-ae4f-2994ec9f7fda": {"id": "4"}}';
This assumes that the invalid JSON id: '4' from your question is really stored as "id":"4". If the value is stored as a number: "id": 4 then you need to use that in the comparison value.

How to write a CASE clause with another column as a condition using knex.js

So my code is like one below:
.select('id','units',knex.raw('case when units > 0 then cost else 0 end'))
but it gives me error like this one
hint: "No operator matches the given name and argument type(s). You might need to add explicit type casts."
Any idea how I should right my code so I can use another column as an condition for different to column ?
I don't get the same error you do:
CASE types integer and character varying cannot be matched
but regardless, the issue is that you're trying to compare apples and oranges. Postgres is quite strict on column types, so attempting to put an integer 0 and a string (value of cost) in the same column does not result in an implicit cast.
Turning your output into a string does the trick:
.select(
"id",
"units",
db.raw("CASE WHEN units > 0 THEN cost ELSE '0' END AS cost")
)
Sample output:
[
{ id: 1, units: null, cost: '0' },
{ id: 2, units: 1.2, cost: '2.99' },
{ id: 3, units: 0.9, cost: '4.50' },
{ id: 4, units: 5, cost: '1.23' },
{ id: 5, units: 0, cost: '0' }
]

How to match a string value from the database with an integer value from the response

My feature calls Java to query the database and then compares the results with the response. The results from the Java call returns all values as string. But in the response some of values are integer. So the test fails with the reason: actual value is not a string. I have tried to convert the results to json but that didn't work. If I print out the results, it shows all keys and values are enclosed in double quotes, but there are no double quotes in the error message. I found a similar question in the forum and it was suggested to set the field in the response to '#ignore'. But I want to verify all the fields. How do I get this to work?
Scenario: Get an script by id
* def results = db.getRows("select * from ScriptVersion where id=4 order by version")
Given path '4'
When method get
Then status 200
And match response.version == results
[main] ERROR com.intuit.karate - assertion failed: path: $.version[0], actual: {id=4, version=1, created=2016-06-23T10:49:51.9630000-05:00, updated=2016-06-23T10:49:51.9630000-05:00, message=Initial Version, author=ocadm, hash=0023ad00455962eee4ef1db16a58ce41}, expected: {created=2016-06-23T10:49:51.9630000-05:00, author=ocadm, id=4, message=Initial Version, version=1, updated=2016-06-23T10:49:51.9630000-05:00, hash=0023ad00455962eee4ef1db16a58ce41}, reason: [path: $.version[0].id, actual: 4, expected: '4', reason: actual value is not a string]
Just convert the fields you need to the right data type before the match:
* def results = [{ id: '1', foo: 'bar' }, { id: '2', foo: 'baz' }]
* def fun = function(x){ x.id = ~~x.id; return x }
* def results = karate.map(results, fun)
* match results == [{ id: 1, foo: 'bar' }, { id: 2, foo: 'baz' }]