Tarantool multipart bitset index

Tarantool multipart bitset index - indexing

box.space.test:create_index('secondary', {type='BITSET', unique=false, parts={2,'num', 3,'num'}})
---
- error: 'Can''t create or modify index ''secondary'' in space ''test'': BITSET index key can not be multipart
Tarantool 1.6.7-591-g7d4dbbb
I have table with structure:
id,
param1: {1,50,70} #values[1-256]
param2: {6,8,128} #values[1-128]
And need to:
SELECT * FROM table WHERE param1 HAVE {1,2} AND param2 HAVE {78}
So I can do it creating 256 bit + 128 bit BITSET index. How I can make it?

Related

Convert oid to json as int instead of string

Using Postgres 12, the following will return an int JSON representation:
> SELECT to_json(2::int)
.. 2
Whereas if the type is oid, it will return it as string:
> SELECT to_json(2::oid)
.. "2"
Since oid is inherently an int value, I would like it to be represented as such. I tried creating a cast between oid and both text and json types, but neither seems to be picked up by to_json.
Is there a way to make to_json represent an oid as an int, outside of casting each oid column to int explicitly?

You will have to use an explicit cast, because it is hard-coded that PostgreSQL treats oid as a string.
You could suggest the following patch to the pgsql-hackers mailing list:
diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index 30ca2cf6c8..09e9a9ac08 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
## -170,6 +170,7 ## json_categorize_type(Oid typoid,
case FLOAT4OID:
case FLOAT8OID:
case NUMERICOID:
+ case OIDOID:
getTypeOutputInfo(typoid, outfuncoid, &typisvarlena);
*tcategory = JSONTYPE_NUMERIC;
break;
diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
index 8d1e7fbf91..0e8edb0fc3 100644
--- a/src/backend/utils/adt/jsonb.c
+++ b/src/backend/utils/adt/jsonb.c
## -650,6 +650,7 ## jsonb_categorize_type(Oid typoid,
case FLOAT4OID:
case FLOAT8OID:
case NUMERICOID:
+ case OIDOID:
getTypeOutputInfo(typoid, outfuncoid, &typisvarlena);
*tcategory = JSONBTYPE_NUMERIC;
break;
That would change the behavior, and I don't see why the patch shouldn't be accepted.

You can override to_json by creating a function with the same name in public schema that does the casting for you.
CREATE FUNCTION public.to_json(IN prm oid) RETURNS json
LANGUAGE SQL AS
$body$
SELECT pg_catalog.to_json(prm::int);
$body$;

Can I find distances between arrays in PostgreSQL using <->?

As far as I understood from this article, you can find nearest neighbors using <-> distance operator when working with geometric data types:
SELECT name, location --location is point
FROM geonames
ORDER BY location <-> '(29.9691,-95.6972)'
LIMIT 5;
You can also get some optimizations using SP-GiST indexes:
CREATE INDEX idx_spgist_geonames_location ON geonames USING spgist(location);
But I can't find anything about using <-> operator with arrays in the documentation. If I were to perform same queries using double precision[] instead of point, for example, would that work?

Apparently, we can't. For example, I've got a simple table:
CREATE TABLE test (
id SERIAL PRIMARY KEY,
loc double precision[]
);
And I want to query documents from it, ordering by distance,
SELECT loc FROM test ORDER BY loc <-> ARRAY[0, 0, 0, 0]::double precision[];
It doesn't work:
Query Error: error: operator does not exist: double precision[] <-> double precision[]
Documentation has no mention of <-> for arrays as well. I found a workaround in accepted answer for this question, but it poses some limitations, especially on array's length. Although there is an article (written in Russian), which suggests a workaround on array size limitation. Creation of sample table:
import postgresql
def setup_db():
db = postgresql.open('pq://user:pass#localhost:5434/db')
db.execute("create extension if not exists cube;")
db.execute("drop table if exists vectors")
db.execute("create table vectors (id serial, file varchar, vec_low cube, vec_high cube);")
db.execute("create index vectors_vec_idx on vectors (vec_low, vec_high);")
Element insertion:
query = "INSERT INTO vectors (file, vec_low, vec_high) VALUES ('{}', CUBE(array[{}]), CUBE(array[{}]))".format(
file_name,
','.join(str(s) for s in encodings[0][0:64]),
','.join(str(s) for s in encodings[0][64:128]),
)
db.execute(query)
Element querying:
import time
import postgresql
import random
db = postgresql.open('pq://user:pass#localhost:5434/db')
for i in range(100):
t = time.time()
encodings = [random.random() for i in range(128)]
threshold = 0.6
query = "SELECT file FROM vectors WHERE sqrt(power(CUBE(array[{}]) <-> vec_low, 2) + power(CUBE(array[{}]) <-> vec_high, 2)) <= {} ".format(
','.join(str(s) for s in encodings[0:64]),
','.join(str(s) for s in encodings[64:128]),
threshold,
) + \
"ORDER BY sqrt(power(CUBE(array[{}]) <-> vec_low, 2) + power(CUBE(array[{}]) <-> vec_high, 2)) ASC LIMIT 1".format(
','.join(str(s) for s in encodings[0:64]),
','.join(str(s) for s in encodings[64:128]),
)
print(db.query(query))
print('inset time', time.time() - t, 'ind', i)

SqlSave Error: Unable to append to table

Code:
sqlSave(SQL,data.frame(df),tablename='Data',append = TRUE,rownames = FALSE)
The table in which I am trying to insert the data has a primary key which is auto-increment. My table has a total of 5 columns including the primary key. In my data frame, I have 4 columns because I don't want to insert the PK myself. However, when I run the command, I get the following error:
Error in colnames<-(*tmp*, value = c("BId", "name", "Set", :
length of 'dimnames' [2] not equal to array extent
Also, when I insert the Primary key in the dataframe by myself, it still doesn't work.
Error in sqlSave(SQL, data.frame(df), tablename = "Data", :
unable to append to table ‘Data’

have a try safer = FALSE
the defination of sqlSave
if (!append) {
if (safer)
stop("table ", sQuote(tablename), " already exists")
......
}
......
if (safer)
stop("unable to append to table ", sQuote(tablename))

You can use use verbose argument to get the actual database error.
sqlsave(con, df, verbose = T)

JSONStore difference between 'number' and 'integer' in searchFields

I have a question about JSONStore searchFields.
If I use number as the searchFields key and try to find data by WL.JSONStore.find method with 0 as the query, It will hit all data (not filtered).
With the integer of the case above works fine.
What's the difference between number and integer?

JSONStore uses SQLite to persist data, you can read about SQLite Data Types here. The short answer is number will store data as REAL while integer will store data as INTEGER.
If you create a collection called nums with one searchField called num of type number
var nums = WL.JSONStore.initCollection('nums', {num: 'number'}, {});
and add some data:
var len = 5;
while (len--) {
nums.add({num: len});
}
then call find with the query: {num: 0}
nums.find({num: 0}, {onSuccess: function (res) {
console.log(JSON.stringify(res));
}})
you should get back:
[{"_id":1,"json":{"num":4}},{"_id":2,"json":{"num":3}},{"_id":3,"json":{"num":2}},{"_id":4,"json":{"num":1}},{"_id":5,"json":{"num":0}}]
Notice that you got back all the documents you stored (num = 4, 3, 2, 1, 0).
If you look at the .sqlite file:
$ cd ~/Library/Application Support/iPhone Simulator/6.1/Applications/[id]/Documents
$ sqlite3 jsonstore.sqlite
(The android file should be under /data/data/com.[app-name]/databases/)
sqlite> .schema
CREATE TABLE nums ( _id INTEGER primary key autoincrement, 'num' REAL, json BLOB, _dirty REAL default 0, _deleted INTEGER default 0, _operation TEXT);
Notice the data type for num is REAL.
Running a query the same query used in the find function:
sqlite> SELECT * FROM nums WHERE num LIKE '%0%';
1|4.0|{"num":4}|1363326259.80431|0|add
2|3.0|{"num":3}|1363326259.80748|0|add
3|2.0|{"num":2}|1363326259.81|0|add
4|1.0|{"num":1}|1363326259.81289|0|add
5|0.0|{"num":0}|1363326259.81519|0|add
Notice 4 is stored as 4.0 and JSONStore's queries always use LIKE, any num with a 0 will match the query.
If you use integer instead:
var nums = WL.JSONStore.initCollection('nums', {num: 'integer'}, {});
Find returns:
[{"_id":5,"json":{"num":0}}]
The schema shows that num has an INTEGER data type:
sqlite> .schema
CREATE TABLE nums ( _id INTEGER primary key autoincrement, 'num' INTEGER, json BLOB, _dirty REAL default 0, _deleted INTEGER default 0, _operation TEXT);
sqlite> SELECT * FROM nums WHERE num LIKE '%0%';
5|0|{"num":0}|1363326923.44466|0|add
I skipped some of the onSuccess and all the onFailure callbacks for brevity.

The actual difference between a JSON number and integer is
defining {age: 'number'} indexes 1 as 1.0,
while defining{age: 'integer'} indexes 1 as 1.
Hope you understand

Sphinx Search: get list of words from index by source column

I have a table(let's call it my_table) with two text fields: title and description. Also I have an index(my_index) that uses next source-query:
SELECT * FROM my_table;
When I need to get all words and frequencies from my_index I use something like:
$indexer my_index --buildstops word_freq.txt 1000 --buildfreqs
But now, I need to get words that are presented only in column title(and their frequencies only from title column). What is the best solution to do this?
Edit:
It will be perfect, if solution won't build new indexes on disk space.

Create a new "index", that only includes the title column. No need to ever build an physical index with it, can just use it with --buildstops :)
Index inheritence, allows its creation with very compact bit in the config file
source my_index_title : my_index {
sql_query = SELECT id,title from my_table
}
index my_index_title : my_index {
source = my_index_title
path = /tmp/my_index_title
}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Tarantool multipart bitset index - indexing

Related

Convert oid to json as int instead of string

Can I find distances between arrays in PostgreSQL using <->?

SqlSave Error: Unable to append to table

JSONStore difference between 'number' and 'integer' in searchFields

Sphinx Search: get list of words from index by source column

Categories

Resources