Best way to Store and Count available stock (PostgreSQL) - sql

I am not sure how to proceed with counting the stock and storing the stock in generated invoices.
Live data: http://sqlfiddle.com/#!17/20ff4/4
Table articles there the static product info is stored:
create table articles
(
prod_article_pvt_uuid uuid default uuid_generate_v1() not null
constraint proxies_pk
primary key,
prod_article_pub_uuid uuid default uuid_generate_v1mc(),
prod_article_pub_id serial not null,
t timestamp with time zone default CURRENT_TIMESTAMP(6),
archived boolean default false,
cat integer,
prod_brand_id integer,
barcode bigint,
supplier_code varchar
);
This table opens a document, and defines the type of document:
create table documents
(
sale_doc_pvt_uuid uuid default uuid_generate_v1() not null
constraint documents_pk
primary key,
sale_doc_pub_uuid uuid default uuid_generate_v1mc(),
sale_doc_alpha varchar,
t timestamp with time zone default CURRENT_TIMESTAMP(6),
archived boolean default false,
sale_type_id integer not null,
frozen boolean default false
);
create table types
(
sale_type_id serial not null
constraint types_pk
primary key,
name varchar,
draft boolean default false
);
Where each artricle is added as an item of the document:
create table items
(
sale_unit_pvt_uuid uuid default uuid_generate_v1() not null
constraint units_pk
primary key,
sale_unit_pub_uuid uuid default uuid_generate_v1mc(),
t timestamp with time zone default CURRENT_TIMESTAMP(6),
archived boolean default false,
sale_doc_pvt_uuid uuid not null,
prod_article_pvt_uuid uuid not null,
quantity numeric(16,6) not null
);
These are the available products in the warehouse:
create table products
(
whse_prod_pvt_uuid uuid default uuid_generate_v1() not null,
whse_prod_pub_uuid uuid default uuid_generate_v1mc(),
t_created timestamp with time zone default CURRENT_TIMESTAMP(6),
prod_article_pvt_uuid uuid not null,
quantity numeric(16,6) default 1,
global_ctry_id integer
);
I am trying to query this like so:
SELECT
(sum(products.quantity) - COALESCE(_items.quantity, 0)) as quantity
,articles.prod_article_pub_id as article
FROM products
LEFT JOIN articles ON products.prod_article_pvt_uuid = articles.prod_article_pvt_uuid
LEFT JOIN (
SELECT DISTINCT ON (items.prod_article_pvt_uuid)
items.prod_article_pvt_uuid
,items.sale_doc_pvt_uuid
,sum(items.quantity) as quantity
FROM items
LEFT JOIN documents ON items.sale_doc_pvt_uuid = documents.sale_doc_pvt_uuid
LEFT JOIN types ON documents.sale_type_id = types.sale_type_id
WHERE
items.archived = false
AND documents.archived = false
AND types.draft = false
GROUP BY
items.prod_article_pvt_uuid
,items.sale_doc_pvt_uuid
) AS _items ON _items.prod_article_pvt_uuid = products.prod_article_pvt_uuid
GROUP BY
articles.prod_article_pub_id
,_items.quantity
ORDER BY
article ASC,
quantity DESC
It does give me the desired result. But I am wondering if this is the correct way at going about it, cause it looks like it might struggle with millions of rows.
Basically each time I need to display the quantity available in stock I have to count every single document ever generated.
Is this a good idea?
Will it be a performance bottle neck with millions of rows?

Related

Large SQL Request optimization for Faces Euclidean Distances calculations

I am calculating Euclidean distance between faces and want to store results in a table.
Current setup :
Each face is stored in Objects table and Distances between faces is stored in Faces_distances table.
The object table has the following columns objects_id, face_encodings, description
The faces_distances table has the following columns face_from, face_to, distance
In my my data set I have around 22 231 face objects which result in 494 217 361 pairs of faces - Although I understand it could be divided by 2 because
distance(face_from, face_to) = distance(face_to, face_from)
The database is Postgres 12.
The request below enables to insert the pairs of faces (without performing the distance calculation) that have not been calculated yet, but the execution time is very very very long (started 4 Days ago and still not done). Is there a way to optimize it ?
'''
-- public.objects definition
-- Drop table
-- DROP TABLE public.objects;
CREATE TABLE public.objects
(
objects_id int4 NOT NULL DEFAULT
nextval('objects_in_image_objects_id_seq'::regclass),
filefullname varchar(2303) NULL,
bbox varchar(255) NULL,
description varchar(255) NULL,
confidence numeric NULL,
analyzer varchar(255) NOT NULL DEFAULT 'object_detector'::character
varying,
analyzer_version int4 NOT NULL DEFAULT 100,
x int4 NULL,
y int4 NULL,
w int4 NULL,
h int4 NULL,
image_id int4 NULL,
derived_from_object int4 NULL,
object_image_filename varchar(2023) NULL,
face_encodings _float8 NULL,
face_id int4 NULL,
face_id_iteration int4 NULL,
text_found varchar NULL COLLATE "C.UTF-8",
CONSTRAINT objects_in_image_pkey PRIMARY KEY (objects_id),
CONSTRAINT objects_in_images FOREIGN KEY (objects_id) REFERENCES
public.objects(objects_id)
);
CREATE TABLE public.face_distances
(
face_from int8 NOT NULL,
face_to int8 NOT NULL,
distance float8 NULL,
CONSTRAINT face_distances_pk PRIMARY KEY (face_from, face_to)
);
-- public.face_distances foreign keys
ALTER TABLE public.face_distances ADD CONSTRAINT face_distances_fk
FOREIGN KEY (face_from) REFERENCES public.objects(objects_id);
ALTER TABLE public.face_distances ADD CONSTRAINT face_distances_fk_1
FOREIGN KEY (face_to) REFERENCES public.objects(objects_id);
Indexes
CREATE UNIQUE INDEX objects_in_image_pkey ON public.objects USING btree (objects_id);
CREATE INDEX objects_description_column ON public.objects USING btree (description);
CREATE UNIQUE INDEX face_distances_pk ON public.face_distances USING btree (face_from, face_to);
Query to add all pair of faces that are not already in the table.
insert into face_distances (face_from,face_to)
select t1.face_from , t1.face_to
from (
select f_from.objects_id face_from,
f_from.face_encodings face_from_encodings,
f_to.objects_id face_to,
f_to.face_encodings face_to_encodings
from objects f_from,
objects f_to
where f_from.description = 'face'
and f_to.description = 'face' ) as t1
left join face_distances on (
t1.face_from= face_distances.face_from
and t1.face_to = face_distances.face_to )
where face_distances.face_from is null;
try this simplified query.
It took only 5 minutes on my apple M1, SQLServer, with 22231 objects 'face', generated 247.097.565 pairs, which is excatly C(22231,2) number. The syntax is compatible with postgressql.
optimizations: join instead of the old jointure way, ranking functions to remove duplicates permutations (A,B)=(B,A),
removed the last [left join face_distance]: an empty table to recompute is a lot faster than checking for existance as an index search key lookup would be initiated for each key pair
insert into face_distances (face_from,face_to)
select f1,f2
from(
select --only needed fields here as this will fill temporary tables
f1.objects_id f1
,f2.objects_id f2
,dense_rank()over(order by f1.objects_id) rank1
,rank()over(partition by f2.objects_id order by f1.objects_id) rank2
from objects f1
-- generates all permutations
join objects f2 on f2.objects_id <> f1.objects_id and f2.description = 'face'
where f1.description = 'face'
)a
where rank2 >= rank1 --removes duplicate permutations

How to insert a new row into the table has unique id autoincrement?

After creating the table with a unique ID autoincrement, I realize my table lack a row. But I don't know how to do this without compromising the order of other rows in the table!
TABLE flights
id INTEGER PRIMARY KEY AUTOINCREMENT,
origin TEXT NOT NULL,
destination TEXT NOT NULL,
duration INTEGER NOT NULL
I want to insert a row: 2|Shanghai|Paris|760 into the table with id = 2.
1|New York|London|415
2|Istanbul|Tokyo|700
3|New York|Paris|435
4|Moscow|Paris|245
5|Lima|New York|455
Table I wished:
1|New York|London|415
2|Shanghai|Paris|760
3|Istanbul|Tokyo|700
4|New York|Paris|435
5|Moscow|Paris|245
6|Lima|New York|455
Thanks for any advice to me!
No way you can do this with auto-increment ID because IDS are not to order rows, but to identify the rows and assert it's the only row with that ID. If you want to, use a new specific column for this purpose, this way the IDs still the same and you can sort using anything as indexes.
CREATE TABLE flights (
id INTEGER AUTO_INCREMENT,
index INTEGER NOT NULL,
origin TEXT NOT NULL,
destination TEXT NOT NULL,
duration INTEGER NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY unique_index (index)
);

How can I fix "operator does not exist: text = uuid" when using Haskell's postgres-simple library to perform a multi-row insert?

I am using the postgres-simple library to insert into the eligible_class_passes table. Which is essentially a join table representing a many to many relationship.
I am using the executeMany function from the postgres-simple to do a multi row insert.
updateEligibleClassPasses :: Connection -> Text -> Text -> [Text] -> IO Int64
updateEligibleClassPasses conn tenantId classTypeId classPassTypeIds =
withTransaction conn $ do
executeMany
simpleConn
[sql|
INSERT INTO eligible_class_passes (class_type_id, class_pass_type_id)
SELECT upd.class_type_id::uuid, upd.class_pass_type_id::uuid
FROM (VALUES (?, ?, ?)) as upd(class_type_id, class_pass_type_id, tenant_id)
INNER JOIN class_types AS ct
ON upd.class_type_id::uuid = ct.id
INNER JOIN subscription_types AS st
ON class_pass_type_id::uuid = st.id
WHERE ct.tenant_id = upd.tenant_id::uuid AND st.tenant_id = upd.tenant_id::uuid
|]
params
where
addParams classPassTypeId = (classTypeId, classPassTypeId, tenantId)
params = addParams <$> classPassTypeIds
When this function is executed with the correct parameters applied I get the following runtime error
SqlError {sqlState = "42883", sqlExecStatus = FatalError, sqlErrorMsg = "operator does not exist: text = uuid", sqlErrorDetail = "", sqlErrorHint = "No operator matches the given name and argument type(s). You might need to add explicit type casts."}
However, when translated to SQL without the parameter substitutions (?) the query works correctly when executed in psql.
INSERT INTO eligible_class_passes (class_type_id, class_pass_type_id)
SELECT upd.class_type_id::uuid, upd.class_pass_type_id::uuid
FROM (VALUES ('863cb5ea-7a68-41d5-ab9f-5344605de500', 'e9195660-fd48-4fa2-9847-65a0ad323bd5', '597e6d7a-092a-49be-a2ea-11e8d85d8f82')) as upd(class_type_id, class_pass_type_id, tenant_id)
INNER JOIN class_types AS ct
ON upd.class_type_id::uuid = ct.id
INNER JOIN subscription_types AS st
ON class_pass_type_id::uuid = st.id
WHERE ct.tenant_id = upd.tenant_id::uuid AND st.tenant_id = upd.tenant_id::uuid;
My schema is as follows
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TABLE tenants (
id UUID NOT NULL DEFAULT uuid_generate_v4() PRIMARY KEY,
name text NOT NULL UNIQUE,
email text NOT NULL UNIQUE,
created_at timestamp with time zone NOT NULL default now(),
updated_at timestamp with time zone NOT NULL default now()
);
CREATE TABLE class_types (
id UUID NOT NULL DEFAULT uuid_generate_v4() PRIMARY KEY,
FOREIGN KEY (tenant_id) REFERENCES tenants (id),
created_at timestamp with time zone NOT NULL default now(),
updated_at timestamp with time zone NOT NULL default now()
);
CREATE TABLE class_pass_types (
id UUID NOT NULL DEFAULT uuid_generate_v4() PRIMARY KEY,
name TEXT NOT NULL,
tenant_id UUID NOT NULL,
price Int NOT NULL,
created_at timestamp with time zone NOT NULL default now(),
updated_at timestamp with time zone NOT NULL default now(),
FOREIGN KEY (tenant_id) REFERENCES tenants (id)
);
-- Many to many join through table.
-- Expresses class pass type redeemability against class types.
CREATE TABLE eligible_class_passes (
class_type_id UUID,
class_pass_type_id UUID,
created_at timestamp with time zone NOT NULL default now(),
updated_at timestamp with time zone NOT NULL default now(),
FOREIGN KEY (class_type_id) REFERENCES class_types (id) ON DELETE CASCADE,
FOREIGN KEY (class_pass_type_id) REFERENCES class_pass_types (id) ON DELETE CASCADE,
PRIMARY KEY (
class_type_id, class_pass_type_id
)
);
To help debug your issue, use formatQuery function, then you can see what kind of final query postgresql-simple is sending to the server.
Also, I'd recommend using UUID type from uuid-types package, instead of Text for the uuids. Using Text most likely hides some issues from you (which you'll hopefully see by using formatQuery.

What to join on

I have a table which associates an id to a latitude and a longitude.
For every id in that table, I'm trying find closest ids, and store them in another table with travel time, either if the route doesn't already exists or if the travel time is shorter (a route exists if there is an entry in transfers)
I'm currently using :
6371 * SQRT(POW( RADIANS(stop_lon - %lon) * COS(RADIANS(stop_lat + %lat)/2), 2) + POW(RADIANS(stop_lat - %lat), 2)) AS distance
To find this distance.
It does work pretty well, however I don't know what to join on (for the self join).
How should I do ?
Here 'SHOW CREATE TABLE' for the different tables which are usefull here :
CREATE TABLE `stops` (
`stop_id` int(10) NOT NULL,
`stop_name` varchar(100) NOT NULL,
`stop_desc` text,
`stop_lat` decimal(20,16) DEFAULT NULL,
`stop_lon` decimal(20,16) DEFAULT NULL,
PRIMARY KEY (`stop_id`),
FULLTEXT KEY `stop_name` (`stop_name`),
FULLTEXT KEY `stop_desc` (`stop_desc`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 PAGE_CHECKSUM=1
CREATE TABLE `transfers` (
`transfer_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`from_stop_id` int(10) NOT NULL,
`to_stop_id` int(10) NOT NULL,
`transfer_time` int(10) NOT NULL,
PRIMARY KEY (`transfer_id`),
UNIQUE KEY `transfer_id` (`transfer_id`),
KEY `to_stop_id` (`to_stop_id`),
KEY `from_stop_id` (`from_stop_id`)
) ENGINE=InnoDB AUTO_INCREMENT=81810 DEFAULT CHARSET=utf8 PAGE_CHECKSUM=1
Perhaps:
FROM transfers AS a
JOIN transfers AS b ON b.from_stop_id = to_stop_id
There is to be a third table? And it does not parallel either of the existing ones? Let me see if I have the right model: stops is like airports. transfers is like waiting in an airport for your next leg of a flight. But transfers fails to have the stop_id of itself; this is confusing. And the third_table would be the flight time/distance between stops?
Or maybe a transfer is just a flight from one airport to another? And there is no delay while waiting for the next leg?
Other notes:
PRIMARY KEY (`transfer_id`),
UNIQUE KEY `transfer_id` (`transfer_id`),
Since a PRIMARY KEY is a UNIQUE KEY, the latter is redundant (and wasteful); DROP it.
decimal(20,16) is overkill.
Datatype Bytes resolution
------------------ ----- --------------------------------
DECIMAL(6,4)/(7,4) 7 16 m 52 ft Houses/Businesses
DECIMAL(8,6)/(9,6) 9 16cm 1/2 ft Friends in a mall
DECIMAL(20,16) 20 microscopic

SQL JOIN To Find Records That Don't Have a Matching Record With a Specific Value

I'm trying to speed up some code that I wrote years ago for my employer's purchase authorization app. Basically I have a SLOW subquery that I'd like to replace with a JOIN (if it's faster).
When the director logs into the application he sees a list of purchase requests he has yet to authorize or deny. That list is generated with the following query:
SELECT * FROM SA_ORDER WHERE ORDER_ID NOT IN
(SELECT ORDER_ID FROM SA_SIGNATURES WHERE TYPE = 'administrative director');
There are only about 900 records in sa_order and 1800 records in sa_signature and this query still takes about 5 seconds to execute. I've tried using a LEFT JOIN to retrieve records I need, but I've only been able to get sa_order records with NO matching records in sa_signature, and I need sa_order records with "no matching records with a type of 'administrative director'". Your help is greatly appreciated!
The schema for the two tables is as follows:
The tables involved have the following layout:
CREATE TABLE sa_order
(
`order_id` BIGINT PRIMARY KEY AUTO_INCREMENT,
`order_number` BIGINT NOT NULL,
`submit_date` DATE NOT NULL,
`vendor_id` BIGINT NOT NULL,
`DENIED` BOOLEAN NOT NULL DEFAULT FALSE,
`MEMO` MEDIUMTEXT,
`year_id` BIGINT NOT NULL,
`advisor` VARCHAR(255) NOT NULL,
`deleted` BOOLEAN NOT NULL DEFAULT FALSE
);
CREATE TABLE sa_signature
(
`signature_id` BIGINT PRIMARY KEY AUTO_INCREMENT,
`order_id` BIGINT NOT NULL,
`signature` VARCHAR(255) NOT NULL,
`proxy` BOOLEAN NOT NULL DEFAULT FALSE,
`timestamp` TIMESTAMP NOT NULL DEFAULT NOW(),
`username` VARCHAR(255) NOT NULL,
`type` VARCHAR(255) NOT NULL
);
Create an index on sa_signatures (type, order_id).
This is not necessary to convert the query into a LEFT JOIN unless sa_signatures allows nulls in order_id. With the index, the NOT IN will perform as well. However, just in case you're curious:
SELECT o.*
FROM sa_order o
LEFT JOIN
sa_signatures s
ON s.order_id = o.order_id
AND s.type = 'administrative director'
WHERE s.type IS NULL
You should pick a NOT NULL column from sa_signatures for the WHERE clause to perform well.
You could replace the [NOT] IN operator with EXISTS for faster performance.
So you'll have:
SELECT * FROM SA_ORDER WHERE NOT EXISTS
(SELECT ORDER_ID FROM SA_SIGNATURES
WHERE TYPE = 'administrative director'
AND ORDER_ID = SA_ORDER.ORDER_ID);
Reason : "When using “NOT IN”, the query performs nested full table scans, whereas for “NOT EXISTS”, query can use an index within the sub-query."
Source : http://decipherinfosys.wordpress.com/2007/01/21/32/
This following query should work, however I suspect your real issue is you don't have the proper indices in place. You should have an index on the SA_SGINATURES table on the ORDER_ID column.
SELECT *
FROM
SA_ORDER
LEFT JOIN
SA_SIGNATURES
ON
SA_ORDER.ORDER_ID = SA_SIGNATURES.ORDER_ID AND
TYPE = 'administrative director'
WHERE
SA_SIGNATURES.ORDER_ID IS NULL;
select * from sa_order as o inner join sa_signature as s on o.orderid = sa.orderid and sa.type = 'administrative director'
also, you can create a non clustered index on type in sa_signature table
even better - have a master table for types with typeid and typename, and then instead of saving type as text in your sa_signature table, simply save type as integer. thats because computing on integers is way faster than computing on text