how to define many-to-many relationship in prisma/psql - sql

I'm struggling a bunch with some relatively simple Postgres that I was originally trying to define in Prisma but wasn't having much luck. It's a many to many relationship joined on a single field between two tables as such:
CREATE TABLE scheduleDate (
schedule_day_number int NOT NULL
, schedule_date date NOT NULL
,CONSTRAINT scheduledate_pkey PRIMARY KEY (schedule_date, schedule_day_number)
);
CREATE TABLE schoolDayBlock (
start_time varchar(10) NOT NULL
, end_time varchar(10) NOT NULL
,school_block_day_number int NOT NULL
, block_name varchar(10) NOT NULL
,CONSTRAINT schooldayblock_pkey PRIMARY KEY (start_time, end_time, school_block_day_number, block_name )
);
I eventually want to be able to get all the possible combinations between these two tables with the day number fields being the link. Ideally, later down the line, I can query Prisma for a block name and get all of the dates and times where that block name shows up. Does anyone know how I could
Write this up in a Prisma schema or
Write this up in postgres.sql and use introspection to generate a Prisma schema that accomplishes this.

Here's the prisma schema equivalent of your SQL query for creating two tables.
// This is your Prisma schema file,
// learn more about it in the docs: https://pris.ly/d/prisma-schema
generator client {
provider = "prisma-client-js"
}
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
}
model scheduleDate {
schedule_day_number Int
schedule_date DateTime
##id([schedule_day_number, schedule_date], name: "scheduledate_pkey")
}
model schoolDayBlock {
start_time DateTime
end_time DateTime
school_block_day_number Int
block_name String
##id([start_time, end_time, school_block_day_number, block_name], name: "schooldayblock_pkey")
}

Related

SQL Table with mixed data type field Best Practice

everyone,
I would like an advice on best practice for creating realtional database structure with field having mixed data type.
I have 'datasets' (some business objects) and I would like to have list of parameters, associated with each dataset. And those parameters can have different types - strings, integers, float and json values.
What would be the best structure for the parameters table? Should I have single column with string type?
CREATE TABLE param_desc (
id serial PRIMARY KEY,
name varchar NOT NULL,
param_type int -- varchar, int, real, json
);
CREATE TABLE param_value (
id serial PRIMARY KEY,
dataset_id int NOT NULL,
param int NOT NULL REFERENCES param_desc (id),
value varchar NOT NULL,
CONSTRAINT _param_object_id_param_name_id_time_from_key UNIQUE (dataset_id, param)
);
The problem with such approach is that I can't easily cast value for some additional conditions. For example, I want to get all datasets with some specific integer parameter, having int value more than 10. But if I write where clause, the casting will return error, as other non-integer parameters can't be casted.
SELECT dataset_id FROM vw_param_current WHERE name = 'priority' AND value::int > 5
Or should I have 4 separate columns, with 3 of them being NULL for every row?
Or should I have 4 different tables?

EF6 throws concurrency exception when trying to add new entry with identity column

I'm generating entity models from my database with EF6. I created two test tables. One table has an Identity column, and the other table doesn't. Here are the tables:
CREATE TABLE [dbo].[TestNoIdentity]
(
[ID] INT NOT NULL,
[DTStamp] DATETIME NOT NULL,
[Note] VARCHAR(255) NULL,
PRIMARY KEY CLUSTERED ([ID] ASC, [DTStamp] ASC)
);
CREATE TABLE [dbo].[TestIdentity]
(
[ID] INT IDENTITY (1, 1) NOT NULL,
[DTStamp] DATETIME NOT NULL,
[Note] VARCHAR(255) NULL,
PRIMARY KEY CLUSTERED ([ID] ASC, [DTStamp] ASC)
);
Test code:
using (TestEntities entities = new TestEntities())
{
// This works
var entry1 = new TestNoIdentity();
entry1.ID = 1;
entry1.DTStamp = DateTime.Now;
entry1.Note = "No Identity";
entities.TestNoIdentity.Add(entry1);
entities.SaveChanges();
// This doesn't work
var entry2 = new TestIdentity();
entry2.DTStamp = DateTime.Now;
entities.TestIdentity.Add(entry2);
entities.SaveChanges(); //optimistic concurrency exception
// This query works
// entities.Database.ExecuteSqlCommand("INSERT INTO [dbo].[TestIdentity] ([DTStamp]) VALUES ('1/1/2021 12:00:00 PM')");
return entities.ID.ToString();
}
Why is it throwing a concurrency exception? There are no other users or duplicated instances of the entity.
The message from the exception:
Store update, insert, or delete statement affected an unexpected number of rows (0). Entities may have been modified or deleted since entities were loaded.
Without IDENTITY EF doesn't have to fetch back the ID, and that's where it's failing. You've got a DATETIME column in your PK, and DATETIME only has precision of about 3ms, so comparing the stored value with the generated value may fail. Change it to DATETIME2 to better match the precision of .NET's DateTime, or trim your .NET DateTime to the nearest second.

Suggested Indexing for table with 50 million rows is queried using its CREATED_DATE column and USER_TYPE column

Table Users:
ID PK INT
USER_TYPE VARCHAR(50) NOT NULL
CREATED_DATE DATETIME2(7) NOT NULL
I have this table with 50 million rows, and it is queries using the following where clause:
WHERE
u.USER_TYPE= 'manager'
AND u.CREATED_DATE >= #StartDate
AND u.CREATED_DATE < #EndDate
What would be a good starting point for an index on this table to optimize for the above query where clause?
For that query, the index you want is a composite index with two columns: (user_type, created_date). The order matters, you want user_type first because of the equality comparison.
You'll be well served by creating a table with user types having an arbitrary INT ID and referring to the manager type by ID, instead of having the manager type directly in the users table. This will narrow the table data as well as any index referring to the user type.
CREATE TABLE user_type (
id INT NOT NULL IDENTITY(1,1),
description NVARCHAR(128) NOT NULL,
CONSTRAINT pk_user_type PRIMARY KEY CLUSTERED(id)
);
CREATE TABLE users (
id INT NOT NULL IDENTITY(1,1),
user_type_id INT NOT NULL,
created_date DATETIME2(7) NOT NULL,
CONSTRAINT pk_users PRIMARY KEY CLUSTERED(id),
CONSTRAINT fk_users_user_type FOREIGN KEY(user_type_id) REFERENCES user_type(id)
);
CREATE NONCLUSTERED INDEX
ix_users_type_created
ON
users (
user_type_id,
created_date
);
You would be querying using the user_type ID rather than directly with the text of course.
For any query. Run the query in SSMS with "Include Actual Execution Plan" on. SSMS will advice an index if it feels proper index doesn't exist.

How to find the columns that need to be indexed?

I'm starting to learn SQL and relational databases. Below is the table that I have, and it has around 10 million records. My composite key is (reltype, from_product_id, to_product_id).
What strategy should I follow while selecting the columns that needs to be indexed? Also, I have documented the operations that would be performed on the table. Please help in determining which columns or combination of columns that need to be indexed?
Table DDL is shown below.
Table name: prod_rel.
Database schema name : public
CREATE TABLE public.prod_rel (
reltype varchar NULL,
assocsequence float4 NULL,
action varchar NULL,
from_product_id varchar NOT NULL,
to_product_id varchar NOT NULL,
status varchar NULL,
starttime varchar NULL,
endtime varchar null,
primary key reltype, from_product_id, to_product_id)
);
Operations performed on table:
select distinct(reltype )
from public.prod_rel;
update public.prod_rel
set status = ? , starttime = ?
where from_product_id = ?;
update public.prod_rel
set status = ? , endtime = ?
where from_product_id = ?;
select *
from public.prod_rel
where from_product_id in (select distinct (from_product_id)
from public.prod_rel
where status = ?
and action in ('A', 'E', 'C', 'P')
and reltype = ?
fetch first 1000 rows only);
Note: I'm not performing any JOIN operations. Also please ignore the uppercase for table or column names. I'm just getting started.
Ideal would be two indexes:
CREATE INDEX ON prod_rel (from_product_id);
CREATE INDEX ON prod_rel (status, reltype)
WHERE action IN ('A', 'E', 'C', 'P');
Your primary key (which also is implemented using an index) cannot support query 2 and 3 because from_product_id is not in the beginning. If you redefine the primary key as from_product_id, to_product_id, reltype, you don't need the first index I suggested.
Why does order matter? Imagine you are looking for a book in a library where the books are ordered by “last name, first name”. You can use this ordering to find all books by “Dickens” quickly, but not all books by any “Charles”.
But let me also comment on your queries.
The first one will perform badly if there are lots of different reltype values; try raising work_mem in that case. It is always a sequential scan of the whole table, and no index can help.
I have changed the order of primary columns as shown below as per #a_horse_with_no_name 's suggestion and created only one index for (from_product_id, reltype, status, action) columns.
CREATE TABLE public.prod_rel (
reltype varchar NULL,
assocsequence float4 NULL,
action varchar NULL,
from_product_id varchar NOT NULL,
to_product_id varchar NOT NULL,
status varchar NULL,
starttime varchar NULL,
endtime varchar null,
primary key reltype, from_product_id, to_product_id)
);
Also, I have gone thorough the portal suggested by #a_horse_with_no_name. It was amazing. I came to know lot of new things on indexing.
https://use-the-index-luke.com/

Creating a table specifically for tracking change information to remove duplicated columns from tables

When creating tables, I have generally created them with a couple extra columns that track change times and the corresponding user:
CREATE TABLE dbo.Object
(
ObjectId int NOT NULL IDENTITY (1, 1),
ObjectName varchar(50) NULL ,
CreateTime datetime NOT NULL,
CreateUserId int NOT NULL,
ModifyTime datetime NULL ,
ModifyUserId int NULL
) ON [PRIMARY]
GO
I have a new project now where if I continued with this structure I would have 6 additional columns on each table with this type of change tracking. A time column, user id column and a geography column. I'm now thinking that adding 6 columns to every table I want to do this on doesn't make sense. What I'm wondering is if the following structure would make more sense:
CREATE TABLE dbo.Object
(
ObjectId int NOT NULL IDENTITY (1, 1),
ObjectName varchar(50) NULL ,
CreateChangeId int NOT NULL,
ModifyChangeId int NULL
) ON [PRIMARY]
GO
-- foreign key relationships on CreateChangeId & ModifyChangeId
CREATE TABLE dbo.Change
(
ChangeId int NOT NULL IDENTITY (1, 1),
ChangeTime datetime NOT NULL,
ChangeUserId int NOT NULL,
ChangeCoordinates geography NULL
) ON [PRIMARY]
GO
Can anyone offer some insight into this minor database design problem, such as common practices and functional designs?
Where i work, we use the same construct as yours - every table has the following fields:
CreatedBy (int, not null, FK users table - user id)
CreationDate (datetime, not null)
ChangedBy (int, null, FK users table - user id)
ChangeDate (datetime, null)
Pro: easy to track and maintain; only one I/O operation (i'll come to that later)
Con: i can't think of any at the moment (well ok, sometimes we don't use the change fields ;-)
IMO the approach with the extra table has the problem, that you will have to reference somehow also the belonging table for every record (unless you only need the one direction Object to Tracking table). The approach also leads to more I/O database operations - for every insert or modify you will need to:
add entry to Table Object
add entry to Tracking Table and get the new Id
update Object Table entry with the Tracking Table Id
It would certainly make the application code that communicates with the DB a bit more complicated and error-prone.