Select from many to many table query - sql

I have some tables:
Sessions
SessionID int PK
Created datetime
SiteId int FK
Tracking_Parameters
ParamID int PK
ParamName nvarchar
Session_Custom_Tracking
SessionID int FK
ParamID int FK
ParamValue nvarchar
Site_Custom_Parameters
SiteID int FK
ParamID int FK
ParamKey nvarchar
Sessions: Contains the unique session id for a visitor and the time they entered the site.
Tracking_Parameters: Contains a list of things that you may want to track on a site (i.e. Email Open, Email Click, Article Viewed, etc.)
Site_Custom_Parameters: For a particular site (table not shown), declares the key value for a Tracking_Parameter (i.e. the key to look for in a query string or route)
Session_Custom_Tracking: The link between a session and a tracking parameter and also contains the value for the parameter's key when it was found by my application.
Question:
I want to select session id's where for these particular sessions, there is a record in the Session_Custom_Tracking for two different ParamID's. I want to find sessions where a user both opened an email (paramid 1) and clicked (paramid 3) a link in that email.

You can join to the same table twice:
SELECT S.SessionID
FROM Sessions AS S
JOIN Session_Custom_Tracking AS SCT1
ON SCT1.SessionID = S.SessionID
AND SCT1.ParamID = 1
JOIN Session_Custom_Tracking AS SCT2
ON SCT2.SessionID = S.SessionID
AND SCT2.ParamID = 3
An alteranative that might be easier to read (because it more closely matches the way you describe the problem) is to use WHERE EXISTS:
SELECT S.SessionID
FROM Sessions AS S
WHERE EXISTS
(
SELECT *
FROM Session_Custom_Tracking AS SCT1
WHERE SCT1.SessionID = S.SessionID
AND SCT1.ParamID = 1
)
AND EXISTS
(
SELECT *
FROM Session_Custom_Tracking AS SCT2
WHERE SCT2.SessionID = S.SessionID
AND SCT2.ParamID = 3
)

Related

How to verify table update and migrate data from another table - postgresql

I have following two tables in my potgres database with each type.
user
userid | bigint (PK) NOT NULL
username | character varying(255)
businessname | character varying(255)
inbox
messageid | bigint (PK) NOT NULL
username | character varying(255)
businessname | character varying(255)
What i wanna achieve here is i want to add a new field called userRefId to inbox table and migrate data on user table's userid data into that where each username and businessname match in both tables.
These are the queries i use to do that.
ALTER TABLE inbox ADD userRefId bigint;
UPDATE inbox
SET userRefId = u.userid
from "user" u
WHERE u.username = inbox.username
AND u.businessname = inbox.businessname;
Now i want to verify the data has been migrated correctly. what are the approaches i can take to achieve this? (Note : the username on inbox can be null)
Would this be good enough to verification?
Result of select count(*) from inbox where username is not null; being equal to
select count(userRefId) from inbox;
Is the data transferred correctly? First, the update looks correct, so you don't really need to worry.
You can get all rows in consumer_inbox where the user names don't match
select ci.*. -- or count(*)
from consumer_inbox ci
where not exists (select 1
from user u
where ci.userRefId = u.userId
);
This doesn't mean that the update didn't work. Just that the values in consumer_inbox have no matches.
Under the circumstances of your code, this is equivalent to:
select ci.*
from consumer_inbox ci
where userId is null;
Although this would not pick up a userId set to a non-matching record (cosmic rays, anyone?).
You can also validate the additional fields used for matching:
select ci.*. -- or count(*)
from consumer_inbox ci
where not exists (select 1
from user u
where ci.userRefId = u.userId and
ci.username = u.username and
ci.businessname = u.businessname
);
However, all this checking seems unnecessary, unless you have trigger on the tables or known non-matched records.

How do I filter data between multiple columns into new columns after INNER JOINS?

Brand new to SQL, so apologies that I don't really know how to word the question or find an existing answer. Let me explain further. I'm creating a Chat app for fun with a DM system. I have a table (dms_history) setup that has a row gen. for every new distinct chat between 2 users w/ the last DM being sent between the 2 users. eg:
CREATE TABLE users (
uid SERIAL PRIMARY KEY,
pid VARCHAR(40),
uname VARCHAR(50) NOT NULL UNIQUE,
email valid_email,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE dms (
dmid SERIAL PRIMARY KEY,
uid INT NOT NULL REFERENCES users(uid),
recip INT NOT NULL REFERENCES users(uid),
msg TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- TABLE I'M TALKING ABOUT
CREATE TABLE dms_history (
user1 INT REFERENCES users(uid),
user2 INT REFERENCES users(uid),
last_dm INT REFERENCES dms(dmid),
PRIMARY KEY(user1, user2),
CHECK (user1 < user2)
);
Say I'm trying to get all the data w/ joins on this table (dms_history) for client user uid#2. eg:
SELECT
u1.uid as u1id,
u1.uname as u1name,
u2.uid as u2id,
u2.uname as u2name,
dms.dmid,
dms.msg
FROM dms_history h
INNER JOIN users u1 ON h.user1 = u1.uid
INNER JOIN users u2 ON h.user2 = u2.uid
INNER JOIN dms ON h.last_dm = dms.dmid
WHERE u1.uid = 2 OR u2.uid = 2;
That query I thought of is close to what I want, but what I really want is only to show the user opposite of the client user being queried (so NOT user uid#2 alice) along w/ the last message sent. How do I filter or UNION? those columns into a new column. Picture to explain what I want:
Think of Twitter DMs and how it shows the users you're DMing and a snippet of the last message sent.
Apologies if I did a poor job explaining, it's not my strong suit.
After doing some more reading the past few hours I found I could use subqueries with SELECTs and UNIONs to achieve what I was after. Not sure how efficient it is, but it works.
WITH chats AS (SELECT * FROM dms_history h WHERE h.user1 = 2 OR h.user2 = 2)
SELECT
chats2.user1 as recip_id,
u.uname as recip_uname,
chats2.last_dm as dmid,
dms.msg,
dms.created_at
FROM (
SELECT chats.user1, chats.last_dm FROM chats
UNION
SELECT chats.user2, chats.last_dm FROM chats
) AS chats2
INNER JOIN users u ON chats2.user1 = u.uid
INNER JOIN dms ON chats2.last_dm = dms.dmid
WHERE chats2.user1 != 2;

How do I check if matches on a left join ALL have a match on another table?

I am attempting to write a stored procedure that will return only records who either didn't yield any results on the right side of a LEFT JOIN or for all of the records found on the right side, only return a result set for those who have a match in another table.
To illustrate what I'm attempting to achieve, first consider the following table definitions:
CREATE TYPE [dbo].[TvpDocumentsSent] AS TABLE
(
DocumentId INT
, RecipientId INT
, TransactionId INT
);
CREATE TABLE [dbo].[Recipients]
(
RecipientId INT
, GroupId INT
)
CREATE TABLE [dbo[.[RecipientEmails]
(
RecipientId INT
, TransactionID INT
)
CREATE TABLE [dbo].[DocumentTransactions]
(
TransactionId INT
, DocumentId INT
)
The first table, TvpDocumentsSent is used in the stored procedure as a table-valued parameter. It is indicative of the records that we are checking.
The second table, Recipients houses all of the potential document recipients. It is worth noting that recipients are placed in groups (indicated by the GroupId). All recipients in a group should receive a document before that document is marked as ready-for-archive. That is the part that I'm struggling with, btw.
Next, the RecipientEmails table houses all e-mails (that may or may not have contained a document) that have been sent to a recipient.
The latter table, DocumentTransactions stores a log of all document transactions that have occurred. This tells me what document was sent (indicated by the DocumentId). Although there is not a RecipientId on this table, the TransactionId can be used to trace the DocumentTransaction back to a recipient via the RecipientEmails table.
What I'm struggling with is how to write a query that gives me only a subset of the records passed in via TvpDcoumentsSent; only those who either don't have another recipient waiting for the document in the group or all recipients have received the document (i.e. there is a record in the DocumentTransactions table whose TransactionId maps back to a record in RecipientEmail whose recipient was eligible for this document).
What I've come up with so far is this (Note: I'm aware that I'm using TvpDocumentsSent as a table and not a TVP in the query below. I did this to simplify my explanation.
SELECT
SNT.DocumentId
FROM [dbo].[TvpDocumentsSent] AS SNT
INNER JOIN [dbo].[Recipients] AS RCP ON -- The recipient who recieved the document during this transaction.
RCP.RecipientId = SNT.RecipientId
LEFT JOIN [dbo].[Recipients] AS OTHR_RCP ON -- Other recipients who may have already received the document or could later.
RCP.GroupId = OTHR_RCP.GroupId
AND RCP.RecipientId != OTHR_RCP.RecipientId
WHERE OTHR_RCP.RecipientId IS NULL OR ??????
Keeping in mind that there are n number of recipients who could potentially receive the document, how do I fulfill the OR portion of the WHERE clause to ensure that everyone has received documents?
I tried the following and it does not work correctly:
SELECT
SNT.DocumentId
FROM [dbo].[TvpDocumentsSent] AS SNT
INNER JOIN [dbo].[Recipients] AS RCP ON -- The recipient who recieved the document during this transaction.
RCP.RecipientId = SNT.RecipientId
LEFT JOIN [dbo].[Recipients] AS OTHR_RCP ON -- Other recipients who may have already received the document or could later.
RCP.GroupId = OTHR_RCP.GroupId
AND RCP.RecipientId != OTHR_RCP.RecipientId
LEFT JOIN [dbo].[DocumentTransactions] AS DT ON
SNT.TransactionId = DT.TransactionId
WHERE OTHR_RCP.RecipientId IS NULL OR DT.DocumentId IS NOT NULL
That won't work because as long as one of the recipients have received the document, the OR part of the WHERE clause will pass. Let's say 5 recipients should received the document but only 1 has received it thus far. That OR will see the 1 record's match and pass the WHERE; that's wrong...It should enforce that ALL potential recipients have received the document.
Not sure if the example below is getting close.
Since I had to mock up the sample data & guess the expected results.
But aggregating in a sub-query and then comparing totals could probably help here.
(or via a HAVING clause)
Example snippet:
declare #Recipients table (RecipientId int primary key, GroupId int);
declare #DocumentTransactions table (TransactionId int primary key, DocumentId int);
declare #DocumentsSent table (DocumentId int, RecipientId int, TransactionId int);
declare #RecipientEmails table (RecipientId int, TransactionID int);
insert into #Recipients (RecipientId, GroupId) values
(201,1),(202,1),(203,1),(204,2),(205,2),(206,2);
insert into #DocumentTransactions (TransactionId, DocumentId) values
(301,101),(302,101),(303,101),(304,102),(305,102),(306,102);
insert into #DocumentsSent (DocumentId, RecipientId, TransactionId) values
(101,201,301),(101,202,302),(101,203,303)
,(102,204,304),(102,205,305),(102,206,306);
insert into #RecipientEmails (RecipientId, TransactionId) values
(201,301),(202,302),(203,303)
,(204,304);
SELECT DocumentId
FROM
(
SELECT
tr.DocumentId,
rcpt.GroupId,
count(distinct sent.RecipientId) AS TotalSent,
count(distinct rcptmail.RecipientId) AS TotalRcptEmail
FROM #DocumentsSent AS sent
LEFT JOIN #Recipients AS rcpt ON rcpt.RecipientId = sent.RecipientId
LEFT JOIN #DocumentTransactions AS tr
ON (tr.TransactionId = sent.TransactionId AND tr.DocumentId = sent.DocumentId)
LEFT JOIN #RecipientEmails AS rcptmail
ON (rcptmail.TransactionId = sent.TransactionId AND rcptmail.RecipientId = sent.RecipientId)
GROUP BY tr.DocumentId, rcpt.GroupId
) AS q
WHERE (TotalSent = TotalRcptEmail OR (TotalSent > 0 AND TotalRcptEmail = 0))
GROUP BY DocumentId;
/*
SELECT
tr.TransactionId,
sent.DocumentId,
sent.RecipientId AS RecipientIdSent,
rcpt.GroupId AS GroupIdRcpt,
rcpt.RecipientId AS RecipientIdRcpt,
rcptmail.RecipientId AS RecipientIdEmail
FROM #DocumentsSent AS sent
LEFT JOIN #Recipients AS rcpt ON rcpt.RecipientId = sent.RecipientId
LEFT JOIN #DocumentTransactions AS tr
ON (tr.TransactionId = sent.TransactionId AND tr.DocumentId = sent.DocumentId)
LEFT JOIN #RecipientEmails AS rcptmail
ON (rcptmail.TransactionId = sent.TransactionId AND rcptmail.RecipientId = sent.RecipientId);
*/
Returns:
DocumentId
----------
101

Database Schema for Claims Authentication

On a database I have the tables: USERS, USERS_PROFILES, USERS_CLAIMS.
create table dbo.USERS
(
Id int identity not null,
Username nvarchar (120) not null,
Email nvarchar (120) not null
);
create table dbo.USERS_PROFILES
(
Id int not null,
[Name] nvarchar (80) not null
);
create table dbo.USERS_CLAIMS
(
Id int not null,
[Type] nvarchar (200) not null,
Value nvarchar (200) not null,
);
I am using Claims authorization. When a user signs up and Identity is created.
The identity contains claims and each claim has a type and a value:
UsernameType > Username from USERS
EmailType > Email from USERS
NameType > Name from USERS_PROFILES
RoleType > Directly from USERS_CLAIMS
So I am creating the Identity from many columns in 3 tables.
I ended up with this because I migrated to Claims Authentication.
QUESTION
Should I move the Username, Email and Name to USERS_CLAIMS?
The USERS_PROFILES table would disappear ...
And USERS table would contain only info like "UserId, LastLoginDate, CreatedDate, ..."
If I want get a user by username I would just get the Claim of type username ...
If I want to sign in the user I just get all claims and create the identity.
So the Identity Model is much similar to the SQL tables.
Does this make sense? How would you design the tables?
Thank You,
Miguel
You are creating a key value store. They are a nightmare to query in SQL. Consider the difficulty of querying user attributes by a value on the USER_CLAIMS table. Example:
-- Users with name and email by username
SELECT p.ID, p.Username, p.Name, p.Email, u.LastLoggedIN
FROM USER_PROFILES p
INNER JOIN Users u on p.ID = u.ID
WHERE p.ID = #UserID
-- Users with name and email by username with a claims table
-- Does not specify whether there is only one email, so this could return multiple
-- rows for a single user.
SELECT p.ID, cUName.Value as Username, cName.Value as Name, cEMail.Value as Email, u.LastLoggedIN
FROM Users u
LEFT OUTER JOIN USER_CLAIMS cName ON u.ID = cName.ID and cName.[Type] = 'http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name'
LEFT OUTER JOIN USER_CLAIMS cUName ON u.ID = cUName.ID and cUName.[Type] = 'http://schemas.xmlsoap.org/ws/2005/05/identity/claims/privatepersonalidentifier'
LEFT OUTER JOIN USER_CLAIMS cEmail ON u.ID = cEmail.ID and cEmail.[Type] = 'http://schemas.xmlsoap.org/ws/2005/05/identity/claims/email'
WHERE p.ID = #UserID
Can a user have multiple profiles? If not, there is no need for the "USERS_PROFILES" table. Keep the "Username" and "Email" columns on the "USERS" table. If you put them on the "USERS_CLAIMS" table, you would be storing redundant information anytime a user files a claim.
I am not sure what kind of tracking you'd like to have for your users, but I would recommend having a separate table that tracks when a user signs in. Something like this:
CREATE TABLE USERS_LOG (user_id INT, log_in DATETIME);
You can then get rid of the "LastLoginDate" on your "USERS" table and do a join to get the last time the user signed in. It'll give you more ways to track your users and you won't be creating blocks on your "USERS" table by updating it constantly.

Transact-SQL Query / How to combine multiple JOIN statements?

I'm banging my head on this SQL puzzle since a couple of hours already, so i thought to myself : "Hey, why don't you ask the Stack and allow the web to benefit from the solution?"
So here it is. First thing, these are my SQL tables:
Fields
FieldID INT (PK)
FieldName NVARCHAR(50) (IX)
FormFields
FieldID INT (FK)
FormID INT (FK)
Values
FieldID INT (FK)
RecordID INT (FK)
Value NVARCHAR(1000)
Forms
FormID INT (PK)
FormName NVARCHAR(50) (IX)
Records
RecordID INT (PK)
FormID INT (FK)
PoolID INT (FK)
DataPools
PoolID INT (PK)
FormID INT (FK)
PoolName NVARCHAR(50) (IX)
Consider the following constraints.
Each Form has 0 or more DataPool. Each DataPool can only be assigned to one Form.
Each Form has 0 or more Field. Each Field might be assigned to several Form.
Each Record has 0 or more Value. Each Value is linked to a single Record.
Each DataPool has 0 or more Record. Each Record is linked to a single DataPool.
Each Value is linked to one Field.
Also, all the Name columns have unique values.
Now, here's the problem:
I need to query evey value form the Values table based on the following columns:
The Name of the Field linked to the Value
The Name of the DataPool linked the Record linked to the Value
The Name of the Form linked to that DataPool
The 3 columns above must be equal to the 3 received parameters in the stored procedure.
Here's what I got so far:
CREATE PROCEDURE [GetValues]
#FieldName NVARCHAR(50),
#FormName NVARCHAR(50),
#PoolName NVARCHAR(50)
AS SELECT Value FROM [Values]
JOIN [Fields]
ON [Fields].FieldID = [Values].FieldID
WHERE [Fields].FieldName = #FieldName
How can I filter the rows of the Values table by the PoolName column? The DataPools table isn't directly related to the Values table, but it's still related to the Records table which is directly related to the Values table. Any ideas on how to do that?
I feel like I am missing something in your question. If this solution is not addressing the problem, please let me know where it is missing the issue.
SELECT
Values.Value
FROM
Values INNER JOIN Fields ON
Values.FieldId = Fields.FieldId
INNER JOIN FormFields ON
Fields.FieldId = FormFields.FieldId
INNER JOIN Forms ON
FormFields.FormId = Forms.FormId
INNER JOIN DataPools ON
Forms.FormId = DataPools.FormId
WHERE
Fields.FieldName = #FieldName
AND
Forms.FormName = #FormName
AND
DataPools.PoolName = #PoolName;
if i understand what your needing this should work just fine.
select * from
values v
join records r
on v.recordid = r.recordid
join datapool dp
on r.poolid = dp.poolid
join forms f
on r.formid = f.formid
join fields fi
on v.fieldid = fi.fieldid
where
fi.FieldName = #FieldName
AND
f.FormName = #FormName
AND
dp.PoolName = #PoolName;