Compare set of data in one table to another table

Compare set of data in one table to another table - sql

I've gone round in circles using JOIN and EXISTS and ISNULL. I cannot figure this out and I've gone through hundreds of threads.
Records in table Pages drives the output of individual retail web pages. Fields in this table determine what items are displayed:
PageID INT
Category VARCHAR(30)
Colour VARCHAR(10)
Size VARCHAR(10)
OnSale BIT
e.g.
PageID = 201
Category = Shoes
Colour = Red
Size = Large
OnSale = 1
I need to compare these fields against identically-named fields in Accounts_Emails.
fk_AccountID INT
Category VARCHAR(30)
Colour VARCHAR(10)
Size VARCHAR(10)
OnSale BIT
This table allows users to save options for email newsletters. If they want to be sent more items similar to those on the current page, they click a button on that page.
What I need is a stored procedure that checks if that user already has the exact match of options in a Accounts_Emails record, and if not INSERT a new record with those options.
In the fields, a NULL value means ALL so I need to compare nulls. I pass in the PageID and AccountID to the procedure so I can pick up the current Pages record and limit the Accounts_Emails to the current user.
This is what I have:
IF NOT EXISTS
(
SELECT 1 FROM Accounts_Emails a
JOIN Pages l on l.PageID = #PageID
WHERE
a.fk_AccountID = #AccountID AND
(ISNULL(a.Category,'NULL') = ISNULL(l.Category,'NULL')) AND
(ISNULL(a.Colour,'NULL') = ISNULL(l.Colour,'NULL')) AND
(ISNULL(a.Size,'NULL') = ISNULL(l.Size,'NULL')) AND
(ISNULL(a.OnSale,'NULL') = ISNULL(l.OnSale,'NULL'))
)

it looks like you need to use a MERGE statement. Using Accounts_Emails as TARGET.
Plenty of great examples are online.

Related

Query data from one column depending on other values on the table

So, I have a table table with three columns I am interested in : value, entity and field_entity.
There are other columns that are not important for this question. There are many different types of entity, and some of them can have the same field_entity, but those two columns determine what the column value refers to (if it is an id number for a person or the address or some other thing)
If I need the name of a person I would do something like this:
select value from table where entity = 'person' and field_entity = 'person_name';
My problem is I need to get a lot of different values (names, last names, address, documents, etc.), so the way I am doing it now is using a left join like this:
select
doc_type.value as doc_type,
doc.value as doc,
status.value as status
from
table doc
-- Get doc type
left join table doc_type
on doc.entity = doc_type.entity
and doc.transaction_id = doc_type.transaction_id
and doc_type.field_entity = 'document_type'
-- Get Status
left join table status
on doc.entity = status.entity
and doc.transaction_id = status.transaction_id
and status.field_entity = 'status'
where doc.entity = 'users' and doc.field_entity = 'document' and doc.transaction_id = 11111;
There are 16 values or more, and this can get a bit bulky and difficult to maintain, so I was wondering if some of you can point out a better way to do this?
Thanks!

I assume that you are not in position to alter the table structure, but can you add views to the database? If so, you can create views based on the different types of entities in your table.
For example:
CREATE VIEW view_person AS
SELECT value AS name
FROM doc
WHERE doc.entity = 'person'
AND doc.field_entity = 'name';
Then you can write clearer queries:
SELECT view_person.name FROM view_person;

Efficient SQL to find specific ID used with pagination

At work I need to implement a feature in the API that returns a specific page (20 entries) that contains the entry with a specified ID. That entry could be any of those 20 in the page.
Normally, a page is determined by taking the ID of the last element of the previous page, applying a filter to the elements after the previous ID and take the first 20 entries of the result.
But with the new feature, you‘re supposed to receive the page that CONTAINS the specified ID, rather than using it to determine the first element of a new page.
I‘m not working with databases that much so I‘m not sure, but since I‘m the only developer in this company, there‘s no one I can ask. If it helps, the database is MS SQL Server. If more info is needed, I can give it, as long as it‘s not against company policy.

Let's say we look for ID 123456. This can be anywhere from page 1 to page 6173 (or even missing completely). There is no way to tell other than to count the rows/pages until we get there. And we even must count on to get all following rows that are still on the same page. This is not difficult, but rather slow.
We cannot know which ID comes after another; after ID 5 the next may be ID 6 or 1234 or 1000000 or whatever. So the first step is to number all rows ordered by ID. Or rather assign them pages, for we know numbers 1 to 20 = page 1, numbers 21 to 40 = page 2, etc. Thus we get to know that our ID is on page X and we must select all rows marked with page X.
with rows_with_page as
(
select
t.*,
(row_number() over (order by id) - 1) / 20 + 1 as page
from mytable t
)
select *
from rows_with_page
where page = (select page from rows_with_page where id = #id)
order by id;
As a row number is an integer, the division with / is an integer division resulting in an integer in SQL Server (like in elementary school; 5 / 2 = 2).

If I understand your question, you have a fixed page size of 20 records per page. You will receive a ID value and that ID value can appear anywhere on a given page of 20 records. So, assuming the ID values start at 1, if you receive an ID value between 1 and 20 it should return all the records on page 1, if the ID value received is between 21 and 40 it should return all the records on page 2, etc.
We can figure out what page we are on by doing integer division - divide the ID by 20 and then add 1. So for ID 4, (4 / 20) is zero (remember integer division!) and then add 1 to get page 1. You can use the LIMIT...OFFSET feature of SQL Server (assuming you are using version that supports it).
I will create a simple script to show how this works. I don't know if you plan is to create a stored procedure or what, but you can adapt this technique. So let's make a simple table variable to mimic your table, add 100 records for some sample data, and then write a query to retrieve the records on the appropriate page.
-- create a table for the query
DECLARE #Records AS TABLE(
ID INT IDENTITY(1,1) NOT NULL,
[Value] INT NOT NULL
);
-- populate with 100 sample records
DECLARE #i INT;
SELECT #i = 1;
WHILE (#i <= 100)
BEGIN
INSERT INTO #Records([Value]) VALUES (#i);
SELECT #i = #i + 1;
END
-- now find the records on the correct page
DECLARE #id INT = 41; -- the record to find
DECLARE #pageSize INT = 20; -- the number of records per page
DECLARE #page INT; -- the page, counting from 1
SELECT #page = (#id / 20) + 1;
SELECT ID, [Value]
FROM #Records
ORDER BY ID
OFFSET (#page -1 ) * #pageSize ROWS
FETCH NEXT #pageSize ROWS ONLY;
That should work for you (if I understand what you were asking).

SQL Server - matching attributes query

SQL Server Gurus ...
Currently using MS SQL Server 2016
I know Joe Celko and all SQL purists are squirming at the thought of using bitmasks, but I have a use case in which I need to query for all widgets that contain a set of given attributes.
Each widget may contain several hundred attributes.
The attributes of a widget are either present or not (1 = present, 0 = not
present)
One way I thought to do this is via bitmasks – the attributes to be found (a bitmask) could be ANDed with the attributes of each widget to find matches in a single operation. For example, the widgets table might be:
widets table:
widget_uid Uniqueidentifier
attributes BigInt
SELECT widget_uid
FROM widgets
WHERE ( attributes & bitmask ) = bitmask;
Problem is, using a BigInt for the attributes limits the number of attributes to 64 (a widget can have several hundred attributes), I could group the attributes in chunks of 64 bits, ie:
widets table:
widget_uid Uniqueidentifier
attributes0 BigInt -- Attributes 0-63
attributes1 BigInt -- Attributes 64-127
attributes2 BigInt -- Attributes 128-191
SELECT widget_uid
FROM widgets
WHERE ( attributes0 & bitmask0 ) = bitmask0
AND ( attributes1 & bitmask1 ) = bitmask1
AND ( attributes2 & bitmask2 ) = bitmask2
... but was wondering if anyone has come up with a solution for bit operations using bitmasks with greater than 64 bits – or if other (more efficient?) solutions would exist?
In the use case, the widgets table does contain other columns, but I am only concerned with the attributes matching portion of the query at the moment.
Any and all ideas are welcome - would be interested in knowing how others tackle this particular problem.
Thanks in advance.

We had a similar use case, on a significantly large data set. This was for an e-commerce site with products and attributes. Our case was a bit more complex than here, where we had any possible number of attributes and then values assigned to those attributes. e.g. Color - Red/Green/Blue, Size - S/M/L etc.
We found that associated tables with good indexing was the key in our case. While this may not be an option for you we found this to be the optimal solution for a dynamic data set.
I can code you up an example if you feel it will be helpful.
Edited to add example:
DROP TABLE IF EXISTS #Widgets
DROP TABLE IF EXISTS #Attributes
DROP TABLE IF EXISTS #WidgetAttributes
CREATE TABLE #Widgets (widget_UID UNIQUEIDENTIFIER PRIMARY KEY CLUSTERED, Name NVARCHAR(255))
CREATE TABLE #Attributes (Attribute_UID UNIQUEIDENTIFIER PRIMARY KEY CLUSTERED, Name NVARCHAR(255))
CREATE TABLE #WidgetAttributes (widget_UID UNIQUEIDENTIFIER,Attribute_UID UNIQUEIDENTIFIER)
CREATE NONCLUSTERED INDEX ix_WidgetAttribute ON #WidgetAttributes (Attribute_UID) INCLUDE (widget_UID)
INSERT INTO #Widgets (widget_UID, Name) values
( '{c63bea73-2331-4698-82c9-f71845ab8601}', N'Widget 1' ),
( '{a0865b8f-606b-4273-9207-39a8a26016c4}', N'Widget 2' ),
( '{211fe27e-ab98-4b61-83a3-3d006d66db5a}', N'Widget 3' )
INSERT INTO #Attributes (Attribute_UID, Name)
VALUES
( '{99354dc0-d0b2-4919-a887-edf115eeb1bd}', N'Height' ),
( '{136bbe4c-497d-472f-a905-670e4a7805d0}', N'Width' ),
( '{f006f950-30d1-453e-8e09-4f7d140fa3cb}', N'Depth' ),
( '{0d190639-677f-4b75-8d36-1bdac00de132}', N'Colour' )
-- Set links
-- Widget 1 All attributes
-- Widget 2 Height Width
-- Widget 3 Colour
INSERT INTO #WidgetAttributes (widget_UID, Attribute_UID)
SELECT '{c63bea73-2331-4698-82c9-f71845ab8601}',Attribute_UID FROM #Attributes
UNION ALL
SELECT TOP (2) '{a0865b8f-606b-4273-9207-39a8a26016c4}',Attribute_UID FROM #Attributes WHERE Name<> 'Colour'
UNION ALL
SELECT '{211fe27e-ab98-4b61-83a3-3d006d66db5a}',Attribute_UID FROM #Attributes WHERE Name = 'Colour'
-- #SearchAttributes to hold list of attributes you are trying to find
DECLARE #SearchAttributes TABLE (Attribute_UID UNIQUEIDENTIFIER)
INSERT INTO #SearchAttributes
SELECT Attribute_UID FROM #Attributes WHERE Name<> 'Colour'
;WITH cte AS (
SELECT WA.widget_UID, COUNT(1) AttributesPresent FROM #WidgetAttributes WA
JOIN #SearchAttributes SA ON SA.Attribute_UID = WA.Attribute_UID
GROUP BY WA.widget_UID
)
SELECT cte.AttributesPresent
, W.widget_UID
, W.Name
FROM cte
JOIN #Widgets W ON W.widget_UID = cte.widget_UID
ORDER BY cte.AttributesPresent DESC
Gives an output of:
AttributesPresent widget_UID Name
----------------- ------------------------------------ ----------
3 C63BEA73-2331-4698-82C9-F71845AB8601 Widget 1
2 A0865B8F-606B-4273-9207-39A8A26016C4 Widget 2
We used an approach of counting how many attributes were present for each so we not only had the option of "exact match" but also "closest fit".

Using bitmask in databases is wrong approach. Even if you somewhow manage it to work, you will not be able to use indexes to speed up execution.
Use standard solution, this is standard situation. There is standard M:N relationship between Widgets and Attributes (both should be tables, of course). You will add another table that will assign Attributes to Widgets - you can call it WidgetAttributes.
It will have 3 columns: Id, WidgetId, AttributeId
Then you can simply for example get list of Widgets that have Attribute:
select w.*
from Widgets w
inner join WidgetAttributes wa on wa.WidgetId = w.Id
inner join Attributes a on a.Id = wa.AttributeId
where a.AttributeName='xxx'

How do I get SQL to append columns to the result set instead of adding more rows?

I have 3 tables, they are Events, SignOffs and Users.
Events has the fields EventId (PK, int, autoincrement) and EventTitle (nvarchar(50)).
SignOffs has the fields SignOffId (PK, int, autoincrement), EventId (FK to Events.EventId) and SignedOffByUserId (FK to Users.UserId).
Users has the fields UserId (PK, int, autoincrement) and UserName (nvarchar(50)).
I want to do something like this:
SELECT [Events].EventId, EventTitle, UserName
FROM [Events]
INNER JOIN [SignOffs] ON [Events].EventId = [SignOffs].EventId
INNER JOIN [Users] ON [SignOffs].SignedOffByUserId = [Users].UserId
The problem with the above is you get a row for each person that signed off, so a given event can be repeated in the list multiple times if multiple people signed off.
What I want is for columns to be added to the result set for each person that signed off on an event. So the result set should look like this for an event where 3 people signed off:
EventId - EventTitle - SignedOffByUser1 - SignedOffByUser2 - SignedOffByUser3
I don't have any ideas about how this can be done and I'm not even sure how to articulate the problem succinctly to be able to search for answers.

You need to pivot the data, this can either be done using the PIVOT operator.
https://msdn.microsoft.com/en-us/library/ms177410.aspx?f=255&MSPPError=-2147217396
or you can use multiple case statements to achieve the same effect.
the limitations of both approaches is that you may only have a predefined number of columns. so if you have more than 3 signoffs for example you will not see them.
Another possiblity could be to put them into a csv list this would enable you to see them all on a single row no matter how many there are.
http://blog.sqlauthority.com/2009/11/25/sql-server-comma-separated-values-csv-from-table-column/
SELECT SUBSTRING(
(SELECT ',' + s.Name
FROM HumanResources.Shift s
ORDER BY s.Name
FOR XML PATH('')),2,200000) AS CSV
this can be done in a sub query.
Hope this helps.

SQL query with joins help needed

I have four tables.
DocumentList:
DocumentID int
DocumentDescription varchar(100)
DocumentName varchar(100)
DocumentTypeCode int
Archive ud_DefaultBitFalse:bit
DocumentStepLevel:
DocumentStepID int
DocumentID int
StepLevelCode int
DocumentAttachment:
DocumentAttachmentGenID int
DocumentStepID int
AttachmentGenID int
FacilityGenID int
Submitted ud_DefaultBitFalse:bit
Attachment:
AttachmentGenId int
FileName varchar(255)
FileDescription varchar(255)
UploadDate ud_DefaultDate:datetime
DocumentData varbinary(MAX)
MimeType varchar(30)
Archive ud_DefaultBitFalse:bit
UpdateBy int
UpdateDate ud_DefaultDate:datetime
Documentlist table contains a list of documents.
DocumentStepLevel is a table that associate documents in DocumentList with a step level. We have six steps right now and each step have some documents associated with it.
DocumentAttachment table is junction/relationship table that create relationship between DocumentStepLevel and Attachment table.
Attachment table has the actual files data uploaded to the system
Question:
I need to write a query that will fetch the following columns.
DocumentList.[DocumentDescription]
DocumentList.[DocumentName]
DocumentStepLevel.[DocumentStepID]
DocumentStepLevel.[StepLevelCode]
DocumentAttachment.[DocumentAttachmentGenID]
DocumentAttachment.[FacilityGenID]
DocumentAttachment.[Submitted]
Attachment.[FileName]
Attachment.[FileDescription]
Attachment.[UploadDate]
Query should return data from DocumentList table for specific step level. When DocumentAttachment.[Submitted] column is set to true it should also return the data from DocumentAttachment and Attachment tables as well. Otherwise those columns will return nothing.
I tried using left outer join but problem happen when I add Submitted column to query. When I add that column to query it stop returning any data until that flag is set to true.

SELECT *
FROM documentStepLevel dsl
JOIN documentList dl
ON dl.documentId = dsl.documentId
LEFT JOIN
documentAttachment da
ON da.documentStepID = dsl.documentStepId
AND submitted = 1
LEFT JOIN
attachment a
ON a.attachmentGenId = da.attachmentGenId
WHERE dsl.stepLevelCode = #stepLevelCode

Is it DocumentAttachment you're left outer joining?
Difficult to say for sure without seeing your current query, but I'm guessing you've outer joined to DocumentAttachment and then have something like "where documentattachment.submitted = 1"?
In this case I believe it won't return anything as for rows where documentattachment doesn't exist, submitted is effectively null. So you might need to change your where statement to "where (documentattachment.submitted = 1 or documentattachment.submitted is null)"
This also assumes that when DocumentAttachment is populated, submitted by default has a 0 value rather than a null value (otherwise you'll need a different method of ascertaining the absence of a DocumentAttachment)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Compare set of data in one table to another table - sql

it looks like you need to use a MERGE statement. Using Accounts_Emails as TARGET. Plenty of great examples are online.

Related

Query data from one column depending on other values on the table

Efficient SQL to find specific ID used with pagination

SQL Server - matching attributes query

How do I get SQL to append columns to the result set instead of adding more rows?

SQL query with joins help needed

Categories

Resources