SQL Server : join on array of ID's from previous join - sql

I have 2 tables. One has been pruned to show only ID's which meet certain criteria. The second needs to be pruned to show only data that matches the previous "array" of id's. there can be multiple results.
Consider the following:
Query_1_final: Returns the ID's of users whom meet certain criteria:
select
t1.[user_id]
from
[SQLDB].[db].[meeting_parties] as t1
inner join
(select distinct
[user_id]
from
[SQLDB].[db].[meeting_parties]
group by
[user_id]
having
count([user_id]) = 1) as t2 on t1.user_id = t2.user_id
where
[type] = 'organiser'
This works great and returns:
user_id
--------------------
22
1255
9821
and so on...
It produces a single column with the ID's of everyone who is a "Meeting Organizer" and also in the active_meetings table. (note, there are multiple types/roles, this was the best way to grab them all)
Now, I need this data to filter another table, another join. Here is the start of my query
Query_2_PREP: returns 5 columns where the meeting has "started" already.
SELECT
[meeting_id]
,[meeting_style]
,[meeting_day]
,[address]
,[promos]
FROM
[SQLDB].[db].[all_meetings]
WHERE
[meeting_started] = 'TRUE'
This works as well
meeting_id | meeting_style | meeting_day ...
---------------------------------------------
23 open M,F,SA
23 discussion TU,TH
23 lead W,F
and so on...
and returns ALL 10,982 meetings that started, but I need it to return only the meetings that are from the distinct 'organiser's ID's from Query_1_final (which should be more like 1200 records or so)
Ideally, I need something "like" this below (but of course it does not work)
Query 2: needs to return all meetings that are from organiser ID's only.
SELECT
[meeting_party_id]
,[meeting_style]
,[meeting_day]
,[address]
,[promos]
FROM
[SQLDB].[db].[all_meetings]
WHERE
[meeting_started] = 'TRUE'
AND [meeting_party_id] = "ANY Query_1_final results, especially multiple"
I have tried nesting JOIN and INNER JOIN's but I think there is something fundamental I am missing here about SQL. In PHP I would use an array compare or just run another query... any help would be much appreciated.

Just use IN. Here is the structure of the logic:
with q1 as (
<first query here>
)
SELECT m.*
FROM [SQLDB].[db].[all_meetings] m
WHERE meeting_started = 'TRUE' AND
meeting_party_id IN (SELECT user_id FROM q1);

Related

Query build to find records where all of a series of records have a value

Let me explain a little bit about what I am trying to do because I dont even know the vocab to use to ask. I have an Access 2016 database that records staff QA data. When a staff member misses a QA we assign a job aid that explains the process and they can optionally send back a worksheet showing they learned about what was missed. If they do all of these ina 3 month period they get a credit on their QA score. So I have a series of records all of whom have a date we assigned the work(RA1) and MAY have a work returned date(RC1).
In the below image "lavalleer" has earned the credit because both of her sheets got returned. "maduncn" Did not earn the credit because he didn't do one.
I want to create a query that returns to me only the people that are like "lavalleer". I tried hitting google and searched here and access.programmers.co.uk but I'm only coming up with instructions to use Not null statements. That wouldn't work for me because if I did a IS Not Null on "maduncn" I would get the 4 records but it would exclude the null.
What I need to do is build a query where I can see staff that have dates in ALL of their RC1 fields. If any of their RC1 fields are blank I dont want them to return.
Consider:
SELECT * FROM tablename WHERE NOT UserLogin IN (SELECT UserLogin FROM tablename WHERE RCI IS NULL);
You could use a not exists clause with a correlated subquery, e.g.
select t.* from YourTable t where not exists
(select 1 from YourTable u where t.userlogin = u.userlogin and u.rc1 is null)
Here, select 1 is used purely for optimisation - we don't care what the query returns, just that it has records (or doesn't have records).
Or, you could use a left join to exclude those users for which there is a null rc1 record, e.g.:
select t.* from YourTable t left join
(select u.userlogin from YourTable u where u.rc1 is null) v on t.userlogin = v.userlogin
where v.userlogin is null
In all of the above, change all occurrences of YourTable to the name of your table.

SELECT query with conditional joins, return all rows when no data in lowest join

I'm looking for a solution where a query should return:
a) a limited set of rows when there are rows in the lowest joined table
b) all rows if there is no data in the lowest joined table
c) taking into account that it is possible that there is more than 1 such join
Objective:
we are implementing security using data. Rows from the table (MainTable) are filtered on 1 or more columns. These columns have a relationship with other tables (LookupTable). Security is defined on the LookupTable.
Example1: the MainTable contains contact information. One of the columns holds the country code, this column has a relationship with a LookupTable that contains the country codes. The user can only select a country code that exists in the LookupTable. The security admin can then define that a user can only work with contacts of one or more countries. When that user accesses the MainTable he/she will only get the contacts of that limited list of countries.
Example2: the MainTable contains products. One column holds the country of origin code, another column the product group. Security setup can limit the access to the product MainTable of a user to a list of countries AND a list of product groups.
The security setup works by Management-by-Exception, whichs means that the MainTable is filtered when one or more "security filters" are defined but if no security filters are defined then the user will get ALL rows from MainTable. So my query should return a limited number of rows if any security filter is defined but should return all rows if there are no security filters defined.
Current situation:
I have been working on a query for the case of Example2. There are 4 possible scenarios:
No security filters are defined
expected outcome: all rows are returned
Security filter defined only for first LookupTable
expected outcome: only rows matching values between LookupTable1 and security filter are returned
Security filter defined only for second LookupTable
expected outcome: only rows matching values between LookupTable2 and security filter are returned
Security filter defined only for both LookupTables
expected outcome: only rows matching values between LookupTable1 AND LookupTable2 and security filter are returned
The query I have is correct for cases 2,3 and 4 but fails for case 1 where no rows are returned (as per my understanding this is due to the fact that both JOINS return an empty result set).
Background:
The application provides the (power) users with some kind of table designer which means that they can define which columns are linked to a LookupTable and which of these LookupTables can be used for the "security filters".
This means that, potentially, we could have a MainTable with for example 200 columns of which 20 are linked to a LookupTable which are defined as security filter. The queries are stored procedures which are generated when "design" changes are saved.
With the query I have now (working for 3 out 4 cases) the number of scenarios is equal to 2^N where N is the number of LookupTables. If N is 20 the total goes over 1 million.
Security setup is done with Profiles assigned to Users and Filter Sets assigned to Profiles and Filter Set Entries containing the actual values to filter on (if any).
The environment is currently on MS SQL 2017 but will be put into production on SQL on Azure.
Example of the query (but look further below for the link to dbfiddle):
SELECT E.col_pk, E.col_28, E.col_7, E.col_8, E.col_9, E.col_1052
FROM MainTable AS E
LEFT JOIN LookupTable2 AS L28 ON L28.col_pk = E.col_28
JOIN SecUserProfile AS UP28 ON UP28.IdentityUserId = #UserId
JOIN SecProfileFilterSets AS PFS28 ON PFS28.SecProfileId = UP28.SecProfileId
LEFT JOIN SecFilterSetEntry AS SE28 ON SE28.SecFilterSetId = PFS28.SecFilterSetId AND SE28.MdxEntityId = 2 AND SE28.EntityKey = L28.col_pk
LEFT JOIN LookupTable13 AS L1052 ON L1052.col_pk = E.col_1052
JOIN SecUserProfile AS UP1052 ON UP1052.IdentityUserId = #UserId
JOIN SecProfileFilterSets AS PFS1052 ON PFS1052.SecProfileId = UP1052.SecProfileId
LEFT JOIN SecFilterSetEntry AS SE1052 ON SE1052.SecFilterSetId = PFS1052.SecFilterSetId AND SE1052.MdxEntityId = 13 AND SE1052.EntityKey = L1052.col_pk
WHERE
(SE28.SecFilterSetId IS NOT NULL AND SE1052.SecFilterSetId IS NOT NULL)
OR
(
SE28.SecFilterSetId IS NOT NULL AND
NOT EXISTS
(
SELECT TOP 1 NUP1052.Id FROM SecUserProfile AS NUP1052
JOIN SecProfileFilterSets AS NPFS1052 ON NPFS1052.SecProfileId = NUP1052.SecProfileId
JOIN SecFilterSetEntry AS NSE1052 ON NSE1052.SecFilterSetId = NPFS1052.SecFilterSetId AND NSE1052.MdxEntityId = 13
WHERE NUP1052.IdentityUserId = #UserId
)
)
OR
(
NOT EXISTS
(
SELECT TOP 1 NUP28.Id FROM SecUserProfile AS NUP28
JOIN SecProfileFilterSets AS NPFS28 ON NPFS28.SecProfileId = NUP28.SecProfileId
JOIN SecFilterSetEntry AS NSE28 ON NSE28.SecFilterSetId = NPFS28.SecFilterSetId AND NSE28.MdxEntityId = 2
WHERE NUP28.IdentityUserId = #UserId
)
AND SE1052.SecFilterSetId IS NOT NULL
)
OR
(
NOT EXISTS
(
SELECT TOP 1 NUP28.Id FROM SecUserProfile AS NUP28
JOIN SecProfileFilterSets AS NPFS28 ON NPFS28.SecProfileId = NUP28.SecProfileId
JOIN SecFilterSetEntry AS NSE28 ON NSE28.SecFilterSetId = NPFS28.SecFilterSetId AND NSE28.MdxEntityId = 2
WHERE NUP28.IdentityUserId = #UserId
)
AND
NOT EXISTS
(
SELECT TOP 1 NUP1052.Id FROM SecUserProfile AS NUP1052
JOIN SecProfileFilterSets AS NPFS1052 ON NPFS1052.SecProfileId = NUP1052.SecProfileId
JOIN SecFilterSetEntry AS NSE1052 ON NSE1052.SecFilterSetId = NPFS1052.SecFilterSetId AND NSE1052.MdxEntityId = 13
WHERE NUP1052.IdentityUserId = #UserId
)
)
Issue:
I have the following issues but they probably boil down to 1 in the end:
my current query is only 75% correct
even if my current query is correct it cannot be used in production with the potential high(er) number of lookup tables.
performance needs to be taken into account. Just as we don't know the number of columns and lookup tables at design time we don't know how many rows the tables will contain. The main table may hold 500, 50000 or 500000 records.
In the end all this will boil down to the right solution :)
I think this is not the easiest of questions (otherwise I will feel very stupid) and for those willing to take a look I've prepared a sandbox environment on dbfiddle representing the use-case I'm working with. I've setup the query to run 4 times, once for each of the scenarios.

Compare 2 columns in 2 tables with DISTINCT value

I am now creating a reporting service with visual business intelligent.
i try to count how many users have been created under an org_id.
but the report consist of multiple org_id. and i have difficulties on counting how many has been created under that particular org_id.
TBL_USER
USER_ID
0001122
0001234
ABC9999
DEF4545
DEF7676
TBL_ORG
ORG_ID
000
ABC
DEF
EXPECTED OUTPUT
TBL_RESULT
USER_CREATED
000 - 2
ABC - 1
DEF - 2
in my understanding, i need nested SELECT, but so far i have come to nothing.
SELECT COUNT(TBL_USER.USER_ID) AS Expr1
FROM TBL_USER INNER JOIN TBL_ORG
WHERE TBL_USER.USER_ID LIKE 'TBL_ORG.ORG_ID%')
this is totally wrong. but i hope it might give us clue.
It looks like the USER_ID value is the concatenation of your ORG_ID and something to make it unique. I'm assuming this is from a COTS product and nothing a human would have built.
Your desire is to find out how many entries there are by department. In SQL, when you read the word by in a requirement, that implies grouping. The action you want to take is to get a count and the reserved word for that is COUNT. Unless you need something out of the TBL_ORG, I see no need to join to it
SELECT
LEFT(T.USER_ID, 3) AS USER_CREATED
, COUNT(1) AS GroupCount
FROM
TBL_USER AS T
GROUP BY
LEFT(T.USER_ID, 3)
Anything that isn't in an aggregate (COUNT, SUM, AVG, etc) must be in your GROUP BY.
SQLFiddle
I updated the fiddle to also show how you could link to TBL_ORG if you need an element from the row in that table.
-- Need to have the friendly name for an org
-- Now we need to do the join
SELECT
LEFT(T.USER_ID, 3) AS USER_CREATED
, O.SOMETHING_ELSE
, COUNT(1) AS GroupCount
FROM
TBL_USER AS T
-- inner join assumes there will always be a match
INNER JOIN
TBL_ORG AS O
-- Using a function on a column is a performance killer
ON O.ORG_ID = LEFT(T.USER_ID, 3)
GROUP BY
LEFT(T.USER_ID, 3)
, O.SOMETHING_ELSE;

SQL JOIN returning multiple rows when I only want one row

I am having a slow brain day...
The tables I am joining:
Policy_Office:
PolicyNumber OfficeCode
1 A
2 B
3 C
4 D
5 A
Office_Info:
OfficeCode AgentCode OfficeName
A 123 Acme
A 456 Acme
A 789 Acme
B 111 Ace
B 222 Ace
B 333 Ace
... ... ....
I want to perform a search to return all policies that are affiliated with an office name. For example, if I search for "Acme", I should get two policies: 1 & 5.
My current query looks like this:
SELECT
*
FROM
Policy_Office P
INNER JOIN Office_Info O ON P.OfficeCode = O.OfficeCode
WHERE
O.OfficeName = 'Acme'
But this query returns multiple rows, which I know is because there are multiple matches from the second table.
How do I write the query to only return two rows?
SELECT DISTINCT a.PolicyNumber
FROM Policy_Office a
INNER JOIN Office_Info b
ON a.OfficeCode = b.OfficeCode
WHERE b.officeName = 'Acme'
SQLFiddle Demo
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins
Simple join returns the Cartesian multiplication of the two sets and you have 2 A in the first table and 3 A in the second table and you probably get 6 results. If you want only the policy number then you should do a distinct on it.
(using MS-Sqlserver)
I know this thread is 10 years old, but I don't like distinct (in my head it means that the engine gathers all possible data, computes every selected row in each record into a hash and adds it to a tree ordered by that hash; I may be wrong, but it seems inefficient).
Instead, I use CTE and the function row_number(). The solution may very well be a much slower approach, but it's pretty, easy to maintain and I like it:
Given is a person and a telephone table tied together with a foreign key (in the telephone table). This construct means that a person can have more numbers, but I only want the first, so that each person only appears one time in the result set (I ought to be able concatenate multiple telephone numbers into one string (pivot, I think), but that's another issue).
; -- don't forget this one!
with telephonenumbers
as
(
select [id]
, [person_id]
, [number]
, row_number() over (partition by [person_id] order by [activestart] desc) as rowno
from [dbo].[telephone]
where ([activeuntil] is null or [activeuntil] > getdate()
)
select p.[id]
,p.[name]
,t.[number]
from [dbo].[person] p
left join telephonenumbers t on t.person_id = p.id
and t.rowno = 1
This does the trick (in fact the last line does), and the syntax is readable and easy to expand. The example is simple but when creating large scripts that joins tables left and right (literally), it is difficult to avoid that the result contains unwanted duplets - and difficult to identify which tables creates them. CTE works great for me.

MySQL Query, Join, and Myself, or how I always take the hard way through

I'm creating a small forum.
Attempting to run SElECT... JOIN... query too pick up information on the individual posts, plus the last reply (if any). As part of my desire to do everything the hard way, this covers five tables (only columns revelant to this issue are being stated)
commentInfo referenceID | referenceType | authorID | create
postit id | title
postitInfo referencePostitID | create | authorID
user id | username | permission
userInfo referenceUserID | title
So, I run this query SELECT... JOIN... query to get the most recent topics and their last replies.
SELECT DISTINCT
t1.id, t1.title, t2.create, t2.lastEdit, t2.authorID, t3.username,
t4.title AS userTitle, t3.permission, t5.create AS commentCreate,
t5.authorID AS commentAuthor, t6.username AS commentUsername,
t6.permission AS commentPermission
FROM rantPostit AS t1
LEFT JOIN (rantPostitInfo AS t2)
ON ( t1.id = t2.referencePostitID)
LEFT OUTER JOIN (rantUser as t3, rantUserInfo as t4)
ON (t2.authorId = t3.id AND t4.referenceUserId = t2.authorId)
LEFT OUTER JOIN (rantCommentInfo as t5, rantUser as t6)
ON (t5.referenceType = 8 AND t5.referenceID = t1.id AND t6.id = t5.authorID)
ORDER BY t2.create DESC, t5.create DESC
Now, this returns the topic posts. Say I have two of them, it returns both of them fine. Say I have eight replies to the first, it will return 9 entries (one each for the topic + reply, and the individual one with no replies). So, I guess my issue is this: I don't know what to do to limit the number of returns in the final LEFT OUTER JOIN clause to just the most recent, or simply strike the least recent ones out of the window.
(Yes, I realize the ORDER BY... clause is messed up, as it'll first order it by the post create date, then by the comment create date. Yes, I realize I could simplify all my problems by adding two fields into postitInfo, lastCommentCreate and lastCommentCreateID, and have it update each time a reply is made, but... I like the hard way.)
So what am I doing wrong?
Or is this such an inane problem that I should be taken 'round the woodshed and beat with a hammer?
The splits between post and postInfo, and the user and userInfo tables, appear to be doing nothing much here except obfuscate things. To better see solutions, let's boil things down to their essence: a table Posts (with a primary key id, a creation date date, and other fields) and a table Comments (with a primary key id, a foreign key refId referencing Posts, a unique creation date date, and other fields); we want to see all posts, each with its most recent comment if any (the primary keys id of the table rows retrieved, and the other fields, can of course be contextually used in the SELECT to fetch and show more info yet, but that doesn't change the core structure, and simplifying things down to the core structure should help illustrate the solutions). I'm assuming the creation date of a comment is unique, otherwise "latest comment" can be ambiguous (of course, that ambiguity could be arbitrarily truncated in other ways, picking one item of the set of "latest comments" to a given post).
So, here's one approach:
SELECT Posts.id, Comments.id FROM Posts
LEFT OUTER JOIN Comments on (Posts.id = Comments.refId)
WHERE Comments.create IS NULL OR (
Comments.create = (SELECT create FROM Comments
WHERE refID = Posts.id
ORDER BY create DESC
LIMIT 1)
) /* add ORDER BY &c to taste;-) */
the idea: for each post, we want "a null comment" (when there have been no comment to it) or else the comment whose create date is the highest among those referencing the post; here, the inner SELECT takes care of finding that "highest" create date. So, in the same spirit, the inner select might be SELECT MAX(create) FROM Comments WHERE refID = Posts.id which is probably preferable (as shorter and more direct, & maybe faster).
It looks like the last LEFT JOIN is the only one that can return multiple rows. If that's true, you can just use LIMIT 5 to get the last five comments:
ORDER BY t5.create DESC
LIMIT 5
If not, a very simple solution would be to retrieve the comments with a separate query:
SELECT *
FROM rantCommentInfo t5
ON t5.referenceType = 8
AND t5.referenceid = t1.id
LEFT OUTER JOIN rantUser t6
ON t6.id = t5.authorID
ORDER BY CommentCreate
WHERE t5.referenceid = YourT1Id
LIMIT 5
Can't think of a way to do it in one query, without ROW_NUMBER, which MySQL does not support.