SQL to search with a subsets of records

SQL to search with a subsets of records - sql

Is there any way to select all subsets from table A where corresponding subsets exist in table B? Each subset in table A must have at least all the entries that corresponding subset in table B.
In this link it's called "Division with a Remainder" but my problem is more complex because I've got many to many relation.
Table A
UserName text | File text | AccessLevel int
Table B
AppName text | File text | AccessLevel int
Sample data
Table A
User1 | aaa.txt | 1
User1 | bbb.txt | 3
User1 | ccc.txt | 1
User2 | aaa.txt | 3
Table B
Appl1 | aaa.txt | 1
Appl1 | bbb.txt | 1
Appl2 | aaa.txt | 1
Appl3 | bbb.txt | 5
Appl4 | aaa.txt | 1
Appl4 | bbb.txt | 1
Appl4 | ccc.txt | 1
Appl4 | ddd.txt | 1
Expected results:
User1 | Appl1
User1 | Appl2
User2 | Appl2
User1 has "complete" access to applications Appl1 and Appl2 because he has necessary access to ALL files used by these applications. He doesn't have access to application Appl3 because access level is not high enough. He doesn't have access to application Appl4 because he doesn't have access to file ddd.txt.
Basically I need to compare subsets of records and return all cases where subset in table B is equal or greater than subset in table A. Is there any way to do it in SQL?
Any help appreciated.

One method is a self-join and then comparing the number of matching records to the number needed for the application.
Assuming no duplicates:
select a.username, b.appname
from a join
(select b.*, count(*) over (partition by b.appname) as cnt
from b
) b
on a.filetext = b.filetext and
a.accesslevel >= b.accesslevel
group by a.username, b.appname, b.cnt
having count(*) = b.cnt

SELECT distinct A1.USERNAME, B1.APPLICATION
FROM TABLEUSER A1
INNER JOIN TABLEAPPLI B1 on B1.filename=A1.filename and B1.accesslevel<=A1.accesslevel
where not exists
(
select *
from TABLEUSER A2 INNER JOIN TABLEAPPLI B2 on B2.filename<>A2.filename or B2.accesslevel>A2.accesslevel
where (A1.USERNAME, B1.APPLICATIONAME) = (A2.USERNAME, B2.APPLICATIONAME)
)

Related

Comparing aggregated columns to non aggregated columns to remove matches

I have two separate tables from two different databases that are performing a matching check.
If the values match I want them out of the result set. The first table (A) has multiple entries that contain the same symbol matches for the matching columns in the second table (B).
The entries in table B, if added up will ideally equal the value of one of the matching rows of A.
The tables look like below when queried separately.
Underneath the tables is what my query currently looks like. I thought if I group the columns by the symbols I could use the SUM of B to add up to the value of A which would get rid of the entries. However, I think because I am summing from B and not from A, then the A doesn't count as an aggregated column so must be included in the group by and doesn't allow for the summing to work in the way I'm wanting it to calculate.
How would I be able to run this query so the values in B are all summed up. Then, if matching to the symbol/value from any of the entries in A, don't get included in the result set?
Table A
| Symbol | Value |
|--------|-------|
| A | 1000 |
| A | 1000 |
| B | 1440 |
| B | 1440 |
| C | 1235 |
Table B
| Symbol | Value |
|--------|-------|
| A | 750 |
| A | 250 |
| B | 24 |
| B | 1416|
| C | 1874|
SELECT DBA.A, DBB.B
FROM DatabaseA DBA
INNER JOIN DatabaseB DBB on DBA.Symbol = DBB.Symbol
and DBA.Value != DBB.Value
group by DBA.Symbol, DBB.Symbol, DBB.Value
having SUM(DBB.Value) != DBA.Value
order by Symbol, Value
Edited to add ideal results
Table C
| SymbolB| ValueB| SymbolA | ValueA |
|--------|-------|---------|--------|
| C | 1874 | C | 1235 |
Wherever B adds up to A remove both. If they don't add, leave number inside result set

I will use CTE and use this common table expression (CTE) to search in Table A. Then join table A and table B on symbol.
WITH tDBB as (
SELECT DBB.Symbol, SUM(DBB.Value) as total
FROM tableB as DBB
GROUP BY DBB.Symbol
)
SELECT distinct DBB.Symbol as SymbolB, DBB.Value as ValueB, DBA.Symbol as SymbolA, DBA.Value as ValueA
FROM tableA as DBA
INNER JOIN tableB as DBB on DBA.Symbol = DBB.Symbol
WHERE DBA.Symbol in (Select Symbol from tDBB)
AND NOT DBA.Value in (Select total from tDBB)
Result:
|symbolB |valueB |SymbolA |ValueA |
|--------|-------|--------|-------|
| C | 1874 | C | 1235 |

with t3 as (
select symbol
,sum(value) as value
from t2
group by symbol
)
select *
from t3 join t on t.symbol = t3.symbol and t.value != t3.value
symbol
value
Symbol
Value
C
1874
C
1235
Fiddle

Only select from tables that are of interest

SQL Server 2016
I have a number of tables
Table A Table B Table C Table D
User | DataA User | DataB User | DataC User | DataD
=========== =========== =================== =============
1 | 10 1 | 'hello' 4 | '2020-01-01' 1 | 0.34
2 | 20 2 | 'world'
3 | 30
So some users have data for A,B,C and/or D.
Table UserEnabled
User | A | B | C | D
=============================
1 | 1 | 1 | 0 | 0
2 | 1 | 1 | 0 | 0
3 | 1 | 0 | 0 | 0
4 | 0 | 0 | 1 | 0
Table UserEnabled indicates whether we are interested in any of the data in the corresponding tables A,B,C and/or D.
Now I want to join those tables on User but I do only want the columns where the UserEnabled table has at least one user with a 1 (ie at least one user enabled). Ideally I only want to join the tables that are enabled and not filter the columns from the disabled tables afterwards.
So as a result for all users I would get
User | DataA | DataB | DataC
===============================
1 | 10 | 'hello' | NULL
2 | 20 | 'world' | NULL
3 | 30 | NULL | NULL
4 | NULL | NULL | '2020-01-01'
No user has D enabled so it does not show up in a query
I was going to come up with a dynamic SQL that's built every time I execute the query depending on the state of UserEnabled but I'm afraid this is going to perform poorly on a huge data set as the execution plan will need to be created every time. I want to dynamically display only the enabled data, not columns with all NULL.
Is there another way?
Usage will be a data sheet that may be generated up to a number of times per minute.

You have no choice but to approach this through dynamic SQL. A select query has a fixed set of columns defined when the query is created. No such thing as "variable" columns.
What can you do? One method is to "play a trick". Store the columns as JSON (or XML) and delete the empty columns.
Another method is to create a view that has the specific logic you need. I think you can maintain this view by altering it in a trigger, based on when data in the enabled table changes. That said, altering the view requires dynamic SQL so the code will not be pretty.

Just because I thought this could be fun.
Example
Declare #Col varchar(max) = ''
Declare #Src varchar(max) = ''
Select #Col = #Col+','+Item+'.[Data'+Item+']'
,#Src = #Src+'Left Join [Table'+Item+'] '+Item+' on U.[User]=['+Item+'].[User] and U.['+Item+']=1'+char(13)
From (
Select Item
From ( Select A=max(A)
,B=max(B)
,C=max(C)
,D=max(D)
From UserEnabled
Where 1=1 --<< Use any Key Inital Filter Condition Here
) A
Unpivot ( value for item in (A,B,C,D)) B
Where Value=1
) A
Declare #SQL varchar(max) = '
Select U.[User]'+#Col+'
From #UserEnabled U
'+#Src
--Print #SQL
Exec(#SQL)
Returns
User DataA DataB DataC
1 10 Hello NULL
2 20 World NULL
3 30 NULL NULL
4 NULL NULL 2020-01-01
The Generated SQL
Select A.[User],A.[DataA],B.[DataB],C.[DataC]
From UserEnabled U
Left Join TableA A on U.[User]=[A].[User] and U.[A]=1
Left Join TableB B on U.[User]=[B].[User] and U.[B]=1
Left Join TableC C on U.[User]=[C].[User] and U.[C]=1

If all the relations are 1:1, you can make one query with
...
FROM u
LEFT JOIN a ON u.id = a.u_id
LEFT JOIN b ON u.id = b.u_id
LEFT JOIN c ON u.id = c.u_id
LEFT JOIN d ON u.id = d.u_id
...
and use display logic on the client to omit the irrelevant columns.
If more than one relation is 1:N, then you'd likely have to do multiple queries anyway to prevent N1xN2 results.

Display two linked values for two id field pointing the same table

In my PostgreSQL database I have a table like this one:
id|link1|link2|
---------------
1 | 34 | 66
2 | 23 | 8
3 | 11 | 99
link1 and link2 fields are both pointing to the same table table2 which has id and descr fields.
I would make an SQL query that returns the same row the id and the descr value for the two field like this:
id|link1|link2|desc_l1|desc_l2|
-------------------------------
1 | 34 |66 | bla | sisusj|
2 | 23 | 8 | ghhj | yui |
3 | 11 | 99 | erd | bnbn |
I've try different queries, but everyone returns two rows per id instead of one.
How can I achieve these results in my PostgreSQL 9.04 database?

Normally, this query should work for you. Assume your first table name's table_name.
SELECT t.id, t.link1, t.link2,
l1.descr AS desc_l1,
l2.descr AS desc_l2
FROM table_name t
LEFT JOIN table2 l1
ON t.link1 = l1.id
LEFT JOIN table2 l2
ON t.link2 = l2.id;

you can use case here Like:
select link1,link2,
case
when link1='34' and link2='66' then 'bla'
when link1='23' and link2='8' then 'ghs'
when link1='11' and link2='99' then 'erd'
end as desc_li,
case
when link1='34' and link2='66' then 'sjm'
when link1='23' and link2='8' then 'yur'
when link1='11' and link2='99' then 'bnn'
end as desc_l2
from table1

join on three tables? Error in phpMyAdmin

I'm trying to use a join on three tables query I found in another post (post #5 here). When I try to use this in the SQL tab of one of my tables in phpMyAdmin, it gives me an error:
#1066 - Not unique table/alias: 'm'
The exact query I'm trying to use is:
select r.*,m.SkuAbbr, v.VoucherNbr from arrc_RedeemActivity r, arrc_Merchant m, arrc_Voucher v
LEFT OUTER JOIN arrc_Merchant m ON (r.MerchantID = m.MerchantID)
LEFT OUTER JOIN arrc_Voucher v ON (r.VoucherID = v.VoucherID)
I'm not entirely certain it will do what I need it to do or that I'm using the right kind of join (my grasp of SQL is pretty limited at this point), but I was hoping to at least see what it produced.
(What I'm trying to do, if anyone cares to assist, is get all columns from arrc_RedeemActivity, plus SkuAbbr from arrc_Merchant where the merchant IDs match in those two tables, plus VoucherNbr from arrc_Voucher where VoucherIDs match in those two tables.)
Edited to add table samples
Table arrc_RedeemActivity
RedeemID | VoucherID | MerchantID | RedeemAmt
----------------------------------------------
1 | 2 | 3 | 25
2 | 6 | 5 | 50
Table arrc_Merchant
MerchantID | SkuAbbr
---------------------
3 | abc
5 | def
Table arrc_Voucher
VoucherID | VoucherNbr
-----------------------
2 | 12345
6 | 23456
So ideally, what I'd like to get back would be:
RedeemID | VoucherID | MerchantID | RedeemAmt | SkuAbbr | VoucherNbr
-----------------------------------------------------------------------
1 | 2 | 3 | 25 | abc | 12345
2 | 2 | 5 | 50 | def | 23456

The problem was you had duplicate table references - which would work, except for that this included table aliasing.
If you want to only see rows where there are supporting records in both tables, use:
SELECT r.*,
m.SkuAbbr,
v.VoucherNbr
FROM arrc_RedeemActivity r
JOIN arrc_Merchant m ON m.merchantid = r.merchantid
JOIN arrc_Voucher v ON v.voucherid = r.voucherid
This will show NULL for the m and v references that don't have a match based on the JOIN criteria:
SELECT r.*,
m.SkuAbbr,
v.VoucherNbr
FROM arrc_RedeemActivity r
LEFT JOIN arrc_Merchant m ON m.merchantid = r.merchantid
LEFT JOIN arrc_Voucher v ON v.voucherid = r.voucherid

MS Access Pass Through Query find duplicates using multiple tables

I'm trying to find all coverage_set_id with more than one benefit_id attached summary_attribute (value=2004687).
The query seems to be working fine without the GROUP BY & HAVING parts, but once I add those lines in (for the COUNT) my results are incorrect. Just trying to get duplicate coverage_set_id.
Pass-Through query via OBDC database:
SELECT DISTINCT
b.coverage_set_id,
COUNT (b.coverage_set_id) AS "COUNT"
FROM
coverage_set_detail_view a
JOIN contracts_by_sub_group_view b ON b.coverage_set_id = a.coverage_set_id
JOIN request c ON c.request_id = b.request_id
WHERE
b.valid_from_date BETWEEN to_date('10/01/2010','mm/dd/yyyy')
AND to_date('12/01/2010','mm/dd/yyyy')
AND c.request_status = 1463
AND summary_attribute = 2004687
AND benefit_id <> 1092333
GROUP BY
b.coverage_set_id
HAVING
COUNT (b.coverage_set_id) > 1
My results look like this:
-----------------------
COVERAGE_SET_ID | COUNT
-----------------------
4193706 | 8
4197052 | 8
4193926 | 112
4197078 | 96
4174168 | 8
I'm expecting all the COUNTs to be 2.
::EDIT::
Solution:
SELECT
c.coverage_set_id AS "COVERAGE SET ID",
c1.description AS "Summary Attribute",
count(d.benefit_id) AS "COUNT"
FROM (
SELECT DISTINCT coverage_set_id
FROM contracts_by_sub_group_view
WHERE
valid_from_date BETWEEN '01-OCT-2010' AND '01-DEC-2010'
AND request_id IN (
SELECT request_id
FROM request
WHERE request_status = 1463)
) a
JOIN coverage_set_master e ON e.coverage_set_id = a.coverage_set_id
JOIN coverage_set_detail c ON c.coverage_set_id = a.coverage_set_id
JOIN benefit_summary d ON d.benefit_id = c.benefit_id
AND d.coverage_type = e.coverage_type
JOIN codes c1 ON c1.code_id = d.summary_attribute
WHERE
d.summary_attribute IN (2004687, 2004688)
AND summary_structure = 1000217
GROUP BY c.coverage_set_id, c1.description
HAVING COUNT(d.benefit_id) > 1
ORDER BY c.coverage_set_id, c1.description
And these were the results:
COVERAGE SET ID | SUMMARY ATTRIBUTE | COUNT
-------------------------------------------------
4174168 | INPATIENT | 2
4174172 | INPATIENT | 2
4191828 | INPATIENT | 2
4191832 | INPATIENT | 2
4191833 | INPATIENT | 2
4191834 | INPATIENT | 2
4191838 | INPATIENT | 2
4191842 | INPATIENT | 2
4191843 | INPATIENT | 2
4191843 | OUTPATIENT | 2
4191844 | INPATIENT | 2
4191844 | OUTPATIENT | 2

The coverage_set_id in both the HAVING and count part of the SELECT should be benefit_id.
Since benefit_id is also in table a you can do the following
SELECT
a.coverage_set_id,
COUNT (a.benefit_id) AS "COUNT"
FROM
coverage_set_detail_view a
WHERE
a.coverage_set_id in (
SELECT b.coverage_set_id
FROM contracts_by_sub_group_view b
WHERE b.valid_from_date BETWEEN to_date('10/01/2010','mm/dd/yyyy') AND to_date('12/01/2010','mm/dd/yyyy'))
AND a.coverage_set_id in (
SELECT b2.coverage_set_id
FROM contracts_by_sub_group_view b2
INNER JOIN request c on c.request_id=b2.request_id
WHERE c.request_status = 1463)
AND ?.summary_attribute = 2004687
AND a.benefit_id <> 1092333
GROUP BY
a.coverage_set_id
HAVING
COUNT (a.benefit_id) > 1
This removes the JOIN magnification that was occurring on the FROM since those tables are not needed to pull coverage_set_id and benefit_id. The only remaining need for the other 2 tables is to filter out data based on criteria, which is in the WHERE clause.
I'm not sure what table summary_attribute lives in but it would follow a similar pattern to valid_from_date, request_status, or benefit_id.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL to search with a subsets of records - sql

Related

Comparing aggregated columns to non aggregated columns to remove matches

Only select from tables that are of interest

Display two linked values for two id field pointing the same table

join on three tables? Error in phpMyAdmin

MS Access Pass Through Query find duplicates using multiple tables

Categories

Resources