Union 2 table with diff schema - sql

Following is the requirement and given table schema
Write a query in Canvas using Table 1 ,Table 2 & Table 3 only for books available in both the tables with below conditions
It has to be non-fiction and have rating above 4.2 .
Introduce a column using price to show grouping.
Union Table 3.
Table Schema
I was able to solve the first point. However I am not able to wrap my head around the point 2 & 3. Do you guys think its an incorrect question?
My Query:
Select table2.book, table1.genre, table2.ratings, table2.reviews, table2.type, table2.price
from table1 inner join table2 on table1.book_name = table2.book
where table1.genre = 'non-fiction'
and table2.ratings > 4.2

It is not possible to UNION two tables with a different number of columns.
UNION is a set operator which combines data vertically across the tables. So, the columns must be same number, same datatype and same order.
The trick here is to use NULL for the extra columns present in both the tables if you want to use UNION. You can match the number of columns using this way.
(select table2.book,table2.author,NULL as noOfCopies,table1.genre,table2.ratings,table2.reviews,table2.type,sum(table2.price) as totalprice
from table1 inner join table2 on table1.book_name = table2.book
where table1.genre = 'non-fiction' and table2.ratings > 4.2
group by table2.book, table1.genre, table2.ratings, table2.reviews,table2.type)
UNION
select NULL as book, author_name, noOfCopies, NULL as genre, NULL as ratings, NULL as reviews, NULL as type, NULL as totalprice from table3;

To do union with table3,used inner join to link table2 and table3 with author name
Select table2.book, table1.genre, table2.ratings, table2.reviews, table2.type, sum(table2.price), sum(noofcopies)
from table1
inner join table2 on table1.book_name = table2.book
inner join table3 on table2.author=table3.author_name
where table1.genre = 'non-fiction' and table2.ratings > 4.2
group by table2.book, table1.genre, table2.ratings, table2.reviews,table2.type

Related

Need help in joining tables to get the accurate results

I have 4 tables, the 1st table 'LD0P0K' is the main table i need to join with 2nd 'LD0P0K01' and 3rd 'LD0P0K04' with the 1st column value 'HUSHLNR' and get the 'PNR' from both tables and then join with last table 'LD0P0A' to get the values with 'PNR'.
I tried the below solution but its not giving the data from 3rd table and giving 6 records with 2 rows for each 2nd table.
Select HS.HUSHLNR, HS.FOMDAT,HS.TOMDAT,HSM.PNR,HSM.FAMHUVUD,HSM.MARK,HSM.AVFOMDAT,HSM.AVTOMDAT,P.KUNDNUMMER,P.FODDAT from LD0P0K HS
LEFT OUTER JOIN LD0P0K01 HSM on HS.HUSHLNR = HSM.HUSHLNR
LEFT OUTER JOIN LD0P0K04 CHM on HS.HUSHLNR = CHM.HUSHLNR
LEFT OUTER JOIN LD0P0A P on p.PNR= HSM.PNR AND p.PNR = CHM.PNR
Where HS.HUSHLNR='906'
You can simply use Union to combined the fields
Here is the link for Union to understand
SQL Union
and here is the code change it whatever you want
Select HS.HUSHLNR as HUSHLNR, HS.fromdate as FromDte,HSM.PNR as PNR from LD0P0K HS
inner JOIN LD0P0K01 HSM on HS.HUSHLNR = HSM.HUSHLNR
union
Select CHM.HUSHLNR as HUSHLNR, '' as FromDte,CHM.PNR as PNR from LD0P0K04 CHM
order by PNR
Output:

SQL joined tables are causing duplicates

So table A is an overall table of policy_id information, while table b is policy_id's with claims attached. Not all of the id's in A exist in B, but I want to join the two tables and sum(total claims).
The issue is that the sum is way higher than the actual sum within the table itself.
Here is what I've tried so far:
select a.policy_id, coalesce(sum(b.claim_amt), 0)
from database.table1 as a
left join database2.table2 as b on a.policy_id = b.policy_id
where product_code = 'CI'
group by a.policy_id
The id's that don't exist in b show up just fine with a 0 next to them, it's the ones that do exist where the claim_amt's seem like they're being duplicated heavily in the sum.
I suspect your policy_id in table1 are not unique and that leads to the doubled,tripled ,etc. amounts
You could aggregate the sums from table2 in a CTE to get around this.
WITH CTE AS (
SELECT
policy_id
coalesce(sum(claim_amt), 0) as sum_amt
FROM database2.table2
group by policy_id
)
select a.policy_id, b.sum_amt
from database.table1 as a
left join CTE as b on a.policy_id = b.policy_id
where product_code = 'CI'

SQL, only if matching all foreign key values to return the record?

I have two tables
Table A
type_uid, allowed_type_uid
9,1
9,2
9,4
1,1
1,2
24,1
25,3
Table B
type_uid
1
2
From table A I need to return
9
1
Using a WHERE IN clause I can return
9
1
24
SELECT
TableA.type_uid
FROM
TableA
INNER JOIN
TableB
ON TableA.allowed_type_uid = TableB.type_uid
GROUP BY
TableA.type_uid
HAVING
COUNT(distinct TableB.type_uid) = (SELECT COUNT(distinct type_uid) FROM TableB)
Join the two tables togeter, so that you only have the records matching the types you are interested in.
Group the result set by TableA.type_uid.
Check that each group has the same number of allowed_type_uid values as exist in TableB.type_uid.
distinct is required only if there can be duplicate records in either table. If both tables are know to only have unique values, the distinct can be removed.
It should also be noted that as TableA grows in size, this type of query will quickly degrade in performance. This is because indexes are not actually much help here.
It can still be a useful structure, but not one where I'd recommend running the queries in real-time. Rather use it to create another persisted/cached result set, and use this only to refresh those results as/when needed.
Or a slightly cheaper version (resource wise):
SELECT
Data.type_uid
FROM
A AS Data
CROSS JOIN
B
LEFT JOIN
A
ON Data.type_uid = A.type_uid AND B.type_uid = A.allowed_type_uid
GROUP BY
Data.type_uid
HAVING
MIN(ISNULL(A.allowed_type_uid,-999)) != -999
Your explanation is not very clear. I think you want to get those type_uid's from table A where for all records in table B there is a matching A.Allowed_type_uid.
SELECT T2.type_uid
FROM (SELECT COUNT(*) as AllAllowedTypes FROM #B) as T1,
(SELECT #A.type_uid, COUNT(*) as AllowedTypes
FROM #A
INNER JOIN #B ON
#A.allowed_type_uid = #B.type_uid
GROUP BY #A.type_uid
) as T2
WHERE T1.AllAllowedTypes = T2.AllowedTypes
(Dems, you were faster than me :) )

Filter a SQL Server table dynamically using multiple joins

I am trying to filter a single table (master) by the values in multiple other tables (filter1, filter2, filter3 ... filterN) using only joins.
I want the following rules to apply:
(A) If one or more rows exist in a filter table, then include only those rows from the master that match the values in the filter table.
(B) If no rows exist in a filter table, then ignore it and return all the rows from the master table.
(C) This solution should work for N filter tables in combination.
(D) Static SQL using JOIN syntax only, no Dynamic SQL.
I'm really trying to get rid of dynamic SQL wherever possible, and this is one of those places I truly think it's possible, but just can't quite figure it out. Note: I have solved this using Dynamic SQL already, and it was fairly easy, but not particularly efficient or elegant.
What I have tried:
Various INNER JOINS between master and filter tables - works for (A) but fails on (B) because the join removes all records from the master (left) side when the filter (right) side has no rows.
LEFT JOINS - Always returns all records from the master (left) side. This fails (A) when some filter tables have records and some do not.
What I really need:
It seems like what I need is to be able to INNER JOIN on each filter table that has 1 or more rows and LEFT JOIN (or not JOIN at all) on each filter table that is empty.
My question: How would I accomplish this without resorting to Dynamic SQL?
In SQL Server 2005+ you could try this:
WITH
filter1 AS (
SELECT DISTINCT
m.ID,
HasMatched = CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END,
AllHasMatched = MAX(CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END) OVER ()
FROM masterdata m
LEFT JOIN filtertable1 f ON join_condition
),
filter2 AS (
SELECT DISTINCT
m.ID,
HasMatched = CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END,
AllHasMatched = MAX(CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END) OVER ()
FROM masterdata m
LEFT JOIN filtertable2 f ON join_condition
),
…
SELECT m.*
FROM masterdata m
INNER JOIN filter1 f1 ON m.ID = f1.ID AND f1.HasMatched = f1.AllHasMatched
INNER JOIN filter2 f2 ON m.ID = f2.ID AND f2.HasMatched = f2.AllHasMatched
…
My understanding is, filter tables without any matches simply must not affect the resulting set. The output should only consist of those masterdata rows that have matched all the filters where matches have taken place.
SELECT *
FROM master_table mt
WHERE (0 = (select count(*) from filter_table_1)
OR mt.id IN (select id from filter_table_1)
AND (0 = (select count(*) from filter_table_2)
OR mt.id IN (select id from filter_table_2)
AND (0 = (select count(*) from filter_table_3)
OR mt.id IN (select id from filter_table_3)
Be warned that this could be inefficient in practice. Unless you have a specific reason to kill your existing, working, solution, I would keep it.
Do inner join to get results for (A) only and do left join to get results for (B) only (you will have to put something like this in the where clause: filterN.column is null) combine results from inner join and left join with UNION.
Left Outer Join - gives you the MISSING entries in master table ....
SELECT * FROM MASTER M
INNER JOIN APPRENTICE A ON A.PK = M.PK
LEFT OUTER JOIN FOREIGN F ON F.FK = M.PK
If FOREIGN has keys that is not a part of MASTER you will have "null columns" where the slots are missing
I think that is what you looking for ...
Mike
First off, it is impossible to have "N number of Joins" or "N number of filters" without resorting to dynamic SQL. The SQL language was not designed for dynamic determination of the entities against which you are querying.
Second, one way to accomplish what you want (but would be built dynamically) would be something along the lines of:
Select ...
From master
Where Exists (
Select 1
From filter_1
Where filter_1 = master.col1
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_1
)
Intersect
Select 1
From filter_2
Where filter_2 = master.col2
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_2
)
...
Intersect
Select 1
From filter_N
Where filter_N = master.colN
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_N
)
)
I have previously posted a - now deleted - answer based on wrong assumptions on you problems.
But I think you could go for a solution where you split your initial search problem into a matter of constructing the set of ids from the master table, and then select the data joining on that set of ids. Here I naturally assume you have a kind of ID on your master table. The filter tables contains the filter values only. This could then be combined into the statement below, where each SELECT in the eligble subset provides a set of master ids, these are unioned to avoid duplicates and that set of ids are joined to the table with data.
SELECT * FROM tblData INNER JOIN
(
SELECT id FROM tblData td
INNER JOIN fa on fa.a = td.a
UNION
SELECT id FROM tblData td
INNER JOIN fb on fb.b = td.b
UNION
SELECT id FROM tblData td
INNER JOIN fc on fc.c = td.c
) eligible ON eligible.id = tblData.id
The test has been made against the tables and values shown below. These are just an appendix.
CREATE TABLE tblData (id int not null primary key identity(1,1), a varchar(40), b datetime, c int)
CREATE TABLE fa (a varchar(40) not null primary key)
CREATE TABLE fb (b datetime not null primary key)
CREATE TABLE fc (c int not null primary key)
Since you have filter tables, I am assuming that these tables are probably dynamically populated from a front-end. This would mean that you have these tables as #temp_table (or even a materialized table, doesn't matter really) in your script before filtering on the master data table.
Personally, I use the below code bit for filtering dynamically without using dynamic SQL.
SELECT *
FROM [masterdata] [m]
INNER JOIN
[filter_table_1] [f1]
ON
[m].[filter_column_1] = ISNULL(NULLIF([f1].[filter_column_1], ''), [m].[filter_column_1])
As you can see, the code NULLs the JOIN condition if the column value is a blank record in the filter table. However, the gist in this is that you will have to actively populate the column value to blank in case you do not have any filter records on which you want to curtail the total set of the master data. Once you have populated the filter table with a blank, the JOIN condition NULLs in those cases and instead joins on itself with the same column from the master data table. This should work for all the cases you mentioned in your question.
I have found this bit of code to be faster in terms of performance.
Hope this helps. Please let me know in the comments.

SQL Count on multiple joins with dynamic WHERE

My issue is that I have a Select statement that has a where clause that is generated on the fly. It is joined across 5 tables.
I basically need a Count of each DISTINCT instance of a USER ID in table 1 that falls into the scope of the WHERE. This has to be able to be executed in one statement as well. So, Esentially, I can't do a global GROUP BY because of the other 4 tables data I need returned.
If I could get a column that had the count that was duplicated where the primary key column is that would be perfect. Right now this is what I'm looking at as my query:
SELECT *
FROM TBL1 1
INNER JOIN TBL2 2 On 2.FK = 1.FK
INNER JOIN TBL3 3 On 3.PK = 2.PK INNER JOIN TBL4 4 On 4.PK = 3.PK
LEFT OUTER JOIN TBL5 5 ON 4.PK = 5.PK
WHERE 1.Date_Time_In BETWEEN '2010-11-15 12:00:00' AND '2010-11-30 12:00:00'
ORDER BY
4.Column
, 3.Column
, 3.Column2
, 1.Date_Time_In DESC
So instead of selecting all columns, I will be filtering it down to about 5 or 6 but with that I need something like a Total column that is the Distinct count of TBL1's Primary Key that applies the WHERE clause that has a possibility of growing and shrinking in size.
I almost wish there was a way to apply the same WHERE clause to a subselect because I realize that would work but don't know of a way other than creating a variable and just placing it in both places which I can't do either.
If you are using SQL Server 2005 or higher, you could use one of the AGGREGATE OVER functions.
SELECT *
, COUNT(UserID) OVER(PARTITION BY UserID) AS 'Total'
FROM TBL1 1
INNER JOIN TBL2 2 On 2.FK = 1.FK
INNER JOIN TBL3 3 On 3.PK = 2.PK INNER JOIN TBL4 4 On 4.PK = 3.PK
LEFT OUTER JOIN TBL5 5 ON 4.PK = 5.PK
WHERE 1.Date_Time_In BETWEEN '2010-11-15 12:00:00' AND '2010-11-30 12:00:00'
ORDER BY
4.Column, 3.Column, 3.Column2, 1.Date_Time_In DESC
something like adding:
inner join (select pk, count(distinct user_id) from tbl1 WHERE Date_Time_In BETWEEN '2010-11-15 12:00:00' AND '2010-11-30 12:00:00') as tbl1too on 1.PK = tbl1too.PK