I have two tables (Table A & Table B) that I have in a database (SpatiaLite database). I would like to join all the entries in Table A with Table B using two foreign keys (TableA.Location & TableB.LSD, TableA.LICENCE_NO & TableB.Licence_No); however, there will be multiple INCIDEN_NO entries in Table A that match up with the joined rows in Table B.
Since there will be many INCIDEN_NO entries associated with the Licence_No in Table B, I would like to evenly distribute the INCIDEN_NO entries among all the LIC_LI_NO entries in Table B that align with the foreign keys. The rows from Table A can be randomly assigned to each LIC_LI_NO in Table B randomly and in no particular order.
I cannot seem to find a SQL expression for this operation, which has really stumped me for weeks.
You could match the rows up randomly with something like this:
with B as (
select row_number() over () as rn, lic_li_no
from B
), A as (
select abs(random()) % cntb + 1 as randnum, a.*
from A cross apply (select count(*) as cntb from B) b
)
select *
from A inner join B on A.randnum = B.rn;
You could also generate the cross product and keep random rows. I tried this query out on SQLite but it didn't seem to work as I expected:
select * from A cross join B where abs(random()) % 20 = 1
That said, I don't understand the purpose behind all this and it's certainly not a common thing to do in a database.
Related
I need to join three tables to get all the info I need. Table a has 70 million rows, after joining a with b, I got 40 million data. But after I join table c, which has only 1.7 million rows, it becomes 300 million rows.
In table c, there are more than one same pt_id and fi_id, one pt_id can connect to many different fi_id, but one fi_id only connects to one same pt_id.
I'm wondering if there is any way to get rid of the duplicate rows, cause I join table c only to get the pt_id.
Thanks for any help!
select c.pt_id,b.fi_id,a.zq_id
from a
inner join (select zq_id, fi_id from b) b
on a.zq_id = b.zq_id
inner join (select fi_id,pt_id from c) c
on b.fi_id = c.fi_id
You can use GROUP BY
select c.pt_id,b.fi_id,a.zq_id
from a
inner join (select zq_id, fi_id from b) b
on a.zq_id = b.zq_id
inner join (select fi_id,pt_id from c) c
on b.fi_id = c.fi_id
group by c.pt_id,b.fi_id,a.zq_id
to remove all duplicate row as question below:
How do I (or can I) SELECT DISTINCT on multiple columns?
I have a database with 3 tables; tblCustomers, tblBookings, tblFlights.
I want to find the customer's last name (LName), from the Customers table where the customers do not appear in the bookings table. It should return just three names, but it returns the names 10 times each. There are 10 records in the bookings table, so I think the command is returning the correct names, but not once...
I have tried:
SELECT tblCustomers.LName
FROM tblCustomers, tblBookings
WHERE tblCustomers.CustID
NOT IN (SELECT CustID FROM tblBookings)
How do I return just one instance of the name, not the name repeated 10 times?
You are doing a CROSS JOIN of the 2 tables.
Use only NOT IN:
SELECT LName
FROM tblCustomers
WHERE CustID NOT IN (SELECT CustID FROM tblBookings)
The (implicit) cross join on The bookings table in the outer query makes no sense - and it multiplies the customer rows.
Also, I would recommend not exists for filtering instead of not in: it usually performs better - with the right index in place, and it is null-safe:
SELECT c.LName
FROM tblCustomers c
WHERE NOT EXISTS (SELECT 1 FROM tblBookings b WHERE b.CustID = c.CustID)
For performance, make sure to have an index on tblBookings(CustID) - if you have a proper foreign key declared, it should already be there.
Hi i have two table A and B.A has 6 rows and b has 7 rows.Both tables have common value in name column.All the 6 rows of a table is present in b table on name column.
When i write query select * from a,b where a.name = b.name i get 14 rows returned i was expecting an inner join of with 6 rows in result.
Please explain me how query works when we have two tables in form clause.
Table A
Table B
query is
select * from a,b where a.tt = b.tt and a.nename=b.nename;
reuslt is
You've got duplicates in both tables (except for {2, 2017-03-04 03:00:00} which has three copies) which is why you get 14 = (2 * 4) + (2 * 3).
It's very hard to make sense of duplicate data. It's even harder to do when it duplicated on both sides of a join.
You could do something like
With fixedA (SELECT
*,
row_number() over (partition by nename, tt order by nename) rn
FROM
A),
fixedb (SELECT
*,
row_number() over (partition by nename, tt order by nename) rn
FROM
B)
SELECT *
FROM fixedA a full outer join fixedb b
on a.neName = b.neName
and a.tt = b.tt
and a.rn = b.rn
This will however leave one B record with a Null A record
The row_number also seems to do what cellID does so you could just do
SELECT *
FROM a full outer join b
on a.neName = b.neName
and a.tt = b.tt
and a.cellID = b.cellID
you should be doing something like full outer join on that table that you need result set from I would suggest something like this
select * from a full outer join b on a.tt = b.tt and a.nename=b.nename;
if your dealing with a bigger data set join on data type like varchar might take a lot of time to load the result set due to comparison. So, it would be better to use foreign key or primary key joins
https://www.w3schools.com/sql/sql_join_full.asp
I have two tables
Table A
type_uid, allowed_type_uid
9,1
9,2
9,4
1,1
1,2
24,1
25,3
Table B
type_uid
1
2
From table A I need to return
9
1
Using a WHERE IN clause I can return
9
1
24
SELECT
TableA.type_uid
FROM
TableA
INNER JOIN
TableB
ON TableA.allowed_type_uid = TableB.type_uid
GROUP BY
TableA.type_uid
HAVING
COUNT(distinct TableB.type_uid) = (SELECT COUNT(distinct type_uid) FROM TableB)
Join the two tables togeter, so that you only have the records matching the types you are interested in.
Group the result set by TableA.type_uid.
Check that each group has the same number of allowed_type_uid values as exist in TableB.type_uid.
distinct is required only if there can be duplicate records in either table. If both tables are know to only have unique values, the distinct can be removed.
It should also be noted that as TableA grows in size, this type of query will quickly degrade in performance. This is because indexes are not actually much help here.
It can still be a useful structure, but not one where I'd recommend running the queries in real-time. Rather use it to create another persisted/cached result set, and use this only to refresh those results as/when needed.
Or a slightly cheaper version (resource wise):
SELECT
Data.type_uid
FROM
A AS Data
CROSS JOIN
B
LEFT JOIN
A
ON Data.type_uid = A.type_uid AND B.type_uid = A.allowed_type_uid
GROUP BY
Data.type_uid
HAVING
MIN(ISNULL(A.allowed_type_uid,-999)) != -999
Your explanation is not very clear. I think you want to get those type_uid's from table A where for all records in table B there is a matching A.Allowed_type_uid.
SELECT T2.type_uid
FROM (SELECT COUNT(*) as AllAllowedTypes FROM #B) as T1,
(SELECT #A.type_uid, COUNT(*) as AllowedTypes
FROM #A
INNER JOIN #B ON
#A.allowed_type_uid = #B.type_uid
GROUP BY #A.type_uid
) as T2
WHERE T1.AllAllowedTypes = T2.AllowedTypes
(Dems, you were faster than me :) )
I am trying to filter a single table (master) by the values in multiple other tables (filter1, filter2, filter3 ... filterN) using only joins.
I want the following rules to apply:
(A) If one or more rows exist in a filter table, then include only those rows from the master that match the values in the filter table.
(B) If no rows exist in a filter table, then ignore it and return all the rows from the master table.
(C) This solution should work for N filter tables in combination.
(D) Static SQL using JOIN syntax only, no Dynamic SQL.
I'm really trying to get rid of dynamic SQL wherever possible, and this is one of those places I truly think it's possible, but just can't quite figure it out. Note: I have solved this using Dynamic SQL already, and it was fairly easy, but not particularly efficient or elegant.
What I have tried:
Various INNER JOINS between master and filter tables - works for (A) but fails on (B) because the join removes all records from the master (left) side when the filter (right) side has no rows.
LEFT JOINS - Always returns all records from the master (left) side. This fails (A) when some filter tables have records and some do not.
What I really need:
It seems like what I need is to be able to INNER JOIN on each filter table that has 1 or more rows and LEFT JOIN (or not JOIN at all) on each filter table that is empty.
My question: How would I accomplish this without resorting to Dynamic SQL?
In SQL Server 2005+ you could try this:
WITH
filter1 AS (
SELECT DISTINCT
m.ID,
HasMatched = CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END,
AllHasMatched = MAX(CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END) OVER ()
FROM masterdata m
LEFT JOIN filtertable1 f ON join_condition
),
filter2 AS (
SELECT DISTINCT
m.ID,
HasMatched = CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END,
AllHasMatched = MAX(CASE WHEN f.ID IS NULL THEN 0 ELSE 1 END) OVER ()
FROM masterdata m
LEFT JOIN filtertable2 f ON join_condition
),
…
SELECT m.*
FROM masterdata m
INNER JOIN filter1 f1 ON m.ID = f1.ID AND f1.HasMatched = f1.AllHasMatched
INNER JOIN filter2 f2 ON m.ID = f2.ID AND f2.HasMatched = f2.AllHasMatched
…
My understanding is, filter tables without any matches simply must not affect the resulting set. The output should only consist of those masterdata rows that have matched all the filters where matches have taken place.
SELECT *
FROM master_table mt
WHERE (0 = (select count(*) from filter_table_1)
OR mt.id IN (select id from filter_table_1)
AND (0 = (select count(*) from filter_table_2)
OR mt.id IN (select id from filter_table_2)
AND (0 = (select count(*) from filter_table_3)
OR mt.id IN (select id from filter_table_3)
Be warned that this could be inefficient in practice. Unless you have a specific reason to kill your existing, working, solution, I would keep it.
Do inner join to get results for (A) only and do left join to get results for (B) only (you will have to put something like this in the where clause: filterN.column is null) combine results from inner join and left join with UNION.
Left Outer Join - gives you the MISSING entries in master table ....
SELECT * FROM MASTER M
INNER JOIN APPRENTICE A ON A.PK = M.PK
LEFT OUTER JOIN FOREIGN F ON F.FK = M.PK
If FOREIGN has keys that is not a part of MASTER you will have "null columns" where the slots are missing
I think that is what you looking for ...
Mike
First off, it is impossible to have "N number of Joins" or "N number of filters" without resorting to dynamic SQL. The SQL language was not designed for dynamic determination of the entities against which you are querying.
Second, one way to accomplish what you want (but would be built dynamically) would be something along the lines of:
Select ...
From master
Where Exists (
Select 1
From filter_1
Where filter_1 = master.col1
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_1
)
Intersect
Select 1
From filter_2
Where filter_2 = master.col2
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_2
)
...
Intersect
Select 1
From filter_N
Where filter_N = master.colN
Union All
Select 1
From ( Select 1 )
Where Not Exists (
Select 1
From filter_N
)
)
I have previously posted a - now deleted - answer based on wrong assumptions on you problems.
But I think you could go for a solution where you split your initial search problem into a matter of constructing the set of ids from the master table, and then select the data joining on that set of ids. Here I naturally assume you have a kind of ID on your master table. The filter tables contains the filter values only. This could then be combined into the statement below, where each SELECT in the eligble subset provides a set of master ids, these are unioned to avoid duplicates and that set of ids are joined to the table with data.
SELECT * FROM tblData INNER JOIN
(
SELECT id FROM tblData td
INNER JOIN fa on fa.a = td.a
UNION
SELECT id FROM tblData td
INNER JOIN fb on fb.b = td.b
UNION
SELECT id FROM tblData td
INNER JOIN fc on fc.c = td.c
) eligible ON eligible.id = tblData.id
The test has been made against the tables and values shown below. These are just an appendix.
CREATE TABLE tblData (id int not null primary key identity(1,1), a varchar(40), b datetime, c int)
CREATE TABLE fa (a varchar(40) not null primary key)
CREATE TABLE fb (b datetime not null primary key)
CREATE TABLE fc (c int not null primary key)
Since you have filter tables, I am assuming that these tables are probably dynamically populated from a front-end. This would mean that you have these tables as #temp_table (or even a materialized table, doesn't matter really) in your script before filtering on the master data table.
Personally, I use the below code bit for filtering dynamically without using dynamic SQL.
SELECT *
FROM [masterdata] [m]
INNER JOIN
[filter_table_1] [f1]
ON
[m].[filter_column_1] = ISNULL(NULLIF([f1].[filter_column_1], ''), [m].[filter_column_1])
As you can see, the code NULLs the JOIN condition if the column value is a blank record in the filter table. However, the gist in this is that you will have to actively populate the column value to blank in case you do not have any filter records on which you want to curtail the total set of the master data. Once you have populated the filter table with a blank, the JOIN condition NULLs in those cases and instead joins on itself with the same column from the master data table. This should work for all the cases you mentioned in your question.
I have found this bit of code to be faster in terms of performance.
Hope this helps. Please let me know in the comments.