Select that finds IF multiple - sql

I am doing a inner join between two tables where one is an association table, so there is a many to one relationship. I am trying to come up with a query that can decide if the key on the join exist more than once than store a value multiple in the update column, but not sure the efficient way to make this happen:
SELECT
MainTable.Name
FROM MainTable
INNER JOIN ASSN_Main ON MainTable.AppID = ASSN_Main.AppID
WHERE
EXISTS (SELECT
COUNT(MainTable.AppID)
FROM MainTable
INNER JOIN ASSN_Main ON MainTable.AppID = ASSN_Main.AppID
GROUP BY
MainTable.AppID
HAVING
(COUNT(MainTable.AppID)>1));
The problem is the subquery grabs the correct ones that have duplicates on the appid, but the main SELECT query grabs all appid names instead of only the ones that exist in the subquery. Not sure whats going wrong since the subquery is correct?

There is no relation between the items in the main query and the items in the subquery, so what the query is returning is "all items, if there are any items that have duplicates". What you want is "all items where there are duplicates":
SELECT
MainTable.Name
FROM MainTable m
INNER JOIN ASSN_Main a ON m.AppID = a.AppID
WHERE
EXISTS (SELECT AppID
FROM ASSN_Main
WHERE AppID = m.AppID
GROUP BY AppID
HAVING COUNT(*)>1);

I don't know access SQL, but something like this would work in SQL Server:
SELECT
MainTable.Name
FROM
MainTable INNER JOIN ASSN_Main ON MainTable.AppID = ASSN_Main.AppID
WHERE
MainTable.AppID IN
(SELECT
MainTable.AppID
FROM MainTable INNER JOIN ASSN_Main ON MainTable.AppID = ASSN_Main.AppID
GROUP BY
MainTable.AppID
HAVING
(COUNT(MainTable.AppID)>1));
So basically replacing the EXISTS with IN and return the AppID from the subquery.

I'm not really sure what you're trying to do, but to debug Access queries in general, I'd break it into multiple queries so you can see what's going on at each step. Then if you want you can combine them into one query that does all the steps.

Try changing the INNER JOIN to a WHERE clause. When you change it to a WHERE, ASSN_Main in the subquery will refer to the table from the parent query.
There's a good overview of the EXISTS clause here:
http://www.techonthenet.com/sql/exists.php

Related

Left Outer Join Without Duplicate Rows

After a lot of searching I could not solve my problem. I have the following tables:
I want to select all records from my 'product' table. but I have a problem. I got multiple rows from 'product' table when I execute the following query:
SELECT dbo.product.id, dbo.product.name, dbo.product_price.value
dbo.product_barcode.barcode
FROM dbo.product LEFT OUTER JOIN
dbo.product_price ON dbo.product.id = dbo.product_price.product_id LEFT OUTER JOIN
dbo.product_barcode ON dbo.product.id = dbo.product_barcode.product_id
My problem was solved with the following query:
SELECT dbo.product.id, dbo.product.name, dbo.product_price.value
dbo.product_barcode.barcode
FROM dbo.product LEFT OUTER JOIN
dbo.product_price ON dbo.product.id = dbo.product_price.product_id LEFT OUTER JOIN
dbo.product_barcode ON dbo.product.id = dbo.product_barcode.product_id
WHERE (dbo.product_price.id IN
(SELECT MIN(id) AS minPriceID
FROM dbo.product_price AS product_price_1
GROUP BY product_id)) AND (dbo.product_barcode.id IN
(SELECT MIN(id) AS Expr1
FROM dbo.product_barcode AS product_barcode_1
GROUP BY product_id))
Now I have just one problem. if the 'product_price' table or 'product_barcode' table does not have any record, No records will be returned. I mean if no similar record is found in 'product_price' or 'product_barcode' table, we will not have any record. while we should have records from the 'product' table with null columns of other tables.
Please help me Thanks.
I think the problem is that the conditions are in the WHERE clause - try moving the conditions from the WHERE clause and into each join:
SELECT dbo.product.id, dbo.product.name, dbo.product_price.value,
dbo.product_barcode.barcode
FROM dbo.product
LEFT OUTER JOIN dbo.product_price ON dbo.product.id = dbo.product_price.product_id
--moved from WHERE clause
AND dbo.product_price.id IN (SELECT MIN(id) AS minPriceID FROM dbo.product_price AS product_price_1 GROUP BY product_id)
LEFT OUTER JOIN dbo.product_barcode ON dbo.product.id = dbo.product_barcode.product_id
--moved from WHERE clause
AND dbo.product_barcode.id IN (SELECT MIN(id) AS Expr1 FROM dbo.product_barcode AS product_barcode_1 GROUP BY product_id)
SELECT dbo.product.id, dbo.product.name, MIN(dbo.product_price.value),MIN( dbo.product_barcode.barcode)
FROM dbo.product
LEFT OUTER JOIN dbo.product_price ON dbo.product.id = dbo.product_price.product_id
LEFT OUTER JOIN dbo.product_barcode ON dbo.product.id = dbo.product_barcode.product_id
GROUP BY dbo.product.id, dbo.product.name
While the above query should achieve your desired results - one entry for product with the minimun product_price (if one exist) and minimum product_barcode (if one exist).
I only assumed you wanted this based on the query you were writing. You need to spend more time thinking about the question you are trying to answer.
Joining will multiply your results if more than one entry per join key is present in one of the tables.
Just remember this diagram:
When you are calling LEFT JOIN and applying filtering like in the second picture from the top left, you will get only items belonging to the A table.
I want to select all records from my 'product' table. but I have a problem. I got multiple rows from 'product' table when I execute the following query:
Can you try calling DISTINCT after SELECT, like this:
SELECT DISTINCT dbo.product.id, dbo.product.name, dbo.product_price.value
dbo.product_barcode.barcode
...

SQLite GROUP_CONCAT from another table, multiple joins

Having trouble with my sql query. Not an SQL expert by any means.
SELECT
transactions.*,
categories.*,
GROUP_CONCAT(tags.tagName) as concatTags
FROM transactions
INNER JOIN categories
ON transactions.category = categories.categoryId
LEFT JOIN TransactionTagRelation AS ttr
ON transactions.transactionId = ttr.transactionId
LEFT JOIN tags
ON tags.tagId = ttr.tagId;
(There's also a where and group by, but didn't think it was relevant to the question).
I'm trying to get:
transactionId1, ...otherStuff..., "tagId1,tagId2,tagId3"
transactionId2, ...otherStuff..., "tagId1,tagId3"
What I have now seems to merge the tags into one transaction or something. I tried adding a GROUP BY transactionID at the end, but it gives a syntax error for some reason. I have a feeling my joins are incorrect, but I wasn't able to get anything better.
Do something like this:
SELECT t.*, c.*,
(SELECT GROUP_CONCAT(tg.tagName)
FROM TransactionTagRelation ttr JOIN
Tags tg
ON tg.tagId = ttr.tagId
WHERE t.transactionId = ttr.transactionId
) as concatTags
FROM transactions t JOIN
categories c
ON t.category = c.categoryId;
This eliminates the GROUP BY in the outer query and allows you to use t.* and c.* in the SELECT.

Why is LEFT JOIN deleting rows?

I have been using sql for a long time, but I am now working in Databricks and I am getting a very strange result. I have a table called block_durations with a set of ids (called block_ts), and I have another table called mergetable, which I want to left join to that table. Mergetable is indexed by acct_id and block_ts, so it has many different records for each block_ts. I want to keep the rows in block_durations that don't match, and if there are multiple matches in mergetable I want there to be multiple corresponding entries in the resulting join, as you would expect from a left join.
But this is not happening. In order to demonstrate this, I am showing the result of joining mergetable, after filtering for a single acct_id so that there is at most one match per block_ts.
select count(*) from mergetable where acct_id = '0xfbb1b73c4f0bda4f67dca266ce6ef42f520fbb98'
16579
select count(*) from block_durations
82817
select count(*) from
(
SELECT
mt.*,
bd.block_duration
FROM
block_durations bd
left outer JOIN mergetable mt
ON mt.block_ts = bd.block_ts
where acct_id='0xfbb1b73c4f0bda4f67dca266ce6ef42f520fbb98'
) countTable
16579
As you can see, even though there are >80000 records in block_durations, most of them are getting lost in the left join. Why is this happening? I thought the whole point of a left join is that the non-matching rows of the left table are kept. This is exactly the behavior I would expect from an inner join -- and indeed when I switch to an inner join nothing changes.
Could someone please help me figure out what's going on?
-Paul
All rows from left side of the join are preserved, but later on you run WHERE ... condition on that which removed rows not matching the condition.
Merge your WHERE condition into JOIN condition:
SELECT
mt.*,
bd.block_duration
FROM
block_durations bd
left outer JOIN mergetable mt
ON mt.block_ts = bd.block_ts AND acct_id='0xfbb1b73c4f0bda4f67dca266ce6ef42f520fbb98'
You can also filter mergetable before you run JOIN on the results:
SELECT
mt.*,
bd.block_duration
FROM
block_durations bd
left outer JOIN (SELECT * FROM mergetable WHERE acct_id='0xfbb1b73c4f0bda4f67dca266ce6ef42f520fbb98') mt
ON mt.block_ts = bd.block_ts

The opposite of INNER JOIN

I have query with Inner Join.
Query 1:
select *
from vehicle_models vmodel
Inner join ogpo_voilure_model md on md.Name = vmodel.VEHICLE_MODEL
Now, i Need data that not exists in these id. With another word - opposite Inner JOIN.
I tried to make query that I need, but not successfull.
Query 2:
Select top 500 *
from ogpo_voilure_model md
Where md.id not in
(
select md.id
from Novelty.dbo.vehicle_models vmodel
Inner join [ogpo_voilure_model] md on md.Name = vmodel.VEHICLE_MODEL
)
I find in StakOverflow answer like this(sixth example). But my fields are not NULL.
How I can achieve it?
The inner join means you want everything is in set table vehicle_models but also got one or more correlated row in ogpo_voilure_model
If you define the oposite to it as everything is in vehicle_models but don't got a correlated in ogpo_voilure_model
All you needs is:
select *
from [Novelty].[dbo].[vehicle_models] vmodel
Where vmodel.VEHICLE_MODEL not in
(
select md.Name
from [Novelty].[dbo].[ogpo_voilure_model] md
)
And following that definition it's right even if results returns zero rows.
If it's not the right answer you must first define what the opposite of inner join means to you. For example you maybe want to swap the tables.
What do you mean by "my fields are not NULL"?
The link that you gave in the question has an answer: FULL JOIN.
SELECT *
FROM
vehicle_models vmodel
FULL JOIN ogpo_voilure_model md on md.Name = vmodel.VEHICLE_MODEL
WHERE
md.Name IS NULL
OR vmodel.VEHICLE_MODEL IS NULL
;
This will return rows from vehicle_models and ogpo_voilure_model that don't have common Vehicle Model Name.
It would help if you could add few sample rows to the question and your expected result.

SQL - why is this 'where' needed to remove row duplicates, when I'm already grouping?

Why, in this query, is the final 'WHERE' clause needed to limit duplicates?
The first LEFT JOIN is linking programs to entities on a UID
The first INNER JOIN is linking programs to a subquery that gets statistics for those programs, by linking on a UID
The subquery (that gets the StatsForDistributorClubs subset) is doing a grouping on UID columns
So, I would've thought that this would all be joining unique records anyway so we shouldn't get row duplicates
So why the need to limit based on the final WHERE by ensuring the 'program' is linked to the 'entity'?
(irrelevant parts of query omitted for clarity)
SELECT LmiEntity.[DisplayName]
,StatsForDistributorClubs.*
FROM [Program]
LEFT JOIN
LMIEntityProgram
ON LMIEntityProgram.ProgramUid = Program.ProgramUid
INNER JOIN
(
SELECT e.LmiEntityUid,
sp.ProgramUid,
SUM(attendeecount) [Total attendance],
FROM LMIEntity e,
Timetable t,
TimetableOccurrence [to],
ScheduledProgramOccurrence spo,
ScheduledProgram sp
WHERE
t.LicenseeUid = e.lmientityUid
AND [to].TimetableOccurrenceUid = spo.TimetableOccurrenceUid
AND sp.ScheduledProgramUid = spo.ScheduledProgramUid
GROUP BY e.lmientityUid, sp.ProgramUid
) AS StatsForDistributorClubs
ON Program.ProgramUid = StatsForDistributorClubs.ProgramUid
INNER JOIN LmiEntity
ON LmiEntity.LmiEntityUid = StatsForDistributorClubs.LmiEntityUid
LEFT OUTER JOIN Region
ON Region.RegionId = LMIEntity.RegionId
WHERE (
[Program].LicenseeUid = LmiEntity.LmiEntityUid
OR
[LMIEntityProgram].LMIEntityUid = LmiEntity.LmiEntityUid
)
If you were grouping in your outer query, the extra criteria probably wouldn't be needed, but only your inner query is grouped. Your LEFT JOIN to a grouped inner query can still result in multiple records being returned, for that matter any of your JOINs could be the culprit.
Without seeing sample of duplication it's hard to know where the duplicates originate from, but GROUPING on the outer query would definitely remove full duplicates, or revised JOIN criteria could take care of it.
You have in result set:
SELECT LmiEntity.[DisplayName]
,StatsForDistributorClubs.*
I suppose that you dublicates comes from LMIEntityProgram.
My conjecture: LMIEntityProgram - is a bridge table with both LmiEntityId an ProgramId, but you join only by ProgramId.
If you have several LmiEntityId for single ProgramId - you must have dublicates.
And this dublicates you're filtering in WHERE:
[LMIEntityProgram].LMIEntityUid = LmiEntity.LmiEntityUid
You can do it in JOIN:
LEFT JOIN LMIEntityProgram
ON LMIEntityProgram.ProgramUid = Program.ProgramUid
AND [LMIEntityProgram].LMIEntityUid = LmiEntity.LmiEntityUid