Left outer join and group by issue - sql

I wrote a query. this query sum fields from 2 different table. And grouped by main table id field. But second left outer join is not grouped and giving me different results.
SELECT s.*,
f.firma_adi,
sum(sd.fiyat) AS konak,
sum(ss.fiyat) AS sponsor
FROM fuar_sozlesme1 s
INNER JOIN fuar_firma_2012 f
ON ( s.cari = f.cari )
LEFT OUTER JOIN fuar_sozlesme1_detay sd
ON ( sd.sozlesme_id = s.id )
LEFT OUTER JOIN fuar_sozlesme1_sponsor ss
ON ( ss.sozlesme_id = s.id )
GROUP BY s.id
ORDER BY s.id DESC
I know, it is really complicated but I'm stucking on this issue.
My question is: why second left outer join is not correctly sum of field . If I remove second left outer join or first, everything is normal.

The problem is that you have multiple dimensions on your data, and the number of rows is multiplying beyond what you expect. I would suggest that you run the query for one id, without the group by, to see what rows the join is producing.
One way to fix this is by using correlated subqueries:
select s.*, f.firma_adi,
(select SUM(sd.fiyat)
from fuar_sozlesme1_detay fd
where sd.sozlesme_id = s.id
) as konak,
(select SUM(ss.fiyat)
from fuar_sozlesme1_sponsor ss
where (ss.sozlesme_id = s.id)
) as sponsor
from fuar_sozlesme1 s inner join
fuar_firma_2012 f
on (s.cari = f.cari)
order by s.id DESC
By the way, you appear to by using MySQL (because your query is not parsable in any other dialect). You should tag your questions with the version of the database you are using.

Related

What is causing the multi-part join error in this SQL?

I have been struggling with the sql and have tried a few approaches but can't get it working.
Can any SQL experts work out why this SQL is erroring, I think it's due to the ORDER BY?
SELECT
t.*
FROM
(SELECT
ROW_NUMBER() OVER (ORDER BY [owner_details].[id]) AS _row_num,
COUNT(count_column)
FROM
(SELECT 1 AS count_column
FROM [owner_details]
LEFT OUTER JOIN currencies cur ON owner_details.currency_id = cur.id
LEFT OUTER JOIN primary_contacts as pc ON owner_details.primary_contact_id = pc.id
LEFT OUTER JOIN contacts ON pc.contact_id = contacts.id
WHERE [owner_details].[primary_id] = 405062121) subquery_for_count) AS t
WHERE
t._row_num BETWEEN 1 AND 20
I should note that this SQL is programmatically generated via an ORM in Ruby on Rails but if I can work out the issue with the SQL maybe I can figure out how to change my code.
I want to understand the SQL better.
The error:
The multi-part identifier "owner_details.id" could not be bound..
try like below using t.[id] in order by i have assumed you id column in owner_details
SELECT
t.*
FROM
(SELECT
ROW_NUMBER() OVER (ORDER BY subquery_for_count.[id]) AS _row_num,
COUNT(count_column) over()
FROM
(SELECT 1 AS count_column,owner_details.id as id
FROM [owner_details]
LEFT OUTER JOIN currencies cur ON owner_details.currency_id = cur.id
LEFT OUTER JOIN primary_contacts as pc ON owner_details.primary_contact_id = pc.id
LEFT OUTER JOIN contacts ON pc.contact_id = contacts.id
WHERE [owner_details].[primary_id] = 405062121
) subquery_for_count) AS t
WHERE
t._row_num BETWEEN 1 AND 20
The multi-part naming conversion basically goes four levels (other than columns) in SQL. i.e.
Object (Table, view, stored procedure etc..)
Schema (schema name, default 'dbo')
Database (Database name)
Server (Server name, particularly when querying linked servers)
In your case [owner_details] is considered as table in sub-query, and as schema in OVER clause. Aside from this since you add alias name to sub-query (subquery_for_count), you should call it as subquery_for_count.id (in OVER clause)
Full query must go as:
SELECT
t.*
FROM
(SELECT
ROW_NUMBER() OVER (ORDER BY subquery_for_count.[id]) AS _row_num,
COUNT(count_column) as CountCoumn
FROM
(SELECT 1 AS count_column, owner_details.id as id
FROM [owner_details]
LEFT OUTER JOIN currencies cur ON owner_details.currency_id = cur.id
LEFT OUTER JOIN primary_contacts as pc ON owner_details.primary_contact_id = pc.id
LEFT OUTER JOIN contacts ON pc.contact_id = contacts.id
WHERE [owner_details].[primary_id] = 405062121
) subquery_for_count
GROUP BY subquery_for_count.[id]
) AS t
WHERE
t._row_num BETWEEN 1 AND 20

Group by and Having aggregation

i'm trying to determine who is the largest scorer in a world cup group (this is a personal project)
I have the data but i'm having a hard time using count, group by and having in order to accomplish what i need.
I need to count messi's goals (top scorer) and group by each one of the groups so i get the highest scorer of each group.
For now i just have the joins:
select * from zonas
left join goles_zonas on (zonas.id = goles_zonas.Id_zona)
inner join goles on (goles.id = goles_zonas.id_gol)
inner join jugadores on (goles.id_jugador = jugadores.id)
instead displaying all columns (by using SELECT * ), in order to group the data, I find it necessary to do SELECT only certain columns which are considered to be the keys to determine the difference of each group of dataset to get the aggregation (in this case COUNT) of each dataset group
SELECT Id_zona, id_gol, id_jugador, COUNT(1) as number_of_goal
FROM zonas
left join goles_zonas on (zonas.id = goles_zonas.Id_zona)
inner join goles on (goles.id = goles_zonas.id_gol)
inner join jugadores on (goles.id_jugador = jugadores.id)
GROUP BY Id_zona, id_gol, id_jugador
It has to be grouped by all columns included the select statement that does not being aggregated.
but if you expect to display other columns as well which are not part of the grouping keys, you can do it like this
SELECT goles_zonas.* , x.* FROM (
SELECT Id_zona, id_gol, id_jugador, COUNT(1) as number_of_goal
FROM zonas
left join goles_zonas on (zonas.id = goles_zonas.Id_zona)
inner join goles on (goles.id = goles_zonas.id_gol)
inner join jugadores on (goles.id_jugador = jugadores.id)
GROUP BY Id_zona, id_gol, id_jugador ) X
LEFT JOIN goles_zonas on (x.id = goles_zonas.Id_zona)

T-SQL Left-Join with 1 row (limi, subselect)

I already read a lot on that topic but I´m unable to get it to work for my case.
I have the following situation:
A list of orderitems (the main datasets I want to get)
Articles which have a 1:1 relation to an order item
A n:m Jointable "Articlesupplier" which creates a relation between an article and a
partner
A Partner table with detailed information about partners.
Target:
One dataset per OrderItem and from the suppliers I only want to get the first one found in the join. No priorization required.
Tables:
Table IDX_ORDERITEM
id,article_id
Table IDX_ARTICLE
id,name
Table IDX_ARTICLESUPPLIER
article_id,partner_id
Table IDX_PARTNER
id,abbr
My actual statement (short version):
SELECT IDX_ORDERITEM.id
FROM
dbo.IDX_ORDERITEM AS IDX_ORDERITEM
-- ARTICLE --
INNER JOIN dbo.IDX_ARTICLE AS IDX_ARTICLE
ON IDX_ORDERITEM.article_id=IDX_ARTICLE.id
-- SUPPLIER VIA ARTICLE --
LEFT JOIN
(SELECT TOP(1) IDX_PARTNER.id, IDX_PARTNER.abbr
FROM IDX_PARTNER, IDX_ARTICLESUPPLIER
WHERE IDX_PARTNER.id = IDX_ARTICLESUPPLIER.partner_id
AND IDX_ARTICLESUPPLIER.article_id=IDX_ARTICLE.id) AS IDX_PARTNER_SUPPLIER
ON IDX_PARTNER_SUPPLIER.id=IDX_ARTICLE.supplier_partner_id
WHERE 1>0
ORDER BY orderitem.id DESC
But it seems I can´t access IDX_ARTICLE.id in the subquery. I get the following error message:
The multi-part identifier "IDX_ARTICLE.id" could not be bound.
Is the problem that the Article alias has the same name as the table name?
Thanks a lot in advance for possible ideas,
Mike
Well, I changed your aliases, and the subquery to which you were joining (I also modified that subquery so it doesn't use implicit joins anymore), though this changes where mostly cosmetics. The actual important change was the use of OUTER APPLY instead of LEFT JOIN:
SELECT OI.id
FROM dbo.IDX_ORDERITEM AS OI
INNER JOIN dbo.IDX_ARTICLE AS A
ON OI.article_id = A.id
OUTER APPLY
(SELECT TOP(1) P.id, P.abbr
FROM IDX_PARTNER AS P
INNER JOIN IDX_ARTICLESUPPLIER AS SUP
ON P.id = SUP.partner_id
WHERE SUP.article_id = A.id
AND P.id = A.supplier_partner_id) AS PS
ORDER BY OI.id DESC
The error is thrown because the below piece of query
(SELECT TOP(1) IDX_PARTNER.id, IDX_PARTNER.abbr
FROM IDX_PARTNER, IDX_ARTICLESUPPLIER
WHERE IDX_PARTNER.id = IDX_ARTICLESUPPLIER.partner_id
AND IDX_ARTICLESUPPLIER.article_id=IDX_ARTICLE.id) AS IDX_PARTNER_SUPPLIER
cannot be considered as a correlated sub-query and IDX_ARTICLE.id is referenced in it in the same manner we reference a field of outer query in a correlated sub-query.
I see two problems.
According to your DDLs there is no IDX_ARTICLE.supplier_partner_id which you refer to in the left join on clause.
Second, I'm quite sure you cannot use IDX_ARTICLE.id in your derived table. Simply add IDX_ARTICLESUPPLIER.article_id to your derived table selected fields and use it in your left join on clause against IDX_ARTICLE.id.
I prefer to avoid nested queries. If I can, I will always rewrite it using CTE.
WITH Part_Sup
AS (
SELECT TOP ( 1 ) P.id
,P.abbr
,SUP.article_id
FROM IDX_PARTNER AS P
INNER JOIN IDX_ARTICLESUPPLIER AS SUP
ON P.id = SUP.partner_id
)
SELECT OI.id
FROM dbo.IDX_ORDERITEM AS OI
INNER JOIN dbo.IDX_ARTICLE AS A
ON OI.article_id = A.id
LEFT OUTER JOIN Part_Sup AS PS
ON PS.article_id = A.Id
AND PS.id = A.supplier_partner_id
ORDER BY OI.id DESC;
Next I rewritten the first query to use ROW_NUMBER() function instead of using TOP (1) using ROW_NUMBER you can control which results you want and what you don't want.
WITH Part_Sup
AS (
SELECT P.id
,P.abbr
,SUP.article_id
,ROW_NUMBER() OVER ( PARTITION BY P.id, P.abbr ) AS RowNum
FROM IDX_PARTNER AS P
INNER JOIN IDX_ARTICLESUPPLIER AS SUP
ON P.id = SUP.partner_id
)
SELECT OI.id
FROM dbo.IDX_ORDERITEM AS OI
INNER JOIN dbo.IDX_ARTICLE AS A
ON OI.article_id = A.id
LEFT OUTER JOIN Part_Sup AS PS
ON PS.article_id = A.Id
AND PS.id = A.supplier_partner_id
AND RowNum = 1
ORDER BY OI.id DESC;
Thanks Lamak - you solved it :)
I used your input to extract the basic solution to make it a bit easier to read for others which have the same problem:
Using OUTER APPLY (without ORDER_ITEM Table here):
SELECT IDX_ARTICLE.id AS AR_ID, IDX_PARTNER_SUPPLIER.id, IDX_PARTNER_SUPPLIER.abbr
FROM
dbo.IDX_ARTICLE AS IDX_ARTICLE
OUTER APPLY
(SELECT TOP(1) _PARTNER.id, _PARTNER.abbr
FROM IDX_PARTNER AS _PARTNER
INNER JOIN IDX_ARTICLESUPPLIER AS _ARTICLESUPPLIER
ON _PARTNER.id = _ARTICLESUPPLIER.partner_id
WHERE _ARTICLESUPPLIER.article_id=IDX_ARTICLE.id
AND _ARTICLESUPPLIER.deleted IS NULL) AS IDX_PARTNER_SUPPLIER
WHERE IDX_ARTICLE.id=67

sql left outer join with a constraining column

Here is the SQL, 'aal_county_zip' has entry for 2 zipcodes whereas 'us_zip' has 15 zipcodes. The requirement is to get 15 rows with only 2 rows having data from 'aal_county_zip'. It works like a normal join. How can I make change the SQL/Table structure to make this work. I also want to add the condition that is commented below.
SELECT DISTINCT a.zcta5ce10 AS zipcode,
c.report_year,
c.aal
FROM aal_county_zip c
RIGHT OUTER JOIN us_zip a
ON ( c.zip = a.zcta5ce10 )
WHERE Lower(c.name) = Lower('alachua')
--and c.report_year=2009
ORDER BY c.report_year DESC
The WHERE Lower(c.name) = Lower('alachua') in your query turns the outer join into an inner join, since it prevents c.name from being NULL.
Consider using a left join instead, as they're often more natural to write. And in any event, apply that condition to the join clause rather than to the where clause, so as to avoid turning it into an inner join.
Borrowing and amending #dasblinkenlight's query:
SELECT DISTINCT
a.zcta5ce10 AS zipcode
, c.report_year
, c.aal
FROM us_zip a
LEFT OUTER JOIN aal_county_zip c
ON c.zip = a.zcta5ce10
AND c.report_year=2009
AND LOWER(c.name) = LOWER('alachua')
ORDER BY c.report_year DESC
That should fix your "only two rows returned" problem. That said, the query is likely missing some additional criteria (and ordering criteria) on us_zip.
SELECT DISTINCT a.zcta5ce10 AS zipcode,
c.report_year,
c.aal
FROM aal_county_zip c
RIGHT OUTER JOIN us_zip a
ON ( c.zip = a.zcta5ce10 )
WHERE Lower(c.name) = Lower('alachua')
AND COALESCE(c.report_year, 2009)=2009
ORDER BY c.report_year DESC
or
SELECT DISTINCT a.zcta5ce10 AS zipcode,
c.report_year,
c.aal
FROM aal_county_zip c
RIGHT OUTER JOIN us_zip a
ON ( c.zip = a.zcta5ce10 AND c.report_year=2009)
WHERE Lower(c.name) = Lower('alachua')
ORDER BY c.report_year DESC
You're doing a RIGHT OUTER JOIN, so your first table, aal_county_zip, may contain nulls. So either you account for those nulls by using COALESCE, or you make it part of the join condition.

i want to modify this SQL statement to return only distinct rows of a column

select
picks.`fbid`,
picks.`time`,
categories.`name` as cname,
options.`name` as oname,
users.`name`
from
picks
left join categories
on (categories.`id` = picks.`cid`)
left join options
on (options.`id` = picks.oid)
left join users
on (users.fbid = picks.`fbid`)
order by
time desc
that query returns a result that like:
my question is.... I would like to modify the query to select only DISTINCT fbid's. (perhaps the first row only sorted by time)
can someone help with this?
select
p2.fbid,
p2.time,
c.`name` as cname,
o.`name` as oname,
u.`name`
from
( select p1.fbid,
min( p1.time ) FirstTimePerID
from picks p1
group by p1.fbid ) as FirstPerID
JOIN Picks p2
on FirstPerID.fbid = p2.fbid
AND FirstPerID.FirstTimePerID = p2.time
LEFT JOIN Categories c
on p2.cid = c.id
LEFT JOIN Options o
on p2.oid = o.id
LEFT JOIN Users u
on p2.fbid = u.fbid
order by
time desc
I don't know why you originally had LEFT JOINs, as it appears that all picks must be associated with a valid category, option and user... I would then remove the left, and change them to INNER joins instead.
The first inner query grabs for each fbid, the FIRST entry time which will result in a single entity for the FBID. From that, it re-joins to the picks table for the same ID and timeslot... then continues for the rest of the category, options, users join criteria of that single entry.
2 options, you could write a group by clause.
Or you could write a nested query joined back to itself to get pertinent info.
Nested aliased table:
SELECT
n.fBids
FROM
MyTable t
INNER JOIN
(SELECT DISTINCT fBids
FROM MyTable) n
ON n.ID = t.ID
Or group by option
SELECT fBId from MyTable
GROUP BY fBID
select picks.`fbid`, picks.`time`, categories.`name` as cname,
options.`name` as oname, users.`name` from picks left join categories
on (categories.`id` = picks.`cid`) left join options on (options.`id` = picks.oid)
left join users on (users.fbid = picks.`fbid`)
order by time desc GROUP BY picks.`fbid`
select
picks.fbid,
MIN(picks.time) as first_time,
MAX(picks.time) as last_time
from
picks
group by
picks.fbid
order by
MIN(picks.time) desc
However, if you want only distinct fbid's you cannot display cname and other columns at the same time.