Postgres: what's wrong with the syntax of this query? - sql

I'm trying to write a fairly straightforward PSQL query to retrieve some data (I realise it's not the most efficient query right now):
SELECT c.name AS article, c.id AS article_id, t.name AS template, t.id AS template_id, brand_names, COUNT(p.component_id)
FROM publications p
INNER JOIN components c
(SELECT string_agg(b.name, ', ') AS brand_names
FROM brands b
INNER JOIN brands_components
ON b.id = brands_components.brand_id
WHERE brands_components.component_id = c.id
) brand_query
ON c.id = p.component_id
INNER JOIN brands_components bc
ON c.id = bc.component_id
AND bc.brand_id IN (16, 23, 24, 35, 37)
INNER JOIN components_templates ct
ON c.id = ct.component_id
INNER JOIN templates t
ON t.id = ct.template_id
This gives me a syntax error though on line 4. What's missing? If I run the subquery alone it works fine:
syntax error at or near "SELECT" LINE 4: (SELECT string_agg(b.name, ', ') AS brand_names ^ : SELECT c.name AS article, c.id AS article_id, t.name AS template, t.id AS template_id, brand_nam
The subquery is designed to retrieve all the brand names per component and display them in a single row instead of many. Their join table is brands_components.
A fiddle that is available here, the desired result should be something like:
article article_id template template_id count brands
--------------------------------------------------------------------------------------------------------------
component one | 1 | template one | 1 | 4 | brand one, brand two, brand three, brand four

Your immediate problem could be solved with a a lateral join:
SELECT c.name AS article, c.id AS article_id, t.name AS template, t.id AS template_id, brand_names, COUNT(p.component_id)
FROM publications p
JOIN components c ON c.id = p.component_id
JOIN brands_components bc ON c.id = bc.component_id AND bc.brand_id IN (1, 2, 3, 4)
JOIN LATERAL (
SELECT b.id, string_agg(b.name, ', ') AS brand_names
FROM brands b
JOIN brands_components ON b.id = brands_components.brand_id
WHERE brands_components.component_id = c.id
GROUP BY b.id
) brand_query ON brand_query.id = bc.brand_id
JOIN components_templates ct ON c.id = ct.component_id
JOIN templates t ON t.id = ct.template_id
GROUP BY 1,2,3,4
The above would still not run because the group by doesn't include the brand_names column. Postgres doesn't know that brand_names is already aggregates.
However, the derived table is not really needed if you move the aggregation to the outer query:
SELECT c.name AS article,
c.id AS article_id,
t.name AS template,
t.id AS template_id,
string_agg(b.name, ',') as brand_names,
COUNT(p.component_id)
FROM publications p
JOIN components c ON c.id = p.component_id
JOIN brands_components bc ON c.id = bc.component_id AND bc.brand_id IN (1, 2, 3, 4)
JOIN brands b on b.id = bc.brand_id
JOIN components_templates ct ON c.id = ct.component_id
JOIN templates t ON t.id = ct.template_id
GROUP BY c.name, c.id, t.name, t.id;

Try this Query:
SELECT c.name AS article, c.id AS article_id, t.name AS template, t.id AS template_id,MAX(brand_names) AS brand_names, COUNT(p.component_id) AS Counts
FROM publications p
INNER JOIN components c
ON c.id = p.component_id
INNER JOIN brands_components bc
ON c.id = bc.component_id
AND bc.brand_id IN (1, 2, 3, 4)
INNER JOIN components_templates ct
ON c.id = ct.component_id
INNER JOIN templates t
ON t.id = ct.template_id
INNER JOIN (SELECT string_agg(b.name, ', ') AS brand_names
FROM brands b
INNER JOIN brands_components bcc
ON b.id = bcc.brand_id
INNER JOIN components c ON bcc.component_id = c.id
) brand_query ON brand_names IS NOT NULL
Group by c.name,c.id,t.name,t.id

I got it working with a function:
CREATE FUNCTION brands(int) RETURNS varchar AS $$
SELECT string_agg(b.name, ', ') AS brand_names
FROM brands b
INNER JOIN brands_components
ON b.id = brands_components.brand_id
WHERE brands_components.component_id = $1
$$ LANGUAGE SQL;
SELECT c.name, c.id, t.name AS template_name, t.id AS template_id, brands(c.id), COUNT(p.component_id)
FROM publications p
INNER JOIN components c
ON c.id = p.component_id
INNER JOIN brands_components bc
ON c.id = bc.component_id
AND bc.brand_id IN (1, 2, 3, 4)
INNER JOIN components_templates ct
ON c.id = ct.component_id
INNER JOIN templates t
ON t.id = ct.template_id
GROUP BY 1, 2, 3, 4
Not sure which is preferable, likely it's DineshDB's though.

Related

How to join queries with a subquery?

So I'm a total newbie trying to solve this exercise where I have to find all the dishes that are marked as Vegetarian but contain Turkey meat in their ingredients.
This is what I've tried (this is where I inner join 3 tables to find the ingredients):
SELECT Name
FROM Dishes
INNER JOIN DishesIngredients ON DishesIngredients.DishId = s.Id
INNER JOIN Ingredients ON DishesIngredients.IngredientID = Ingredients.ID
this is where I can't seem to be able to join the subquery to identify the Vegetarian tag:
WHERE Ingredients.Name = 'Turkey meat' =
(SELECT Name
FROM Tags
INNER JOIN DishesTags ON DishesTags.TagID = Tags.ID
INNER JOIN Dishes ON DishesTags.DishID = Dishes.ID)
The diagram of the database is here for reference:
Let first find out how many dishes have Turkey meat as ingredient.
You have:
SELECT D.ID
FROM
Dishes D
JOIN DishIngredients DI ON D.ID = DI.DishID
JOIN Ingredients I ON DI.IngredientID = I.ID
WHERE I.Name LIKE 'Turkey meat'
Then get all dishes with tag 'Vegetarian'.
SELECT D.ID
FROM
Dishes D
JOIN DishIngredients DI ON D.ID = DI.DishID
JOIN Ingredients I ON DI.IngredientID = I.ID
JOIN DishesTags DT on D.ID = DT.DishID
JOIN Tags T ON DT.TagID = T.ID
WHERE I.Name LIKE 'Turkey meat'
AND T.Name = 'Vegetarian'
You could use exists and subqueries:
select d.*
from dishes d
where
exists (
select 1
from dishestags dt
innerjoin tags t on t.id = dt.tagid
where dt.dishid = d.id and t.name = 'Vegetarian'
)
and exists (
select 1
from dishesingredients di
inner join ingredients i on i.id = di.ingredientid
where di.dishid = d.id and i.name = 'Turkey'
)

How to get distinct id in subquery from fact table

Trying to return all the fields in my sub query minus the duplicated childid's in the fact table cor.score. I don't want the duplicate id's to inflate my count. Every id needs to be counted once.
select distinct cs.childid
from
(select s.sitename, c.primarylanguage, count(Primarylanguage) as 'Count'
from cor.scores cs
left join cor.sites s on s.id = cs.siteid
left join cor.children c on c.id = cs.childid
group by s.sitename, c.primarylanguage)
Error:
Msg 102, Level 15, State 1, Line 7 Incorrect syntax near ')'.
What's the best way to go about this?
select distinct cs_childid
from
(
select s.sitename,cs.childid as cs_childid, c.primarylanguage, count(Primarylanguage) as 'Count'
from cor.scores cs
left join cor.sites s on s.id = cs.siteid
left join cor.children c on c.id = cs.childid
group by s.sitename, c.primarylanguage,cs.childid
)as T
Do you want this:
1. can return any column of subquery
select s.childid,s.[Count]
FROM (
SELECT s.sitename, c.primarylanguage,cs.childid, count(Primarylanguage)OVER(PARTITION BY s.sitename) as 'Count'
FROM cor.scores cs
LEFT join cor.sites s on s.id = cs.siteid
) AS s LEFT join cor.children c on c.id = s.childid
Or this:
2. Only can return group column and aggregate columns
select s.childid,s.[Count]
FROM (
SELECT s.sitename,cs.childid, count(Primarylanguage)OVER(PARTITION BY s.sitename) as 'Count'
FROM cor.scores cs
LEFT join cor.sites s on s.id = cs.siteid
GROUP BY s.sitename,cs.childid
) AS s LEFT join cor.children c on c.id = s.childid

SQL - Multiple join left with sum doesn't give expected result

Here is my request
SELECT j.* ,
c.name as client_name ,
s.name as supplier_name,
s.ID as supplier_id ,
mt.* ,
SUM(pb.require_followup) as nb_followup,
SUM(ws.worked_time) as hours_on_job,
SUM(iv.total) as total_price,
SUM(iv.hour_expected) as hours_planned,
j.ID as ID
FROM $wpdb->posts j
LEFT JOIN ".Job::$META_TABLE." mt ON mt.post_id = j.ID
LEFT JOIN ".Job::$LINK_TABLE_JOB_CONTACT." l1 ON l1.job_id = j.ID
LEFT JOIN ".Contact::$TABLE_NAME." c ON c.ID = l1.contact_id
LEFT JOIN ".Supplier::$TABLE_NAME." s ON s.ID = c.supplier_id
LEFT JOIN ".Problem::$TABLE_NAME." pb ON pb.job_id = j.ID
LEFT JOIN ".Worksheet::$TABLE_NAME." ws ON ws.job_id = j.ID
LEFT JOIN ".Invoice::$TABLE_NAME." iv ON iv.job_id = j.ID
WHERE j.post_status = 'publish'
AND j.post_type = 'job'
".implode(' ',$where_condition)."
GROUP BY j.ID
ORDER BY j.post_date DESC
the Problem is that result for SUM is wrong when I LEFT JOIN other table.
The row 53 for example give 105 for nb_followup instead of 1
Where this request return the right result simply by removing the last 2 LEFT JOIN : LEFT JOIN ".Worksheet::$TABLE_NAME." ws ON ws.job_id = j.ID and
LEFT JOIN ".Invoice::$TABLE_NAME." iv ON iv.job_id = j.ID
SELECT j.* ,
c.name as client_name ,
s.name as supplier_name,
s.ID as supplier_id ,
mt.* ,
SUM(pb.require_followup) as nb_followup,
j.ID as ID
FROM $wpdb->posts j
LEFT JOIN ".Job::$META_TABLE." mt ON mt.post_id = j.ID
LEFT JOIN ".Job::$LINK_TABLE_JOB_CONTACT." l1 ON l1.job_id = j.ID
LEFT JOIN ".Contact::$TABLE_NAME." c ON c.ID = l1.contact_id
LEFT JOIN ".Supplier::$TABLE_NAME." s ON s.ID = c.supplier_id
LEFT JOIN ".Problem::$TABLE_NAME." pb ON pb.job_id = j.ID
WHERE j.post_status = 'publish'
AND j.post_type = 'job'
".implode(' ',$where_condition)."
GROUP BY j.ID
ORDER BY j.post_date DESC
Also removing only LEFT JOIN ".Invoice::$TABLE_NAME." iv ON iv.job_id = j.ID will give 15 as result for the row 53
To resume
Full request give 105 -> wrong should be 1
removing the last join give 15 -> wrong should be 1
removing the last 2 join give 1 -> Correct
You need to calculate the SUM()s BEFORE you join, otherwise the rows multiply because of the joins and this in turn leads to errors in summation. e.g.
SELECT
j.ID as ID
, pb.nb_followup
FROM $wpdb->posts j
LEFT JOIN (select pb.job_id, SUM(pb.require_followup) as nb_followup from ".Problem::$TABLE_NAME." pb GROUP BY pb.job_id) pb ON pb.job_id = j.ID
The other problem you are facing is that MySQL permits "lazy syntax" for group by. Don't use this lazy syntax or you will get unexpected error/bugs. It is very simple to avoid, REPEAT every column of the select clause in the group by clause UNLESS the column is using an aggregate function such as SUM(), COUNT(), MIN(), MAX() and so on.e.g.
select a.col1, b.col2, c.col3 , sum(d.col4)
from a
inner join b on a.id = b.aid
inner join c on b.id = c.bid
inner join d on c.id = d.cid
group by a.col1, b.col2, c.col3

Too many results in query

I'm fetching some data from our database in MSSQL. Out of this data I want to determine who created the client entry and who took the first payment from this client.
There can be many payment entries for a client on a single booking/enquiry and at the moment, my query shows results for each payment. How can I limit the output to only show the first payment entry?
My query:
SELECT
c.FirstName,
c.LastName,
c.PostalCode,
o.OriginOfEnquiry,
s.SuperOriginName,
c.DateOfCreation,
DATEDIFF(day, c.DateOfCreation, p.DateOfCreation) AS DaysToPayment,
pc.PackageName,
CONCAT(u.FirstName, ' ', u.LastName) AS CreateUser,
(SELECT CONCAT(u.FirstName, ' ', u.LastName)
WHERE u.UserID = p.UserID ) AS PaymentUser
FROM tblBookings b
INNER JOIN tblPayments p
ON b.BookingID = p.BookingID
INNER JOIN tblEnquiries e
ON e.EnquiryID = b.EnquiryID
INNER JOIN tblCustomers c
ON c.CustomerID = e.CustomerID
INNER JOIN tblOrigins o
ON o.OriginID = e.OriginID
INNER JOIN tblSuperOrigins s
ON s.SuperOriginID = o.SuperOriginID
INNER JOIN tblBookingPackages bp
ON bp.bookingID = p.BookingID
INNER JOIN tblPackages pc
ON pc.PackageID = bp.packageID
INNER JOIN tblUsers u
ON u.UserID = c.UserID
WHERE c.DateOfCreation >= '2016-06-01' AND c.DateOfCreation < '2016-06-30'
AND p.PaymentStatusID IN (1,2)
AND e.CustomerID = c.CustomerID
AND p.DeleteMark != 1
AND c.DeleteMark != 1
AND b.DeleteMark != 1
;
I tried adding a "TOP 1" to the nested select statement for PaymentUser, but it made no difference.
you can use cross apply with top 1:
FROM tblBookings b
cross apply
(select top 1 * from tblPayments p where b.BookingID = p.BookingID) as p
Instead of table tblPayments specify sub-query like this:
(SELECT TOP 1 BookingID, UserID, DateOfCreation
FROM tblPayments
WHERE DeleteMark != 1
AND PaymentStatusID IN (1,2)
ORDER BY DateOfCreation) as p
I'm assuming that tblPayments has a primary key column ID. If it is true, you can use this statment:
FROM tblBookings b
INNER JOIN tblPayments p ON p.ID = (
SELECT TOP 1 ID
FROM tblPayments
WHERE BookingID = b.BookingID
AND DeleteMark != 1
AND PaymentStatusID IN (1,2)
ORDER BY DateOfCreation)

SQL use nested select in middle of inner join

Is it possible to use a select in the middle of joining...
I am trying to do the following:
FROM
tblorders o
INNER JOIN tblunits u on o.id = u.orderid
INNER JOIN ((SELECT
,Min(n.date) as [MinDate]
from tblNotes n
Where n.test = 'test') te
INNER JOIN tblnotes n on te.id = n.id
and te.[MinDate] = n.AuditinsertTimestamp)
INNER Join tblClient c ON o.ClientId = c.Id
Basically in the select in the middle of the query it is selecting only the notes with min date. The problem is I need to do this here because I need from tblOrders to be the first table.......
Suggestions?
The INNER JOIN failed because you have a leading comma here:
,Min(n.date) as [MinDate]
I think you are looking for something like this:
SELECT ...
FROM tblorders o
INNER JOIN tblunits u on o.id = u.orderid
INNER JOIN (
SELECT id, Min(date) as [MinDate]
from tblNotes
Where test = 'test'
group by id
) te <-- not sure what JOIN clause to use here, please post schema
INNER JOIN tblnotes n on te.id = n.id
and te.[MinDate] = n.AuditinsertTimestamp
INNER Join tblClient c ON o.ClientId = c.Id
You are missing an alias and join condition:
FROM
tblorders o
INNER JOIN tblunits u on o.id = u.orderid
INNER JOIN ((SELECT Min(n.date) as [MinDate]
from tblNotes n
Where n.test = 'test') te
INNER JOIN tblnotes n on te.id = n.id
and te.[MinDate] = n.AuditinsertTimestamp)
-- missing
AS z
ON <join conditions haere>
INNER Join tblClient c ON o.ClientId = c.Id
Yes, you can have a Select in a Join.