SQL using table from join in subqueries - sql

I am writing a query to be used as databases view, it looks now like this:
SELECT
contact.*,
contact_users.names AS user_names,
contact_status.status_id AS status_id,
status_translation.name AS status_name,
status_translation.lang_id AS lang_id
FROM contacts as contact
LEFT JOIN contact_status AS contact_status ON contact_status.status_id = contact.status
LEFT JOIN contact_status_translation AS status_translation ON status_translation.id = contact.status
LEFT JOIN (
SELECT
contacts_users.contact_id,
string_agg(users.fullname || ', ' || users.id, ' | ') as names
FROM v_contacts_users as contacts_users
LEFT JOIN v_users as users on users.id = contacts_users.user_id
WHERE users.lang_id = status_translation.lang_id
GROUP BY contacts_users.contact_id
) AS contact_users ON contact_users.contact_id = contact.id
WHERE contact.deleted = FALSE
Everything works as expected except the WHERE condition in the last LEFT JOIN - the WHERE users.lang_id = status_translation.lang_id part says that status_translation cannot be referenced in this part of the query? Why is that? I tried to reference this table with various always but the result is still the same.Thing is that v_users is translated as well so I need to have only one result from this table.

Insert LATERAL between LEFT JOIN and the opening parenthesis if you want to reference previous FROM list entries.

Related

PostgreSQL query optimize

Here is my query
SELECT
DISTINCT(org.id),
org.name,
org.partner_id,
pos.partner_id,
pos.id,
org.partner_offer_section_id,
pos.title,
pos.offer_value,
pos.offer_currency,
(SELECT user_info.email FROM user_info WHERE user_info.org_id=org.id ORDER BY created ASC LIMIT 1) as user_email,
(SELECT CONCAT(user_info.first_name,' ',user_info.last_name) FROM user_info WHERE user_info.org_id=org.id ORDER BY created ASC LIMIT 1) as name
FROM org
INNER JOIN partner_offer_section pos ON org.partner_offer_section_id = pos.id
WHERE org.partner_offer_section_id != 0 AND org.partner_id != 0
Here is the same subquery that is executing the twice the same query. I was trying to left join this query but the problem is when I left join I got a null value. I have to get one user name or user email insted of multiple users aginst org.
SELECT org.name,
org.partner_id,
org.partner_offer_section_id,
org.offer_applied_date,
partner_offer_section.title,
partner_offer_section.offer_value,
partner_offer_section.offer_currency,
user_info.email
FROM org
left join (SELECT user_info.id, user_info.email,user_info.created, user_info.org_id FROM user_info WHERE role='Org Admin' LIMIT 1) user_info on org.id = user_info.org_id
left join partner_offer_section on org.partner_offer_section_id = partner_offer_section.id
where org.partner_id = 1
Now I wanna optime this query instead of multiple same subqueries.
You should join the table directly instead of doing a subquery. Bellow is the example, making a JOIN with the first table and the LEFT only with the last one. Also, DISTINCT applies to all columns, it's not a function, as user #a_horse_with_no_name pointed out
SELECT DISTINCT
org.name,
org.partner_id,
org.partner_offer_section_id,
org.offer_applied_date,
partner_offer_section.title,
partner_offer_section.offer_value,
partner_offer_section.offer_currency,
user_info.email
FROM org
join partner_offer_section on org.partner_offer_section_id = partner_offer_section.id
left join user_info on org.id = user_info.org_id
and user_info.role='Org Admin'
where org.partner_id = 1

SQL left joins returning weird IDs

I am running this SQL query in a SQL Server database,
SELECT staff_list.id AS staff_id,
staff_list.status,
core_teams.workbase,
core_teams.service_user,
core_teams.staff_name,
core_teams.role,
service_users.id AS su_id
FROM staff_list
LEFT JOIN core_teams
ON staff_list.workbase = core_teams.workbase
LEFT JOIN service_users
ON service_users.name = core_teams.service_user
WHERE staff_list.status <> 'Left'
AND staff_list.status <> 'Name Changed';
The join and results are being returned, however the attribute staff_id is being repeated multiple times over many rows for users that have relevance to the staff_id. Am I doing something incorrectly?
You may be missing a link between staff_list and core_teams.
SELECT staff_list.id AS staff_id, staff_list.status,
core_teams.workbase, core_teams.service_user, core_teams.staff_name,
core_teams.role, service_users.id AS su_id
FROM staff_list
LEFT JOIN core_teams
ON staff_list.workbase = core_teams.workbase
--missing link
AND staff_list.name = core_teams.staff_name
LEFT JOIN service_users
ON service_users.name = core_teams.service_user
WHERE staff_list.status <> 'Left'
AND staff_list.status <> 'Name Changed';

Access: Integrating Subquery into excisting Query

The "Last" function in the query below (line 4 & 5)that I'm using is not exactly what I'm after. The last function finds the last record in that table.
What i need find is the most recent record in the table according to a date field.
SELECT
tblinmate.statusid,
tblinmate.activedate,
Last(tblclassificationhistory.classificationid) AS LastOfclassificationID,
Last(tblsquadhistory.squadid) AS LastOfsquadID,
tblperson.firstname,
tblperson.middlename,
tblperson.lastname,
tblinmate.prisonnumber,
tblinmate.droppeddate,
tblinmate.personid,
tblinmate.inmateid
FROM tblsquad
INNER JOIN (tblperson
INNER JOIN ((tblinmate
INNER JOIN (tblclassification
INNER JOIN tblclassificationhistory
ON tblclassification.classificationid =
tblclassificationhistory.classificationid)
ON tblinmate.inmateid =
tblclassificationhistory.inmateid)
INNER JOIN tblsquadhistory
ON tblinmate.inmateid =
tblsquadhistory.inmateid)
ON tblperson.personid = tblinmate.personid)
ON tblsquad.squadid = tblsquadhistory.squadid
GROUP BY tblinmate.statusid,
tblinmate.activedate,
tblperson.firstname,
tblperson.middlename,
tblperson.lastname,
tblinmate.prisonnumber,
tblinmate.droppeddate,
tblinmate.personid,
tblinmate.inmateid;
This query below does just that, finds the most recent record in a table according to a date field.
my problem is i dont know how to integrate this Query into the above to replace the "Last" function
SELECT a.inmateID,
a.classificationID,
b.max_date
FROM (
SELECT tblClassificationHistory.inmateID,
tblClassificationHistory.classificationID,
tblClassificationHistory.reclassificationDate
FROM tblinmate
INNER JOIN tblClassificationHistory
ON tblinmate.inmateID = tblClassificationHistory.inmateID
) a
INNER JOIN (
SELECT tblClassificationHistory.inmateID,
MAX(tblClassificationHistory.reclassificationDate) as max_date
FROM tblinmate
INNER JOIN tblClassificationHistory
ON tblinmate.inmateID = tblClassificationHistory.inmateID
GROUP BY tblClassificationHistory.inmateID
) b
ON a.inmateID = b.inmateID
AND a.reclassificationDate = b.max_date
ORDER BY a.inmateID;
I got a tip from another forum to combine queries like this
SELECT qryMainTemp.*, qrySquad.*, qryClassification.*
FROM (qryMainTemp INNER JOIN qrySquad ON qryMainTemp.inmateID = qrySquad.inmateID) INNER JOIN qryClassification ON qryMainTemp.inmateID = qryClassification.inmateID;
and it worked :) i separated the first query into the two queries it was made of and then combined the three like shown above.
Sadly this made another problem arise the query is now not up-datable..working on a solution for this

Cannot get both records with SQL/Bigquery JOINs

I've got a query that returns order details, I want information from the briisk table for deals it has found. I also want it to display orders even if the briisk table has nothing.
If I add the final line (and flostream.briisk.master = "") my query only returns one result instead of two.
SELECT *
FROM (SELECT orderno,ifnull(dealid,sales_rule) as DealIDCombo from flostream.orders left join mobileheads.surveys on mobileheads.surveys.order_number = flostream.orders.externalreference) as first
INNER JOIN flostream.orders on first.orderno = flostream.orders.orderno
LEFT JOIN flostream.briisk on first.dealidcombo = flostream.briisk.uniquereference
WHERE first.orderno in (359692,359683)
//AND flostream.briisk.master = ""
When you use a left outer join, then you need to include filter conditions on the second table in the on clause. So try this:
SELECT *
FROM (SELECT orderno,ifnull(dealid,sales_rule) as DealIDCombo
from flostream.orders left join
mobileheads.surveys
on mobileheads.surveys.order_number = flostream.orders.externalreference
) as first INNER JOIN
flostream.orders
on first.orderno = flostream.orders.orderno LEFT JOIN
flostream.briisk
on first.dealidcombo = flostream.briisk.uniquereference AND
flostream.briisk.master = ""
WHERE first.orderno in (359692, 359683)
Conditions on the first table should go in the WHERE clause.

Super Slow Query - sped up, but not perfect... Please help

I posted a query yesterday (see here) that was horrible (took over a minute to run, resulting in 18,215 records):
SELECT DISTINCT
dbo.contacts_link_emails.Email, dbo.contacts.ContactID, dbo.contacts.First AS ContactFirstName, dbo.contacts.Last AS ContactLastName, dbo.contacts.InstitutionID,
dbo.institutionswithzipcodesadditional.CountyID, dbo.institutionswithzipcodesadditional.StateID, dbo.institutionswithzipcodesadditional.DistrictID
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
INNER JOIN
dbo.institutionswithzipcodesadditional
ON dbo.contacts.InstitutionID = dbo.institutionswithzipcodesadditional.InstitutionID
LEFT OUTER JOIN
dbo.contacts_def_jobfunctions
INNER JOIN
dbo.contacts_link_jobfunctions
ON dbo.contacts_def_jobfunctions.JobID = dbo.contacts_link_jobfunctions.JobID
ON dbo.contacts.ContactID = dbo.contacts_link_jobfunctions.ContactID
WHERE
(dbo.contacts.JobTitle IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_1
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist))
OR
(dbo.contacts_link_jobfunctions.JobID IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_2
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist AS newsletterremovelist))
ORDER BY EMAIL
With a lot of coaching and research, I've tuned it up to the following:
SELECT contacts.ContactID,
contacts.InstitutionID,
contacts.First,
contacts.Last,
institutionswithzipcodesadditional.CountyID,
institutionswithzipcodesadditional.StateID,
institutionswithzipcodesadditional.DistrictID
FROM contacts
INNER JOIN contacts_link_emails ON
contacts.ContactID = contacts_link_emails.ContactID
INNER JOIN institutionswithzipcodesadditional ON
contacts.InstitutionID = institutionswithzipcodesadditional.InstitutionID
WHERE
(contacts.ContactID IN
(SELECT contacts_2.ContactID
FROM contacts AS contacts_2
INNER JOIN contacts_link_emails AS contacts_link_emails_2 ON
contacts_2.ContactID = contacts_link_emails_2.ContactID
LEFT OUTER JOIN contacts_def_jobfunctions ON
contacts_2.JobTitle = contacts_def_jobfunctions.JobID
RIGHT OUTER JOIN newsletterremovelist ON
contacts_link_emails_2.Email = newsletterremovelist.EmailAddress
WHERE (contacts_def_jobfunctions.ParentJobID <> 1841)
GROUP BY contacts_2.ContactID
UNION
SELECT contacts_1.ContactID
FROM contacts_link_jobfunctions
INNER JOIN contacts_def_jobfunctions AS contacts_def_jobfunctions_1 ON
contacts_link_jobfunctions.JobID = contacts_def_jobfunctions_1.JobID
AND contacts_def_jobfunctions_1.ParentJobID <> 1841
INNER JOIN contacts AS contacts_1 ON
contacts_link_jobfunctions.ContactID = contacts_1.ContactID
INNER JOIN contacts_link_emails AS contacts_link_emails_1 ON
contacts_link_emails_1.ContactID = contacts_1.ContactID
LEFT OUTER JOIN newsletterremovelist AS newsletterremovelist_1 ON
contacts_link_emails_1.Email = newsletterremovelist_1.EmailAddress
GROUP BY contacts_1.ContactID))
While this query is now super fast (about 3 seconds), I've blown part of the logic somewhere - it only returns 14,863 rows (instead of the 18,215 rows that I believe is accurate).
The results seem near correct. I'm working to discover what data might be missing in the result set.
Can you please coach me through whatever I've done wrong here?
Thanks,
Russell Schutte
The main problem with your original query was that you had two extra joins just to introduce duplicates and then a DISTINCT to get rid of them.
Use this:
SELECT cle.Email,
c.ContactID,
c.First AS ContactFirstName,
c.Last AS ContactLastName,
c.InstitutionID,
izip.CountyID,
izip.StateID,
izip.DistrictID
FROM dbo.contacts c
INNER JOIN
dbo.institutionswithzipcodesadditional izip
ON izip.InstitutionID = c.InstitutionID
INNER JOIN
dbo.contacts_link_emails cle
ON cle.ContactID = c.ContactID
WHERE cle.Email NOT IN
(
SELECT EmailAddress
FROM dbo.newsletterremovelist
)
AND EXISTS
(
SELECT NULL
FROM dbo.contacts_def_jobfunctions cdj
WHERE cdj.JobId = c.JobTitle
AND cdj.ParentJobId <> '1841'
UNION ALL
SELECT NULL
FROM dbo.contacts_link_jobfunctions clj
JOIN dbo.contacts_def_jobfunctions cdj
ON cdj.JobID = clj.JobID
WHERE clj.ContactID = c.ContactID
AND cdj.ParentJobId <> '1841'
)
ORDER BY
email
Create the following indexes:
newsletterremovelist (EmailAddress)
contacts_link_jobfunctions (ContactID, JobID)
contacts_def_jobfunctions (JobID)
Do you get the same results when you do:
SELECT count(*)
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
SELECT COUNT(*)
FROM
contacts
INNER JOIN contacts_link_jobfunctions
ON contacts.ContactID = contacts_link_jobfunctions.ContactID
INNER JOIN contacts_link_emails
ON contacts.ContactID = contacts_link_emails.ContactID
If so keep adding each join conditon on until you don't get the same results and you will see where your mistake was. If all the joins are the same, then look at the where clauses. But I will be surprised if it isn't in the first join because the syntax you have orginally won't even work on SQL Server and it is pretty nonstandard SQL and may have been incorrect all along but no one knew.
Alternatively, pick a few of the records that are returned in the orginal but not the revised. Track them through the tables one at a time to see if you can find why the second query filters them out.
I'm not directly sure what is wrong, but when I run in to this situation, the first thing I do is start removing variables.
So, comment out the where clause. How many rows are returned?
If you get back the 11,604 rows then you've isolated the problems to the joins. Work though the joins, commenting each one out (remove the associated columns too) and figure out how many rows are eliminated.
As you do this, aim to find what is causing the desired rows to be eliminated. Once isolated, consider the join differences between the first query and the second query.
In looking at the first query, you could probably just modify that to eliminate any INs and instead do a EXISTS instead.
Consider your indexes as well. Any thing in the where or join clauses should probably be indexed.