I know youve answered similar issues before but I have a specific query I hope you can help.
I have a table of organizations (clients).
I need to find the most recent job per client.
the thing is, the client table isnt directly connected to the job. It goes like this Job - Job Header - Organization.
So I have a query for ALL Organizations (Select * From Organizations), then I have a JOIN to a query which finds the most recent job using the client ORg as the join criteria.
For
example:
Select * From Organization
LEFT JOIN (Select Top 1 JobDate, JobNumber,JobWeight From Jobs LEFT JOIN JobHeader on Job.PK = JobHeader.ParentPK LEFT JOIN Organization on JObHeader.Org = Organization.PK Order by JObDate DESC)
When I ran it, it gives an error saying Order By clause is invalid in views, inline functions.
How else can I find the most recent JObDate in the JobHeader table for each related Organization?
Your syntax looks like SQL Server. You can do what you want using a correlated subquery or lateral join (apply). This looks like
select o.*, j.*
from Organization o outer apply
(select Top 1 JobDate, JobNumber, JobWeight
from Jobs j join
JobHeader jh
on j.PK = jh.ParentPK
where jh.Org = o.PK
order by JObDate DESC
) j;
Related
I have a copy of our salesforce data in bigquery, I'm trying to join the contact table together with the account table.
I want to return every account in the dataset but I only want the contact that was created first for each account.
I've gone around and around in circles today googling and trying to cobble a query together but all roads either lead to no accounts, a single account or loads of contacts per account (ignoring the earliest requirement).
Here's the latest query. that produces no results. I think I'm nearly there but still struggling. any help would be most appreciated.
SELECT distinct
c.accountid as Acct_id
,a.id as a_Acct_ID
,c.id as Cont_ID
,a.id AS a_CONT_ID
,c.email
,c.createddate
FROM `sfdcaccounttable` a
INNER JOIN `sfdccontacttable` c
ON c.accountid = a.id
INNER JOIN
(SELECT a2.id, c2.accountid, c2.createddate AS MINCREATEDDATE
FROM `sfdccontacttable` c2
INNER JOIN `sfdcaccounttable` a2 ON a2.id = c2.accountid
GROUP BY 1,2,3
ORDER BY c2.createddate asc LIMIT 1) c3
ON c.id = c3.id
ORDER BY a.id asc
LIMIT 10
The solution shared above is very BigQuery specific: it does have some quirks you need to work around like the memory error you got.
I once answered a similar question here that is more portable and easier to maintain.
Essentially you need to create a smaller table(even better to make it a view) with the ID and it's first transaction. It's similar to what you shared by slightly different as you need to group ONLY in the topmost query.
It looks something like this
select
# contact ids that are first time contacts
b.id as cont_id,
b.accountid
from `sfdccontacttable` as b inner join
( select accountid,
min(createddate) as first_tx_time
FROM `sfdccontacttable`
group by 1) as a on (a.accountid = b.accountid and b.createddate = a.first_tx_time)
group by 1, 2
You need to do it this way because otherwise you can end up with multiple IDs per account (if there are any other dimensions associated with it). This way also it is kinda future proof as you can have multiple dimensions added to the underlying tables without affecting the result and also you can use a where clause in the inner query to define a "valid" contact and so on. You can then save that as a view and simply reference it in any subquery or join operation
Setup a view/subquery for client_first or client_last
as:
SELECT * except(_rank) from (
select rank() over (partition by accountid order by createddate ASC) as _rank,
*
FROM `prj.dataset.sfdccontacttable`
) where _rank=1
basically it uses a Window function to number the rows, and return the first row, using ASC that's first client, using DESC that's last client entry.
You can do that same for accounts as well, then you can join two simple, as exactly 1 record will be for each entity.
UPDATE
You could also try using ARRAY_AGG which has less memory footprint.
#standardSQL
SELECT e.* FROM (
SELECT ARRAY_AGG(
t ORDER BY t.createddate ASC LIMIT 1
)[OFFSET(0)] e
FROM `dataset.sfdccontacttable` t
GROUP BY t.accountid
)
I'm writing a query to multiply the count that I receive from subquery to fees amount, But I don't know how to do that. Any help/suggestion?
Oracle query is:
select courseid,coursename,fees*tmp
from course c join registration r on
r.courseid=c.courseid
and tmp IN (select count(*)
from course c join registration r on
r.courseid=c.courseid group by coursename);
I tried to use like a variable tmp ,But i don't think it works in oracle query. Is there an alternative way to do so?
You can't do that, because you can only select data from tables that appeared between FROM and WHERE. The IN operator is a quick way to save having to write a bunch of OR statements, it is not something that can establish a variable in the outer query.
Instead do something like:
select courseid,coursename,fees * COUNT(r.courseID) OVER(PARTITION BY c.coursename)
from course c join registration r on
r.courseid=c.courseid
Edit/update: you noted that this query produces too many rows and you only want to see distinct course names. In that case it would be better to just use the registrations table to count the number of people on the course and then multiply the fees:
SELECT
c.courseid, c.coursename, c.fees * COALESCE(r.numberOfstudents, 0) as courseWorth
FROM
course c
LEFT OUTER JOIN
(select courseid, COUNT(*) as numberofstudents FROM registration GROUP BY courseid) r
ON c.courseID = r.courseid
You can use a windowing function like Caius or you can use a join like this:
select courseid,coursename, fees * COALESCE(sub.cnt,0)
from course c
join registration r on r.courseid=c.courseid
left join (
select coursename, count(*) as cnt
from course c2
join registration r2 on r2.courseid=c2.courseid
group by coursename
) as sub;
note: I make no claim your joins are correct -- I'm basing this query off of your example not on any knowledge of your data model.
I have 3 Oracle tables for a project that link a demo Transaction table to Transaction_Customer and Transaction_Employee as shown below. Each transaction can have multiple customers involved and many employees involved.
I am trying to write a SQL query which will list each Customer_ID that has had transactions with multiple employees within a one period. I would like the output to include a single row for each Customer_ID with a comma separated list of which Employee_IDs had a transaction with that customer.
The output should look like this:
Customer_ID|Employees
601|007,008,009
The basic query to join the tables together looks like this:
select * from transactions t
left join transactions_customer tc
on t.t_id = tc.t_id
left join transactions_employee te
on t.t_id = te.t_id
How do I get this do I finish this assignment and get the query working the way intended?
Thank you!
Transactions
T_ID|Date|Amount
1|1/10/2017|100
2|1/10/2017|200
3|1/31/2017|150
4|2/16/2017|175
5|2/17/2017|175
6|2/18/2017|185
Transactions_Customer
T_ID|Customer_ID
1|600
1|601
1|602
2|605
3|606
4|601
5|607
6|607
Transactions_Employee
T_ID|Employee_ID
1|007
1|008
2|009
3|008
4|009
5|007
6|007
Is this what you want?
select tc.Customer_id,
listagg(te.employee_id, ',') within group (order by te.employee_id) as employees
from Transactions_Customer tc join
Transactions_Employee te
on tc.t_id = te.t_id
group by tc.Customer_id;
You only need the Transactions table for filtering on the date. Your question alludes to such filtering but does not exactly describe it, so I left it out.
Edit:
The customer data (and perhaps the employees data too) has duplicates. To avoid these in the output:
select tc.Customer_id,
listagg(te.employee_id, ',') within group (order by te.employee_id) as employees
from (select distinct tc.t_id, tc.customer_id
from Transactions_Customer tc
) tc join
(select distinct te.t_id, te.employee_id
from Transactions_Employee te
) te
on tc.t_id = te.t_id
group by tc.Customer_id;
This query feeds a data table with sorting, filtering, and pagination. All features worked fine until I added the INNER JOIN and then i got:
The multi-part 'identifier "Types.Description" could not be bound
if i remove the second WHERE clause at the end of the query the LIKE statements work, but i lose pagination. I removed some of the LIKE clauses to try and clean up this monstrous query.
SELECT *
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY TAG asc) AS RowNumber, *
FROM (
SELECT (SELECT COUNT(*) FROM Instruments) AS TotalDisplayRows, (SELECT COUNT(*) FROM Instruments) AS TotalRows, Instruments.Tag, Instruments.Location, Instruments.Description, Types.Description As TypeDesc, Manufacturer.Name, Lease.Name as LeaseName, Facility.Name as FacName
FROM Instruments
INNER JOIN Types ON Instruments.Type = Types.ID
INNER JOIN Manufacturer ON Instruments.Manufacturer = Manufacturer.ID
INNER JOIN Facility ON Instruments.Facility = Facility.ID
INNER JOIN Lease ON Instruments.Lease = Lease.ID
WHERE (Types.Description LIKE '%Cat%')
) RawResults
) Results
WHERE (Types.Description LIKE '%Cat%') AND RowNumber BETWEEN 1 AND 10
I think this is your problem
WHERE (types.description LIKE '%Cat%')
You can't do this because you are actually selecting from your derived table named Results and you aliased the column as TypeDesc.
So it should be
WHERE (results.typeDesc LIKE '%Cat%')
I think I'm pretty close on this query, but can't seem to crack it, and I'm not sure if I've got the most efficient approach.
I am trying to find a day where a user is not booked from a range of dates where they are booked.
Think staff scheduling. I need to find who is available to work on Tuesday, and is working on other days this week.
My query currently looks like this
SELECT employees.uid, name, date
FROM employees
LEFT JOIN storelocation ON employees.uid = storelocation.uid
LEFT JOIN schedule ON emplyees.uid = schedule.uid
WHERE slid =9308
AND date
BETWEEN '2009-11-10'
AND '2009-12-20'
AND NOT
EXISTS (
SELECT uid
FROM schedule
WHERE date = '2009-11-11'
)
If I don't include the 'Not Exists', I get 1500 results
If I use only the Select form the 'Not Exists', I get 200 results, so both of those queries work independently.
However, my query as I've written it returns 0 results.
You might want to try something more like this:
SELECT employees.uid, name, date
FROM users
LEFT JOIN storelocation ON employees.uid = storelocation.uid
LEFT JOIN schedule ON emplyees.uid = schedule.uid
WHERE slid =9308
AND date BETWEEN '2009-11-10' AND '2009-12-20'
AND employees.uid NOT IN (
SELECT uid
FROM schedule
WHERE date = '2009-11-11'
)
The problem is your NOT EXISTS isn't correllated, and you won't be able to do with this without using table aliases:
SELECT e.uid,
e.name,
s.date
FROM EMPLOYEES e
LEFT JOIN STORELOCATION sl ON sl.uid = e.uid
LEFT JOIN schedule s ON s.uid = e.uid
AND s.date BETWEEN '2009-11-10'AND '2009-12-20'
WHERE slid = 9308
AND NOT EXISTS (SELECT NULL
FROM SCHEDULE t
WHERE t.uid = e.uid
AND t.date = '2009-11-11')
The SELECT in an EXISTS clause doesn't do anything - you could use EXISTS( SELECT 1/0 ..., which should cause a "can not divide by zero" error. But it won't... EXISTS only returns true if 1+ instances match the WHERE/etc clause. There are numerous questions on SO asking about if it matters what's in the SELECT clause if you want to read more.
Jim's answer, typo aside, should be faster in MySQL than the alternative I supplied. To read more about why, check this article: NOT IN vs NOT EXISTS vs LEFT JOIN/IS NULL in MySQL.