Missing Keyword Error using Inner join in SQL - sql

I am new to SQL and thus facing some problems in enhancing a SQL query.
There are two tables basically: one is ef_dat_app_ef_link and the other is ef_dat_lspd_machine_type_model
Here is the query I am talking about. I have added an inner join in the 10th line which is the base of all problems.
select country_isocode as country,
language_isocode as language, model_number,
machine_type_model_name model_name,machine_type_group_id,
machine_type_model_id, product_type_name product_type,
machine_type_model_id_to,filenet_link
from ef_dat_lspd_machine_type_model,
cross join ef_dim_lspd_country_language
left join ef_dat_lspd_model_country
using (machine_type_model_id, country_isocode)
inner join ef_dat_app_ef_link on (ef_dat_lspd_machine_type_model.lpmd_revenue_pid = ef_dat_app_ef_link.ef_product_revenue_pid)
where (ef_dat_app_ef_link.country_isocode='US' or ef_dat_app_ef_link.country_isocode='CA') and ef_dat_app_ef_link.language_isocode='en'
join ef_dat_lspd_product_type using (product_type_id)
left join ef_dat_lspd_model_relationship on (machine_type_model_id_from = machine_type_model_id)
where (discontinue_date is null or discontinue_date > sysdate) and
(announce_date is null or announce_date <= sysdate)
and (machine_type_model_id in (select machine_type_model_id from ef_dat_lspd_model_parts)
or not machine_type_model_id_to is null) order by machine_type_model_id
Below is the unchanged query which I am supposed to work on.
select country_isocode as country,
language_isocode as language,
machine_type_model_id,
product_type_name product_type,machine_type_model_id_to,
image_url
from ef_dat_lspd_machine_type_model cross join ef_dim_lspd_country_language
left join ef_dat_lspd_model_country using (machine_type_model_id, country_isocode)
join ef_dat_lspd_product_type using (product_type_id)
left join ef_dat_lspd_model_relationship on (machine_type_model_id_from = machine_type_model_id)
where (discontinue_date is null or discontinue_date > sysdate) and
(announce_date is null or announce_date <= sysdate) and
(machine_type_model_id in (select machine_type_model_id from ef_dat_lspd_model_parts) or >not machine_type_model_id_to is null)
order by machine_type_model_id
Now in the ef_dat_app_ef_link table there is a link with images,countrycode and languagecode and a revenueid and the other is ef_dat_lspd_machine_type_model which contains the image links and a few more columns. What I am trying to do is, create a query which shall replace pull up images from the ef_dat_app_ef_link table where the revenue id in this table is equal to the revenue id in the other table. A lot of columns come up when a table is searched by the revenue id, thats why i want to pull up a row which has language as 'en' and country as 'US' or 'CA'.
I have added an inner join statement to the same effect but it is consistently throwing a ORA00905 error for a missing keyword.
I have italicized the changes I have made. Sorry for such a bad representation of the code. I couldn't make head or tail of how to make it look better.

You added a WHERE clause smack dab in the middle of the existing query. You can't have more than one WHERE clause in a query, and you also can't place it in the middle of your table joins.
Merge the WHERE clause you added to the pre-existing one (indicated by the green arrow in the snapshot).

You can't mix the structure like that you have to do it like
select *
from t
join t1
join t2
where
and
or
order by
you can't do this
select *
from t
join t2
where
order
join
where
order
try
select country_isocode as country,
language_isocode as language, model_number,
machine_type_model_name model_name,machine_type_group_id,
machine_type_model_id, product_type_name product_type,
machine_type_model_id_to,filenet_link
from ef_dat_lspd_machine_type_model
cross join ef_dim_lspd_country_language
left join ef_dat_lspd_model_country
using (machine_type_model_id, country_isocode)
inner join ef_dat_app_ef_link on (ef_dat_lspd_machine_type_model.lpmd_revenue_pid = ef_dat_app_ef_link.ef_product_revenue_pid)
join ef_dat_lspd_product_type using (product_type_id)
left join ef_dat_lspd_model_relationship on (machine_type_model_id_from = machine_type_model_id)
where (ef_dat_app_ef_link.country_isocode='US' or ef_dat_app_ef_link.country_isocode='CA') and ef_dat_app_ef_link.language_isocode='en'
and (discontinue_date is null or discontinue_date > sysdate) and
(announce_date is null or announce_date <= sysdate)
and (machine_type_model_id in (select machine_type_model_id from ef_dat_lspd_model_parts)
or not machine_type_model_id_to is null) order by machine_type_model_id

Related

SQL subquery with join to main query

I have this:
SELECT
SU.FullName as Salesperson,
COUNT(DS.new_dealsheetid) as Units,
SUM(SO.New_profits_sales_totaldealprofit) as TDP,
SUM(SO.New_profits_sales_totaldealprofit) / COUNT(DS.new_dealsheetid) as PPU,
-- opportunities subquery
(SELECT COUNT(*) FROM Opportunity O
LEFT JOIN Account A ON O.AccountId = A.AccountId
WHERE A.OwnerId = SU.SystemUserId AND
YEAR(O.CreatedOn) = 2022)
-- /opportunities subquery
FROM New_dealsheet DS
LEFT JOIN SalesOrder SO ON DS.New_DSheetId = SO.SalesOrderId
LEFT JOIN New_salespeople SP ON DS.New_SalespersonId = SP.New_salespeopleId
LEFT JOIN SystemUser SU ON SP.New_SystemUserId = SU.SystemUserId
WHERE
YEAR(SO.New_purchaseordersenddate) = 2022 AND
SP.New_SalesGroupIdName = 'LO'
GROUP BY SU.FullName
I'm getting an error from the subquery:
Column 'SystemUser.SystemUserId' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Is it possible to use the SystemUser table join from the main query in this way?
As has been mentioned extensively in the comments, the error is actually telling you the problem; SU.SystemUserId isn't in the GROUP BY nor in an aggregate function, and it appears in the SELECT of the query (albeit in the WHERE of a correlated subquery). Any columns in the SELECT must be either aggregated or in the GROUP BY when using one of the other for a query scope. As the column in question isn't aggregated nor in the GROUP BY, the error occurs.
There are, however, other problems. Like mentioned ikn the comments too, your LEFT JOINs make little sense, as many of the tables you LEFT JOIN to require a column in that table to have a non-NULL value; it is impossible for a column to have a non-NULL value if a row was not found.
You also use syntax like YEAR(<Column Name>) = <int Value> in the WHERE; this is not SARGable, and thus should be avoided. Use explicit date boundaries instead.
I assume here that SU.SystemUserId is a primary key, and so should be in the GROUP BY. This is probably a good thing anyway, as a person's full name isn't something that can be used to determine who a person is on their own (take this from someone who shared their name, date of birth and post code with another person in their youth; it caused many problems on the rudimentary IT systems of the time). This results in a query like this:
SELECT SU.FullName AS Salesperson,
COUNT(DS.new_dealsheetid) AS Units,
SUM(SO.New_profits_sales_totaldealprofit) AS TDP,
SUM(SO.New_profits_sales_totaldealprofit) / COUNT(DS.new_dealsheetid) AS PPU,
(SELECT COUNT(*)
FROM dbo.Opportunity O
JOIN dbo.Account A ON O.AccountId = A.AccountId --A.OwnerID must have a non-NULL value, so why was this a LEFT JOIN?
WHERE A.OwnerId = SU.SystemUserId
AND O.CreatedOn >= '20220101' --Don't use YEAR(<Column Name>) = <int Value> syntax, it isn't SARGable
AND O.CreatedOn < '20230101') AS SomeColumnAlias
FROM dbo.New_dealsheet DS
JOIN dbo.SalesOrder SO ON DS.New_DSheetId = SO.SalesOrderId --SO.New_purchaseordersenddate must have a non-NULL value, so why was this a LEFT JOIN?
JOIN dbo.New_salespeople SP ON DS.New_SalespersonId = SP.New_salespeopleId --SP.New_SalesGroupIdName must have a non-NULL value, so why was this a LEFT JOIN?
LEFT JOIN dbo.SystemUser SU ON SP.New_SystemUserId = SU.SystemUserId --This actually looks like it can be a LEFT JOIN.
WHERE SO.New_purchaseordersenddate >= '20220101' --Don't use YEAR(<Column Name>) = <int Value> syntax, it isn't SARGable
AND SO.New_purchaseordersenddate < '20230101'
AND SP.New_SalesGroupIdName = 'LO'
GROUP BY SU.FullName,
SU.SystemUserId;
Doing such a sub-query is bad on a performance-wise point of view
better do it like this:
SELECT
SU.FullName as Salesperson,
COUNT(DS.new_dealsheetid) as Units,
SUM(SO.New_profits_sales_totaldealprofit) as TDP,
SUM(SO.New_profits_sales_totaldealprofit) / COUNT(DS.new_dealsheetid) as PPU,
SUM(csq.cnt) as Count
FROM New_dealsheet DS
LEFT JOIN SalesOrder SO ON DS.New_DSheetId = SO.SalesOrderId
LEFT JOIN New_salespeople SP ON DS.New_SalespersonId = SP.New_salespeopleId
LEFT JOIN SystemUser SU ON SP.New_SystemUserId = SU.SystemUserId
-- Moved subquery as sub-join
LEFT JOIN (SELECT a.OwnerId, YEAR(o.CreatedOn) as year, COUNT(*) cnt FROM Opportunity O
LEFT JOIN Account A ON O.AccountId = A.AccountId
GROUP BY a.OwnerId, YEAR(o.CreatedOn) as csq ON csq.OwnerId = su.SystemUserId and csqn.Year = 2022
WHERE
YEAR(SO.New_purchaseordersenddate) = 2022 AND
SP.New_SalesGroupIdName = 'LO'
GROUP BY SU.FullName
So you have a nice join and a clean result
The query above is untested

LEFT JOIN not keeping only records that occur in a SELECT query

I have the following SQL select statement that I use to get a subset of products, or wines:
SELECT pv.SkProdVariantId AS id,
pa.Colour AS colour,
FROM Dim.ProductVariant AS pv
JOIN ProductAttributes_new AS pa
ON pv.SkProdVariantId = pa.SkProdVariantId
WHERE pv.ProdTypeName = 'Wines'
The length of this table generated is 3,905. I want to get all the transactional data for these products.
At the moment I'm using this select statement
SELECT c.CalDate AS timestamp,
f.SkProductVariantId AS sku_id,
f.Quantity AS quantity
FROM fact.FTransactions AS f
LEFT JOIN Dim.Calendar AS c
ON f.SkDateId = c.SkDateId
LEFT JOIN (
SELECT pv.SkProdVariantId AS id,
pa.Colour AS colour,
FROM Dim.ProductVariant AS pv
JOIN ProductAttributes_new AS pa
ON pv.SkProdVariantId = pa.SkProdVariantId
WHERE pv.ProdTypeName = 'Wines'
) AS s
ON s.id = f.SkProductVariantId
WHERE c.CalDate LIKE '%2019%'
The calendar dates are correct, but the number of unique products returned is 5,648, rather than the expected 3,905 from the select query.
Why does my LEFT JOIN on the first select query not work as I expect it to, please?
Thanks for any help!
If you want all the rows form your query, it needs to be the first reference in the LEFT JOIN. Then, I am guessing that you want transaction in 2019:
select . . .
from (SELECT pv.SkProdVariantId AS id, pa.Colour AS colour,
FROM Dim.ProductVariant pv JOIN
ProductAttributes_new pa
ON pv.SkProdVariantId = pa.SkProdVariantId
WHERE pv.ProdTypeName = 'Wines'
) s LEFT JOIN
(fact.FTransactions f JOIN
Dim.Calendar c
ON f.SkDateId = c.SkDateId AND
c.CalDate >= '2019-01-01' AND
c.CalDate < '2020-01-01'
)
ON s.id = f.SkProductVariantId;
Note that this assumes that CalDate is really a date and not a string. LIKE should only be used on strings.
You misunderstand somehow how outer joins work. See Gordon's answer and my request comment on that.
As to the task: It seems you want to select transactions of 2019, but you want to restrict your results to wine products. We typically restrict query results in the WHERE clause. You can use IN or EXISTS for that.
SELECT
c.CalDate AS timestamp,
f.SkProductVariantId AS sku_id,
f.Quantity AS quantity
FROM fact.FTransactions AS f
INNER JOIN Dim.Calendar AS c ON f.SkDateId = c.SkDateId
WHERE DATEPART(YEAR, c.CalDate) = 2019
AND f.SkProductVariantId IN
(
SELECT pv.SkProdVariantId
FROM Dim.ProductVariant AS pv
WHERE pv.ProdTypeName = 'Wines'
);
(I've removed the join to ProductAttributes_new, because it doesn't seem to play any part in this query.)

Rows which appeared only once in query results

I want to have rows which appeared in 25th of February and did not appear in 6th of March. I've tried to do such query:
SELECT favfd.day, dv.title, array_to_string(array_agg(distinct "host_name"), ',') AS affected_hosts , htmltoText(dsol.fix),
count(dv.title) FROM fact_asset_vulnerability_finding_date favfd
INNER JOIN dim_vulnerability dv USING(vulnerability_id)
INNER JOIN dim_asset USING(asset_id)
INNER JOIN dim_vulnerability_solution USING(vulnerability_id)
INNER JOIN dim_solution dsol USING(solution_id)
INNER JOIN dim_solution_highest_supercedence dshs USING (solution_id)
WHERE (favfd.day='2018-02-25' OR favfd.day='2018-03-06') AND
dsol.solution_type='PATCH' AND dshs.solution_id=dshs.superceding_solution_id
GROUP BY favfd.day, dv.title, host_name, dsol.fix
ORDER BY favfd.day, dv.title
which gave me following results:
results
I've read that I need to add something like "HAVING COUNT(*)=1" but like you can see in query results my count columns looks quite weird. Here is my results with that line added:
results with having
Can you advice me what I am doing wrong?
One way is to use a HAVING clause to assert your date criteria:
SELECT
dv.title,
array_to_string(array_agg(distinct "host_name"), ',') AS affected_hosts,
htmltoText(dsol.fix),
count(dv.title)
FROM fact_asset_vulnerability_finding_date favfd
INNER JOIN dim_vulnerability dv USING(vulnerability_id)
INNER JOIN dim_asset USING(asset_id)
INNER JOIN dim_vulnerability_solution USING(vulnerability_id)
INNER JOIN dim_solution dsol USING(solution_id)
INNER JOIN dim_solution_highest_supercedence dshs USING (solution_id)
WHERE
dsol.solution_type = 'PATCH' AND
dshs.solution_id = dshs.superceding_solution_id
GROUP BY dv.title, dsol.fix
HAVING
SUM(CASE WHEN favfd.day = '2018-02-25' THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN favfd.day = '2018-03-06' THEN 1 ELSE 0 END) = 0
ORDER BY favfd.day, dv.title
The only major changes I made were to remove host_name from the GROUP BY clause, because you use this column in aggregate, in the select clause. And I added the HAVING clause to check your logic.
The change you should make is to replace your implicit join syntax with explicit joins. Putting join criteria into the WHERE clause is considered bad practice nowadays.

SQL query that uses a GROUP BY and IN is too slow

I am struggling to speed this SQL query up. I have tried removing all the fields besides the two SUM() functions and the Id field but it is still incredibly slow. It is currently taking 15 seconds to run. Does anyone have any suggestions to speed this up as it is currently causing a timeout on a page in my web app. I need the fields shown so I can't really remove them but there surely has to be a way to improve this?
SELECT [Customer].[iCustomerID],
[Customer].[sCustomerSageCode],
[Customer].[sCustomerName],
[Customer].[sCustomerTelNo1],
SUM([InvoiceItem].[fQtyOrdered]) AS [Quantity],
SUM([InvoiceItem].[fNetAmount]) AS [Value]
FROM [dbo].[Customer]
LEFT JOIN [dbo].[CustomerAccountStatus] ON ([Customer].[iAccountStatusID] = [CustomerAccountStatus].[iAccountStatusID])
LEFT JOIN [dbo].[SalesOrder] ON ([SalesOrder].[iCustomerID] = [dbo].[Customer].[iCustomerID])
LEFT JOIN [Invoice] ON ([Invoice].[iCustomerID] = [Customer].[iCustomerID])
LEFT JOIN [dbo].[InvoiceItem] ON ([Invoice].[iInvoiceNumber] = [InvoiceItem].[iInvoiceNumber])
WHERE ([InvoiceItem].[sNominalCode] IN ('4000', '4001', '4002', '4004', '4005', '4006', '4007', '4010', '4015', '4016', '700000', '701001', '701002', '701003'))
AND( ([dbo].[SalesOrder].[dOrderDateTime] >= '2013-01-01')
OR ([dbo].[Customer].[dDateCreated] >= '2014-01-01'))
GROUP BY [Customer].[iCustomerID],[Customer].[sCustomerSageCode],[Customer].[sCustomerName], [Customer].[sCustomerTelNo1];
I don't think this query is doing what you want anyway. As written, there are no relationships between the Invoice table and the SalesOrder table. This leads me to believe that it is producing a cartesian product between invoices and orders, so customers with lots of orders would be generating lots of unnecessary intermediate rows.
You can test this by removing the SalesOrder table from the query:
SELECT c.[iCustomerID], c.[sCustomerSageCode], c.[sCustomerName], c.[sCustomerTelNo1],
SUM(it.[fQtyOrdered]) AS [Quantity], SUM(it.[fNetAmount]) AS [Value]
FROM [dbo].[Customer] c LEFT JOIN
[dbo].[CustomerAccountStatus] cas
ON c.[iAccountStatusID] = cas.[iAccountStatusID] LEFT JOIN
[Invoice] i
ON (i.[iCustomerID] = c.[iCustomerID]) LEFT JOIN
[dbo].[InvoiceItem] it
ON (i.[iInvoiceNumber] = it.[iInvoiceNumber])
WHERE it.[sNominalCode] IN ('4000', '4001', '4002', '4004', '4005', '4006', '4007', '4010', '4015', '4016', '700000', '701001', '701002', '701003') AND
c.[dDateCreated] >= '2014-01-01'
GROUP BY c.[iCustomerID], c.[sCustomerSageCode], c.[sCustomerName], c.[sCustomerTelNo1];
If this works and you need the SalesOrder, then you will need to either pre-aggregate by SalesOrder or find better join keys.
The above query could benefit from an index on Customer(dDateCreated, CustomerId).
You have a lot of LEFT JOIN
I don't see CustomerAccountStatus usage. Ou can exclude it
The [InvoiceItem].[sNominalCode] could be null in case of LEFT JOIN so add [InvoiceItem].[sNominalCode] is not null or <THE IN CONDITION>
Also add the is not null checks to other conditions
It seems you are looking for customers that are either created this year or for which sales orders exist from last year or this year. So select from customers, but use EXISTS on SalesOrder. Then you want to count invoices. So outer join them and make sure to have the criteria in the ON clause. (sNominalCode will be NULL for any outer joined records. Hence asking for certain sNominalCode in the WHERE clause will turn your outer join into an inner join.)
SELECT
c.iCustomerID,
c.sCustomerSageCode,
c.sCustomerName,
c.sCustomerTelNo1,
SUM(ii.fQtyOrdered) AS Quantity,
SUM(ii.fNetAmount) AS Value
FROM dbo.Customer c
LEFT JOIN dbo.Invoice i ON (i.iCustomerID = c.iCustomerID)
LEFT JOIN dbo.InvoiceItem ii ON (ii.iInvoiceNumber = i.iInvoiceNumber AND ii.sNominalCode IN ('4000', '4001', '4002', '4004', '4005', '4006', '4007', '4010', '4015', '4016', '700000', '701001', '701002', '701003'))
WHERE c.dDateCreated >= '2014-01-01'
OR EXISTS
(
SELECT *
FROM dbo.SalesOrder
WHERE iCustomerID = c.iCustomerID
AND dOrderDateTime >= '2013-01-01'
)
GROUP BY c.iCustomerID, c.sCustomerSageCode, c.sCustomerName, c.sCustomerTelNo1;

What could create a syntax error if you take a SQL query and perform an UNION with itself?

I have this strange error in SQL Server 2005 where I take a working query, add the UNION keyword below it and then copy the query again. In my opinion, this should always be working, but it is not. I get the message 'Incorrect syntax near the keyword 'union'.
What could create this problem ?
To be more specific, here is the complete query :
select distinct deliveries.id, orders.id, 20 + sum(orders.mass1) as allowed_duration
from features_resources
inner join features on features.id = featureid
inner join orders on orders.id = features_resources.resourceid
inner join orderinformations on orders.id = orderinformations.orderid
inner join deliveries on orderinformations.deliveryid = deliveries.id
where features.name = 'O_FRAIS'
and (deliveries.ID IN
(SELECT ID
FROM dbo.DeliveriesInExportedSchedule))
group by deliveries.id, features.name ,orders.id order by deliveries.id
union
select distinct deliveries.id, orders.id, 20 + sum(orders.mass1) as allowed_duration
from features_resources
inner join features on features.id = featureid
inner join orders on orders.id = features_resources.resourceid
inner join orderinformations on orders.id = orderinformations.orderid
inner join deliveries on orderinformations.deliveryid = deliveries.id
where features.name = 'O_FRAIS'
and (deliveries.ID IN
(SELECT ID
FROM dbo.DeliveriesInExportedSchedule))
group by deliveries.id, features.name ,orders.id order by deliveries.id
I have tried to reproduce the error on a smaller query, by starting from a simple query and adding features one by one (inner join, nested queryes, group by, sum,....) but failed to reproduce the error again.
Any idea ?
It is actually the order by deliveries.id in the top half that causes the problem.
The order by needs to apply to the whole query.
Example Syntax
SELECT v1.number
FROM master.dbo.spt_values v1
WHERE v1.number > 2000
UNION
SELECT v2.number
FROM master.dbo.spt_values v2
WHERE v2.number < 10
ORDER BY v1.number
Try putting the individual SELECTs in parentheses:
(SELECT ... )
UNION
(SELECT ... )
The way you have it now, the second WHERE and GROUP BY clauses are ambiguous - should that apply to the SELECT, or to the UNION? I don't have any way to tell, and neither has your DB server.