Debugging SQL Query - sql

So I've researched this question but the solutions I've found are for debugging syntax errors, whereas I am encountering logic errors. I have a query in Access that is joining a table to a query, and the query is working almost perfectly. It runs, it's displaying exactly what I want where I want it to. The only problem is that the data it is showing is wrong. It has a group by field, and then the other fields are sums of fields based on the query joins. However the sums are wrong for some groupbys and right for others, and I'm unsure why. Is there a way to get into my code and see where some of these queries are grabbing the numbers?
Here is query I'm working with:
SELECT m.Bldg, Sum([e].[TotCost]*IIf([e].[Utility]="E",1,0)) AS ECost, Sum(g.TotCost*Switch(g.Utility='G',1,True,0)) AS GCost, Sum(h.TotCost*Switch(h.Utility='H',1,True,0)) AS HCost, Sum(c.TotCost*Switch(c.Utility='C',1,True,0)) AS CCost, Sum(w.TotCost*Switch(w.Utility='W',1,True,0)) AS WCost, Sum(s.TotCost*Switch(s.Utility='S',1,True,0)) AS SCost
FROM (((((tblBldgMeters AS m LEFT JOIN qryCurrentMonthMtrHis AS e ON m.EMeters = e.MeterID) LEFT JOIN qryCurrentMonthMtrHis AS g ON m.GMeters = g.MeterID) LEFT JOIN qryCurrentMonthMtrHis AS h ON m.HMeters = h.MeterID) LEFT JOIN qryCurrentMonthMtrHis AS c ON m.CMeters = c.MeterID) LEFT JOIN qryCurrentMonthMtrHis AS w ON m.WMeters = w.MeterID) LEFT JOIN qryCurrentMonthMtrHis AS s ON m.SMeters = s.MeterID
GROUP BY m.Bldg;
The problem is that the Cost fields will sometimes be right and sometimes be wrong, and I can't understand why. It can be anywhere from a hundred to a million dollars off. Each building has several meters that have separate costs that have to be added together, and so I have a query with just the current months costs for each meter, and then a table that lists all the buildings and the corresponding meters with it.

The best approach is to isolate a data set that works and one that doesn't . Then break the aggregate and verify the data. See what the query is actually doing... Without data samples or even a copy of access I could shoot from the hip with something like:
SELECT m.Bldg,
[e].[TotCost]*IIf([e].[Utility]="E",1,0) AS ECost,
e.Utility as e_Utility
g.TotCost*Switch(g.Utility='G',1,True,0) AS GCost,
g.Utility as g_Utility,
h.TotCost*Switch(h.Utility='H',1,True,0) AS HCost,
h.Utility as h_Utility,
c.TotCost*Switch(c.Utility='C',1,True,0) AS CCost,
c.utility as c_Utility,
w.TotCost*Switch(w.Utility='W',1,True,0) AS WCost,
w.Utility as w_Utility
s.TotCost*Switch(s.Utility='S',1,True,0) AS SCost
s.Utility as s_Utility
FROM
(((((tblBldgMeters AS m
LEFT JOIN qryCurrentMonthMtrHis AS e ON m.EMeters = e.MeterID)
LEFT JOIN qryCurrentMonthMtrHis AS g ON m.GMeters = g.MeterID)
LEFT JOIN qryCurrentMonthMtrHis AS h ON m.HMeters = h.MeterID)
LEFT JOIN qryCurrentMonthMtrHis AS c ON m.CMeters = c.MeterID)
LEFT JOIN qryCurrentMonthMtrHis AS w ON m.WMeters = w.MeterID)
LEFT JOIN qryCurrentMonthMtrHis AS s ON m.SMeters = s.MeterID
You might want to separate out the TotCost fields as well. This should give you decent insight into what is actually happening in the query. That's always my go to in troubleshooting though... Break the aggregate, check the data. If the data set is to large, isolate a test group and narrow it down.

Related

How to do multiple left joins in SQL Server SQL Query

I'm trying to perform the query below on SQL Server:
SELECT
[dbo].[Machine].[MachineID],
[dbo].[Machine].[CompanyID],
[dbo].[Company].[AccountRef],
[dbo].[Machine].[ProductTypeID],
[dbo].[Machine].[SerialNo],
[dbo].[Machine].[InstallationDate],
[dbo].[Machine].[SalesTypeID],
[dbo].[SalesType].[SalesType],
[dbo].[Machine].[LeasingCompanyID],
[dbo].[LeasingCompany].[Name],
[dbo].[Machine].[QuarterlyRentalCost],
[dbo].[Machine].[Term],
[dbo].[Machine].[ExpiryDate],
[dbo].[Machine].[Scales],
[dbo].[Machine].[Chips],
[dbo].[Machine].[ContractTypeID],
[dbo].[ContractType].[ContractType],
[dbo].[Machine].[ContractCost],
[dbo].[Machine].[InvoiceDate],
[dbo].[Machine].[ServiceDueDate],
[dbo].[Machine].[ServiceNotes],
[dbo].[Machine].[modelID],
[dbo].[Machine].[Model],
[dbo].[Machine].[IMP_Machine Reference],
[dbo].[Machine].[Smart]
FROM
[dbo].[Machine], [dbo].[Company], [dbo].[SalesType], [dbo].[LeasingCompany], [dbo].[ContractType]
LEFT JOIN
[dbo].[Machine] as A ON A.[CompanyID] = [dbo].[Company].[CompanyID]
LEFT JOIN
[dbo].[Machine] as B ON B.[SalesTypeID] = [dbo].[SalesType].[SalesTypeID]
LEFT JOIN
[dbo].[Machine] as C ON C.[LeasingCompanyID] = [dbo].[LeasingCompany].[LeasingCompanyID]
LEFT JOIN
[dbo].[Machine] as D ON D.[ContractTypeID] = [dbo].[ContractType].[ContractTypeID] ;
But for some reason that i cannot see for the life of me, the destination column name in the bottom 3 join statements is reporting "The multi part identifier could not be bound".
Could anyone assist please?
Many Thanks,
You were pretty close, for readability, you dont need the extensive [dbo].[table] all over. You can define it once in the from clause with an alias, then use that alias the rest of the way through. Also, make the alias make sense to the context of the table as you will see in this below. LeasingCompany lc, SalesType st, etc.
Also, I try to have indented so I always know the first table of the join relationship and the JOIN indented on where it is being joined to. Then I keep the orientation of the from/to table aliases the same context.
Since the machine table is used once and joined to 4 different lookup tables, you can reuse the same "m" alias to each underlying. Think of a tree and branches. The root tree is your "Machine", and all the branches are the lookups.
SELECT
m.MachineID,
m.CompanyID,
c.AccountRef,
m.ProductTypeID,
m.SerialNo,
m.InstallationDate,
m.SalesTypeID,
st.SalesType,
m.LeasingCompanyID,
lc.LeasingCompany.Name,
m.QuarterlyRentalCost,
m.Term,
m.ExpiryDate,
m.Scales,
m.Chips,
m.ContractTypeID,
ct.ContractType,
m.ContractCost,
m.InvoiceDate,
m.ServiceDueDate,
m.ServiceNotes,
m.modelID,
m.Model,
m.IMP_Machine Reference,
m.Smart
FROM
Machine m
LEFT JOIN Company c
ON m.CompanyID = c.CompanyID
LEFT JOIN SalesType st
ON m.SalesTypeID = st.SalesTypeID
LEFT JOIN LeasingCompany lc
ON m.LeasingCompanyID = lc.LeasingCompanyID
LEFT JOIN ContractType ct
ON m.ContractTypeID = ct.ContractTypeID ;

Convert Legacy SQL Outer JOIN *=, =* to ANSI

I need to convert a legacy SQL outer Join to ANSI.
The reason for that being, we're upgrading from a legacy DB instance (2000/5 ?) to SQL 2016.
Legacy SQL query :-
SELECT
--My Data to Select--
FROM counterparty_alias ca1,
counterparty_alias ca2,
counterparty cp,
party p
WHERE cp.code *= ca1.counterparty_code AND
ca1.alias = 'Party1' AND
cp.code *= ca2.counterparty_code AND
ca2.alias = 'Party2' AND
cp.code *= p.child_code AND
cp.category in ('CAT1','CAT2')
Here, Party1 and Party2 Are the party type codes and CAT1 and CAT2 are the category codes. They're just data; I have abstracted it, because the values don't really matter.
Now, when I try to replace the *= with a LEFT OUTER JOIN, I get a huge mismatch on the Data, both in terms of the number of rows, as well as the Data itself.
The query I'm using is this :
What am I doing wrong ?
SELECT
--My Data to Select--
FROM
counterparty cp
LEFT OUTER JOIN counterparty_alias ca1 ON cp.code = ca1.counterparty_code
LEFT OUTER JOIN counterparty_alias ca2 ON cp.code = ca2.counterparty_code
LEFT OUTER JOIN party p ON cp.code = p.child_code
WHERE
ca1.alias = 'Party1' AND
ca2.alias = 'Party2' AND
cp.category in ('CAT1','CAT2')
Clearly , in all the three legacy joins , the cp (counterparty) table is on the Left hand Side of the *=. So that should translate to a LEFT OUTER JOIN WITH all the three tables. However, my solution doesn't seem to to be working
How can I fix this ? What am I doing wrong here ?
Any help would be much appreciated. Thanks in advance :)
EDIT
I also have another query like this :
SELECT
--My Data to Select--
FROM dbo.deal d,
dbo.deal_ccy_option dvco,
dbo.deal_valuation dv,
dbo.strike_modifier sm
WHERE d.deal_id = dvco.deal_id
AND d.deal_id = dv.deal_id
AND dvco.base + dvco.quoted *= sm.ccy_pair
AND d.maturity_date *= sm.expiry_date
In this case, both the dvco and d tables seem to be doing a LEFT OUTER JOIN on the same table sm. How do I proceed about this ?
Maybe join in on the same table and use an alias sm1 and sm2 ?
Or should I use sm as the central table and change the join to RIGHT OUTER JOIN on dvco and d tables ?
I think the problem with your translation is that you are using conditions on the right tables in the where clause instead of in the on clause.
When I tried to translate it, this is the translation I've got:
FROM counterparty cp
LEFT JOIN counterparty_alias ca1 ON cp.code = ca1.counterparty_code
AND ca1.alias = 'Party1'
LEFT JOIN counterparty_alias ca2 ON cp.code *= ca2.counterparty_code
AND ca2.alias = 'Party2'
LEFT JOIN party p ON cp.code = p.child_code
WHERE cp.category in ('CAT1','CAT2')
However, it's hard to know if I'm correct since you didn't provide sample data, desired results, or even a complete query.
If you're doing a conversion, it has been my experience that *= is a RIGHT OUTER JOIN and =* is a LEFT OUTER JOIN in terms of a straight conversion.
I am converting hundreds of stored procs and views now and through testing this is what matches. I run the query as the original first, then make the changes and re-run it with the ANSI compliant code.
The data returned needs to be the same for consistency in our application.
So for your second query I think it would look something like this:
FROM dbo.deal d
INNER JOIN dbo.deal_ccy_option dvco ON d.deal_id = dvco.deal_id
INNER JOIN dbo.deal_valuation dv ON d.deal_id = dv.deal_id
RIGHT OUTER JOIN dbo.strike_modifier sm ON d.maturity_date = sm.expiry_date
AND (dvco.base + dvco.quoted) = sm.ccy_pair
Thanks for the help and sorry for the late post, but I got it to work with a quick hack, using the Query Designer Tool inbuilt in SSMS. It simply refactored all my queries and put in the correct Join, Either Left or Right , and the Where condition as an AND condition on the Join itself, so I was getting the correct data result set for both pre and post, only sometimes the data sorting/ordering was a little off.
I got lost with deadlines and couldnt update with the solution earlier. Thanks again for the help. Hope this helps someone else too !!
Still a little bit unsure though why the ordering/sorting was a little off if the Join condition was the same and the filters as well, because data was a 100 % match.
To get the query Designer to Work , just select your legacy SQL, and
open the Query Designer by pressing Ctrl + Shift + Q or Goto Main Menu
ToolBar => Query => Design Query in Editor.
Thats it. This will refactor your legacy code to new ANSI standards. You wll get the converted query with the new Joins that you can copy and test. Worked 100% of the time for me, except in some cases where the sorting was not matching, which you can check by adding a simple order by clause to both pre and post to compare the data.
For reference, I cross checked with this post :
http://sqlblog.com/blogs/john_paul_cook/archive/2013/03/02/using-the-query-designer-to-convert-non-ansi-joins-to-ansi.aspx

Access SQL query without duplicate results

I made a query and wanted to not have any duplicates but i got some times 3 duplicates and when i used DISTINCT or DISTINCTROW i got only 2 duplicates.
SELECT f.flight_code,
f.status,
a.airport_name,
a1.airport_name,
f.departing_date+f.departing_time AS SupposedDepartingTime,
f.landing_date+f.landing_time AS SupposedLandingTime,
de.actual_takeoff_date+de.actual_takeoff_time AS ActualDepartingTime,
SupposedLandingTime+(ActualDepartingTime-SupposedDepartingTime) AS ActualLandingTime
FROM
(((Flights AS f
LEFT JOIN Aireports AS a
ON a.airport_code = f.depart_ap)
LEFT JOIN Aireports AS a1
ON f.target_ap = a1.airport_code)
LEFT JOIN Irregular_Events AS ie
ON f.flight_code = ie.flight_code)
LEFT JOIN Delay_Event AS de
ON ie.IE_code = de.delay_code;
had to use LEFT JOIN because when i used INNER JOIN i missed some of the things i wanted to show because i wanted to see all the flights and not only the flights that got delayed or canceled.
This is the results when i used INNER JOIN, you can see only the flights that have the status "ביטול" or "עיכוב" and that is not what i wanted.
[the results with LEFT JOIN][2]
[2]: https://i.stack.imgur.com/cgE2G.png
and when i used DISTINCT where you see the rows with the NUMBER 6 on the first column it appear only two times
IMPORTANT!
I just checked my query and all the tables i use there and i saw my problem but dont know how to fix it!
in the table Irregular_Events i have more the one event for flights 3,6 and 8 and that is why when i use LEFT JOIN i see more even thou i use distinct, please give me some help!
Not entirely sure without seeing the table structure, but this might work:
SELECT f.flight_code,
f.status,
a.airport_name,
a1.airport_name,
f.departing_date+f.departing_time AS SupposedDepartingTime,
f.landing_date+f.landing_time AS SupposedLandingTime,
de.actual_takeoff_date+de.actual_takeoff_time AS ActualDepartingTime,
SupposedLandingTime+(ActualDepartingTime-SupposedDepartingTime) AS ActualLandingTime
FROM
((Flights AS f
LEFT JOIN Aireports AS a
ON a.airport_code = f.depart_ap)
LEFT JOIN Aireports AS a1
ON f.target_ap = a1.airport_code)
LEFT JOIN
(
SELECT
ie.flight_code,
de1.actual_takeoff_date,
de1.actual_takeoff_time
FROM
Irregular_Events ie
INNER JOIN Event AS de1
ON ie.IE_code = de1.delay_code
) AS de
ON f.flight_code = de.flight_code
It is hard to tell what is the problem with your query without any sample of the output, and without any description of the structure of your tables.
But your problem is that your are querying from the flights table, which [I assume] can be linked to multiple irregular_events, which can possibly also be linked to multiple delay_event.
If you want to get only one row per flight, you need to make sure your joins return only one row too. Maybe you can do it by adding one more condition to the join, or by adding a condition in a sub-query.
EDIT
You could try to add a GROUP BY to the query:
GROUP BY
f.flight_code,
f.status,
a.airport_name,
a1.airport_name;

Best way to optimize this SQL Query

I have to optimize this query and I am really in a hurry here. The following query searches by client. The input value RIF.keyvaluechar
LIKE 'V%10553790 ' is because in some old registers in the database some IDs when missing characters it used to be V0012345678 but it should have been V12345678 as that's the maximum amount of characters the ID can have. I know 12345678 should have been numeric and the V a char and then compare, but that's another issue.
Anyway, the query is this one:
SELECT DISTINCT idata.itemnum AS [ID],
LTRIM(RTRIM(ISNULL(CONTRATO.keyvaluechar,'N/A'))) AS [Contrato],
idata.datestored AS [Fecha],
NUMERO.keyvaluesmall AS [Numero],
TIPO.keyvaluechar AS [Tipo],
LTRIM(RTRIM(ISNULL(LC.lifecyclename,'N/A'))) AS [Flujo],
LTRIM(RTRIM(ISNULL(LC.lcnum,-1))) AS [FlujoID],
LTRIM(RTRIM(ISNULL(LCS.statename,'N/A'))) AS [Cola],
LTRIM(RTRIM(ISNULL(LCS.statenum,-1))) AS [ColaID],
CASE
WHEN PC.NombreProceso IN('PTD','PV2','PV3') THEN 1
ELSE 0
END AS [Portada]
FROM OnBase.hsi.itemdata idata WITH (NOLOCK)
INNER JOIN OnBase.hsi.keyitem109 TIPO WITH (NOLOCK) ON TIPO.itemnum = idata.itemnum
INNER JOIN OnBase.hsi.keyitem113 NUMERO WITH (NOLOCK) ON NUMERO.itemnum = idata.itemnum
LEFT JOIN OnBase.hsi.keyitem132 CONTRATO WITH (NOLOCK) ON CONTRATO.itemnum = idata.itemnum
LEFT JOIN OnBase.hsi.keyitem114 CLIENTE WITH (NOLOCK) ON CLIENTE.itemnum = idata.itemnum
LEFT JOIN OnBase.hsi.keyitem111 RIF WITH (NOLOCK) ON RIF.itemnum = idata.itemnum
INNER JOIN OnBase.hsi.doctype DOC WITH (NOLOCK) ON DOC.itemtypenum = idata.itemtypenum
INNER JOIN BD_WorkFlow.dbo.BBVA_ProcesosConfig PC WITH (NOLOCK) ON PC.ID_Documento = idata.itemtypenum
LEFT JOIN Onbase.hsi.itemlc ILC WITH (NOLOCK) ON ILC.itemnum = idata.itemnum
LEFT JOIN Onbase.hsi.lcstate LCS WITH (NOLOCK) ON LCS.statenum = ILC.statenum
LEFT JOIN Onbase.hsi.lifecycle LC WITH (NOLOCK) ON LC.lcnum = ILC.lcnum
WHERE PC.NombreProceso <> 'XXX' AND
PC.NombreProceso NOT IN('PTD','PV2','PV3') AND
TIPO.keyvaluechar = 'CCD' AND
RIF.keyvaluechar LIKE 'V%10553790 '
As you can see it is this way so it finds V0012345678 or V12345678 but this is not the right way or I feel it is the best optimization, although I am no expert in databases.
Anyways, I've though about something like this instead of last line
AND LEFT ('RIF.Keyvaluechar, 1) ="V"
AND SUBSTRING (RIF.Keyvaluechar, 2, LEN(RIF.Keyvaluechar)) = "12345678"
What do you guys think? Is there any other better way to improve upon this?
First, your query has a logic problem. You have this:
LEFT JOIN OnBase.hsi.keyitem111 RIF WITH(NOLOCK) ON RIF.itemnum = idata.itemnum
and then this in your where clause:
AND RIF.keyvaluechar LIKE 'V%10553790 '
Putting that filter in your where clause effectively changes your left join to an inner join. To fix this, move the filter to the join.
In terms of optimizing it, I assume that means to make it run faster. What you were thinking about will probably slow things down because you are filtering on function results instead of fields. A better approach, no matter how much of a hurry you are in, is to look at the indexes in your database and try to filter on those. In fact, it might be appropriate to add new ones.
Is the Keyvaluechar always a number from the second character onwards and you want to treat it as a number (=remove leading zeros). You could try to add a persisted column convert(int, SUBSTRING (Keyvaluechar, 2, 10)) to the table, then index that, and use it as a search criteria. At least I would assume that should help a lot.
In addition to that, looking at statistics IO output might be a good idea too, to see what table is actually responsible for the biggest I/O amounts.
Just a note, I hope you also know the problems using NOLOCK can cause you.

Optimize SQL query with many left join

I have a SQL query with many left joins
SELECT COUNT(DISTINCT po.o_id)
FROM T_PROPOSAL_INFO po
LEFT JOIN T_PLAN_TYPE tp ON tp.plan_type_id = po.Plan_Type_Fk
LEFT JOIN T_PRODUCT_TYPE pt ON pt.PRODUCT_TYPE_ID = po.cust_product_type_fk
LEFT JOIN T_PROPOSAL_TYPE prt ON prt.PROPTYPE_ID = po.proposal_type_fk
LEFT JOIN T_BUSINESS_SOURCE bs ON bs.BUSINESS_SOURCE_ID = po.CONT_AGT_BRK_CHANNEL_FK
LEFT JOIN T_USER ur ON ur.Id = po.user_id_fk
LEFT JOIN T_ROLES ro ON ur.roleid_fk = ro.Role_Id
LEFT JOIN T_UNDERWRITING_DECISION und ON und.O_Id = po.decision_id_fk
LEFT JOIN T_STATUS st ON st.STATUS_ID = po.piv_uw_status_fk
LEFT OUTER JOIN T_MEMBER_INFO mi ON mi.proposal_info_fk = po.O_ID
WHERE 1 = 1
AND po.CUST_APP_NO LIKE '%100010233976%'
AND 1 = 1
AND po.IS_STP <> 1
AND po.PIV_UW_STATUS_FK != 10
The performance seems to be not good and I would like to optimize the query.
Any suggestions please?
Try this one -
SELECT COUNT(DISTINCT po.o_id)
FROM T_PROPOSAL_INFO po
WHERE PO.CUST_APP_NO LIKE '%100010233976%'
AND PO.IS_STP <> 1
AND po.PIV_UW_STATUS_FK != 10
First, check your indexes. Are they old? Did they get fragmented? Do they need rebuilding?
Then, check your "execution plan" (varies depending on the SQL Engine): are all joins properly understood? Are some of them 'out of order'? Do some of them transfer too many data?
Then, check your plan and indexes: are all important columns covered? Are there any outstandingly lengthy table scans or joins? Are the columns in indexes IN ORDER with the query?
Then, revise your query:
- can you extract some parts that normally would quickly generate small rowset?
- can you add new columns to indexes so join/filter expressions will get covered?
- or reorder them so they match the query better?
And, supporting the solution from #Devart:
Can you eliminate some tables on the way? does the where touch the other tables at all? does the data in the other tables modify the count significantly? If neither SELECT nor WHERE never touches the other joined columns, and if the COUNT exact value is not that important (i.e. does that T_PROPOSAL_INFO exist?) then you might remove all the joins completely, as Devart suggested. LEFTJOINs never reduce the number of rows. They only copy/expand/multiply the rows.