SQL GROUP BY function returning incorrect SUM amount - sql

I've been working on this problem, researching what I could be doing wrong but I can't seem to find an answer or fault in the code that I've written. I'm currently extracting data from a MS SQL Server database, with a WHERE clause successfully filtering the results to what I want. I get roughly 4 rows per employee, and want to add together a value column. The moment I add the GROUP BY clause against the employee ID, and put a SUM against the value, I'm getting a number that is completely wrong. I suspect the SQL code is ignoring my WHERE clause.
Below is a small selection of data:
hr_empl_code hr_doll_paid
1 20.5
1 51.25
1 102.49
1 560
I expect that a GROUP BY and SUM clause would give me the value of 734.24. The value I'm given is 211461.12. Through troubleshooting, I added a COUNT(*) column to my query to work out how many lines it's running against, and it's giving a result of 1152, furthering reinforces my belief that it's ignoring my WHERE clause.
My SQL code is as below. Most of it has been generated by the front-end application that I'm running it from, so there is some additional code in there that I believe does assist the query.
SELECT DISTINCT
T000.hr_empl_code,
SUM(T175.hr_doll_paid)
FROM
hrtempnm T000,
qmvempms T001,
hrtmspay T166,
hrtpaytp T175,
hrtptype T177
WHERE 1 = 1
AND T000.hr_empl_code = T001.hr_empl_code
AND T001.hr_empl_code = T166.hr_empl_code
AND T001.hr_empl_code = T175.hr_empl_code
AND T001.hr_ploy_ment = T166.hr_ploy_ment
AND T001.hr_ploy_ment = T175.hr_ploy_ment
AND T175.hr_paym_code = T177.hr_paym_code
AND T166.hr_pyrl_code = 'f' AND T166.hr_paid_dati = 20180404
AND (T175.hr_paym_type = 'd' OR T175.hr_paym_type = 't')
GROUP BY T000.hr_empl_code
ORDER BY hr_empl_code
I'm really lost where it could be going wrong. I have stripped out the additional WHERE AND and brought it down to just T166.hr_empl_code = T175.hr_empl_code, but it doesn't make a different.
By no means am I any expert in SQL Server and queries, but I have decent grasp on the technology. Any help would be very appreciated!

Group by is not wrong, how you are using it is wrong.
SELECT
T000.hr_empl_code,
T.totpaid
FROM
hrtempnm T000
inner join (SELECT
hr_empl_code,
SUM(hr_doll_paid) as totPaid
FROM
hrtpaytp T175
where hr_paym_type = 'd' OR hr_paym_type = 't'
GROUP BY hr_empl_code
) T on t.hr_empl_code = T000.hr_empl_code
where exists
(select * from qmvempms T001,
hrtmspay T166,
hrtpaytp T175,
hrtptype T177
WHERE T000.hr_empl_code = T001.hr_empl_code
AND T001.hr_empl_code = T166.hr_empl_code
AND T001.hr_empl_code = T175.hr_empl_code
AND T001.hr_ploy_ment = T166.hr_ploy_ment
AND T001.hr_ploy_ment = T175.hr_ploy_ment
AND T175.hr_paym_code = T177.hr_paym_code
AND T166.hr_pyrl_code = 'f' AND T166.hr_paid_dati = 20180404
)
ORDER BY hr_empl_code
Note: It would be more clear if you have used joins instead of old style joining with where.

Related

MAX function not working in Oracle statement

I have the following statement using MAX(woq.wq_version) and it keeps returning two results.
SELECT woq.wo_number, woq.quote_amount, MAX(woq.wq_version) version
FROM ba_view_wo_quote woq
LEFT JOIN sm_header smh
ON woq.woo_auto_key = smh.woo_auto_key
WHERE woq.woo_auto_key = smh.woo_auto_key
AND woq.wo_number = 'WO1110885'
AND woq.quote_amount <> '0'
HAVING woq.wq_version = MAX(woq.wq_version)
GROUP BY woq.wq_version, woq.quote_amount, woq.wo_number
I keep receiving these results:
wo_number
quote_amount
version
WO1110885
2803.15
1
WO1110885
1200
2
It sounds like you just want
select woq.wo_number,
woq.quote_amount,
woq.wq_version version
from ba_view_wo_quote woq
left join sm_header smh on woq.woo_auto_key=smh.woo_auto_key
where woq.wo_number = 'WO1110885'
and woq.quote_amount<>'0'
order by woq.quote_amount desc
fetch first 1 row only
If that isn't what you're looking for, it would be helpful to update your question with a reproducible test case that shows us what your tables look like, what your data looks like, and what results you want for that data.
Note that it doesn't make sense to duplicate the same condition in the on clause of your join and in the where clause so I got rid of the where clause condition.

Self joining columns from the same table with calculation on one column not displaying column name

I am fairly new to SQL and having issues figuring out how to solve the simple issue below. I have a dataset I am trying to self-join, I am using (b.calendar_year_number -1) as one of the columns to join. I applied a calculation of -1 with the goal of trying to match values from the previous year. However, it is not working as the resulting column shows (No column name) with a screenshot attached below. How do I change the alias to b.calendar_year_number after the calculation?
Code:
SELECT a.day_within_fiscal_period,
a.calendar_month_name,
a.cost_period_rolling_three_month_start_date,
a.calendar_year_number,
b.day_within_fiscal_period,
b.calendar_month_name,
b.cost_period_rolling_three_month_start_date,
(b.calendar_year_number -1)
FROM [data_mart].[v_dim_date_consumer_complaints] AS a
JOIN [data_mart].[v_dim_date_consumer_complaints] AS b
ON b.day_within_fiscal_period = a.day_within_fiscal_period AND
b.calendar_month_name = a.calendar_month_name AND
b.calendar_year_number = a.calendar_year_number
I am using (b.calendar_year_number -1) as one of the columns to join.
Nope, you're not. Look at your join statement and you'll see the third condition is:
b.calendar_year_number = a.calendar_year_number
So just change that to include the calculation. As far as the 'no column name' issue, you can use colname = somelogic syntax or somelogic as colname. Below, I used the former syntax.
select a.day_within_fiscal_period,
a.calendar_month_name,
a.cost_period_rolling_three_month_start_date,
a.calendar_year_number,
b.day_within_fiscal_period,
b.calendar_month_name,
b.cost_period_rolling_three_month_start_date,
bCalYearNum = b.calendar_year_number
from [data_mart].[v_dim_date_consumer_complaints] a
left join [data_mart].[v_dim_date_consumer_complaints] b
on b.day_within_fiscal_period = a.day_within_fiscal_period
and b.calendar_month_name = a.calendar_month_name
and b.calendar_year_number - 1 = a.calendar_year_number;
You could use the analytical function LAG/LEAD to get your required result, no self-join necessary:
select a.day_within_fiscal_period,
a.calendar_month_name,
a.cost_period_rolling_three_month_start_date,
a.calendar_year_number,
old_cost_period_rolling_three_month_start_date =
LAG(cost_period_rolling_three_month_start_date) OVER
(PARTITION BY calendar_month_name, day_within_fiscal_period
ORDER BY calendar_year_number),
old_CalYearNum = LAG(calendar_year_number) OVER
(PARTITION BY calendar_month_name, day_within_fiscal_period
ORDER BY calendar_year_number)
from [data_mart].[v_dim_date_consumer_complaints] a

SQL Server - Need to SUM values in across multiple returned records

In the following query I am trying to get TotalQty to SUM across both the locations for item 6112040, but so far I have been unable to make this happen. I do need to keep both lines for 6112040 separate in order to capture the different location.
This query feeds into a Jasper ireport using something called Java.Groovy. Despite this, none of the PDFs printed yet have been either stylish or stained brown. Perhaps someone could address that issue as well, but this SUM issue takes priority
I know Gordon Linoff will get on in about an hour so maybe he can help.
DECLARE #receipt INT
SET #receipt = 20
SELECT
ent.WarehouseSku AS WarehouseSku,
ent.PalletId AS [ReceivedPallet],
ISNULL(inv.LocationName,'') AS [ActualLoc],
SUM(ISNULL(inv.Qty,0)) AS [LocationQty],
SUM(ISNULL(inv.Qty,0)) AS [TotalQty],
MAX(CAST(ent.ReceiptLineNumber AS INT)) AS [LineNumber],
MAX(ent.WarehouseLotReference) AS [WarehouseLot],
LEFT(SUM(ent.WeightExpected),7) AS [GrossWeight],
LEFT(SUM(inv.[Weight]),7) AS [NetWeight]
FROM WarehouseReceiptDetail AS det
INNER JOIN WarehouseReceiptDetailEntry AS ent
ON det.ReceiptNumber = ent.ReceiptNumber
AND det.FacilityName = ent.FacilityName
AND det.WarehouseName = ent.WarehouseName
AND det.ReceiptLineNumber = ent.ReceiptLineNumber
LEFT OUTER JOIN Inventory AS inv
ON inv.WarehouseName = det.WarehouseName
AND inv.FacilityName = det.FacilityName
AND inv.WarehouseSku = det.WarehouseSku
AND inv.CustomerLotReference = ent.CustomerLotReference
AND inv.LotReferenceOne = det.ReceiptNumber
AND ISNULL(ent.CaseId,'') = ISNULL(inv.CaseId,'')
WHERE
det.WarehouseName = $Warehouse
AND det.FacilityName = $Facility
AND det.ReceiptNumber = #receipt
GROUP BY
ent.PalletId
, ent.WarehouseSku
, inv.LocationName
, inv.Qty
, inv.LotReferenceOne
ORDER BY ent.WarehouseSku
The lines I need partially coalesced are 4 and 5 in the above return.
Create a second dataset with a subquery and join to that subquery - you can extrapolate from the following to apply to your situation:
First the Subquery:
SELECT
WarehouseSku,
SUM(Qty)
FROM
Inventory
GROUP BY
WarehouseSku
Now apply to your query - insert into the FROM clause:
...
LEFT JOIN (
SELECT
WarehouseSKU,
SUM(Qty)
FROM
Inventory
GROUP BY
WarehouseSKU
) AS TotalQty
ON Warehouse.WarehouseSku = TotalQty.WarehouseSku
Without seeing the actual schema DDL it is hard to know the exact cardinality, but I think this will point you in the right direction.

SQL outer join in combination with MAX function in right table

I have an SQL question based on below table structure.
Database is currently in MS Access, with plans to migrate to SQL Server. Query should work in both DBMS'es.
I want to get devName and the latest dswSW_Version, based on dswTimestamp, for the device in question. If no SW history exists, I want to just return the devName.
The closest I could get was:
SELECT dev.devname, dsw1.dswsw_version
FROM device_sw_history AS dsw1 RIGHT JOIN device AS dev
ON dsw1.dswdevid = dev.devid
WHERE dsw1.dswtimestamp = (SELECT MAX(dswtimestamp) FROM device_sw_history AS dsw2 WHERE dsw1.dswdevid = dsw2.dswdevid)
AND devid = #devid
But nothing is returned for devid = 2, due to MAX returning null. I want to return Apple, null.
Is there a way to construct this statement without using a UNION and still return devname even if no SW history exists ?
Device:
devid devname
1 Samsung
2 Apple
Device_SW_History:
dswid dswdevid dswtimestamp dswsw_version
1 1 5/dec/13 One
2 1 6/dec/13 Two
Thank you !
Just put your condition in the on clause:
SELECT dev.devname, dsw1.dswsw_version
FROM device_sw_history AS dsw1 RIGHT JOIN device AS dev
ON dsw1.dswdevid = dev.devid
AND dsw1.dswtimestamp = (SELECT MAX(dswtimestamp) FROM device_sw_history AS dsw2 WHERE dsw1.dswdevid = dsw2.dswdevid)
WHERE devid = #devid
For inner joins the on and where clauses are identical, and putting a condition in one or the other is merely a question of style and readability. Outer joins introduce a difference between on and where, the on clause only applies to one table, while the where clause applies to their combination.
On SQL Server, a simple subquery should do the trick:
SELECT
devname,
(SELECT TOP 1 dswsw_version FROM device_sw_history WHERE dswdevid = devid
ORDER BY dswtimestamp DESC)
FROM device
This will return all the device names from device, even those that does not have an entry in device_sw_history.

Simple SQL query too long

I just have a simple query. On SQL Server 2008, sometimes it queries too long, nearly hang, sometimes not. The same sql on Oracle server, it always return at once.
SELECT D.DESCITEM, D.LONGDESC, D.DESCTABL, D.DESCCOY, D.DESCPFX
FROM VM1DTA.DESCPF D, VM1DTA.ITEMPF I
WHERE D.DESCPFX='IT'AND D.DESCITEM=I.ITEMITEM AND I.VALIDFLAG='1'
AND D.DESCTABL = I.ITEMTABL AND D.DESCCOY = I.ITEMCOY AND "LANGUAGE" = 'E'
AND "VALIDFLAG" = '1' AND DESCTABL IN('T1680')
ORDER BY LONGDESC ASC;
Each table has about 100k records.
Could someone point me about the root cause? Thanks
I'm not sure what the issue is but your query could use some refactoring. This line looks unneeded as well "VALIDFLAG" = '1' because it has no prefix. The IN may have something to do with it.
SELECT D.DESCITEM, D.LONGDESC, D.DESCTABL, D.DESCCOY, D.DESCPFX
FROM
VM1DTA.DESCPF D INNER JOIN VM1DTA.ITEMPF I ON
D.DESCITEM=I.ITEMITEM
AND
D.DESCTABL = I.ITEMTABL
AND
D.DESCCOY = I.ITEMCOY
WHERE
D.DESCPFX='IT'
AND
"LANGUAGE" = 'E'
AND
I.VALIDFLAG='1'
AND
"VALIDFLAG" = '1'
AND
I.DESCTABL = 'T1680'
ORDER BY
LONGDESC ASC;
The other thing to look at is putting indexes on all of the join columns.
Hope this helps.
I find that joining two tables by free text fields takes up a lot of resource and time. Additionally these fields are traditionally not indexed in anyway.
Try to see if there are any indexes worth using instead or other joins.
Also your join to the I.ITEMTABL field is not required as the D.DESCTBL field has a filter making the processor work more to join the two, unless it's an index.
Another method would be to use a subquery in your where clause like:
SELECT D.DESCITEM, D.LONGDESC, D.DESCTABL, D.DESCCOY, D.DESCPFX
FROM VM1DTA.DESCPF D
WHERE D.DESCPFX = 'IT'
AND D.LANGUAGE = 'E'
AND D.VALIDFLAG = '1'
AND D.DESCTABL = 'T1680'
AND EXISTS (SELECT NULL
FROM VM1DTA.ITEMPF I
WHERE I.VALIDFLAG = '1'
AND I.ITEMTBL = 'T1680'
AND D.DESCITEM = I.ITEMITEM
AND D.DESCCOY = I.ITEMCOY)
ORDER BY LONGDESC ASC;