Group by clause not giving expected result

Group by clause not giving expected result - sql

I have two tables
InvoiceHead(InvoiceID,RetailerID,ExecutiveID,InvoiceDate)
InvoiceItem(InvoiceID,ItemID,Qty,Contract,Amount).
I want to get the sales qty grouped by ExecutiveId and ItemId.
I tried Following query but it doesn't give expected outcome.Grouping by both columns is not happening.As in the following screenshot the Same executive and item is not added together.they are in two rows
Screenshot of the result
SELECT
SecItem.ItemID
,SecHead.ExecutiveID
,sum(SecItem.QTY) AS Total_Qty
FROM Secondary_Sales.dbo.InvoiceHead AS SecHead
INNER JOIN Secondary_Sales.dbo.InvoiceItem AS SecItem
ON SecHead.InvoiceID = SecItem.Invoice_ID
GROUP BY
SecItem.ItemID
,SecHead.ExecutiveID
This query works in mysql and gives expected result but the same query doesn't work in msqlserver

The query as given should work, SQL Server does not implement things like this differently from MySql. I suspect this is a data issue. I suspect that the data for some of the IDs has trailing spaces. "GALSR02" does not equal "GALSR02 ". Try this query to see if that is the case:
SELECT
RTRIM(SecItem.ItemID)
,RTRIM(SecHead.ExecutiveID)
,sum(SecItem.QTY) AS Total_Qty
FROM Secondary_Sales.dbo.InvoiceHead AS SecHead
INNER JOIN Secondary_Sales.dbo.InvoiceItem AS SecItem
ON RTRIM(SecHead.InvoiceID) = RTRIM(SecItem.Invoice_ID)
GROUP BY
RTRIM(SecItem.ItemID)
,RTRIM(SecHead.ExecutiveID)
If that works then you will want to run this to correct the data and investigate how the white space got there in the first place. Once the data is corrected you can revert to your original query.
UPDATE Secondary_Sales.dbo.InvoiceHead
SET ExecutiveID = RTRIM(ExecutiveID), InvoiceID = RTRIM(InvoiceID)
UPDATE Secondary_Sales.dbo.InvoiceItem
SET ItemID = RTRIM(ItemID), Invoice_ID = RTRIM(Invoice_ID)

Try with the below query..
;with cte_1
as
(SELECT InvoiceID,ItemID,SUM(QTY) OVER (partition by ItemID order by ItemID)New_QTY
FROM InvoiceItem )
SELECT c.ItemID,h.ExecutiveID,sum(c.New_QTY) AS Total_Qty
FROM Secondary_Sales.dbo.InvoiceHead h
JOIN cte_1 c
on h.InvoiceID=c.InvoiceID
GROUP BY h.ExecutiveID,c.ItemID
if you wanted to pull the sum based on executiveId and item id, use the below query.
;with cte_1
as
(SELECT c.ItemID,h.ExecutiveID
,ROW_NUMBER() OVER(partition by h.ExecutiveID,c.ItemID order by h.ExecutiveID,c.ItemID) RNO
,SUM(QTY) OVER (partition by h.ExecutiveID,c.ItemID order by h.ExecutiveID,c.ItemID) AS Total_Qty
FROM Secondary_Sales.dbo.InvoiceHead h
JOIN econdary_Sales.dbo.InvoiceItem c
on h.InvoiceID=c.InvoiceID)
SELECT ExecutiveID,ItemID,Total_Qty
FROM cte_1
WHERE RNO=1

Try with below query hope it helps .
SELECT SecItem.ItemID,SecHead.ExecutiveID,sum(SecItem.QTY) AS Total_Qty
FROM Secondary_Sales.dbo.InvoiceHead AS SecHead
INNER JOIN Secondary_Sales.dbo.InvoiceItem AS SecItem ON SecHead.InvoiceID = SecItem.Invoice_ID
GROUP BY SecItem.ItemID,SecHead.ExecutiveID,SecItem.QTY

Related

How to speed up a slow update query in SQL Server 2012

I have an update query that works fine, but it is way too slow and takes over 2 minutes to complete. Is there another way I can write this query to speed it up? Here is my code thanks:
UPDATE #tmpIMDS
SET
ModelFileName = b.ModelFileName,
SendEMail = b.SendEMail
FROM
(
SELECT DISTINCT
IMDSConversionReportData.ModelNumber,
ModelFileName,
'Send Email' AS SendEmail
FROM
IMDSConversionReportData,
(
SELECT DISTINCT
ModelNumber,
Max(DateAdded) AS DateAdded
FROM
IMDSConversionReportData
GROUP BY
ModelNumber) a
WHERE
IMDSConversionReportData.ModelNumber = a.ModelNumber
AND IMDSConversionReportData.DateAdded = a.DateAdded
) b
WHERE ModelID = b.ModelNumber

Instead of hitting IMDSConversionReportData table twice to get the maximum DateAdded per ModelNumber you can generate row_number to identify maximum DateAdded per ModelNumbercolumn.
Also remove distinct when you are selecting only one non aggregate column with group by which is meaningless
Try this
;WITH cte
AS (SELECT *,
'Send Email' AS SendEmail,
Row_number()OVER(partition BY ModelNumber ORDER BY DateAdded DESC) AS rn
FROM IMDSConversionReportData)
UPDATE t
SET ModelFileName = c.ModelFileName,
SendEMail = c.SendEMail
FROM #tmpIMDS t
INNER JOIN cte c
ON t.ModelID = c.ModelNumber
Where Rn = 1
Note : Always use proper INNER JOIN syntax to join two tables instead of Old style comma separated join. We always find INNER Join syntax is more readable. Keep the filters alone in Where clause

INNER JOIN on a Sub Query

I have a list of tasks in a table called dbo.Task
In the database, each Task can have 1 or more rows in the TaskLine table.
TaskLine has a TaskID to related the Tasklines to the Task.
A TaskLine has a column called TaskHeadingTypeID
I need to return all the tasks, joined to the LAST TaskLine for that Task.
In english, I need to display a task, with the latest TaskLine heading. So, I basically need to join to the TaskLine table, like this (which, is incorrect and maybe inefficient, but hopefully shows what I am trying to do)
SELECT *
FROM #Results r
INNER JOIN (
SELECT TOP 1 TaskID, TaskHeadingTypeID FROM dbo.TaskLine
ORDER BY TaskLineID DESC
) tl
ON tl.TaskID = r.TaskID
However, the issue is, the sub query only brings back the last TaskLine row, which is incorrect.
Edit:
At the moment, it's 'Working' like the code below, but it seems highly inefficient, because for each task row, it has to run two extra queries. And they're both on the same table, just slightly different columns in that table:
(An extract of the columns in the SELECT cause)
SELECT TaskStatusID,
TaskStatus,
(SELECT TOP 1 TaskHeadingTypeID FROM dbo.TaskLine
WHERE TaskID = r.TaskID
ORDER BY TaskLineID DESC) AS TaskHeadingID,
(SELECT TOP 1 LongName FROM dbo.TaskLine tl
INNER JOIN ref.TaskHeadingType tht
ON tht.TaskHeadingTypeID = tl.TaskHeadingTypeID
WHERE TaskID = r.TaskID
ORDER BY TaskLineID DESC) AS TaskHeading,
PersonInCareID,
ICMSPartyID,
CarerID.... FROM...
EDIT:
Thanks to the ideas and comments below, I have ended up with this, using CTE:
;WITH ValidTaskLines (RowNumber, TaskID, TaskHeadingTypeID, TaskHeadingType)
AS
(SELECT
ROW_NUMBER()OVER(PARTITION BY tl.TaskID, tl.TaskHeadingTypeID ORDER BY tl.TaskLineID) AS RowNumber,
tl.TaskID,
tl.TaskHeadingTypeID,
LongName AS TaskHeadingType
FROM dbo.TaskLine tl
INNER JOIN ref.TaskHeadingType tht
ON tht.TaskHeadingTypeID = tl.TaskHeadingTypeID
)
SELECT AssignedByBusinessUserID,
BusinessUserID,
LoginName,
Comments,
r.CreateDate,
r.CreateUser,
r.Deleted,
r.Version,
IcmsBusinessUserID,
r.LastUpdateDate,
r.LastUpdateUser,
OverrrideApprovalBusinessUserID,
PlacementID,
r.TaskID,
TaskPriorityTypeID,
TaskPriorityCode,
TaskPriorityType,
TaskStatusID,
TaskStatus,
vtl.TaskHeadingTypeID AS TaskHeadingID,
vtl.TaskHeadingType AS TaskHeading,
PersonInCareID,
ICMSPartyID,
CarerID,
ICMSCarerEntityID,
StartDate,
EndDate
FROM #Results r
INNER JOIN ValidTaskLines vtl
ON vtl.TaskID = r.TaskID
AND vtl.RowNumber = 1

You could use the ROW_NUMBER() function for this:
SELECT *
FROM #Results r
INNER JOIN (SELECT TaskID
, TaskHeadingTypeID
, ROW_NUMBER()OVER(PARTITION BY TaskID, TaskHeadingTypeID ORDER BY TAskLineID DESC) RN
FROM dbo.TaskLine
) tl
ON tl.TaskID = r.TaskID
AND t1.RN = 1
The ROW_NUMBER() function assigns a number to each row. PARTITION BY is optional, but used to start the numbering over for each value in that group, ie: if you PARTITION BY Some_Date then for each unique date value the numbering would start over at 1. ORDER BY of course is used to define how the counting should go, and is required in the ROW_NUMBER() function.
You may need to adjust the PARTITION BY to suit your query, run the subquery by itself to get an idea of how the ROW_NUMBER() works.

SQL Group By Clause and Empty Entries

I have a SQL Server 2005 query that I'm trying to assemble right now but I am having some difficulties.
I have a group by clause based on 5 columns: Project, Area, Name, User, Engineer.
Engineer is coming from another table and is a one to many relationship
WITH TempCTE
AS (
SELECT htce.HardwareProjectID AS ProjectId
,area.AreaId AS Area
,hs.NAME AS 'Status'
,COUNT(*) AS Amount
,MAX(htce.DateEdited) AS DateModified
,UserEditing AS LastModifiedName
,Engineer
,ROW_NUMBER() OVER (
PARTITION BY htce.HardwareProjectID
,area.AreaId
,hs.NAME
,htce.UserEditing ORDER BY htce.HardwareProjectID
,Engineer DESC
) AS row
FROM HardwareTestCase_Execution AS htce
INNER JOIN HardwareTestCase AS htc ON htce.HardwareTestCaseID = htc.HardwareTestCaseID
INNER JOIN HardwareTestGroup AS htg ON htc.HardwareTestGroupID = htg.HardwareTestGroupId
INNER JOIN Block AS b ON b.BlockId = htg.BlockId
INNER JOIN Area ON b.AreaId = Area.AreaId
INNER JOIN HardwareStatus AS hs ON htce.HardwareStatusID = hs.HardwareStatusId
INNER JOIN j_Project_Testcase AS jptc ON htce.HardwareProjectID = jptc.HardwareProjectId AND htce.HardwareTestCaseID = jptc.TestcaseId
WHERE (htce.DateEdited > #LastDateModified)
GROUP BY htce.HardwareProjectID
,area.AreaId
,hs.NAME
,htce.UserEditing
,jptc.Engineer
)
The gist of what I want is to be able to deal with empty Engineer columns. I don't want this column to have a blank second entry (where row=2).
What I want to do:
Group the items with "row" value of 1 & 2 together.
Select the Engineer that isn't empty.
Do not deselect engineers where there is not a matching row=2.
I've tried a series of joins to try and make things work. No luck so far.

Use j_Project_Testcase PIVOT( MAX(Engineer) for Row in ( [1], [2] ) then select ISNULL( [1],[2]) to select the Engineer value
I can give you a more robust example if you set up a SQL fiddle
Try reading this: PIVOT and UNPIVOT

How to return the last record in a table with multiple states

The following SQL query will return all my programs that are in development or completed mode. The goal here is to get the latest state of all programs.
I use the following query to return all my program states
SELECT PK_ProgramState, FK_Program, State
FROM ProgramStates
I get the following results:
As seen by the yellow highlight in the colored rectangles of this image, I want those "FK_Program" records to be returned. The others who come before the last highlighted record state are not needed.
I can't seem to figure out how to do it ... All the queries I've been trying give me bogus results. All help is appreciated.
Thanks in advance.

SELECT s1.PK_ProgramStatee, s1.FK_Program, s1.State
FROM ProgramStates s1
inner join
(
SELECT max(PK_ProgramState) as mstate, FK_Program
FROM ProgramStates
group by FK_Program
) s2 on s2.mstate = s1.PK_ProgramState and s2.FK_Program = s1.FK_Program

Here is one way:
select fk_program
from ProgramStates ps
group by fk_program
having substring_index(group_concat(State order by PK_ProgramState desc), ',', 1
) in ('Development', 'Completed');
This is finding the last state using group_concat() than then comparing it to the states that you want to look for.
You could also write the having clause as:
having group_concat(State order by PK_ProgramState desc) like 'Completed%' or
group_concat(State order by PK_ProgramState desc) like 'Development%'
The intention might be clearer in this form.

select p.fk, (select ps.state from ProgramStates ps
where ps.FK_Program = p.fk
order by ps.PK_ProgramState desc limit 1)
from (select distinct q.FK_Program as fk
from ProgramStates q) as p
http://sqlfiddle.com/#!2/422d92/19

Try this:
SELECT DISTINCT FK_Program,
(SELECT TOP(1) State FROM ProgramStates P1
WHERE P1.FK_Program = ProgramStates.FK_Program
ORDER BY PK_ProgramState DESC) as State
FROM ProgramStates

select
ps.PK_ProgramState,
ps.FK_Program,
ps.state
from
ProgramStates ps
inner join
(select max(PK_ProgramState)PK_ProgramState, FK_Program from ProgramStates group by FK_Program) stg
on stg.FK_Program=ps.FK_Program and stg.PK_ProgramState=ps.PK_ProgramState

MS-Access -> SELECT AS + ORDER BY = error

I'm trying to make a query to retrieve the region which got the most sales for sweet products. 'grupo_produto' is the product type, and 'regiao' is the region. So I got this query:
SELECT TOP 1 r.nm_regiao, (SELECT COUNT(*)
FROM Dw_Empresa
WHERE grupo_produto='1' AND
cod_regiao = d.cod_regiao) as total
FROM Dw_Empresa d
INNER JOIN tb_regiao r ON r.cod_regiao = d.cod_regiao ORDER BY total DESC
Then when i run the query, MS-Access asks for the "total" parameter. Why it doesn't consider the newly created 'column' I made in the select clause?
Thanks in advance!

Old Question I know, but it may help someone knowing than while you cant order by aliases, you can order by column index. For example, this will work without error :
SELECT
firstColumn,
IIF(secondColumn = '', thirdColumn, secondColumn) As yourAlias
FROM
yourTable
ORDER BY
2 ASC
The results would then be ordered by the values found in the second column wich is the Alias "yourAlias".

Aliases are only usable in the query output. You can't use them in other parts of the query. Unfortunately, you'll have to copy and paste the entire subquery to make it work.

You can do it like this
select * from(
select a + b as c, * from table)
order by c
Access has some differences compared to Sql Server.

Why it doesn't consider the newly
created 'column' I made in the select
clause?
Because Access (ACE/Jet) is not compliant with the SQL-92 Standard.
Consider this example, which is valid SQL-92:
SELECT a AS x, c - b AS y
FROM MyTable
ORDER
BY x, y;
In fact, x and y the only valid elements in the ORDER BY clause because all others are out of scope (ordinal numbers of columns in the SELECT clause are valid though their use id deprecated).
However, Access chokes on the above syntax. The equivalent Access syntax is this:
SELECT a AS x, c - b AS y
FROM MyTable
ORDER
BY a, c - b;
However, I understand from #Remou's comments that a subquery in the ORDER BY clause is invalid in Access.

Try using a subquery and order the results in an outer query.
SELECT TOP 1 * FROM
(
SELECT
r.nm_regiao,
(SELECT COUNT(*)
FROM Dw_Empresa
WHERE grupo_produto='1' AND cod_regiao = d.cod_regiao) as total
FROM Dw_Empresa d
INNER JOIN tb_regiao r ON r.cod_regiao = d.cod_regiao
) T1
ORDER BY total DESC
(Not tested.)

How about:
SELECT TOP 1 r.nm_regiao
FROM (SELECT Dw_Empresa.cod_regiao,
Count(Dw_Empresa.cod_regiao) AS CountOfcod_regiao
FROM Dw_Empresa
WHERE Dw_Empresa.[grupo_produto]='1'
GROUP BY Dw_Empresa.cod_regiao
ORDER BY Count(Dw_Empresa.cod_regiao) DESC) d
INNER JOIN tb_regiao AS r
ON d.cod_regiao = r.cod_regiao

I suggest using an intermediate query.
SELECT r.nm_regiao, d.grupo_produto, COUNT(*) AS total
FROM Dw_Empresa d INNER JOIN tb_regiao r ON r.cod_regiao = d.cod_regiao
GROUP BY r.nm_regiao, d.grupo_produto;
If you call that GroupTotalsByRegion, you can then do:
SELECT TOP 1 nm_regiao, total FROM GroupTotalsByRegion
WHERE grupo_produto = '1' ORDER BY total DESC
You may think it's extra work to create the intermediate query (and, in a sense, it is), but you will also find that many of your other queries will be based off of GroupTotalsByRegion. You want to avoid repeating that logic in many other queries. By keeping it in one view, you provide a simplified route to answering many other questions.

How about use:
WITH xx AS
(
SELECT TOP 1 r.nm_regiao, (SELECT COUNT(*)
FROM Dw_Empresa
WHERE grupo_produto='1' AND
cod_regiao = d.cod_regiao) as total
FROM Dw_Empresa d
INNER JOIN tb_regiao r ON r.cod_regiao = d.cod_regiao
) SELECT * FROM xx ORDER BY total

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Group by clause not giving expected result - sql

Related

How to speed up a slow update query in SQL Server 2012

INNER JOIN on a Sub Query

SQL Group By Clause and Empty Entries

How to return the last record in a table with multiple states

MS-Access -> SELECT AS + ORDER BY = error

Categories

Resources