SQL Group By Clause and Empty Entries - sql

I have a SQL Server 2005 query that I'm trying to assemble right now but I am having some difficulties.
I have a group by clause based on 5 columns: Project, Area, Name, User, Engineer.
Engineer is coming from another table and is a one to many relationship
WITH TempCTE
AS (
SELECT htce.HardwareProjectID AS ProjectId
,area.AreaId AS Area
,hs.NAME AS 'Status'
,COUNT(*) AS Amount
,MAX(htce.DateEdited) AS DateModified
,UserEditing AS LastModifiedName
,Engineer
,ROW_NUMBER() OVER (
PARTITION BY htce.HardwareProjectID
,area.AreaId
,hs.NAME
,htce.UserEditing ORDER BY htce.HardwareProjectID
,Engineer DESC
) AS row
FROM HardwareTestCase_Execution AS htce
INNER JOIN HardwareTestCase AS htc ON htce.HardwareTestCaseID = htc.HardwareTestCaseID
INNER JOIN HardwareTestGroup AS htg ON htc.HardwareTestGroupID = htg.HardwareTestGroupId
INNER JOIN Block AS b ON b.BlockId = htg.BlockId
INNER JOIN Area ON b.AreaId = Area.AreaId
INNER JOIN HardwareStatus AS hs ON htce.HardwareStatusID = hs.HardwareStatusId
INNER JOIN j_Project_Testcase AS jptc ON htce.HardwareProjectID = jptc.HardwareProjectId AND htce.HardwareTestCaseID = jptc.TestcaseId
WHERE (htce.DateEdited > #LastDateModified)
GROUP BY htce.HardwareProjectID
,area.AreaId
,hs.NAME
,htce.UserEditing
,jptc.Engineer
)
The gist of what I want is to be able to deal with empty Engineer columns. I don't want this column to have a blank second entry (where row=2).
What I want to do:
Group the items with "row" value of 1 & 2 together.
Select the Engineer that isn't empty.
Do not deselect engineers where there is not a matching row=2.
I've tried a series of joins to try and make things work. No luck so far.

Use j_Project_Testcase PIVOT( MAX(Engineer) for Row in ( [1], [2] ) then select ISNULL( [1],[2]) to select the Engineer value
I can give you a more robust example if you set up a SQL fiddle
Try reading this: PIVOT and UNPIVOT

Related

SQL - Pivoting where the rows of one table are the column titles of another

I'm looking to create a table (shown below) more in the form of a matrix, where the column titles are variables.
#ACTable AS ACT
#WCTable AS WCT
The temp table, derived from #ProductionTable AS PT
The output I'm looking for looks like this. Essentially I want the ACT.AC running as column titles, the WCT.WC running down, and counting how many were ActFin on Nov 6th. The colour shows the matching associations. I'll coalesce the rest after, not too concerned about NULLs or 0s.
The query so far (it fails at the FOR statement)
SELECT * FROM
(
SELECT
PT.ParentPart,
ACT.AC,
WCT.WC,
PT.ActFin
FROM #ProductionTable AS PT
INNER JOIN #WCTable AS WCT ON WCT.WC = PT.WC
INNER JOIN #ACTable AS ACT ON PT.AC = ACT.AC
) t
PIVOT(
COUNT(CASE
WHEN
PT.ActFin > '2019-11-06' --count
THEN
1
END)
FOR ACT.AC IN ( --this is where things fall apart
'54',
'53',
'52')
)
Is this possible?
The columns in the FOR clause need to be wrapped in []:
SELECT * FROM
(
SELECT
PT.ParentPart,
ACT.AC,
WCT.WC,
PT.ActFin
FROM #ProductionTable AS PT
INNER JOIN #WCTable AS WCT ON WCT.WC = PT.WC
INNER JOIN #ACTable AS ACT ON PT.AC = ACT.AC
) t
PIVOT(
COUNT(CASE
WHEN
PT.ActFin > '2019-11-06' --count
THEN
1
END)
FOR ACT.AC IN ( --this is where things fall apart
[54],
[53],
[52])
)

SELECT and JOIN column not in Group by function

I have to join two different table to get my result.
The table 'Resource' it is simple, while the table 'Dimension.[Code]' contains, among the others, a column with different values (i.e :
Code
SILO
GRADE
OTHER 1
OTHER2
This is the reason why a join twice that column to get two different columns called GRADE and SILO.
Now, I have a query that selects the maximum value of a grade within the group as follows:
`SELECT
R.[ID] -- If I inserted that here, it is not working obviously.
-- This cannot But this is the additional column I need (see later)
DD_SILO.[Value] DIR ,
max(R.[GRADE]) GRADE_DIR
FROM [Resource] R
LEFT JOIN
Dimension DD_SILO ON R.[ID] = DD_SILO.[ID] AND DD_SILO.[Code] = 'SILO'
group by DD_SILO.[Value]'
What I need is basically to have, beside GRADE AND SILO, also the ID name, which is contained into the [Resource] table.
Please notice that [Resource].ID = [Dimension].ID
I would have solved the problem with ROW_NUMBER () to select the highest within the group, avoiding then then 'group by', but as the query has to be inserted in a bigger one, that would take too much time to run. I am using Microsoft SQL Server 2016.
Could you use a virtual table something like: -
`
select
a.max_grade_silo,
a.max_grade_value,
(select max(r.id)
from [resource] r,
[dimension] d
where r.[ID] = d.[ID] and
d.[CODE]= 'SILO' and
r.[GRADE] = a.[max_grade_value]
),
max_grade_silo a
from
(SELECT
DD_SILO.[Value] DIR ,
max(R.[GRADE]) GRADE_DIR
FROM [Resource] R
LEFT JOIN
Dimension DD_SILO ON R.[ID] = DD_SILO.[ID] AND DD_SILO.[Code] = 'SILO'
group by DD_SILO.[Value]
) temp_result (max_grade_silo, max_grade_value)
'
Probably better to look at normalizing the tables?
SELECT
MAX(R.[ID]) as ID ,
DD_SILO.[Value] DIR ,
max(R.[GRADE]) GRADE_DIR
FROM [Resource] R
LEFT JOIN
Dimension DD_SILO ON R.[ID] = DD_SILO.[ID] AND DD_SILO.[Code] = 'SILO'
group by DD_SILO.[Value]

How can I join on multiple columns within the same table that contain the same type of info?

I am currently joining two tables based on Claim_Number and Customer_Number.
SELECT
A.*,
B.*,
FROM Company.dbo.Company_Master AS A
LEFT JOIN Company.dbp.Compound_Info AS B ON A.Claim_Number = B.Claim_Number AND A.Customer_Number = B.Customer_Number
WHERE A.Filled_YearMonth = '201312' AND A.Compound_Ind = 'Y'
This returns exactly the data I'm looking for. The problem is that I now need to join to another table to get information based on a Product_ID. This would be easy if there was only one Product_ID in the Compound_Info table for each record. However, there are 10. So basically I need to SELECT 10 additional columns for Product_Name based on each of those Product_ID's that are being selected already. How can do that? This is what I was thinking in my head, but is not working right.
SELECT
A.*,
B.*,
PD_Info_1.Product_Name,
PD_Info_2.Product_Name,
....etc {Up to 10 Product Names}
FROM Company.dbo.Company_Master AS A
LEFT JOIN Company.dbo.Compound_Info AS B ON A.Claim_Number = B.Claim_Number AND A.Customer_Number = B.Customer_Number
LEFT JOIN Company.dbo.Product_Info AS PD_Info_1 ON B.Product_ID_1 = PD_Info_1.Product_ID
LEFT JOIN Company.dbo.Product_Info AS PD_Info_2 ON B.Product_ID_2 = PD_Info_2.Product_ID
.... {Up to 10 LEFT JOIN's}
WHERE A.Filled_YearMonth = '201312' AND A.Compound_Ind = 'Y'
This query not only doesn't return the correct results, it also takes forever to run. My actual SQL is a lot longer and I've changed table names, etc but I hope that you can get the idea. If it matters, I will be creating a view based on this query.
Please advise on how to select multiple columns from the same table correctly and efficiently. Thanks!
I found put my extra stuff into CTE and add ROW_NUMBER to insure that I get only 1 row that I care about. it would look something like this. I only did for first 2 product info.
WITH PD_Info
AS ( SELECT Product_ID
,Product_Name
,Effective_Date
,ROW_NUMBER() OVER ( PARTITION BY Product_ID, Product_Name ORDER BY Effective_Date DESC ) AS RowNum
FROM Company.dbo.Product_Info)
SELECT A.*
,B.*
,PD_Info_1.Product_Name
,PD_Info_2.Product_Name
FROM Company.dbo.Company_Master AS A
LEFT JOIN Company.dbo.Compound_Info AS B
ON A.Claim_Number = B.Claim_Number
AND A.Customer_Number = B.Customer_Number
LEFT JOIN PD_Info AS PD_Info_1
ON B.Product_ID_1 = PD_Info_1.Product_ID
AND B.Fill_Date >= PD_Info_1.Effective_Date
AND PD_Info_2.RowNum = 1
LEFT JOIN PD_Info AS PD_Info_2
ON B.Product_ID_2 = PD_Info_2.Product_ID
AND B.Fill_Date >= PD_Info_2.Effective_Date
AND PD_Info_2.RowNum = 1

Providing Language FallBack In A SQL Select Statement

I have a table that represents an Object. It has many columns but also fields that require language support.
For simplicity let's say I have 3 tables:
MainObjectTable
LanguageDependantField1
LanguageDependantField2.
MainObjectTable has a PK int called ID, and both LanguageDependantTables have a foreign key link back to the MainObjectTable along with a language code and the date they were added.
I've created a stored procedure that accepts the MainObjectTable ID and a Language. It will return a single row containing the most recent items from the language tables. The select statement looks like
SELECT
MainObjectTable.VariousColumns,
LanguageDependantField1.Description,
LanguageDependantField2.SomeOtherText
FROM
MainObjectTable
OUTER APPLY
(SELECT TOP 1 LanguageDependantField1.Description
FROM LanguageDependantField1
WHERE LanguageDependantField1.MainObjectTable_ID = MainObjectTable.ID
AND LanguageDependantField1.Language_ID = #language
ORDER BY
LanguageDependantField1.[Default], LanguageDependantField1.CreatedDate DESC) LanguageDependantField1
OUTER APPLY
(SELECT TOP 1 LanguageDependantField2.SomeOtherText
FROM LanguageDependantField2
WHERE LanguageDependantField2.MainObjectTable_ID = MainObjectTable.ID
AND LanguageDependantField2.Language_ID = #language
ORDER BY
LanguageDependantField2.[Default] DESC, LanguageDependantField2.CreatedDate DESC) LanguageDependantField2
WHERE
MainObjectTable.ID = #MainObjectTableID
What I want to add is the ability to fallback to a default language if a row isn't found in the specified language. Let's say we use "German" as the selected language. Is it possible to return an English row from LanguageDependantField1 if the German does not exist presuming we have #fallbackLanguageID
Also am I right to use OUTER APPLY in this scenario or should I be using JOIN?
Many thanks for your help.
Try this:
SELECT MainObjectTable.VariousColumns,
COALESCE(PrefLang.Description,Fallback.Description,'Not Found Desc')
as Description,
COALESCE(PrefLang.SomeOtherText,FallBack.SomeOtherText,'Not found')
as SomeOtherText
FROM MainObjectTable
LEFT JOIN
(SELECT TOP 1 pl.Description,pl.SomeOtherText
FROM LanguageDependantField1 pl
WHERE pl.MainObjectTable_ID = MainObjectTable.ID
AND pl.Language_ID = #language
ORDER BY
pl.[Default], pl.CreatedDate DESC)
PrefLang ON 1=1
LEFT JOIN
(SELECT TOP 1 fb.Description,fb.SomeOtherText
FROM LanguageDependantField1 fb
WHERE fb.MainObjectTable_ID = MainObjectTable.ID
AND fb.Language_ID = #fallbackLanguageID
ORDER BY
fb.[Default], fb.CreatedDate DESC)
Fallback ON 1=1
WHERE
MainObjectTable.ID = #MainObjectTableID
Basically, make two queries, one to the preferred language and one to English (Default). Use the LEFT JOIN, so if the first one isn't found, the second query is used...
I don't have your actual tables, so there might be a syntax error in above, but hope it gives you the concept you want to try...
Yes, the use of Outer Apply is correct if you want to correlate the MainObjectTable table rows to the inner queries. You cannot use Joins with references in the derived table to the outer table. If you wanted to use Joins, you would need to include the joining column(s) and in this case pre-filter the results. Here is what that might look like:
With RankedLanguages As
(
Select LDF1.MainObjectTable_ID, LDF1.Language_ID, LDF1.Description, LDF1.SomeOtherText, ...
, Row_Number() Over ( Partition By LDF1.MainObjectTable_ID, LDF1.Language_ID
Order By LDF1.[Default] Desc, LDF1.CreatedDate Desc ) As Rnk
From LanguageDependantField1 As LDF1
Where LDF1.Language_ID In( #languageId, #defaultLanguageId )
)
Select M.VariousColumns
, Coalesce( SpecificLDF.Description, DefaultLDF.Description ) As Description
, Coalesce( SpecificLDF.SomeOtherText, DefaultLDF.SomeOtherText ) As SomeOtherText
, ...
From MainObjectTable As M
Left Join RankedLanguages As SpecificLDF
On SpecificLDF.MainObjectTable_ID = M.ID
And SpecifcLDF.Language_ID = #languageId
And SpecifcLDF.Rnk = 1
Left Join RankedLanguages As DefaultLDF
On DefaultLDF.MainObjectTable_ID = M.ID
And DefaultLDF.Language_ID = #defaultLanguageId
And DefaultLDF.Rnk = 1
Where M.ID = #MainObjectTableID

Limit join to one row

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.
SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.
Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by
Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE
Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...