I have 2 tables (Dim & User). In Dim table, there is the EmpId column which has incremental values (1,2,..) but is not an identity column and the User table is in join with Dim table based on SalesKey.
There are 3 rows that are missing in Dim which exists in the User table. I want to insert the missing rows in Dim table, but the catch is while inserting the EmpId column needs to get incremental values for new rows.
So far queries I tried is as below, which gives me results in split, but I am not able to merge results in a single query, maybe nested subquery will help but not sure how?
Create table DimEmp
(
EmpId bigint not null,
SalesKey varchar(10),
EmpName varchar(100)
CONSTRAINT PK_DimEmp_EmpId PRIMARY KEY (EmpId)
)
GO
INSERT INTO DimEmp (EmpId,SalesKey,EmpName)
VALUES (1,'001A','John'), (2,'002B','Stephen')
GO
Create table [User]
(
UserId varchar(10),
EmpName varchar(100)
CONSTRAINT PK_User_UserId PRIMARY KEY (UserId)
)
GO
INSERT INTO [User] (UserId,EmpName)
VALUES ('001A','John'), ('002B','Stephen'),
('003C','Bruce'), ('004D','Clark'),('005E','Mitchel')
GO
SELECT u.UserId,u.EmpName
FROM [User] u
LEFT JOIN DimEmp d
ON d.SalesKey=u.UserId
WHERE d.SalesKey IS NULL -- prints missing 3 records of Dim
GO
SELECT 1 + EmpId + 1 AS NewincrEmpId,
( SELECT MAX(EmpId) FROM DimEmp
) AS MaxEmpid
FROM DimEmp -- inner query gives max empid and outer query increments value for each row
GO
Expected Output in Dim table after inserting 3 new records using INSERT INTO SELECT (subquery) statement
You will need to wrap the whole thing into a transaction.
Grab the max(EmpId) using serializable table lock, to make sure no other process adds/modifies EmpId
use row_number to get the new unique ids
Query:
begin tran
declare #maxid bigint
set #maxid =
(
select max(EmpId) from DimEmp with(serializable)
)
insert into DimEmp
(
EmpId,
SalesKey,
EmpName
)
select
isnull(#maxid, 0) +
row_number() over (order by u.UserId),
u.UserId,
u.EmpName
from
[User] as u
left join
DimEmp as d on
d.SalesKey = u.UserId
where
d.SalesKey is null
commit tran
Sounds like you got a serious design issue there.
For a quick fix you can use row_number() and add it to the maximum ID.
INSERT INTO [dimemp]
([empid],
[saleskey],
[empname])
SELECT (SELECT coalesce(max(de1.[empid]), 0)
FROM [dimemp] de1) + row_number() OVER (ORDER BY u1.[userid]),
u1.[userid],
u1.[empname]
FROM [user] u1
WHERE NOT EXISTS (SELECT *
FROM [dimemp] de2
WHERE de2.[saleskey] = u1.[userid]);
db<>fiddle
TRY: This is also better way of using OUTER APPLY to get max EmpId and ROW_NUMBER to get the desired output as below
SELECT ISNULL(tt.NewincrEmpId, 0)+ROW_NUMBER() OVER(ORDER BY u.UserId ASC) AS NewincrEmpId,
u.UserId,
u.EmpName
FROM User u
LEFT JOIN DimEmp d ON d.SalesKey=u.UserId
OUTER APPLY(SELECT MAX(de.EmpId) AS NewincrEmpId
FROM DimEmp de) tt
OUTPUT:
NewincrEmpId UserId EmpName
3 003C Bruce
4 004D Clark
5 005E Mitchel
WHERE d.SalesKey IS NULL
For a given "CustomerId" I need to get 4 related values from a column ("CompanySales") in another table.
I have joined the two tables and, with the query below, manage to get 2 "CompanySales" values from the column in the other table.
How do I do this do get 4 values (I need CompanySales for "WeekNumber" = 1,2,3 and 4)
This is the SQL query I have to secure "CompanySales" for "Weeknumber" = 1 and 2:
Declare #TempTable1 table
(
CustomerID INT,
CustomerName Varchar (50),
CompanySales DEC (8,2),
WeekNumber INT
)
INSERT INTO #TempTable1 ("CustomerID","CustomerName", "WeekNumber")
SELECT Customer.CustomerID, Customer.CustomerName, Company.WeekNumber, company.Sales
FROM Customer INNER JOIN
Company ON Customer.CustomerID = Company.CustomerID;
With tblDifference as
(
Select Row_Number() OVER (Order by WeekNumber) as RowNumber,CustomerID,CustomerName, companysales, WeekNumber from #TempTable1
)
Select Top (50) cur.CustomerID, Cur.CustomerName, Cur.WeekNumber as CurrentWeek, Prv.WeekNumber as PreviousWeek, Cur.CompanySales as CurrentSales, Prv.CompanySales as PreviousSales, CAST(((Cur.CompanySales-Prv.CompanySales)/Prv.CompanySales)*100 As Decimal(8,2)) as PercentChange from
tblDifference Cur Left Outer Join tblDifference Prv
On Cur.CustomerID=Prv.CustomerID
Where cur.WeekNumber = 1 AND prv.WeekNumber = 2
Order BY PercentChange ASC
How about adding between there cur.WeekNumber Between 1 and 4?
At my organization clients can be enrolled in multiple programs at one time. I have a table with a list of all of the programs a client has been enrolled as unique rows in and the dates they were enrolled in that program.
Using an External join I can take any client name and a date from a table (say a table of tests that the clients have completed) and have it return all of the programs that client was in on that particular date. If a client was in multiple programs on that date it duplicates the data from that table for each program they were in on that date.
The problem I have is that I am looking for it to only return one program as their "Primary Program" for each client and date even if they were in multiple programs on that date. I have created a hierarchy for which program should be selected as their primary program and returned.
For Example:
1.)Inpatient
2.)Outpatient Clinical
3.)Outpatient Vocational
4.)Outpatient Recreational
So if a client was enrolled in Outpatient Clinical, Outpatient Vocational, Outpatient Recreational at the same time on that date it would only return "Outpatient Clinical" as the program.
My way of thinking for doing this would be to join to the table with the previous programs multiple times like this:
FROM dbo.TestTable as TestTable
LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms1
ON TestTable.date = PreviousPrograms1.date AND PreviousPrograms1.type = 'Inpatient'
LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms2
ON TestTable.date = PreviousPrograms2.date AND PreviousPrograms2.type = 'Outpatient Clinical'
LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms3
ON TestTable.date = PreviousPrograms3.date AND PreviousPrograms3.type = 'Outpatient Vocational'
LEFT OUTER JOIN dbo.PreviousPrograms as PreviousPrograms4
ON TestTable.date = PreviousPrograms4.date AND PreviousPrograms4.type = 'Outpatient Recreational'
and then do a condition CASE WHEN in the SELECT statement as such:
SELECT
CASE
WHEN PreviousPrograms1.name IS NOT NULL
THEN PreviousPrograms1.name
WHEN PreviousPrograms1.name IS NULL AND PreviousPrograms2.name IS NOT NULL
THEN PreviousPrograms2.name
WHEN PreviousPrograms1.name IS NULL AND PreviousPrograms2.name IS NULL AND PreviousPrograms3.name IS NOT NULL
THEN PreviousPrograms3.name
WHEN PreviousPrograms1.name IS NULL AND PreviousPrograms2.name IS NULL AND PreviousPrograms3.name IS NOT NULL AND PreviousPrograms4.name IS NOT NULL
THEN PreviousPrograms4.name
ELSE NULL
END as PrimaryProgram
The bigger problem is that in my actual table there are a lot more than just four possible programs it could be and the CASE WHEN select statement and the JOINs are already cumbersome enough.
Is there a more efficient way to do either the SELECTs part or the JOIN part? Or possibly a better way to do it all together?
I'm using SQL Server 2008.
You can simplify (replace) your CASE by using COALESCE() instead:
SELECT
COALESCE(PreviousPrograms1.name, PreviousPrograms2.name,
PreviousPrograms3.name, PreviousPrograms4.name) AS PreviousProgram
COALESCE() returns the first non-null value.
Due to your design, you still need the JOINs, but it would be much easier to read if you used very short aliases, for example PP1 instead of PreviousPrograms1 - it's just a lot less code noise.
You can simplify the Join by using a bridge table containing all the program types and their priority (my sql server syntax is a bit rusty):
create table BridgeTable (
programType varchar(30),
programPriority smallint
);
This table will hold all the program types and the program priority will reflect the priority you've specified in your question.
As for the part of the case, that will depend on the number of records involved. One of the tricks that I usually do is this (assuming programPriority is a number between 10 and 99 and no type can have more than 30 bytes, because I'm being lazy):
Select patient, date,
substr( min(cast(BridgeTable.programPriority as varchar) || PreviousPrograms.type), 3, 30)
From dbo.TestTable as TestTable
Inner Join dbo.BridgeTable as BridgeTable
Left Outer Join dbo.PreviousPrograms as PreviousPrograms
on PreviousPrograms.type = BridgeTable.programType
and TestTable.date = PreviousPrograms.date
Group by patient, date
You can achieve this using sub-queries, or you could refactor it to use CTEs, take a look at the following and see if it makes sense:
DECLARE #testTable TABLE
(
[id] INT IDENTITY(1, 1),
[date] datetime
)
DECLARE #previousPrograms TABLE
(
[id] INT IDENTITY(1,1),
[date] datetime,
[type] varchar(50)
)
INSERT INTO #testTable ([date])
SELECT '2013-08-08'
UNION ALL SELECT '2013-08-07'
UNION ALL SELECT '2013-08-06'
INSERT INTO #previousPrograms ([date], [type])
-- a sample user as an inpatient
SELECT '2013-08-08', 'Inpatient'
-- your use case of someone being enrolled in all 3 outpation programs
UNION ALL SELECT '2013-08-07', 'Outpatient Recreational'
UNION ALL SELECT '2013-08-07', 'Outpatient Clinical'
UNION ALL SELECT '2013-08-07', 'Outpatient Vocational'
-- showing our workings, this is what we'll join to
SELECT
PPP.[date],
PPP.[type],
ROW_NUMBER() OVER (PARTITION BY PPP.[date] ORDER BY PPP.[Priority]) AS [RowNumber]
FROM (
SELECT
[type],
[date],
CASE
WHEN [type] = 'Inpatient' THEN 1
WHEN [type] = 'Outpatient Clinical' THEN 2
WHEN [type] = 'Outpatient Vocational' THEN 3
WHEN [type] = 'Outpatient Recreational' THEN 4
ELSE 999
END AS [Priority]
FROM #previousPrograms
) PPP -- Previous Programs w/ Priority
SELECT
T.[date],
PPPO.[type]
FROM #testTable T
LEFT JOIN (
SELECT
PPP.[date],
PPP.[type],
ROW_NUMBER() OVER (PARTITION BY PPP.[date] ORDER BY PPP.[Priority]) AS [RowNumber]
FROM (
SELECT
[type],
[date],
CASE
WHEN [type] = 'Inpatient' THEN 1
WHEN [type] = 'Outpatient Clinical' THEN 2
WHEN [type] = 'Outpatient Vocational' THEN 3
WHEN [type] = 'Outpatient Recreational' THEN 4
ELSE 999
END AS [Priority]
FROM #previousPrograms
) PPP -- Previous Programs w/ Priority
) PPPO -- Previous Programs w/ Priority + Order
ON T.[date] = PPPO.[date] AND PPPO.[RowNumber] = 1
Basically we have our deepest sub-select giving all PreviousPrograms a priority based on type, then our wrapping sub-select gives them row numbers per date so we can select only the ones with a row number of 1.
I am guessing you would need to include a UR Number or some other patient identifier, simply add that as an output to both sub-selects and change the join.
I'm getting the wrong result from my report. Maybe i'm missing something simple.
The report is an inline table-valued-function that should count goods movement in our shop and how often these spareparts are claimed(replaced in a repair).
The problem: different spareparts in the shop-table(lets call it SP) can be linked to the same sparepart in the "repair-table"(TSP). I need the goods movement of every sparepart in SP and the claim-count of every distinct sparepart in TSP.
This is a very simplified excerpt of the relevant part:
create table #tsp(id int, name varchar(20),claimed int);
create table #sp(id int, name varchar(20),fiTsp int,ordered int);
insert into #tsp values(1,'1235-6044',300);
insert into #tsp values(2,'1234-5678',400);
insert into #sp values(1,'1235-6044',1,30);
insert into #sp values(2,'1235-6044',1,40);
insert into #sp values(3,'1235-6044',1,50);
insert into #sp values(4,'1234-5678',2,60);
WITH cte AS(
select tsp.id As TspID,tsp.name as TspName,tsp.claimed As Claimed
,sp.id As SpID,sp.name As SpName,sp.ordered As Ordered
from #sp sp inner join #tsp tsp
on sp.fiTsp=tsp.id
)
SELECT TspName, SUM(Claimed) As Claimed, Sum(Ordered) As Ordered
FROM cte
Group By TspName
drop table #tsp;
drop table #sp;
Result:
TspName Claimed Ordered
1234-5678 400 60
1235-6044 900 120
The Ordered-count is correct but the Claimed-count should be 300 instead of 900 for TspName='1235-6044'.
I need to group by Tsp.ID for the claim-count and group by Sp.ID for the order-count. But how in one query?
Edit: Actually the TVF looks like(note that getOrdered and getClaimed are SVFs and that i'm grouping in the outer select on TSP's Category):
CREATE FUNCTION [Gambio].[rptReusedStatistics](
#fromDate datetime
,#toDate datetime
,#fromInvoiceDate datetime
,#toInvoiceDate datetime
,#idClaimStatus varchar(50)
,#idSparePartCategories varchar(1000)
,#idSpareParts varchar(1000)
)
RETURNS TABLE AS
RETURN(
WITH ExclusionCat AS(
SELECT idSparePartCategory AS ID From tabSparePartCategory
WHERE idSparePartCategory IN(- 3, - 1, 6, 172,168)
), Report AS(
SELECT Cat.SparePartCategoryName AS Category
,TSP.SparePartDescription AS Part
,TSP.SparePartName AS PartNumber
,SP.Inventory
,Gambio.getGoodsIn(SP.idSparePart,#FromDate,#ToDate) GoodsIn
,Gambio.getOrdered(SP.idSparePart,#FromDate,#ToDate) Ordered
--,CASE WHEN TSP.idSparePart IS NULL THEN 0 ELSE
-- Gambio.getClaimed(TSP.idSparePart,#FromInvoiceDate,#ToInvoiceDate,#idClaimStatus,NULL)END AS Claimed
,CASE WHEN TSP.idSparePart IS NULL THEN 0 ELSE
Gambio.getClaimed(TSP.idSparePart,#FromInvoiceDate,#ToInvoiceDate,#idClaimStatus,1)END AS ClaimedReused
,CASE WHEN TSP.idSparePart IS NULL THEN 0 ELSE
Gambio.getCostSaving(TSP.idSparePart,#FromInvoiceDate,#ToInvoiceDate,#idClaimStatus)END AS Costsaving
FROM Gambio.SparePart AS SP
INNER JOIN tabSparePart AS TSP ON SP.fiTabSparePart = TSP.idSparePart
INNER JOIN tabSparePartCategory AS Cat
ON Cat.idSparePartCategory=TSP.fiSparePartCategory
WHERE Cat.idSparePartCategory NOT IN(SELECT ID FROM ExclusionCat)
AND (#idSparePartCategories IS NULL
OR TSP.fiSparePartCategory IN(
SELECT Item From dbo.Split(#idSparePartCategories,',')
)
)
AND (#idSpareParts IS NULL
OR TSP.idSparePart IN(
SELECT Item From dbo.Split(#idSpareParts,',')
)
)
)
SELECT Category
--, Part
--, PartNumber
, SUM(Inventory)As InventoryCount
, SUM(GoodsIn) As GoodsIn
, SUM(Ordered) As Ordered
--, SUM(Claimed) As Claimed
, SUM(ClaimedReused)AS ClaimedReused
, SUM(Costsaving) As Costsaving
, Count(*) AS PartCount
FROM Report
GROUP BY Category
)
Solution:
Thanks to Aliostad i've solved it by first grouping and then joining(actual TVF, reduced to a minimum):
WITH Report AS(
SELECT Cat.SparePartCategoryName AS Category
,TSP.SparePartDescription AS Part
,TSP.SparePartName AS PartNumber
,SP.Inventory
,SP.GoodsIn
,SP.Ordered
,Gambio.getClaimed(TSP.idSparePart,#FromInvoiceDate,#ToInvoiceDate,#idClaimStatus,1) AS ClaimedReused
,Gambio.getCostSaving(TSP.idSparePart,#FromInvoiceDate,#ToInvoiceDate,#idClaimStatus) AS Costsaving
FROM (
SELECT GSP.fiTabSparePart
,SUM(GSP.Inventory)AS Inventory
,SUM(Gambio.getGoodsIn(GSP.idSparePart,#FromDate,#ToDate))AS GoodsIn
,SUM(Gambio.getOrdered(GSP.idSparePart,#FromDate,#ToDate))AS Ordered
FROM Gambio.SparePart GSP
GROUP BY GSP.fiTabSparePart
)As SP
INNER JOIN tabSparePart TSP ON SP.fiTabSparePart = TSP.idSparePart
INNER JOIN tabSparePartCategory AS Cat
ON Cat.idSparePartCategory=TSP.fiSparePartCategory
)
SELECT Category
, SUM(Inventory)As InventoryCount
, SUM(GoodsIn) As GoodsIn
, SUM(Ordered) As Ordered
, SUM(ClaimedReused)AS ClaimedReused
, SUM(Costsaving) As Costsaving
, Count(*) AS PartCount
FROM Report
GROUP BY Category
You are JOINing first and then GROUPing by. You need to reverse it, GROUP BY first and then JOIN.
So here in my subquery, I group by first and then join:
select
claimed,
ordered
from
#tsp
inner JOIN
(select
fitsp,
SUM(ordered) as ordered
from
#sp
group by
fitsp) as SUMS
on
SUMS.fiTsp = id;
I think you just need to select Claimed and add it to the Group By in order to get what you are looking for.
WITH cte AS(
select tsp.id As TspID,tsp.name as TspName,tsp.claimed As Claimed
,sp.id As SpID,sp.name As SpName,sp.ordered As Ordered
from #sp sp inner join #tsp tsp
on sp.fiTsp=tsp.id )
SELECT TspName, Claimed, Sum(Ordered) As Ordered
FROM cte
Group By TspName, Claimed
Your cte is an inner join between tsp and sp, which means that the data you're querying looks like this:
SpID Ordered TspID TspName Claimed
1 30 1 1235-6044 300
2 40 1 1235-6044 300
3 50 1 1235-6044 300
4 60 2 1234-5678 400
Notice how TspID, TspName and Claimed all get repeated. Grouping by TspName means that the data gets grouped in two groups, one for 1235-6044 and one for 1234-5678. The first group has 3 rows on which to run the aggregate functions, the second group only one. That's why your sum(Claimed) will get you 300*3=900.
As Aliostad suggested, you should first group by TspID and do the sum of Ordered and then join to tsp.
No need to join, just subselect:
create table #tsp(id int, name varchar(20),claimed int);
create table #sp(id int, name varchar(20),fiTsp int,ordered int);
insert into #tsp values(1,'1235-6044',300);
insert into #tsp values(2,'1234-5678',400);
insert into #sp values(1,'1235-6044',1,30);
insert into #sp values(2,'1235-6044',1,40);
insert into #sp values(3,'1235-6044',1,50);
insert into #sp values(4,'1234-5678',2,60);
WITH cte AS(
select tsp.id As TspID,tsp.name as TspName,tsp.claimed As Claimed
,sp.id As SpID,sp.name As SpName,sp.ordered As Ordered
from #sp sp inner join #tsp tsp
on sp.fiTsp=tsp.id
)
SELECT id, name, SUM(claimed) as Claimed, (SELECT SUM(ordered) FROM #sp WHERE #sp.fiTsp = #tsp.id GROUP BY #sp.fiTsp) AS Ordered
FROM #tsp
GROUP BY id, name
drop table #tsp;
drop table #sp;
Produces:
id name Claimed Ordered
1 1235-6044 300 120
2 1234-5678 400 60
-- EDIT --
Based on the additional info, this is how I might try to split the CTE to form the data as per the example. I fully admit that Aliostad's approach may yield a cleaner query but here's an attempt (completely blind) using the subselect:
CREATE FUNCTION [Gambio].[rptReusedStatistics](
#fromDate datetime
,#toDate datetime
,#fromInvoiceDate datetime
,#toInvoiceDate datetime
,#idClaimStatus varchar(50)
,#idSparePartCategories varchar(1000)
,#idSpareParts varchar(1000)
)
RETURNS TABLE AS
RETURN(
WITH ExclusionCat AS (
SELECT idSparePartCategory AS ID From tabSparePartCategory
WHERE idSparePartCategory IN(- 3, - 1, 6, 172,168)
), ReportSP AS (
SELECT fiTabSparePart
,Inventory
,Gambio.getGoodsIn(idSparePart,#FromDate,#ToDate) GoodsIn
,Gambio.getOrdered(idSparePart,#FromDate,#ToDate) Ordered
FROM Gambio.SparePart
), ReportTSP AS (
SELECT TSP.idSparePart
,Cat.SparePartCategoryName AS Category
,TSP.SparePartDescription AS Part
,TSP.SparePartName AS PartNumber
,CASE WHEN TSP.idSparePart IS NULL THEN 0 ELSE
Gambio.getClaimed(TSP.idSparePart,#FromInvoiceDate,#ToInvoiceDate,#idClaimStatus,1)END AS ClaimedReused
,CASE WHEN TSP.idSparePart IS NULL THEN 0 ELSE
Gambio.getCostSaving(TSP.idSparePart,#FromInvoiceDate,#ToInvoiceDate,#idClaimStatus)END AS Costsaving
FROM tabSparePart AS TSP
INNER JOIN tabSparePartCategory AS Cat
ON Cat.idSparePartCategory=TSP.fiSparePartCategory
WHERE Cat.idSparePartCategory NOT IN(SELECT ID FROM ExclusionCat)
AND (#idSparePartCategories IS NULL
OR TSP.fiSparePartCategory IN(
SELECT Item From dbo.Split(#idSparePartCategories,',')
)
)
AND (#idSpareParts IS NULL
OR TSP.idSparePart IN(
SELECT Item From dbo.Split(#idSpareParts,',')
)
)
)
SELECT Category
--, Part
--, PartNumber
, (SELECT SUM(Inventory) FROM ReportSP WHERE ReportSP.fiTabSparePart = idSparePart GROUP BY fiTabSparePart) AS Inventory
, (SELECT SUM(GoodsIn) FROM ReportSP WHERE ReportSP.fiTabSparePart = idSparePart GROUP BY fiTabSparePart) AS GoodsIn
, (SELECT SUM(Ordered) FROM ReportSP WHERE ReportSP.fiTabSparePart = idSparePart GROUP BY fiTabSparePart) AS Ordered
, Claimed
, ClaimedReused
, Costsaving
, Count(*) AS PartCount
FROM ReportTSP
GROUP BY Category
)
Without a better understanding of the whole schema it's difficult to cover for all the eventualities but whether this works or not (I suspect PartCount will be 1 for all instances) hopefully it'll give you some fresh thoughts for alternate approaches.
SELECT
tsp.name
,max(tsp.claimed) as claimed
,sum(sp.ordered) as ordered
from #sp sp
inner join #tsp tsp
on sp.fiTsp=tsp.id
GROUP BY tsp.name