optimize the query which results the duplicates taken from two diff data tables - sql

select
A.SlNo
,A.[Company Code]
,A.[Document Number]
,A.[Document Type]
,CONVERT(varchar(10),A.[Document Date],101) AS [Document Date]
,A.[Reference]
,CAST(Null as nvarchar(5)) As NumericInvoice
,CONVERT(varchar(10),A.[Posting Date],101) AS [Posting Date]
,A.[Clearing Document]
,A.[Amount in doc# curr#]`enter code here`
,A.[Document currency]
,A.[Account]
,A.[Account Description]
,A.[User Name]
,A.dumptype
,CAST(Null As Varchar(5)) As Similarity
,CASE
When (A.[Reference] = B.[Reference] and A.[Amount in doc# curr#]= B.[Amount in doc# curr#]) then 'DUPEPAYA'
END As [Status]
,CAST(Null As Varchar(5)) As [DupeSets]
from MasterData A
Join InputData B on (A.[Reference] = B.[Reference] and A.[Amount in doc# curr#]= B.[Amount in doc# curr#])
Union All
select distinct
A.SlNo
,A.[Company Code]
,A.[Document Number]
,A.[Document Type]
,CONVERT(varchar(10),A.[Document Date],101) AS [Document Date]
,A.[Reference]
,CAST(Null as nvarchar(5)) As NumericInvoice
,CONVERT(varchar(10),A.[Posting Date],101) AS [Posting Date]
,A.[Clearing Document]
,A.[Amount in doc# curr#]
,A.[Document currency]
,A.[Account]
,A.[Account Description]
,A.[User Name]
,A.dumptype
,CAST(Null As Varchar(5)) As Similarity
,CASE
When (A.[Reference] = B.[Reference] and A.[Amount in doc# curr#]= B.[Amount in doc# curr#]) then 'DUPEPAYA'
END As [Status]
,DENSE_Rank () Over (order by A.[Reference], A.[Account Description], A.dumptype asc) As [DupeSets]
from InputData A
Join MasterData B on (A.[Reference] = B.[Reference] and A.[Amount in doc# curr#]= B.[Amount in doc# curr#])
order by A.[Reference], A.[Account Description], A.dumptype asc
I want to optimize the above query which results in the duplicates from masterdata table to that of the input data table on few validations, these data tables deal with a huge amount of data hence taking a long time to execute the query. please help
Note: I'm a beginner in SQL.

Related

Failed to convert NVARCHAR to INT, but I'm not trying to

I get this error:
Msg 245, Level 16, State 1, Line 5
Conversion failed when converting the nvarchar value 'L NOVAK ENTERPRISES, INC' to data type int.
I've been wrestling with this query for quite a while and just can't figure out in what place that the conversion is being attempted. Using SQL Server 2017.
DECLARE #StartDate AS DateTime
DECLARE #MfgGroupCode AS Varchar(20)
SET #StartDate='2/27/2020'
SET #MfgGroupCode = 'VOLVO_NLA'
SELECT DISTCINT
CT.No_ AS [Contact Number],
CT.Name AS [Contact Name],
UAL.Time AS [Search Date],
UAL.Param1 AS [Search Part],
CT.[E-Mail] AS [Contact Email],
CT.[Phone No_] AS [Contact Phone],
CT.[Company Name] AS [Search By Customer],
(SELECT C.Name
FROM dbo.[Customer] C
WHERE C.No_ = SL.[Sell-to Customer No_]
AND C.Name <> '') AS [Sold To Customer],
SL.[Posting Date] AS [Invoice Date],
SL.[Document No_] AS [Invoice],
SL.Quantity AS [Quantity],
SL.[Unit Price] AS [Unit Price],
SL.Amount AS [Amount],
DATEDIFF(DAY, UAL.Time, SL.[Posting Date]) AS [Interval]
FROM
dbo.[User Action Log] UAL
JOIN
dbo.[User Action Types] UAT ON UAL.[User Action ID] = UAT.ID
JOIN
dbo.[Item] I ON UAL.Param1 = I.[OEM Part Number]
JOIN
dbo.[Contact] CT ON UAL.[Contact No_] = CT.No_
LEFT OUTER JOIN
dbo.[Sales Invoice Line] SL ON UAL.Param1 = SL.[OEM Part Number]
AND SL.[Posting Date] >= #StartDate
WHERE
UAT.Name IN ('SinglePartSearch', 'MultiPartSearch')
AND UAL.[MFG Group Code] = #MfgGroupCode
AND UAL.Time >= #StartDate
AND UAL.Param3 > 0
-- AND DATEDIFF(DAY, UAL.Time, SL.[Posting Date]) < 0 -- Uncomment to see Current Searches with Past Orders
-- AND DATEDIFF(DAY, UAL.Time, SL.[Posting Date]) > -1 -- Uncomment to see Searches resulting in Future Order
AND DATEDIFF(DAY, UAL.Time, SL.[Posting Date]) IS NULL -- Uncomment to See Searches with no Order
ORDER BY
Interval DESC
Thanks to all of your help and questioning, I was able to identify that the culprit is UAL.Param3.
The User Action Log table stores a variety of different "Actions" and the parameters that are affiliated with each type of action (param1, param2, param3). For one action "RequestForAccess", the "L NOVAK" value is perfectly acceptable. For this query, we're looking at the "SinglePartSearch" and "MultiPartSearch" actions, which will only contain numeric values in UAL.Param3
I replaced this line (AND UAL.Param3 > 0) with (AND ISNUMERIC(UAL.Param3) = 1 AND UAL.Param3 <> 0) in the Where clause and it is now returning the results I hoped for.
Let me know if there is a more correct way of doing this and thank you to all who contributed!

Performance issue with SQL Query

I am pulling data from two tables-Forecast and Orders to compute sales forecast accuracy.
Steps I am taking:
Identifying all exhaustive combinations of product-region-demand month b/w both data sets...let's call this (1)
Identifying different forecast snapshots in forecast data... let's call this (2)
Performing a cross-join of (1) and (2)...let's call this (3)
Performing the "SUMIF()" equivalent on the lines from (3) for both orders and forecast. For example, if I am comparing forecast to actual orders for February,
Jan "INDPOR" Forecast---> For some Product/Region-Feb Delivery Combination: February Forecast (generated in January) vs. Orders booked after Jan 1st with a delivery schedule in Feb
Feb "INDPOR" Forecast---> For the same Product/Region-Feb Delivery Date Combination: February Forecast (generated in February) vs. Orders booked after Jan 27th* with a delivery schedule in Feb
Note 1: Generating multiple forecasts for the same month
Note 2: Fiscal calendar definitions used; that is why Feb starts on Jan 27th
Output is generating correctly. But, it is painfully slow (1 hour +). Please help me fine-tune this and make it faster as I will need to use this for larger data sets too.
Other Details:
I am running this on SQL Server 2014 locally from my desktop.Uploading this using the SQL data import wizard into SQL from an Excel file currently
Input Forecast data: ForecastAggTable
Input Orders data: OrderAggTable
Input and Output Files
Code:
Select *
from
(
Select *,
(Select isnull(sum([Forecast Qty]),0) from ForecastAggTable t2 where t2.LOB=D.LOB and
t2.[Demand Month]=D.[Demand Month] and t2.Class=D.Class
and t2.[Item Type]=D.[Item Type] and t2.[LoB Region]=D.[LoB Region] and
t2.[Key Account]=D.[Key Account] and t2.Country=D.Country
and t2.[Master Customer]=D.[Master Customer] and t2.[INDPOR Version]=D.[INDPOR Version])[Forecast Qty],
(
Select isnull(sum([Order Qty]),0) from OrderAggTable t1 where t1.LOB=D.LOB and
t1.[SAD Month]=D.[Demand Month] and t1.Class=D.Class
and t1.[Item Type]=D.[Item Type] and t1.[LoB Region]=D.[LoB Region] and
t1.[Key Account]=D.[Key Account] and t1.Country=D.Country
and t1.[Master Customer]=D.[Master Customer] and t1.[Book Date]>=D.[INDPOR Timestamp]
)[SAD-OrderQty],
(
Select isnull(sum([Order Revenue]),0) from OrderAggTable t1 where t1.LOB=D.LOB and
t1.[SAD Month]=D.[Demand Month] and t1.Class=D.Class
and t1.[Item Type]=D.[Item Type] and t1.[LoB Region]=D.[LoB Region] and
t1.[Key Account]=D.[Key Account] and t1.Country=D.Country
and t1.[Master Customer]=D.[Master Customer] and t1.[Book Date]>=D.[INDPOR Timestamp]
)[SAD-OrderRevenue],
(
Select isnull(sum([Order Qty]),0) from OrderAggTable t1 where t1.LOB=D.LOB and
t1.[RDD Month]=D.[Demand Month] and t1.Class=D.Class
and t1.[Item Type]=D.[Item Type] and t1.[LoB Region]=D.[LoB Region] and
t1.[Key Account]=D.[Key Account] and t1.Country=D.Country
and t1.[Master Customer]=D.[Master Customer] and t1.[Book Date]>=D.[INDPOR Timestamp]
)[RDD-OrderQty],
(
Select isnull(sum([Order Revenue]),0) from OrderAggTable t1 where t1.LOB=D.LOB and
t1.[RDD Month]=D.[Demand Month] and t1.Class=D.Class
and t1.[Item Type]=D.[Item Type] and t1.[LoB Region]=D.[LoB Region] and
t1.[Key Account]=D.[Key Account] and t1.Country=D.Country
and t1.[Master Customer]=D.[Master Customer] and t1.[Book Date]>=D.[INDPOR Timestamp]
)[RDD-OrderRevenue]
from
(
Select distinct LOB,[INDPOR Version],[INDPOR Timestamp],[Demand Month],
[Demand Quarter],[Min Date],Class,[Item Type],[Offer PF],
[LoB Region],[Key Account],Country,[Master Customer]
from
(
Select V.LOB,V.[SAD Month][Demand Month],V.[SAD Quarter][Demand Quarter],V.[SAD Min Date][Min Date],V.Class,
[Item Type],[Offer PF],[LoB Region],[Key Account],Country,[Master Customer]
from OrderAggTable V
union
(
Select Z.LOB,Z.[RDD Month][Demand Month],Z.[RDD Quarter][Demand Quarter],Z.[RDD Min Date][Min Date],Z.Class,
[Item Type],[Offer PF],[LoB Region],[Key Account],Country,[Master Customer]
from OrderAggTable Z
)
union
(
Select LOB,[Demand Month],[Demand Quarter],[Min Date],Class[Class],[Item Type],[Offer PF],[LoB Region],
[Key Account],Country,[Master Customer] from ForecastAggTable
)
)A
cross join
(
select distinct [INDPOR Version],[INDPOR Timestamp]
from ForecastAggTable
)B
)D
where [Min Date]>=[INDPOR Timestamp]
)E
where ([SAD-OrderQty] + [RDD-OrderQty] + [Forecast Qty]<>0)
How about simplifying, and doing less passes of your tables.
In this example I do two scans of the forecast table, one for the distinct, and one for the union, and one scan of the orders table.
with cte as
(
select distinct [INDPOR Version],[INDPOR Timestamp]
from ForecastAggTable
)
,cte2 as
(
Select
V.LOB
,iif(DUP=0,V.[SAD Month] ,V.[RDD Month] ) [Demand Month]
,iif(DUP=0,V.[SAD Quarter] ,V.[RDD Quarter] ) [Demand Quarter]
,iif(DUP=0,V.[SAD Min Date] ,V.[RDD Min Date] ) [Min Date]
,V.[Book Date]
,V.Class
,V.[Item Type]
,V.[Offer PF]
,V.[LoB Region]
,V.[Key Account]
,V.Country
,V.[Master Customer]
,null [INDPOR Version]
,null [Forecast Qty]
,iif(DUP=0,v.[Order Qty] ,null ) [SAD-OrderQty]
,iif(DUP=0,V.[Order Revenue] ,null ) [SAD-OrderRevenue]
,iif(DUP=1,V.[Order Qty] ,null ) [RDD-OrderQty]
,iif(DUP=1,V.[Order Revenue] ,null ) [RDD-OrderRevenue]
from OrderAggTable V
cross join (select dup from (values (0),(1))a(dup)) a
union all
Select
LOB
,[Demand Month]
,[Demand Quarter]
,[Min Date]
,[Min Date]
,Class
,[Item Type]
,[Offer PF]
,[LoB Region]
,[Key Account]
,Country
,[Master Customer]
,[INDPOR Version]
,[Forecast Qty]
,null[SAD-OrderQty]
,null[SAD-OrderRevenue]
,null[RDD-OrderQty]
,null[RDD-OrderRevenue]
from ForecastAggTable
)
select
cte2.LOB
,cte.[INDPOR Version]
,cte.[INDPOR Timestamp]
,cte2.[Demand Month]
,cte2.[Demand Quarter]
,cte2.[Min Date]
,cte2.Class
,cte2.[Item Type]
,cte2.[Offer PF]
,cte2.[LoB Region]
,cte2.[Key Account]
,cte2.Country
,cte2.[Master Customer]
,isnull(sum(cte2.[Forecast Qty] ),0) [Forecast Qty]
,isnull(sum(cte2.[SAD-OrderQty] ),0) [SAD-OrderQty]
,isnull(sum(cte2.[SAD-OrderRevenue]) ,0) [SAD-OrderRevenue]
,isnull(sum(cte2.[RDD-OrderQty] ),0) [RDD-OrderQty]
,isnull(sum(cte2.[RDD-OrderRevenue]) ,0) [RDD-OrderRevenue]
from cte2
inner join cte
on cte2.[Book Date]>=cte.[INDPOR Timestamp]
where isnull(cte2.[INDPOR Version],cte.[INDPOR Version])=cte.[INDPOR Version]
group by
cte2.LOB
,cte2.[Demand Month]
,cte2.[Demand Quarter]
,cte2.[Min Date]
,cte2.Class
,cte2.[Item Type]
,cte2.[Offer PF]
,cte2.[LoB Region]
,cte2.[Key Account]
,cte2.Country
,cte2.[Master Customer]
,cte.[INDPOR Version]
,cte.[INDPOR Timestamp]
having
isnull(sum(cte2.[Forecast Qty] ),0) +
isnull(sum(cte2.[SAD-OrderQty] ),0) +
isnull(sum(cte2.[RDD-OrderQty] ),0)
!=0

Script help - count , total and avg

I have the following script, trying to count how many distinct customers are there , how many distinct orders , what is total of all orders under £15 and its's avg, total of orders above £20 and it's avg.
with consignments as
(select
[Sell-to Customer No_],
[Convert-to Document No_],
ic.[Shipping Agent Service Code],
[Pick Completed DateTime] as [Shipped DateTime],
ROUND((ic.[Amount Including VAT] + ic.Postage + ic.[Gift Wrap Price] +
ic.[Handling Fee] + ic.[Personalisation Fee]),2) as [Document Amount]
from dbo.[Temp$Consignment] ic inner join [dbo].
[Temp$Order] oh
on ic.[Owner Header GuID]=oh.[Order Guid]
where ic.[Shipping Agent Service Code]='secstan' and ic.[Pick Completed
DateTime] >= '2016-11-01T00:00:00.000' AND
ic.[Pick Completed DateTime] <= '2016-11-30T23:59:55.000' ),summary as
(select *,CASE WHEN [Document Amount] > 15 THEN 1 ELSE 0 END as 'Over15'
from consignments )select * from summary
I have working script like below, but as I am new to sql I am bit confused how to convert above script to below.
select amountclass,[Shipping Agent Service Code],
count(distinct [Sell-to Customer No_]) as Total_customers,
count(*) as Total_orders,
sum([Amount]) as total_revenue,
avg([Amount] * 1.0) as AOV
from
(
select [Sell-to Customer No_], oh.[Original Order No_], [Amount],ic.
[Shipping Agent Service Code],
case when [Amount] <= 20 then 'Under_20'
else 'Over_20'
end as amountclass
from [TBW_BI].[dbo].[Temp$Order] oh INNER JOIN [TBW_BI].[dbo].
[Temp$Consignment] IC
ON IC.[Owner Header GuID]=OH.[Order Guid]
where[order date] >= '2016-09-01' AND
[order date] <= '2016-09-30' AND [COUNTRY]='UNITED KINGDOM' and
[document type] like 'ord%' and ic.[Shipping Agent Service
Code]='secstan'
) dt
group by amountclass,[Shipping Agent Service Code]
order by amountclass,[Shipping Agent Service Code]
It looks like you have a query with a 2 Common Table Expressions (CTE) and you want to convert the CTE to a derived table. First, we can eliminate one of the CTEs.
summary as
(select *,CASE WHEN [Document Amount] > 15 THEN 1 ELSE 0 END as 'Over15'
from consignments )
select * from summary
The CTE for summary is unnecessary. We can convert that code to this:
select *,CASE WHEN [Document Amount] > 15 THEN 1 ELSE 0 END as 'Over15'
from consignments
Now, instead of using the consignments CTE; we can copy the sql code within it to create a derived table.
select *,CASE WHEN [Document Amount] > 15 THEN 1 ELSE 0 END as 'Over15'
from (
select [Sell-to Customer No_],
[Convert-to Document No_],
ic.[Shipping Agent Service Code],
[Pick Completed DateTime] as [Shipped DateTime],
ROUND((ic.[Amount Including VAT] + ic.Postage + ic.[Gift Wrap Price] +
ic.[Handling Fee] + ic.[Personalisation Fee]),2) as [Document Amount]
from dbo.[Temp$Consignment] ic
inner join [dbo].[Temp$Order] oh on ic.[Owner Header GuID]=oh.[Order Guid]
where ic.[Shipping Agent Service Code]='secstan'
and ic.[Pick Completed
DateTime] >= '2016-11-01T00:00:00.000'
AND
ic.[Pick Completed DateTime] <= '2016-11-30T23:59:55.000' ) as t

Workaround for PIVOT statement

I have this query, is taking like 2 minutes to resolve, I need to find a workaround, I know that UNPIVOT has a better solution using CROSS APPLY, is there anything similar for PIVOT?
SELECT [RowId], [invoice date], [GL], [Entité], [001], [Loc], [Centre Cout], [Compte_1], [Interco_1], [Futur_1], [Department], [Division], [Compagnie], [Localisation], [Centre/Cout], [Compte], [Interco], [Futur], [Account], [Mobile], [Last Name], [First Name], [license fee], [GST], [HST], [PST], [Foreign Tax], [Sales Tax License], [Net Total], [Total], [ServiceType], [Oracle Cost Center], [CTRL], [EXPENSE], [Province]
FROM
(SELECT fd.[RowId], fc.[ColumnName], fd.[Value]
FROM dbo.FileData fd
INNER JOIN dbo.[FileColumn] fc
ON fc.[FileColumnId] = fd.[FileColumnId]
WHERE FileId = 1
AND TenantId = 1) x
PIVOT
(
MAX(Value)
FOR [ColumnName] IN ( [invoice date], [GL], [Entité], [001], [Loc], [Centre Cout], [Compte_1], [Interco_1], [Futur_1], [Department], [Division], [Compagnie], [Localisation], [Centre/Cout], [Compte], [Interco], [Futur], [Account], [Mobile], [Last Name], [First Name], [license fee], [GST], [HST], [PST], [Foreign Tax], [Sales Tax License], [Net Total], [Total], [ServiceType], [Oracle Cost Center], [CTRL], [EXPENSE], [Province])
) AS p
Pivots are great, but so are Conditional Aggregations. Also, there would be no datatype conficts or conversions necessary
SELECT [RowId]
,[invoice date] = max(case when [FileColumnId] = ??? then Value end)
,[GL] = max(case when [FileColumnId] = ??? then Value end)
,... more fields
FROM dbo.FileData fd
WHERE FileId = 1
AND TenantId = 1
Group By [RowId]
EDIT
You could add back the join to make it more readable.

only return records count() = 1

I have this table called myTable
Posting Date|Item No_|Entry Type|
2015-01-13|1234|1
2015-01-13|1234|1
2015-01-12|1234|1
2015-01-12|5678|1
2015-02-12|4567|1
What I want, is only return result where a [Item No_] is ind the table 1 time.
So in this example of my table, i only want to return [Item No_] 5678 and 4567, because there only are one record in it. And then ignore [Item No_] 1234
This is my SQL i have tried, but something is wrong. Can anyone help me?
SELECT [Item No_], [Posting Date], COUNT([Item No_]) AS Antal
FROM myTable
GROUP BY [Entry Type], [Posting Date], [Item No_]
HAVING ([Entry Type] = 1) AND (COUNT([Item No_]) = 1)
ORDER BY [Posting Date] DESC
select [Item No_]
from myTable
group by [Item No_]
having count(*)=1
Remove Posting Date from group by
SELECT [Item No_],Entry Type, COUNT([Item No_]) AS Antal
FROM myTable
GROUP BY [Entry Type], [Item No_]
HAVING COUNT([Item No_]) = 1
or if you want other details use a subquery
SELECT [Item No_],
Entry Type,
Posting Date
FROM myTable a
WHERE EXISTS (SELECT 1
FROM myTable b
where a.[Item No_]=b.[Item No_]
GROUP BY [Entry Type],
[Item No_]
HAVING Count(1) = 1)
ORDER BY [Posting Date] DESC
or window function
;WITH cte
AS (SELECT [Item No_],
[Posting Date],
[Entry Type],
Row_number()OVER (Partition BY [Entry Type], [Item No_] ORDER BY [Item No_]) RN
FROM myTable)
SELECT *
FROM cte a
WHERE NOT EXISTS (SELECT 1
FROM cte b
WHERE a.[Item No_] = b.[Item No_]
AND rn > 1)
ORDER BY [Posting Date] DESC
You can use ROW_Number() in Sql Server
select * from (
SELECT [Item No_], [Posting Date],
Row_Number() over (Parition by [Item No_]
order by [Item No]) RN
FROM myTable
)D
where D.RN=1