I made this view in sql server to combine the values of 2 records of multiple columns. But the problem with this solution is that you need a concat for every column in table2. I would like to know if it is possible to do the concat part with a loop and a dynamic variable for the column numbers (columns in table2 are called 1,2,3,4,5....) of table2.
SELECT
dbo.table1.lot_id AS lot,
dbo.table1.hybird_id AS hybrid,
concat(
LEFT( (SELECT dbo.table2.[1] FROM dbo.table2 WHERE dbo.table2.parentals_id = dbo.table1.parental_male_id AND dbo.table2.lot_id = dbo.table1.lot_id) , 1),
LEFT( (SELECT dbo.table2.[1] FROM dbo.table2 WHERE dbo.table2.parentals_id = dbo.table1.parental_female_id AND dbo.table2.lot_id = dbo.table1.lot_id) , 1)
) AS '1',
--above concat x31 times more
FROM dbo.table2
INNER JOIN dbo.table1 ON dbo.table2.lot_id = dbo.table1.lot_id
GROUP BY dbo.table1.lot_id, dbo.table1.hybird_id,
dbo.table1.parental_male_id,
dbo.table1.parental_female_id
I tried a few things but nothing worked, any ideas?
Try to simplify it a bit, kind of
SELECT lot, hybrid, parental_male_id, parental_female_id
concat(Left(m.[1],1), left(f.[1], 1)) AS [1]
--,..
FROM (
SELECT dbo.table1.lot_id AS lot
, dbo.table1.hybird_id AS hybrid
, dbo.table1.parental_male_id
, dbo.table1.parental_female_id
FROM dbo.table2
INNER JOIN dbo.table1 ON dbo.table2.lot_id = dbo.table1.lot_id
GROUP BY dbo.table1.lot_id, dbo.table1.hybird_id,
dbo.table1.parental_male_id,
dbo.table1.parental_female_id
) t
JOIN dbo.table2 m ON m.parentals_id = t.parental_male_id AND m.lot_id = lot)
JOIN dbo.table2 f ON f.parentals_id = t.parental_female_id AND f.lot_id = lot)
Related
I have a query that does a select with joins from multiple tables that contains in total about 90 million rows. I only need data from the last 30 days. The problem is that when I run the select query the sql server throws a timeout while the query is running and new records are not created during this time frame. This query takes about 5 seconds to complete.
I would like to optimise this query so that it wont go through the entire tables looking at the datetime and would only search from the latest entries.
Right now it seems that I would need to index datetime column. Please advise if I need to create indexes or if there is another way to optimise this query.
SELECT [table1].Column1 AS InvoiceNo,
'ND' AS VATRegistrationNumber,
'ND' AS RegistrationNumber,
Column2 AS Country,
[table2].Column3 + ' ' + [table2].Column4 AS Name,
CAST([table1].Column5 AS date) AS InvoiceDate,
'SF' AS InvoiceType,
'' AS SpecialTaxation,
'' AS VATPointDate,
ROUND([table1Line].Column6, 2) AS TaxableValue,
CASE
WHEN [table1Line].Column7 = 9 THEN 'PVM2'
WHEN [table1Line].Column7 = 21 THEN 'PVM1'
WHEN [table1Line].Column7 = 0 THEN 'PVM14'
END AS TaxCode,
CAST([table1Line].Column7 AS int) AS TaxPercentage,
table1Line.Column8 - ROUND([table1Line].Column6, 2) AS Amount,
'' AS VATPointDate2,
[table1].Column1 AS InvoiceNo,
'' AS ReferenceNo,
'' AS ReferenceDate,
[table1].CustomerPersonID AS CustomerID
FROM [table1]
INNER JOIN [table2] ON [table1].CustomerPersonID = [table2].ID
INNER JOIN [table3] ON [table2].Column9 = [table3].ID
INNER JOIN [table1Line] ON [table1].ID = [table1Line].table1ID
INNER JOIN [table4] ON table1Line.TaxID = Tax.ID
INNER JOIN [table5] ON [table1].CompanyID = Company.ID
INNER JOIN table6 ON [table1].SalesChannelID = table6.ID
WHERE Column5 LIKE '%date%'
AND table6.id = 5
OR table6.id = 2
AND Column5 LIKE '%date%'
ORDER BY Column5 DESC;
First things first, each database runs a little differently because the optimizer has been running and figuring out how the unique circumstances can be improved and continuously tries to make common things run better.
There's also versioning differences that also play a part is the performance of the server.
Besides that stuff, Here's a few things to do to optimize this query.
When working with Joins, Your Joined table comes first then compare against the already specified table.
For example t2 checks against t1:
select t1.name, t2.car
from customers as t1
left join purchases as t2
on t2.customerid = t1.customerid
The next thing I see is the Like condition in the Where part of the code.
The stored date that it's finding is stored as text in your example.
I would recommend processing the date as a datetime instead of a string type of datatype.
I would include that in the code below, but I'm not sure what the format looks like for your string of text.
%date% is the same thing as saying "Contains date".
This takes the date string, and tries to see if it matches in every position of characters from left to right.
So if your date text is 20200130, it will check to see if it matches 2date0200130, then tries 20date200130, then tries 202date00130, etc.
It will significantly increase the time it takes to process.
I also see that the date is being searched accidently two times instead of one.
I would recommend doing:
WHERE LTRIM(RTRIM(Column5)) LIKE 'date'
As for the Inner Joins, I would not use them.
Use the Left join, and then in the Where, I would make sure it had no Null values for that joined data.
This makes the Left Join work the same as the Inner Join and runs more optimally when you are running the query.
For Instance, the first Join would look like this:
FROM [table1]
LEFT JOIN [table2] ON [table2].ID = [table1].CustomerPersonID
WHERE table2.id IS NOT NULL
I see an error in the code in the Where statement:
AND table6.id = 5
OR tables6.id = 2
This should be:
AND (tables6.id = 5 OR tables6.id = 2)
So here should be an optimized version of your code:
SELECT [table1].Column1 AS InvoiceNo,
'ND' AS VATRegistrationNumber,
'ND' AS RegistrationNumber,
Column2 AS Country,
[table2].Column3 + ' ' + [table2].Column4 AS Name,
CAST([table1].Column5 AS date) AS InvoiceDate,
'SF' AS InvoiceType,
'' AS SpecialTaxation,
'' AS VATPointDate,
ROUND([table1Line].Column6, 2) AS TaxableValue,
(CASE WHEN [table1Line].Column7 = 9 THEN 'PVM2'
WHEN [table1Line].Column7 = 21 THEN 'PVM1'
WHEN [table1Line].Column7 = 0 THEN 'PVM14'
ELSE '' END ) AS TaxCode,
CAST([table1Line].Column7 AS int) AS TaxPercentage,
table1Line.Column8 - ROUND([table1Line].Column6, 2) AS Amount,
'' AS VATPointDate2,
[table1].Column1 AS InvoiceNo,
'' AS ReferenceNo,
'' AS ReferenceDate,
[table1].CustomerPersonID AS CustomerID
FROM [table1]
LEFT JOIN [table2] ON [table2].ID = [table1].CustomerPersonID
LEFT JOIN [table3] ON [table3].ID = [table2].Column9
LEFT JOIN [table1Line] ON [table1Line].table1ID = [table1].ID
LEFT JOIN [table4] ON [table4].ID = table1Line.TaxID
LEFT JOIN [table5] ON [table5].ID = [table1].CompanyID
LEFT JOIN [table6] ON table6.ID = [table1].SalesChannelID
WHERE table2.ID IS NOT null
AND table3.ID IS NOT null
AND table1Line.ID IS NOT null
AND table4.ID IS NOT null
AND table5.ID IS NOT null
AND table6.ID IS NOT null
AND LTRIM(RTRIM(Column5)) LIKE 'date'
AND (table6.id = 5 OR table6.id = 2)
ORDER BY Column5 DESC;
I am reverse engineering some legacy SQL algorithms to move to apache spark.
I have encountered a across apply which I understand is TSQL specific and there is no direct equivalent in ANSII or Spark SQL.
The sanitized algorithm is:
SELECT
Id_P ,
Monthindex ,
(
SELECT
100 * (STDEV(ResEligible.num_valid) / AVG(ResEligible.num_valid)) AS Pre_Coef_Var
FROM
tbl_p a CROSS APPLY
(
SELECT
e.Monthindex ,
e.num AS num_valid
FROM
dbo.tbl_p e
WHERE
e.Monthindex = a.MonthIndex
AND e.Id_P = a.Id_P
UNION ALL
SELECT DISTINCT
B1.[MonthIndex ] ,
Tr.num AS num_valid
FROM
#tbl_pr B1
INNER JOIN
#tbl_pr B2
ON
B1.[Id_P] = B2.[Id_P]
AND B2.Rang - B1.Rang BETWEEN 0 AND 2
INNER JOIN
dbo.tbl_p Tr
ON
Tr.Id_P = B1.Id_P
AND Tr.Monthindex = B1.Monthindex
WHERE
a.Id_P = B1.[Id_P]
AND B2.[MonthIndex] =
(
SELECT
MAX([MonthIndex])
FROM
#tbl_pr
WHERE
[MonthIndex] < a.MonthIndex
AND [Id_P] = a.Id_P) ) AS ResEligible
WHERE
a.Id_P = result.Id_P
AND a.MonthIndex = result.MonthIndex) AS Coeff
FROM
tbl_p AS result
WHERE
1 = 1
AND MonthIndex = #CurrentMonth
GROUP BY
Id_P ,
Monthindex) AS CC
so for every row in alias b we cross apply to the inner queries.
Is it possible to re-write the cross apply in terms of join operations (or otherwise) so I can re-implement in spark sql?
Cheers
Terry
Seems like you could rewrite your query as the below:
SELECT T1.col1,
T1.col2,
sq.col3Sum
FROM tbl1 T1
CROSS JOIN (SELECT SUM(T1sq.Col3) AS col3Sum
FROM tbl1 T1sq
JOIN tbl2 T2 ON T1sq.Col1 = T2.Col2
JOIN tbl3 T3 ON T2.col1 = T3.Col1) sq;
Seems odd, however, that there was no JOIN criteria between the 2 references to tbl1.
I have to cleanup some imported data. Changing some entries to a proper code I use. I have it working but is there a cleaner way than lists of update/sets?
update List
SET STCode = REPLACE(STCode, 'Georgia','GEO')
update List
SET STCode = REPLACE(STCode, 'Louisiana','LOU')
etc...
Provided STCode stores at most one value to be replaced ( no stuff like 'Georgia and Louisiana') lookup table is the solution
update List
set STCode = REPLACE(STCode, replacement.bad, replacement.good)
from List l
join (
values
('Georgia','GEO')
,('Louisiana','LOU')
) replacement(bad, good)
on l.STCode like '%' + replacement.bad + '%'
;
could be using a transalation table with union and a join
UPDATE T1
SET T1.STCode = T2.code
FROM List AS T1
INNER JOIN (
select 'Georgia' name ,'GEO' code
union
select 'Louisiana','LOU'
) T2 on .T1.STCode = T2.name
I have a working sql select, which looks like this
[Edited: Im sorry i did one mistake in the question, i edited alias of Table1 but im trying the answers]
SELECT
m.Column1
,t2.Column2
,COALESCE
(
(
SELECT TOP 1 Vat
FROM LinkedDBServer.DatabaseName.dbo.TableName t3
WHERE
m.MaterialNumber = t3.MaterialNumber COLLATE Czech_CI_AS
and t3.Currency = …
and ...
ORDER BY [Date] DESC
), m.Vat
) as Vat
FROM Table1 m
JOIN Table2 t2 on (m.Column1 = t2.Column1)
It works but the problem is that it takes too long and LinkedServer cut my connection because it takes more than 10 minutes. The purpose of the query is to get newer data from a different database if it exists (i get newest data by top and ordering it by date and precondition is that every data in that database is newer than in mine, thats why im using COALESCE).
But my though is if I was able to rewrite it to JOIN it could be faster. But another problem could be I dont have an primary key (and cant change that).
How can I speed that query up ? (Im using SQL Server 2008 R2)
Thank you
Here i attached Estimated Query Plan: (Its readable in browser ZOOM :) Estimation is for 2 Coalesce columns.
Try rewriting query using outer apply
SELECT
t1.Column1
,t2.Column2
,COALESCE(ou.vat, m.Vat) as Vat
FROM Table1 t1
JOIN Table2 m on (m.Column1 = t1.Column1)
outer apply
(
SELECT TOP 1 Vat
FROM LinkedDBServer.DatabaseName.dbo.TableName t3
WHERE
m.MaterialNumber = t3.MaterialNumber COLLATE Czech_CI_AS
and t3.Currency = …
and ...
ORDER BY [Date] DESC
) ou
Another option:
; WITH vat AS (
SELECT MaterialNumber COLLATE Czech_CI_AS As MaterialNumber
, Vat
, Row_Number() OVER (PARTITION BY MaterialNumber ORDER BY "Date" DESC) As sequence
FROM LinkedDBServer.DatabaseName.dbo.TableName
WHERE Currency = ...
AND ...
)
SELECT t1.Column1
, m.Column2
, Coalesce(vat.Vat, m.Vat) As Vat
FROM Table1 As t1
INNER
JOIN Table2 As m
ON m.Column1 = t1.Column1
LEFT
JOIN vat
ON vat.MaterialNumber = m.MaterialNumber
AND vat.sequence = 1
;
I would like to consolidate a one to many relationship that outputs on different rows to a single row.
(select rate_value1
FROM xgenca_enquiry_event
INNER JOIN xgenca_enquiry_iso_code_translation
ON
xgenca_enquiry_event_rate.rate_code_id
= xgenca_enquiry_iso_code_translation.id
where xgenca_enquiry_event_rate.event_id = xgenca_enquiry_event.id
and ISO_code = 'PDIV') as PDIVrate,
(select rate_value1
FROM xgenca_enquiry_event
INNER JOIN xgenca_enquiry_iso_code_translation
ON
xgenca_enquiry_event_rate.rate_code_id
= xgenca_enquiry_iso_code_translation.id
where xgenca_enquiry_event_rate.event_id = xgenca_enquiry_event.id
and ISO_code = 'TAXR') as TAXrate
PDIVrate TAXrate
NULL 10.0000000
0.0059120 NULL
I would like the results on one row. Any help would be greatly appreciated.
Thanks.
You can use an aggregate function to perform this:
select
max(case when ISO_code = 'PDIV' then rate_value1 end) PDIVRate,
max(case when ISO_code = 'TAXR' then rate_value1 end) TAXRate
FROM xgenca_enquiry_event_rate r
INNER JOIN xgenca_enquiry_iso_code_translation t
ON r.rate_code_id = t.id
INNER JOIN xgenca_enquiry_event e
ON r.event_id = e.id
It looks like you are joining three tables are are identical in the queries. This consolidates this into a single query using joins.
Look here:
Can I Comma Delimit Multiple Rows Into One Column?
Simulate Oracle's LISTAGG() in SQL Server using STUFF:
SELECT Column1,
stuff((
SELECT ', ' + Column2
FROM tableName as t1
where t1.Column1 = t2.Column1
FOR XML PATH('')
), 1, 2, '')
FROM tableName as t2
GROUP BY Column1
/
Copied from here: https://github.com/jOOQ/jOOQ/issues/1277