Problems shortening a SQL query - sql

I am trying to make a query that works with a temp table, work without that temp table
I tried doing a join in the subquery without the temp table but I don't get the same results as the query with the temp table.
This is the query with the temp table that works as I want:
create table #results(
RowId id_t,
LastUpdatedAt date_T
)
insert into #results
select H.RowId, H.LastUpdatedAt from MemberCarrierMap M Join MemberCarrierMapHistory H on M.RowId = H.RowId
update MemberCarrierMap
set CreatedAt = (select MIN(LastUpdatedAt) from #results r where r.rowId = MemberCarrierMap.rowId)
Where CreatedAt is null;
and here is the query I tried without the temp table that doesn't work like the above:
update MemberCarrierMap
set CreatedAt = (select MIN(MH.LastUpdatedAt) from MemberCarrierMapHistory MH join MemberCarrierMap M on MH.RowId = M.RowId where MH.RowId = M.RowId )
Where CreatedAt is null;
I was expecting the 2nd query to work as the first but It is not. Any suggestions on how to achieve what the first query does without the temp table?

This should work:
update M
set M.CreatedAt = (select MIN(MH.LastUpdatedAt) from MemberCarrierMapHistory MH WHERE MH.RowId = M.RowId)
FROM MemberCarrierMap M
Where M.CreatedAt is null;

Your question is more or less a duplicate of this answer. There, you will find multiple solutions. But the ones that implement correlated subqueires are less performant than the one that simply uses an uncorrelated aggregation subquery inside a join.
Applying it to your situation, you will have this:
update m
set m.createdDate = hAgg.maxVal
from memberCarrierMap m
join (
select rowId, max(lastUpdatedAt) as maxVal
from memberCarrierMapHistory
group by rowId
) as hAgg
on m.rowId = hAgg.rowId
where m.createdAt is null;
Basically, it's more performant because it is more expensive to run aggregations and filterings on a row-by-row basis (which is what happens in a correlated subquery) than to just get the aggregations out of the way all at once (joins tend to happen early in processing) and perform the match afterwards.

Related

Rewrite query without using temp table

I have a query that is using a temp table to insert some data then another select from to extract distinct results. That query by it self was fine but now with entity-framework it is causing all kinds of unexpected errors at the wrong time.
Is there any way I can rewrite the query not to use a temp table? When this is converted into a stored procedure and in entity framework the result set is of type int which throws an error:
Could not find an implementation of the query pattern Select not found.
Here is the query
Drop Table IF EXISTS #Temp
SELECT
a.ReceiverID,
a.AntennaID,
a.AntennaName into #Temp
FROM RFIDReceiverAntenna a
full join Station b ON (a.ReceiverID = b.ReceiverID) and (a.AntennaID = b.AntennaID)
where (a.ReceiverID is NULL or b.ReceiverID is NULL)
and (a.AntennaID IS NULL or b.antennaID is NULL)
select distinct r.ReceiverID, r.ReceiverName, r.receiverdescription
from RFIDReceiver r
inner join #Temp t on r.ReceiverID = t.ReceiverID;
No need for anything fancy, you can just replace the reference to #temp with an inner sub-query containing the query that generates #temp e.g.
select distinct r.ReceiverID, r.ReceiverName, r.receiverdescription
from RFIDReceiver r
inner join (
select
a.ReceiverID,
a.AntennaID,
a.AntennaName
from RFIDReceiverAntenna a
full join Station b ON (a.ReceiverID = b.ReceiverID) and (a.AntennaID = b.AntennaID)
where (a.ReceiverID is NULL or b.ReceiverID is NULL)
and (a.AntennaID IS NULL or b.antennaID is NULL)
) t on r.ReceiverID = t.ReceiverID;
PS: I haven't made any effort to improve the query overall like Gordon has but do consider his suggestions.
First, a full join makes no sense in the first query. You are selecting only columns from the first table, so you need that.
Second, you can use a CTE.
Third, you should be able to get rid of the SELECT DISTINCT by using an EXISTS condition.
I would suggest:
WITH ra AS (
SELECT ra.*
FROM RFIDReceiverAntenna ra
Station s
ON s.ReceiverID = ra.ReceiverID AND
s.AntennaID = ra.AntennaID)
WHERE s.ReceiverID is NULL
)
SELECT r.ReceiverID, r.ReceiverName, r.receiverdescription
FROM RFIDReceiver r
WHERE EXISTS (SELECT 1
FROM ra
WHERE r.ReceiverID = ra.ReceiverID
);
You can use CTE instead of the temp table:
WITH
CTE
AS
(
SELECT
a.ReceiverID,
a.AntennaID,
a.AntennaName
FROM
RFIDReceiverAntenna a
full join Station b
ON (a.ReceiverID = b.ReceiverID)
and (a.AntennaID = b.AntennaID)
where
(a.ReceiverID is NULL or b.ReceiverID is NULL)
and (a.AntennaID IS NULL or b.antennaID is NULL)
)
select distinct
r.ReceiverID, r.ReceiverName, r.receiverdescription
from
RFIDReceiver r
inner join CTE t on r.ReceiverID = t.ReceiverID
;
This query will return the same results as your original query with the temp table, but its performance may be quite different; not necessarily slower, it can be faster. Just something that you should be aware about.

Improve update performance in oracle

I've generated two temporary tables, also assigned a primary key to the generated tables - to get an index on them.
Like this on both:
ALTER TABLE TEMP_MEASURINGS ADD PRIMARY KEY (MEASURINGID)
ALTER TABLE TEMP_VALUES ADD PRIMARY KEY (<some_other_col>)
The two temp-tables are related by a date and another id as you can see by the query. Now I need to update the "measuringid" in TEMP_VALUES based on the other table.
Can I make this query go faster in any way?
UPDATE TEMP_VALUES v
SET v.MEASURINGID =
(
SELECT MEASURINGID
FROM TEMP_MEASURINGS m
WHERE m.MEASURDATE = v.MEASUREDATE
AND m.ORDERID = v.ORDERID
)
The tables needs to be generated first, so I can't do an insert directly.
SELECT COUNT(*) FROM TEMP_VALUES ~6M
SELECT COUNT(*) FROM TEMP_MEASURINGS ~1.5M
Your query is going to be slow, because so many rows are being updated.
You can speed it with an index on TEMP_MEASURINGS(MEASUREDATE, ORDERID, MEASURINGID). This is a covering index for the subquery. The lookups should be fast.
You might find it faster just to create a new table:
create new_temp_values as
select v.*, m.measuringid
from temp_values v left join
temp_measurings m
on v.measuredate = m.measuredate and v.orderid = m.orderid;
The same index will work here (you can adjust the select columns to be what you really need).
Typically, creating a new table is much, much faster than updating all or even a significant number of rows in a given table.
Try with below merge for peformance:
MERGE TEMP_VALUES v
USING (SELECT MEASURINGID,
MEASURDATE,
ORDERID
FROM TEMP_MEASURINGS) m
ON
m.MEASURDATE = v.MEASUREDATE
AND m.ORDERID = v.ORDERID
WHEN MATCHED THEN
UPDATE
SET v.MEASURINGID = m.MEASURINGID;
Instead of updating all records you can update records that exists in temp table:
UPDATE TEMP_VALUES v
SET v.MEASURINGID =
(
SELECT MEASURINGID
FROM TEMP_MEASURINGS m
WHERE m.MEASURDATE = v.MEASUREDATE
AND m.ORDERID = v.ORDERID
)
where
exists (select 1 from TEMP_VALUES ttt, TEMP_MEASURINGS ttm WHERE ttm.MEASURDATE = ttt.MEASUREDATE
AND ttm.ORDERID = ttt.ORDERID and ttt.ID = v.ID)

Ms Access query to SQL Server - DistinctRow

What would the syntax be to convert this MS Access query to run in SQL Server as it doesn't have a DistinctRow keyword
UPDATE DISTINCTROW [MyTable]
INNER JOIN [AnotherTable] ON ([MyTable].J5BINB = [AnotherTable].GKBINB)
AND ([MyTable].J5BHNB = [AnotherTable].GKBHNB)
AND ([MyTable].J5BDCD = [AnotherTable].GKBDCD)
SET [AnotherTable].TessereCorso = [MyTable].[J5F7NR];
DISTINCTROW [MyTable] removes duplicate MyTable entries from the results. Example:
select distinctrow items
items.item_number, items.name
from items
join orders on orders.item_id = items.id;
In spite of the join getting you the same item_number and name multiple times when there is more than one order for it, DISTINCTROW reduces this to one row per item. So the whole join is merely for assuring that you only select items for which exist at least one order. You don't find DISTINCTROW in any other DBMS as far as I know. Probably because it is not needed. When checking for existence, we use EXISTS of course (or IN for that matter).
You are joining MyTable and AnotherTable and expect for some reason to get the same MyTable record multifold for one AnotherTable record, so you use DISTINCTROW to only get it once. Your query would (hopefully) fail if you got two different MyTable records for one AnotherTable record.
What the update does is:
update anothertable
set tesserecorso = (select top 1 j5f7nr from mytable where mytable.j5binb = anothertable.gkbinb and ...)
where exists (select * from mytable where mytable.j5binb = anothertable.gkbinb and ...)
But this uses about the same subquery twice. So we'd want to update from a query instead.
The easiest way to get one result record per <some columns> in a standard SQL query is to aggregate data:
select *
from anothertable a
join
(
select j5binb, j5bhnb, j5bdcd, max(j5f7nr) as j5f7nr
from mytable
group by j5binb, j5bhnb, j5bdcd
) m on m.j5binb = a.gkbinb and m.j5bhnb = a.gkbhnb and m.j5bdcd = a.gkbdcd;
How to write an updateble query is different from one DBMS to another. Here is the final update statement for SQL-Server:
update a
set a.tesserecorso = m.j5f7nr
from anothertable a
join
(
select j5binb, j5bhnb, j5bdcd, max(j5f7nr) as j5f7nr
from mytable
group by j5binb, j5bhnb, j5bdcd
) m on m.j5binb = a.gkbinb and m.j5bhnb = a.gkbhnb and m.j5bdcd = a.gkbdcd;
The DISTINCTROW predicate in MS Access SQL removes duplicates across all fields of a table in join statements and not just the selected fields of query (which DISTINCT in practically all SQL dialects do). So consider selecting all fields in a derived table with DISTINCT predicate:
UPDATE [AnotherTable]
SET [AnotherTable].TessereCorso = main.[J5F7NR]
FROM
(SELECT DISTINCT m.* FROM [MyTable] m) As main
INNER JOIN [AnotherTable]
ON (main.J5BINB = [AnotherTable].GKBINB)
AND (main.J5BHNB = [AnotherTable].GKBHNB)
AND (main.J5BDCD = [AnotherTable].GKBDCD)
Another variant of the query.. (Too lazy to get the original tables).
But like the query above updates 35 rows =, so does this one
UPDATE [Albi-Anagrafe-Associati]
SET
[Albi-Anagrafe-Associati].CRegDitte = [055- Registri ditte].[CRegDitte],
[Albi-Anagrafe-Associati].NIscrTribunale = [055- Registri ditte].[NIscrTribunale],
[Albi-Anagrafe-Associati].NRegImprese = [055- Registri ditte].[NRegImprese]
FROM [055- Registri ditte]
WHERE EXISTS(
SELECT *
FROM [055- Registri ditte]-- [Albi-Anagrafe-Associati]
WHERE ([055- Registri ditte].GIBINB = [Albi-Anagrafe-Associati].GKBINB)
AND ([055- Registri ditte].GIBHNB = [Albi-Anagrafe-Associati].GKBHNB)
AND ([055- Registri ditte].GIBDCD = [Albi-Anagrafe-Associati].GKBDCD))
Update [AnotherTable]
Set [AnotherTable].TessereCorso = MyTable.[J5F7NR]
From [AnotherTable]
Inner Join
(
Select Distinct [J5BINB],[5BHNB],[J5BDCD]
,(Select Top 1 [J5F7NR] From MyTable) as [J5F7NR]
,[J5BHNB]
From MyTable
)as MyTable
On (MyTable.J5BINB = [AnotherTable].GKBINB)
AND (MyTable.J5BHNB = [AnotherTable].GKBHNB)
AND (MyTable.J5BDCD = [AnotherTable].GKBDCD)

SQL query: Iterate over values in table and use them in subquery

I have a simple SQL table containing some values, for example:
id | value (table 'values')
----------
0 | 4
1 | 7
2 | 9
I want to iterate over these values, and use them in a query like so:
SELECT value[0], x1
FROM (some subquery where value[0] is used)
UNION
SELECT value[1], x2
FROM (some subquery where value[1] is used)
...
etc
In order to get a result set like this:
4 | x1
7 | x2
9 | x3
It has to be in SQL as it will actually represent a database view. Of course the real query is a lot more complicated, but I tried to simplify the question while keeping the essence as much as possible.
I think I have to select from values and join the subquery, but as the value should be used in the subquery I'm lost on how to accomplish this.
Edit: I oversimplified my question; in reality I want to have 2 rows from the subquery and not only one.
Edit 2: As suggested I'm posting the real query. I simplified it a bit to make it clearer, but it's a working query and the problem is there. Note that I have hardcoded the value '2' in this query two times. I want to replace that with values from a different table, in the example table above I would want a result set of the combined results of this query with 4, 7 and 9 as values instead of the currently hardcoded 2.
SELECT x.fantasycoach_id, SUM(round_points)
FROM (
SELECT DISTINCT fc.id AS fantasycoach_id,
ffv.formation_id AS formation_id,
fpc.round_sequence AS round_sequence,
round_points,
fpc.fantasyplayer_id
FROM fantasyworld_FantasyCoach AS fc
LEFT JOIN fantasyworld_fantasyformation AS ff ON ff.id = (
SELECT MAX(fantasyworld_fantasyformationvalidity.formation_id)
FROM fantasyworld_fantasyformationvalidity
LEFT JOIN realworld_round AS _rr ON _rr.id = round_id
LEFT JOIN fantasyworld_fantasyformation AS _ff ON _ff.id = formation_id
WHERE is_valid = TRUE
AND _ff.coach_id = fc.id
AND _rr.sequence <= 2 /* HARDCODED USE OF VALUE */
)
LEFT JOIN fantasyworld_FantasyFormationPlayer AS ffp
ON ffp.formation_id = ff.id
LEFT JOIN dbcache_fantasyplayercache AS fpc
ON ffp.player_id = fpc.fantasyplayer_id
AND fpc.round_sequence = 2 /* HARDCODED USE OF VALUE */
LEFT JOIN fantasyworld_fantasyformationvalidity AS ffv
ON ffv.formation_id = ff.id
) x
GROUP BY fantasycoach_id
Edit 3: I'm using PostgreSQL.
SQL works with tables as a whole, which basically involves set operations. There is no explicit iteration, and generally no need for any. In particular, the most straightforward implementation of what you described would be this:
SELECT value, (some subquery where value is used) AS x
FROM values
Do note, however, that a correlated subquery such as that is very hard on query performance. Depending on the details of what you're trying to do, it may well be possible to structure it around a simple join, an uncorrelated subquery, or a similar, better-performing alternative.
Update:
In view of the update to the question indicating that the subquery is expected to yield multiple rows for each value in table values, contrary to the example results, it seems a better approach would be to just rewrite the subquery as the main query. If it does not already do so (and maybe even if it does) then it would join table values as another base table.
Update 2:
Given the real query now presented, this is how the values from table values could be incorporated into it:
SELECT x.fantasycoach_id, SUM(round_points) FROM
(
SELECT DISTINCT
fc.id AS fantasycoach_id,
ffv.formation_id AS formation_id,
fpc.round_sequence AS round_sequence,
round_points,
fpc.fantasyplayer_id
FROM fantasyworld_FantasyCoach AS fc
-- one row for each combination of coach and value:
CROSS JOIN values
LEFT JOIN fantasyworld_fantasyformation AS ff
ON ff.id = (
SELECT MAX(fantasyworld_fantasyformationvalidity.formation_id)
FROM fantasyworld_fantasyformationvalidity
LEFT JOIN realworld_round AS _rr
ON _rr.id = round_id
LEFT JOIN fantasyworld_fantasyformation AS _ff
ON _ff.id = formation_id
WHERE is_valid = TRUE
AND _ff.coach_id = fc.id
-- use the value obtained from values:
AND _rr.sequence <= values.value
)
LEFT JOIN fantasyworld_FantasyFormationPlayer AS ffp
ON ffp.formation_id = ff.id
LEFT JOIN dbcache_fantasyplayercache AS fpc
ON ffp.player_id = fpc.fantasyplayer_id
-- use the value obtained from values again:
AND fpc.round_sequence = values.value
LEFT JOIN fantasyworld_fantasyformationvalidity AS ffv
ON ffv.formation_id = ff.id
) x
GROUP BY fantasycoach_id
Note in particular the CROSS JOIN which forms the cross product of two tables; this is the same thing as an INNER JOIN without any join predicate, and it can be written that way if desired.
The overall query could be at least a bit simplified, but I do not do so because it is a working example rather than an actual production query, so it is unclear what other changes would translate to the actual application.
In the example I create two tables. See how outer table have an alias you use in the inner select?
SQL Fiddle Demo
SELECT T.[value], (SELECT [property] FROM Table2 P WHERE P.[value] = T.[value])
FROM Table1 T
This is a better way for performance
SELECT T.[value], P.[property]
FROM Table1 T
INNER JOIN Table2 p
on P.[value] = T.[value];
Table 2 can be a QUERY instead of a real table
Third Option
Using a cte to calculate your values and then join back to the main table. This way you have the subquery logic separated from your final query.
WITH cte AS (
SELECT
T.[value],
T.[value] * T.[value] as property
FROM Table1 T
)
SELECT T.[value], C.[property]
FROM Table1 T
INNER JOIN cte C
on T.[value] = C.[value];
It might be helpful to extract the computation to a function that is called in the SELECT clause and is executed for each row of the result set
Here's the documentation for CREATE FUNCTION for SQL Server. It's probably similar to whatever database system you're using, and if not you can easily Google for it.
Here's an example of creating a function and using it in a query:
CREATE FUNCTION DoComputation(#parameter1 int)
RETURNS int
AS
BEGIN
-- Do some calculations here and return the function result.
-- This example returns the value of #parameter1 squared.
-- You can add additional parameters to the function definition if needed
DECLARE #Result int
SET #Result = #parameter1 * #parameter1
RETURN #Result
END
Here is an example of using the example function above in a query.
SELECT v.value, DoComputation(v.value) as ComputedValue
FROM [Values] v
ORDER BY value

Update with Sub Query Derived Table Error

I have the following SQL statement to simply update the #temp temp table with the latest package version number in our Sybase 15 database.
UPDATE t
SET versionId = l.latestVersion
FROM #temp t INNER JOIN (SELECT gp.packageId
, MAX(gp.versionId) latestVersion
FROM Group_Packages gp
WHERE gp.groupId IN (SELECT groupId
FROM User_Group
WHERE userXpId = 'someUser')
GROUP BY gp.packageId) l
ON t.packageId = l.packageId
To me (mainly Oracle & SQL Server experience more than Sybase) there is little wrong with this statement. However, Sybase throws an exception:
You cannot use a derived table in the FROM clause of an UPDATE or DELETE statement.
Now, I don't get what the problem is here. I assume it is because of the aggregation / GROUP BY being used. Of course, I could put the sub query in a temp table and join on it but I really want to know what the 'correct' method should be and what the hell is wrong.
Any ideas or guidance would be much appreciated.
It seems that SYBASE doesn't support nested queries in UPDATE FROM class. Similar problem
Try to use this:
UPDATE #temp
SET versionId = (SELECT MAX(gp.versionId) latestVersion
FROM Group_Packages gp
WHERE gp.packageId=#temp.packageId
and
gp.groupId IN (SELECT groupId
FROM User_Group
WHERE userXpId = 'someUser')
)
And also what if l.latestVersion is NULL? Do you want update #temp with null ? if not then add WHERE:
WHERE (SELECT MAX(gp.versionId)
....
) is not null
I guess this is a limitation of Sybase (not allowing derived tables) in the FROM clause of the UPDATE. Perhaps you can rewrite like this:
UPDATE t
SET t.versionId = l.versionId
FROM #temp t
INNER JOIN
Group_Packages l
ON t.packageId = l.packageId
WHERE
l.groupId IN ( SELECT groupId
FROM User_Group
WHERE userXpId = 'someUser')
AND
l.versionId =
( SELECT MAX(gp.versionId)
FROM Group_Packages gp
WHERE gp.groupId IN ( SELECT groupId
FROM User_Group
WHERE userXpId = 'someUser')
AND gp.packageId = l.packageId
) ;
Your table alias for #temp is called "t" and your original table is called "t".
My guess is that this is the problem.
I think you want to start with:
update #temp
Does this syntax work in Sybase?
update dstTable T
set (T.field1, T.field2, T.field3) =
(select S.value1, S.value2, S.value3
from srcTable S
where S.key = T.Key);
I believe the correlated subquery can be as complicated as you like (including using CTE. etc). It just has to project the right number (and type) of values.