How to Update a group of rows - sql

My sqlfiddle: http://sqlfiddle.com/#!15/4f9da/1
I'm really bad explaining this and noob to do complex query(just the basics), because its complicated.
Situation: The column revision is a group of the same object related, for example: ids 1 2 3 are the same object and always refering the last old object on using id to ground_id.
Problem: I need to make ord column to make same id for the same group of object. example: the ids 1 2 3 need their value setted to 1, because the revison 0 is the id 1. Same for id 4, which must have ord 4 and id 5 too.
Basically must be like this:

You need a recursive query to do this. First you select the rows where ground_id IS NULL, set ord to the value of id. In the following iterations you add more rows based on the value of ground_id, setting the ord value to that of the row it is being matched to. You can then use that set of rows (id, ord) as a row source for the UPDATE:
WITH RECURSIVE set_ord (id, ord) AS (
SELECT id, id
FROM ground
WHERE ground_id IS NULL
UNION
SELECT g.id, o.ord
FROM ground g
JOIN set_ord o ON o.id = g.ground_id
)
UPDATE ground g
SET ord = s.ord
FROM set_ord s
WHERE g.id = s.id;
(SQLFiddle is currently not-responsive so I can't post my code there)

Related

SQL query to join smae table multiple times

I have a scenario to join the same table multiple times to get the desired output. For ex I have two tables TABLE A and TABLE B.
Step 1: I want to take the all the parties from TABLE A which have
lowest Idate. Lowest idate will be fetched based partyid and idate
column.
Step 2: Then based on CID which is fetched from TABLE A in step 1,
we need to fetch the corresponding MID from TABLE B which have
MIDTYPE=130300.
Step 3: Then based on the MID fetched in step 2 we need to traverse
the same table and find out the latest record for the same MID based
on idate in TABLE B and fetch the corresponding CID for the MID.
Step 4: Now for that CID we need to fetch MID value for MIDTYPE
130307 in the same table(TABLEB). And my final output should be combination of MID
which we fetched for step 3 and MID fetched for 130307 in step 4.
I write a query like this ..but its taking lot of time for the query to run as we are going through the same table(TABLEB) multiple times and TABLEB have millions of rows. Is there anyway we can rewrite this query in different way. Could some one can help with this me.
SELECT
ident.mid mid1,
b.mid mid2
FROM
(
SELECT
*
FROM
tableb
WHERE
midtype = 130307
) ident
INNER JOIN (
SELECT
s.cid,
s.mid,
s.midtype
FROM
(
SELECT
cid,
partyid,
admin_sys_tp_cd,
mid,
ilast
FROM
(
SELECT
cq.cid,
RANK() OVER(
PARTITION BY cq.partyid
ORDER BY
cq.idate ASC
) rnk,
cq.idate,
cq.partyid,
i.mid,
i.idate AS ilast
FROM
tablea cq
INNER JOIN tableb i ON cq.cid = i.cid
INNER JOIN tablec ON i.cid = c.cid
WHERE
i.midtype = 130300
)
WHERE
rnk = 1
) a
INNER JOIN (
SELECT
*
FROM
(
SELECT
cid,
mid,
midtype,
RANK() OVER(
PARTITION BY mid
ORDER BY
idate DESC
) rnk_mpid
FROM
tableb
)
WHERE
rnk_mpid = 1
) s ON a.mid = s.mid
AND s.midtype = 130300
) b ON ident.cid = b.cid
AND ident.midtype = 130307
not what you asked, but before others and I, spent time trying to get different approaches for you, let's make sure the basics are covered.
No matter how different you can write an SQL query, they will never perform fast, in a MILLION base table if you don't have the proper indexes for it. Specially in your case, as you have to access it 3 times at least.
Just by looking at your detailed steps. I would say that you should have at least 3 different indexes created to support this query.
TableA_Index1 ( PARTYID, LDATE, INCLUDES CID)
TableB_Index1 (CID, MIDTYPE, INCLUDES MID )
TableB_Index2 (MID, LDATE, INCLUDES CID )
Do you have them ?
Have you ever tried to run this query on db2-advisor (db2advis) to get recommended indexes for it ?

removing non-matching rows based on order in SQL within a WITH statement

I've got a many-to-many setup where there are items and item names(based on languageID)
I want to retrieve all names for a set id, where the name is replace with an alternate name (same itemID, but different languageID) when name is NULL.
I've set up a table that receives all combinations of itemids and itemnames, even the missing ones, and have the name ordered by an hasName flag, that is set based on name existing to 0,1 or 2. 0 means languageId and name exist, 1 means only name exists, and 2 means neither. I then sort the results: ORDER BY itemId, hasName, languageId this works well enough, because the top 1 row of every itemid meats the critera, and I can just pull that.
However I still need to process other queries using the result, so this doesn't work well, because as soon as I use a WITH statement, the order cannot be used, so it breaks the functionality
What I'm using instead is a join, where I select the top 1 matching row on the ordered table
the problem there is that the time to execute goes up 10x
any ideas what else I could try?
using SQL server 10.50
the slow query:
SELECT
*,
(SELECT top 1 ItemName FROM ItemNameMultiLang x WHERE x.ItemId = tc.ItemId ORDER BY ItemID, hasName, LangID) AS ItemName
FROM ItemCategories tc
ORDER BY ItemId
One way to approach this is with row_number(), so you can get the first row from itemNameMultiLang, which is what you want:
SELECT tc.*, inml.ItemName
FROM ItemCategories tc left outer join
(select inml.*, row_number() over (partition by inml.ItemId order by hasname, langId) as seqnum
from ItemNameMultiLang
) inml
on tc.ItemItem = inml.ItemId and
inml.seqnum = 1
ORDER BY tc.ItemId;

Comparing a list of values

For example, I have a head-table with one column id and a position-table with id, head-id (reference to head-table => 1 to N), and a value. Now I select one row in the head-table, say id 1. I look into the position-table and find 2 rows which referencing to the head-table and have the values 1337 and 1338. Now I wanna select all heads which have also 2 positions with these values 1337 and 1338. The position-ids are not the same, only the values, because it is not a M to N relation. Can anyone tell me a SQL-Statement? I have no idea to get it done :/
Assuming that the value is not repeated for a given headid in the position table, and that it is never NULL, then you can do this using the following logic. Do a full outer join on the position table to the specific head positions you care about. Then check whether there is a full match.
The following query does this:
select *
from (select p.headid,
sum(case when p.value is not null then 1 else 0 end) as pmatches,
sum(case when ref.value is not null then 1 else 0 end) as refmatches
from (select p.value
from position p
where p.headid = <whatever>
) ref full outer join
position p
on p.value = ref.value and
p.headid <> ref.headid
) t
where t.pmatches = t.refmatches
If you do have NULLs in the values, you can accommodate these using coalesce. If you have duplicates, you need to specify more clearly what to do in this case.
Assuming you have:
Create table head
(
id int
)
Create table pos
(
id int,
head_id int,
value int
)
and you need to find duplicates by value, then I'd use:
Select distinct p.head_id, p1.head_id
from pos p
join pos p1 on p.value = p1.value and p.head_id<>p1.head_id
where p.head_id = 1
for specific head_id, or without last where for every head_id

Variant use of the GROUP BY clause in TSQL

Imagine the following schema and sample data (SQL Server 2008):
OriginatingObject
----------------------------------------------
ID
1
2
3
ValueSet
----------------------------------------------
ID OriginatingObjectID DateStamp
1 1 2009-05-21 10:41:43
2 1 2009-05-22 12:11:51
3 1 2009-05-22 12:13:25
4 2 2009-05-21 10:42:40
5 2 2009-05-20 02:21:34
6 1 2009-05-21 23:41:43
7 3 2009-05-26 14:56:01
Value
----------------------------------------------
ID ValueSetID Value
1 1 28
etc (a set of rows for each related ValueSet)
I need to obtain the ID of the most recent ValueSet record for each OriginatingObject. Do not assume that the higher the ID of a record, the more recent it is.
I am not sure how to use GROUP BY properly in order to make sure the set of results grouped together to form each aggregate row includes the ID of the row with the highest DateStamp value for that grouping. Do I need to use a subquery or is there a better way?
You can do it with a correlated subquery or using IN with multiple columns and a GROUP-BY.
Please note, simple GROUP-BY can only bring you to the list of OriginatingIDs and Timestamps. In order to pull the relevant ValueSet IDs, the cleanest solution is use a subquery.
Multiple-column IN with GROUP-BY (probably faster):
SELECT O.ID, V.ID
FROM Originating AS O, ValueSet AS V
WHERE O.ID = V.OriginatingID
AND
(V.OriginatingID, V.DateStamp) IN
(
SELECT OriginatingID, Max(DateStamp)
FROM ValueSet
GROUP BY OriginatingID
)
Correlated Subquery:
SELECT O.ID, V.ID
FROM Originating AS O, ValueSet AS V
WHERE O.ID = V.OriginatingID
AND
V.DateStamp =
(
SELECT Max(DateStamp)
FROM ValueSet V2
WHERE V2.OriginatingID = O.ID
)
SELECT OriginatingObjectID, id
FROM (
SELECT id, OriginatingObjectID, RANK() OVER(PARTITION BY OriginatingObjectID
ORDER BY DateStamp DESC) as ranking
FROM ValueSet)
WHERE ranking = 1;
This can be done with a correlated sub-query. No GROUP-BY necessary.
SELECT
vs.ID,
vs.OriginatingObjectID,
vs.DateStamp,
v.Value
FROM
ValueSet vs
INNER JOIN Value v ON v.ValueSetID = vs.ID
WHERE
NOT EXISTS (
SELECT 1
FROM ValueSet
WHERE OriginatingObjectID = vs.OriginatingObjectID
AND DateStamp > vs.DateStamp
)
This works only if there can not be two equal DateStamps for a OriginatingObjectID in the ValueSet table.

Selecting a single (random) row for an SQL join

I've got an sql query that selects data from several tables, but I only want to match a single(randomly selected) row from another table.
Easier to show some code, I guess ;)
Table K is (k_id, selected)
Table C is (c_id, image)
Table S is (c_id, date)
Table M is (c_id, k_id, score)
All ID-columns are primary keys, with appropriate FK constraints.
What I want, in english, is for eack row in K that has selected = 1 to get a random row from C where there exists a row in M with (K_id, C_id), where the score is higher than a given value, and where c.image is not null and there is a row in s with c_id
Something like:
select k.k_id, c.c_id, m.score
from k,c,m,s
where k.selected = 1
and m.score > some_value
and m.k_id = k.k_id
and m.c_id = c.c_id
and c.image is not null
and s.c_id = c.c_id;
The only problem is this returns all the rows in C that match the criteria - I only want one...
I can see how to do it using PL/SQL to select all relevent rows into a collection and then select a random one, but I'm stuck as to how to select a random one.
you can use the 'order by dbms_random.random' instruction with your query.
i.e.:
SELECT column FROM
(
SELECT column FROM table
ORDER BY dbms_random.value
)
WHERE rownum = 1
References:
http://awads.net/wp/2005/08/09/order-by-no-order/
http://www.petefreitag.com/item/466.cfm
with analytics:
SELECT k_id, c_id, score
FROM (SELECT k.k_id, c.c_id, m.score,
row_number() over(PARTITION BY k.k_id ORDER BY NULL) rk
FROM k, c, m, s
WHERE k.selected = 1
AND m.score > some_value
AND m.k_id = k.k_id
AND m.c_id = c.c_id
AND c.image IS NOT NULL
AND s.c_id = c.c_id)
WHERE rk = 1
This will select one row that satisfies your criteria per k_id. This will likely select the same set of rows if you run the query several times. If you want more randomness (each run produces a different set of rows), you would replace ORDER BY NULL by ORDER BY dbms_random.value
I'm not too familiar with oracle SQL, but try using LIMIT random(), if there is such a function available.