Can I join each row of table1 to a unique row of table 2? - sql

I was hoping to query in all the rows of a table that has its ids starting at some number, and update each row of the original table with a one to one of the second table.
For example:
normal
id | fk_test_id
----------------
1 | null
2 | null
3 | null
starts_after
id |
----
12 |
13 |
14 |
What UPDATE can I use to make normal look like this:
id | fk_test_id
----------------
1 | 12
2 | 13
3 | 14
I tried:
UPDATE normal SET fk_test_id = starts_after.id FROM starts_after; which just joins on the first row of starts_after.
UPDATE normal SET fk_test_id = (SELECT id FROM starts_after ORDER BY random() LIMIT 1); Where the subquery only executes once.
Filtering the subquery by which fk_test_ids are already chosen, but it only executes on the pre-updated data.

If you added record with specific order in to starts_after you can use below query:
update normal n
set fk_test_id = tmp.id
from (select id,
row_number() over (order by id)
from starts_after) tmp
where tmp.row_number = n.id;
I ordered by id from starts_after table (ASC) and create range of record with row num:
id | row_number
----------------
12 | 1
13 | 2
14 | 3
After that i join two table and update records

Related

Update records in SQL by looking up in different table

I am copying data from few tables in SQL server A to B. I have a set of staging tables in B and need to update some of those staging tables based on updated values in final target table in B.
Example:
Server B:
StagingTable1:
ID | NAME | CITY
1 ABC XYZ
2 BCD XXX
StagingTable2:
ID | AGE | Table1ID(FK)
10 15 1
20 16 2
After Copying StagingTable1 to TargetTable1 (ID's get auto polulated and I get new ID's, now ID 1 becomes 2 and ID 2 becomes 3)
TargetTable1:
ID | NAME | CITY
1 PQR YYY (pre-existing record)
2 ABC XYZ
3 BCD XXX
So now before I can copy the StagingTable2 I need to update the Table1ID column in it by correct values from TargetTable1.
StagingTable2 should become:
ID | AGE | Table1ID(FK)
10 15 2
20 16 3
I am writing a stored procedure for this and not sure how do I lookup and update the records in staging tables?
Assuming that (name, city) tuples are unique in StagingTable1 and TargetTable1, you can use an updatable common table expression to generate the new mapping and assign the corresponding values:
with cte as (
select st2.Table1ID, tt1.id
from StagingTable2 st2
inner join StagingTable1 st1 on st1.ID = st2.Table1ID
inner join TargetTable1 tt1 on tt1.name = st1.name and tt1.city = st1.city
)
update cte set Table1ID = id
Demo on DB Fiddle - content of StagingTable2 after the update:
id | age | Table1ID
-: | --: | -------:
10 | 15 | 2
20 | 16 | 3

pulling data from max field

I have a table structure with columns similar to the following:
ID | line | value
1 | 1 | 10
1 | 2 | 5
2 | 1 | 6
3 | 1 | 7
3 | 2 | 4
ideally, i'd like to pull the following:
ID | value
1 | 5
2 | 6
3 | 4
one solution would be to do something like the following:
select a.ID, a.value
from
myTable a
inner join (select id, max(line) as line from myTable group by id) b
on a.id = b.id and a.line = b.line
Given the size of the table and that this is just a part of a larger pull, I'd like to see if there's a more elegant / simpler way of pulling this directly.
This is a task for OLAP-functions:
select *
from myTable a
qualify
rank() -- assign a rank for each id
over (partition by id
order by line desc) = 1
Might return multiple rows per id if they share the same max line. If you want to return only one of them, add another column to the order by to make it unique or switch to row_number to get an indeterminate row.

SQL advanced query counting the max value of a group

I want to create a query that will count the number of times the following condition is met.
I have a table that consists of multiple records with a matching foreign key. I want to check only for each of the foreign key groups if the highest value of another column of that key occurs more than once. If it does that will up the count.
Data
--------------------------
ID | Foreign Key | Value
--------------------------
1 | 1 | 1
2 | 1 | 2
3 | 1 | 2
4 | 2 | 0
5 | 2 | 2
6 | 2 | 1
7 | 3 | 0
8 | 3 | 1
9 | 3 | 1
The query I want should return the number 2. This is because the maximum value in group 1(Foreign Key) occurs twice, the value is 2. In group 2 the maximum value is 2 but only occurs once so this will not up the count. Then in group 3 the maximum value is 1 which occurs twice which will up the count. The count therefore ends up as two.
All credit goes to the comment from #Bob, but here is the sql that solved this problem.
SELECT Count(1)
FROM (SELECT DISTINCT foreign_key
FROM (SELECT foreign_key,
Count(1)
FROM data
WHERE ( foreign_key, value ) IN (SELECT foreign_key,
Max(value)
FROM data
GROUP BY foreign_key)
GROUP BY foreign_key
HAVING Count(1) > 1) AS data) AS data;
This is one approach:
select max(num_at_max)
from (select t.*, count(val) over(partition by fk) as num_at_max
from tbl t
join (select max(max_val_by_grp) as max_val_all_grps
from (select fk, max(val) as max_val_by_grp
from tbl
group by fk) x) x
on t.val = x.max_val_all_grps) x

Delete rows except for one for every id

I have a dataset with multiple ids. For every id there are multiple entries. Like this:
--------------
| ID | Value |
--------------
| 1 | 3 |
| 1 | 4 |
| 1 | 2 |
| 2 | 1 |
| 2 | 2 |
| 3 | 3 |
| 3 | 5 |
--------------
Is there a SQL DELETE query to delete (random) rows for every id, except for one (random rows would be nice but is not essential)? The resulting table should look like this:
--------------
| ID | Value |
--------------
| 1 | 2 |
| 2 | 1 |
| 3 | 5 |
--------------
Thanks!
It doesn't look like hsqldb fully supports olap functions (in this case row_number() over (partition by ...), so you'll need to use a derived table to identify the one value you want to keep for each ID. It certainly won't be random, but I don't think anything else will be either. Something like so
This query will give you the first part:
select
id,
min(value) as minval
from
group by id
Then you can delete from your table where you don't match:
delete from
<your table> t1
inner join
(
select
id,
min(value) as minval
from
<your table>
group by id
) t2
on t1.id = t2.id
and t1.value <> t2.value
Try this:
alter ignore table a add unique(id);
Here a is the table name
This should do what you want:
SELECT ID, Value
FROM (SELECT ID, Value, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY NEWID()) AS RN
FROM #Table) AS A
WHERE A.RN = 1
I tried the given answers with HSQLDB but it refused to execute those queries for different reasons (join is not allowed in delete query, ignore statement is not allowed in alter query). Thanks to Andrew I came up with this solution (which is a little bit more circumstantial, but allows it to delete random rows):
Add a new column for random values:
ALTER TABLE <table> ADD COLUMN rand INT
Fill this column with random data:
UPDATE <table> SET rand = RAND() * 1000000
Delete all rows which don't have the minimum random value for their id:
DELETE FROM <table> WHERE rand NOT IN (SELECT MIN(rand) FROM <table> GROUP BY id)
Drop the random column:
ALTER TABLE <table> DROP rand
For larger tables you probably should ensure that the random values are unique, but this worked perfectly for me.

Select multiple distinct rows from table SQL

I am attempting to select distinct (last updated) rows from a table in my database. I am trying to get the last updated row for each "Sub section". However I cannot find a way to achieve this.
The table looks like:
ID | Name |LastUpdated | Section | Sub |
1 | Name1 | 2013-04-07 16:38:18.837 | 1 | 1 |
2 | Name2 | 2013-04-07 15:38:18.837 | 1 | 2 |
3 | Name3 | 2013-04-07 12:38:18.837 | 1 | 1 |
4 | Name4 | 2013-04-07 13:38:18.837 | 1 | 3 |
5 | Name5 | 2013-04-07 17:38:18.837 | 1 | 3 |
What I am trying to get my SQL Statement to do is return rows:
1, 2, and 5.
They are distinct for the Sub, and the most recent.
I have tried:
SELECT DISTINCT Sub, LastUpdated, Name
FROM TABLE
WHERE LastUpdated = (SELECT MAX(LastUpdated) FROM TABLE WHERE Section = 1)
Which only returns the distinct row for the most recent updated Row. Which makes sense.
I have googled what I am trying, and checked relevant posts on here. However not managed to find one which really answers what I am trying.
You can use the row_number() window function to assign numbers for each partition of rows with the same value of Sub. Using order by LastUpdated desc, the row with row number one will be the latest row:
select *
from (
select row_number() over (
partition by Sub
order by LastUpdated desc) as rn
, *
from YourTable
) as SubQueryAlias
where rn = 1
Wouldn't it be enough to use group by?
SELECT DISTINCT MIN(Sub), MAX(LastUpdated), MIN(NAME) FROM TABLE GROUP BY Sub Where Section = 1