I have a table (Fruits) with following column
Fruit_Name(varchar2(10)) | IsDuplicate Number(1)
Mango 0
Orange 0
Mango 0
What i have to do is to update IsDuplicate column to 1 where Fruit_Name in Distinct i.e
Fruit_Name(varchar2(10)) | IsDuplicate Number(1)
Mango 1
Orange 1
Mango 0
How should I do this?
This should do it as far as I can tell
update fruits
set is_duplicate =
(
select case
when dupe_count > 1 and row_num = 1 then 1
else 0
end as is_dupe
from (
select f2.fruit_name,
count(*) over (partition by f2.fruit_name) as dupe_count,
row_number() over (partition by f2.fruit_name order by f2.fruit_name) as row_num,
rowid as row_id
from fruits f2
) ft
where ft.row_id = fruits.rowid
and ft.fruit_name = fruits.fruit_name
)
Edit
But instead of actually updating the table, why don't you create a view that returns the information. Depending on the size of the table it might be more efficient.
create view fruit_dupe_view
as
select fruit_name,
case
when dupe_count > 1 and row_num = 1 then 1
else 0
end as is_duplicate
from (
select fruit_name,
count(*) over (partition by fruit_name) as dupe_count,
row_number() over (partition by fruit_name order by fruit_name) as row_num
from fruits
) ft
Straight and simple -- you can't. Not with vanilla SQL. SQL is a set-based processing language, and you do things in sets. There is no way for SQL to know which one of your many Mango's should be tagged 1. You can probably tag one of them with 1 using windowing functions or ROWNUM etc. in a SELECT, but I don't think it can be done with an UPDATE.
In other words, your table lacks a unique key in the first place, so it is not something that SQL is designed to process.
However, you may try adding a sequential primary key to each row. Then you can easily write an UPDATE query to set to 1 all the rows with COUNT > 1 and key = MIN(key).
In other words, you really have to look at your database design. Relational databases are not supposed to contain "duplicates". That fact that you need to mark something as a duplicate means that your tables are designed wrong in the first place. The database should not even allow duplications to enter into its data.
Related
In our databases we have a table called conditions which references a table called attributes.
So it looks like this (ignoring some other columns that aren't relevant to the question)
id
attribute_id
execution_index
1
1000
1
2
1000
2
3
1000
1
4
2000
1
5
2000
2
6
2000
2
In theory the combination of attribute_id and execution_index should always be unique, but in practice they're not, and the software ends up essentially using the id to decide which comes first between two conditions with the same execution index. We want to add a uniqueness constraint to the table, but before we do that we need to update the execution indexes. So essentially we want to group them by attribute_id, order them by execution_index then id, and give them new execution indexes so that it becomes
id
attribute_id
execution_index
1
1000
1
2
1000
3
3
1000
2
4
2000
1
5
2000
2
6
2000
3
I'm not sure how to do this without just ordering by attribute_id, execution_index, id and then iterating through incrementing the execution_index by 1 each time and resetting it to be 1 whenever the attribute_id changes. (That would work but it'd be slow and someone is going to have to run this script on several dozen databases so I'd rather it didn't take more than a couple of seconds per database.)
Really I'd like to do something along the lines of
UPDATE c
SET c.execution_index = [this needs to be the index within the group somehow]
FROM condities c
GROUP BY c.attribute_id
ORDER BY c.execution_index asc, c.id asc
But I don't know how to make that actually work.
It looks like you can use an updatable CTE:
with cte as (
select *,
Row_Number() over(partition by attribute_id order by execution_index, id) new
from conditions
)
update cte set execution_index = new
I would suggest adding a new column and first updating that and checking the results are as expected.
Example Fiddle
WITH cte AS
(
SELECT
*,
ROW_NUMBER() OVER
(
PARTITION BY attribute_id
ORDER BY execution_index, id
) AS RowNum
FROM condities
)
UPDATE cte
SET execution_index = RowNum
I've been trying to find the answer to this without success. This may be because I don't know the right term to search for.
I've looked at some flattening suggestions using Pivot or Cross Apply but none of the examples I've looked at seem to do exactly what I want.
I basically need the most efficient way to flatten records in a table (because it's over a table with millions of records).
(Please excuse the formatting, I wasn't sure of the best way to make it readable).
Here's an example of what the original records look like :-
MainId--ID1--ID2--ID3
20241--0--2881--0
20241--0--2871--0
20241--0--2884--0
20241--1580--0--0
20241--1588--0--0
20241--0--0--1205
20241--0--0--1001
20241--0--0--1268
20241--0--0--1311
And here is what I need them to end up like :-
MainId--ID1--ID2--ID3
20241--1580--2881--1205
20241--1588--2871--1001
20241--0--2884--1268
20241--0--0--1311
So, I just need them to be the fewest number of records for each MainId.
It actually doesn't matter which ID's are in which record as there is no relation between them. They just need to be related to the correct MainId.
NB. This is a simplified example. The table in question can actually have up to 10 different ID columns but if I get it working for 3 ID's, I should be able to extend this out to more.
Thanks in advance.
Regards,
Jason
This is a little tricky. But you can do it using union all and aggregation. However, you need a column that specifies the ordering if you care about the actual order in the results:
select main_id,
max(id1), max(id2), max(id3)
from ((select t.*, row_number() over (partition by main_id order by ?) as seqnum
from t
where id1 <> 0
) union all
(select t.*, row_number() over (partition by main_id order by ?) as seqnum
from t
where id2 <> 0
) union all
(select t.*, row_number() over (partition by main_id order by ?) as seqnum
from t
where id3 <> 0
)
) t
group by main_id, seqnum;
This enumerates the valid values for each column and then combines them on a single row. The ? is for the ordering column. If you don't care about the ordering just use main_id.
Note: This turns the 0s into NULLs -- which seems useful to me. You can use coalesce() if you really want them as 0s.
In SQL, datasets / tables don't have an implicit order. There isn't a "first" record, unless you can enforce an order using an ORDER BY clause.
This means that there's currently no way to determine that 2881 should be in the same row as 1580 (2881 could just as easily be put in the same row as 1588.)
To resolve this, I'm going to assume you actually have an additional column:
MainId SubID ID1 ID2 ID3
20241 1 0 2881 0
20241 2 0 2871 0
20241 3 0 2884 0
20241 1 1580 0 0
20241 2 1588 0 0
20241 1 0 0 1205
20241 2 0 0 1001
20241 3 0 0 1268
20241 4 0 0 1311
Then it's a matter of aggregating the rows together...
SELECT
MainID,
SubID,
COALESCE(MAX(ID1), 0) AS ID1,
COALESCE(MAX(ID2), 0) AS ID2,
COALESCE(MAX(ID3), 0) AS ID3
FROM
yourTable
GROUP BY
MainID,
SubID
I want to do something like this
this works
Select ID, number, cost from table order by number
number can be 2-xtimes but the cost and the same
1 A33 66.50
2 A34 73.50
3 A34 73.50
But I want to have
1 A33 66.50
2 A34 73.50
3 A34 0
I want to change it in the Sql to 0
I tried distinct or if then else.
I want to do something like this
declare #oldcost int;
Select ID, number,
if(cost=#oldcost) then
cost=0;
else
cost=cost;
end if
#oldcost=cost;
from table order by number
How can I do it in SQL?
You can use window functions and a case expression:
select ID, number,
(case when row_number() over (partition by number order by id) = 1
then cost else 0
end) as cost
from table
order by number, id;
Note that SQL generally does not take ordering into account, so results can be returned in any order -- and even with an order by, rows with the same keys can be in any order (and in different orders on different executions).
Hence, the order by includes id as well as number so you get the cost on the "first" row for each number.
I have the following table:
Id Son RowOrder Technology
1 8 NULL fa
8 0 NULL fa
9 15 NULL gr
15 0 NULL gr
I would like to create an sql query that will do an "order by" by the following
order: technology, "father" (a record that has a son), son (the direct son of the previous father"
and do an update to the RowOrder column so next time i will order these records sole based on the RowOrder.
any ideas?
thanks
If you do not want data to change then instead of storing the data in row number just use
SELECT
id,
son,
technology,
ISNULL((select id from table t2 where t2.id = t1.son), 0) AS father,
ROW_NUMBER() OVER (ORDER BY technology, father, son) AS RowNumber
FROM
table t1
ORDER BY
RowNumber ASC
the functions ISNULL() and ROW_NUMBER() may change name depending on what database you are using but that is a rough idea of what you should go for.
I have a table like so
Id | Type | Value
--------------------
0 | Big | 2
1 | Big | 3
2 | Small | 3
3 | Small | 3
I would like to get a table like this
Type | Last Value
--------------------
Small | 3
Big | 3
How can I do this. I understand there is an SQL Server method called LAST_VALUE(...) OVER .(..) but I can't get this to work with GROUP BY.
I've also tried using SELECT MAX(ID) & SELECT TOP 1.. but this seems a bit inefficient since there would be a subquery for each value. The queries take too long when the table has a few million rows in it.
Is there a way to quickly get the last value for these, perhaps using LAST_VALUE?
You can do it using rownumber:
select
type,
value
from
(
select
type,
value,
rownumber() over (partition by type order by id desc) as RN
) TMP
where RN = 1
Can't test this now since SQL Fiddle doesn't seem to work, but hopefully that's ok.
The most efficient method might be not exists, which uses an anti-join for the underlying operator:
select type, value
from likeso l
where not exists (select 1 from likeso l2 where l2.type = l.type and l2.id > l.id)
For performance, you want an index on likeso(type, id).
I really wonder if there is more efficent solution but, I use following query on such needs;
Select Id, Type, Value
From ( Select *, Max (Id) Over (Partition By Type) As LastId
From #Table) T
Where Id = LastId