How to update distinct column in SQL Server 2005? - sql

In my table I have a column Segment_No which has duplicates. Now I want to update those duplicates.
For example: the value 249X5601 is present in two rows. I want to change the second value to 249X5601R

The general form would be as follows:
;with AllRows as (
select Donation_ID,Registration_No,Segment_No,
ROW_NUMBER() OVER (
PARTITION BY Segment_No
order by <Suitable_Column>
) as rn
from UnnamedTable
)
update AllRows set Segment_no = <New_Value>
where rn > 1
Where <Suitable_Column> gives the column(s) the define which row is "first" and which is second. <New_Value> defines how the new Segment_no value should be computed, and rn gives the row numbers - so the where clause is ignoring the "first" row.
So if there are only ever a max of two rows sharing a single Segment_no value, and the "first" is the one with the lowest Donation_ID value, then it would be:
;with AllRows as (
select Donation_ID,Registration_No,Segment_No,
ROW_NUMBER() OVER (
PARTITION BY Segment_No
order by Donation_ID
) as rn
from UnnamedTable
)
update AllRows set Segment_no = Segment_no + 'R'
where rn > 1

Related

Select rows based on distinct values of nested field in BigQuery

I have a table in BigQuery which looks like this:
The sequence field is a repeated RECORD. I want to select one row per stepName but if there are multiple rows per step name, I want to choose the one where sequence.step.elapsedSeconds and sequence.step.elapsedMinutes are not null, otherwise select the rows where these columns are null.
As shown in the image above, I want to select row no. 2, 4 and 5. I have calculated ROW_NUMBER like this: ROW_NUMBER() OVER(PARTITION BY step.stepName) AS RowNum.
HereĀ“s my query so far in trying to filter out the unwanted rows:
WITH DistinctRows AS
(
select timestamp,
ARRAY (
SELECT
STRUCT(
STRUCT(
step.elapsedSeconds,
step.elapsedMinutes,
) as step
)
FROM
UNNEST(source_table.sequence) AS sequence
) AS sequence,
ROW_NUMBER() OVER(PARTITION BY step.stepName) AS RowNum
from source_table,
unnest(sequence) as previousCalls
order by timestamp asc
)
SELECT *
FROM DistinctRows,
unnest(sequence) as sequence
where (rowNum = 1 and (step.elapsedSeconds is null and step.elapsedMinutes is null)
or (RowNum > 1 and step.elapsedSeconds is not null and step.elapsedSeconds is not null)
order by timestamp asc
I need help in figuring out how to filter out the rows like no. 1 and 3 and would appreciate some help.
Thanks in advance.
Hmmm . . . Assuming that stepname is not part of the repeated column:
SELECT dr.* EXCEPT (sequence),
(SELECT seq
FROM unnest(dr.sequence) seq
ORDER BY seq.step.elapsedSeconds DESC NULLS LAST,
sequence.step.elapsedMinutes DESC NULLS LAST
) as sequence
FROM DistinctRows dr
ORDER BY timestamp asc;
If stepname is part of sequence, then the subquery would reaggregate:
SELECT dr.* EXCEPT (sequence),
(SELECT ARRAY_AGG(sequence ORDER BY stepName)
FROM (SELECT seq,
ROW_NUMBER() OVER (PARTITION BY seq.stepName
ORDER BY seq.step.elapsedSeconds DESC NULLS LAST, sequence.step.elapsedMinutes DESC NULLS
) as seqnum
FROM unnest(dr.sequence) seq
) s
WHERE seqnum = 1
) as sequence
FROM DistinctRows dr
ORDER BY timestamp asc

Need to identify multiple records from the existing SELECT query output and remove all duplicates except one with value <> 0 in column K

There is a SELECT statement, Output of that could have duplicates and in those duplicates only one row for column k can have have value and other will have 0 value. Need to remove the duplicates have 0 values except one with value
If you want to filter out 0 values, then you can use:
select t.*
from t
where value = 0;
If you want to keep exactly one row per k, with preference to non-zero values, you can use:
select t.*
from (select t.*,
row_number() over (partition by k order by value desc) as seqnum
from t
) t
where seqnum = 1;
Note: This assumes that value is never negative. If it can be, then use order by abs(value) desc.
We can have ROW_NUMBER() and consider ColumnK value of 0, as rank =1 and filter them out. Assuming there are no negative numbers in ColumnK.
SELECT ColumnA, ColumnB,ColumnC.....
FROM
(SELECT *, row_number() over(partition by columnK ORDER BY columnk asc) as rnk
from tablename
) as t
WHERE rnk > 1 -- 0 will be having rank = 1

Avoid duplicate records from a particular column of a table

I have a table as shown in the image.In Number column, the values are appeared more than once (for example 63 appeared twice). I would like to keep only one value. Please see my code:
delete from t1 where
(SELECT *,row_number() OVER (
PARTITION BY
Number
ORDER BY
Date) as rn from t1 where rn > 1)
It shows error. Can anyone please assist.
enter image description here
The column created by row_number() was not accessed by your main query, in order to enable that, you can create a quick sub query and use the desired filter
SELECT *
FROM
(
SELECT *,
row_number() OVER (PARTITION BY Number ORDER BY Date) as rn
FROM t1 ) T
where rn = 1;
The partition by determines how row numbers repeat. The row numbers are assigned per group of partition by keys. So, you can get duplicates.
If you want a unique row number over all rows, just leave out the partition by:
select t1.*
from (select t1.*,
row_number() over (order by date) as rn
from t1
) t1
where rn > 1
if you want to keep only one value, rn = 1 instead of "> 1"

SQL - Order by issue

How to add another order by at the end with records sorted by field_id, but order should be in the same way they are set in the IN clause 32015102,32015100,32015101,32015105. The numbers can change and can increase or decrease in count.
select * from dba.form_data where form_id = 207873 and field_id in (32015102,32015100, 32015101, 32015105 )
order by sub_id, array_number
An IN clause contains an unordered set of values. IN (1,2,3) and IN (3,2,1) are considered equal.
So you must add some order criteria. In your case you want 32015102 first, 32015100 second, etc. Present the DBMS the values with appropriate sort keys. E.g.:
select *
from dba.form_data fd
join
(
select 32015102 as value, 1 as sortkey
union all
select 32015100 as value, 2 as sortkey
union all
select 32015101 as value, 3 as sortkey
union all
select 32015105 as value, 4 as sortkey
) criteria on criteria.value = fd.field_id
where fd.form_id = 207873
order by criteria.sortkey;
TSQL solution on one of my DBs:
WITH CTE AS
(
SELECT ROW_NUMBER() OVER(ORDER BY(SELECT NULL)) AS a,
* FROM (VALUES (21963), (21961), (27665)) AS X(b)
)
SELECT tpb.AutoId FROM dbo.TPB AS tpb
JOIN CTE ON CTE.b = tpb.AutoId
ORDER BY CTE.a
Step by step:
Row constructor syntax: SELECT * FORM( VALUES (1), (2), (3)) X(a)
Create a table named X with a column a that contains 1, 2,3.
ROW_NUMBER(): I needed row number for final sorting of the array. It needs an over clause so I used dummy OVER(ORDER BY(SELECT NULL)) which is, well, order by nothing.
Encapsulate this in a CTE, then select from your target table and join it with CTE on your desired order then sort it using corresponding row_numbers.

Update based on group by, top 1 row and where case

I have a table as follows:
This is a result of this select:
SELECT ParentID, ID, [Default], IsOnTop, OrderBy
FROM [table]
WHERE ParentID IN (SELECT ParentID
FROM [table]
GROUP BY ParentID
HAVING SUM([Default]) <> 1)
ORDER BY ParentID
Now, what I want to do is to: for each ParentID group, set one of the rows as a Default ([Default] = 1), where the row is chosen using this logic:
if group has a row with IsOnTop = 1 then take this row, otherwise take top 1 row ordered by OrderBy.
I'm completly clueless as on how to do that in SQL and I have over 40 of such groups, thus I'd like to ask you for help, preferably with some explanation of your query.
Just slightly modify your current query by assigning a row number, across each ParentID group. The ordering logic for the row number assignment is that records with IsOnTop values of 1 come first, and after that the OrderBy column determines position. I update the CTE under the condition that only the first record in each ParentID group gets assigned a Default value of 1.
WITH cte AS (
SELECT ParentID, ID, [Default], IsOnTop, OrderBy,
ROW_NUMBER() OVER (PARTITION BY ParentID
ORDER BY IsOnTop DESC, OrderBy) rn
FROM [table]
WHERE ParentID IN (SELECT ParentID FROM [table]
GROUP BY ParentID HAVING SUM([Default]) <> 1)
)
UPDATE cte
SET [Default] = 1
WHERE rn = 1;
There might be a quicker way but this is how I would do it.
First create a CTE
First we create a CTE in which we add a row_number over the ParentID's based on if IsOnTop = 1. Else it picks the 1st row based on the OrderBy column.
Then we update the rows with the rownumber 1.
WITH FindSoonToBeDefault AS (
SELECT ParentID, ID, [Default], IsOnTop, OrderBy, row_number() OVER(PARTITION BY ParentID ORDER BY IsOnTop DESC, [OrderBy] ASC) AS [rn]
FROM [table]
WHERE ParentID IN (SELECT ParentID
FROM [table]
GROUP BY ParentID
HAVING SUM([Default]) <> 1)
ORDER BY ParentID
)
UPDATE FindSoonToBeDefault
SET [Default] = 1
WHERE [rn] = 1
In your screenshot row 12 will be default.
Row 13 will be not.
(1-IsOnTop)*OrderBy combines IsOnTop and OrderBy into a single result that can be ranked so that the lowest value is the one you want. Use a derived table to identify the lowest result for each ParentID, thenJOIN to that to identify your defaults.
UPDATE [table]
SET [Default] = 1
FROM [table]
INNER JOIN
(
SELECT ParentID, MIN((1-IsOnTop)*OrderBy) DefaultRank
FROM [table]
GROUP BY ParentID
) AS rankForDefault
ON rankForDefault.ParentID=[table].ParentID
AND rankForDefault.DefaultRank=(1-[table].IsOnTop)*[table].OrderBy