Update one row based off distinct values of another column - sql

I've got a data set with post codes, suburbs and their longitude and latitude.
For each postcode there are multiple rows with the corresponding suburbs within that postcode, so when I match it with another table which has sales by postcode in Power BI I end up with multiple rows returned for each post code.
What I'd like to do is insert a column called unique_postcode as a boolean marking one line of each post code as True. I don't mind which one. I tried the below as well as a few other options, it didn't give any errors but didn't have any affect.
UPDATE postcodes
SET post_codes.unique_postcode = 1
FROM (
SELECT DISTINCT(postcode)
FROM postcodes
);

You could use an updatable CTE which targets a random row:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY postcode ORDER BY postcode) rn
FROM postcodes
)
UPDATE cte
SET unique_postcode = 1
WHERE rn = 1;
Note that because the ordering used in ROW_NUMBER uses the postal code itself, the "first" row number value could be any of the rows, in the case that a postal code have more than one record associated with it.

If the row doesn't matter then the simplest way would be to select TOP 1.
with cte as (select top 1 * from postcodes)
update cte
set unique_postcode = 1;

You can use row_number() to define a particular one to assign the flag to. In an update, this looks like:
WITH toupdate AS (
SELECT p.*, ROW_NUMBER() OVER (PARTITION BY postcode ORDER BY postcode) as seqnum
FROM postcodes p
)
UPDATE toupdate
SET unique_postcode = (CASE WHEN seqnum = 1 THEN 1 ELSE 0 END);
Note: This sets one value to "1" and the rest to "0". It is also safe to run multiple times on the table.

Related

How create a unique ID based on conditions in SQL?

I would like to get a new ID, no matter the format (in the example below 11,12,13...)
Based on the following condition:
Every time the days column value is greater then 1 and not null then current row and all following ones will get the same ID until a new value will meet the condition.
Within the same email
Below you can see the expected 1 (in the format of XX)
I thought about using two conditions with the following order between them
Every time the days column value is greater then 1 then all following rows will get the same ID until a new value will meet the condition.
2.AND When lag (previous) is equal to 0/1/null.
Assuming you have an EmailDate column over which you're ordering (a DATETIME field, really), try something like this:
WITH
TableNameWithEmailDateIDs AS (
SELECT
*,
ROW_NUMBER() OVER (
ORDER BY
Email DESC,
EmailDate
) AS EmailDateID
FROM
TableName
),
IDs AS (
SELECT
*,
LEAD(EmailDateID, 1) OVER (
ORDER BY
Email,
EmailDate
) AS LeadEmailDateID
FROM
(
SELECT
*,
-- REMOVE +10 if you don't want 11 to be starting ID
ROW_NUMBER() OVER (
ORDER BY
Email DESC,
EmailDate
)+10 AS ID
FROM
TableNameWithEmailDateIDs
WHERE
Days > 1
OR Days IS NULL
) X
)
SELECT
COALESCE(TableName.EmailDate, IDs.EmailDate) AS EmailDate,
IDs.Email,
COALESCE(TableName.Days, IDs.Days) AS Days,
IDs.ID
FROM
IDs
LEFT JOIN TableNameWithEmailDateIDs TableName
ON IDs.Email = TableName.Email
AND TableName.EmailDateID BETWEEN
IDs.EmailDateID
AND IDs.LeadEmailDateID-1
ORDER BY
ID DESC,
TableName.EmailDate DESC
;
First, create a CTE that generates IDs for each distinct Email/Date combo (helpful for LEFT JOIN condition later). Then, create a CTE that generates IDs for rows that meet your condition (i.e. the important rows). Finally, LEFT JOIN your main table onto that CTE to fill in the "gaps", so to speak.
I suggest running each of the components of this query independently to fully understand what's going on.
Hope it helps!

How to extract the nth row of a sql statement?

I am trying to extract the nth row of an SQL statement (not the nth row of a table).
Is there an easy way to run a query and read values from specific rows.
I have tried something similar to this but it does not work since rownum tells me what nth record it is in the table.
SELECT * FROM (
SELECT
ROW_NUMBER() OVER(ORDER BY RewardID ASC) AS rownumber,
RewardID,
Name,
Description,
Image,
RewardType,
price
FROM
Reward
) AS num
WHERE
RewardType = 'Electronics' and rownum = 2
Are you using Postgres Or SQLServer? Both have a ROW_NUMBER() window function, but the syntax, especially around subqueries, may be slightly different. I'll assume Postgres.
You query looks good to me, except that the rownum = 2 in your WHERE should presumably be rownumber = 2, since that's what you aliased the ROW_NUMBER() column as. You also presumably don't need the RewardType and rownumber columns in your result, since their values will always be 'Electronics' and 2 respectively. Corrected and formatted for readability, this looks ok to me:
SELECT RewardID, Name, Description, Image, RewardType, price
FROM (
SELECT
ROW_NUMBER() OVER(ORDER BY RewardID ASC) AS rownumber,
RewardID,
Name,
Description,
Image,
RewardType,
price
FROM Reward) AS num
WHERE RewardType = 'Electronics' and rownumber = 2
The only questionable part of that is the WHERE RewardType = 'Electronics'. Should that actually be in the subquery rather than the outer query? The difference is that if it is in the subquery, the row counting will include only reward type 'Electronics', whereas in the outer query, all reward types will be counted. To only count the reward type 'Electronics', modify it as so:
SELECT RewardID, Name, Description, Image, RewardType, price
FROM (
SELECT
ROW_NUMBER() OVER(ORDER BY RewardID ASC) AS rownumber,
RewardID,
Name,
Description,
Image,
RewardType,
price
FROM Reward
WHERE RewardType = 'Electronics') AS num
WHERE rownumber = 2
Edit: Since the comment you just made clarifies what you're really trying to do, I'll add that you should NOT be making an individual query for each row of the data that you want. Whatever interface you are using to your database will have a way of iterating over a query result that uses multiple rows, and if that's what you really need, you should find out how to do that.
You can try using OFFSET/FETCH:
SELECT
RewardID,
Name,
Description,
Image,
RewardType,
price
FROM Reward
WHERE RewardType = 'Electronics'
OFFSET 1 ROWS FETCH NEXT 1 ROWS ONLY
Tell it to OFFSET(skip) row 1 then FETCH(select) 1 row after that, which would be row 2. This removes the need for a sub select.
The key is the order by, try using null in the order by.SELECT * FROM (
SELECT
ROW_NUMBER() OVER(ORDER BY (select null)) AS rownumber,
RewardID,
Name,
Description,
Image,
RewardType,
price
FROM
Reward
) AS num
WHERE
RewardType = 'Electronics' and rownum = 2

How to get the desired row out of multiple rows based upon a column value in sql by exclusion

My sql query potentially can pull 1 or 2 rows which differ in the value of the SourceType column as well as CreatedDate column. The possible values of the SourceType column are "New" or "Old". If only one row containing "Old" is present, I want to get that row. If two rows are present, I want to get the row with the value "New". I could have done this by ordering the rows by CreatedDate and get the top 1, but I would like to use the SourceType to get the required row. I am not sure whether coalesce will work here or not.
You can do this using prioritization logic. It looks like this:
select t.*
from t
where t.sourcetype = 'New'
union all
select t.*
from t
where t.soucetype = 'Old' and
not exists (select 1 from t t2 where t2.? = t.? and t2.sourcetype = 'New');
The ? is for the column that specifies the duplicates.
Actually, the above logic will evaluate the subquery three times. That is okay for a single table. But there is actually a better way:
select q.*
from (select q.*,
row_number() over (partition by ? order by sourceType) as seqnum
from query q
) q
where seqnum = 1;
The ? is the id to identify multiple rows.
The order by sourceType is really la shorthand for order by case when sourceType = 'New' then 1 else 2 end.

Update one specific row where same employee id has multiple entries in sql

I am new to sql and stuck in duplicate entries update issue, any help would be greatly appreciated.
I have a table call employee history, it has empid, roleid, rolestartdate, roleenddate columns.
In the table there are multiple entries for one empid based on role assignment and unassignment.
I need to update only one empid row based on below condition.
Select one empid row where rolestartdate is Max date, if it returns more than one row, check roleid columns and filter based on Max roleid.
It should also return those empid rows which has only one entry.
Thank you
You can use Row_Number window function.
;with cte as
(
select *,
Rn = row_number()over(partition by empid order by rolestartdate desc, roleid desc)
from EmployeeHistory
)
/*
--To check the records which will be updated
Select * from cte where Rn = 1
*/
update cte
set update_column = 'whatever value'
where Rn = 1

Removing dups and updating null values

I've just been tasked with removing all the duplicate values in a database. Simple enough. But they also want me to go through and check if there are any Null values that were not Null in previous entries for that record.
So let's say that we have user 123. User 123 doesn't have a zip code listed for whatever reason. But in a past entry he had zip code 55555. I'm supposed to update the latest entry with that zip code from a past entry and then delete the past entry. Leaving me with only one entry for user 123 AND having the zip code 55555.
I'm just unsure how to do the update portion. Anybody have any suggestions?
Thanks!
Here is how you can do the update. It finds the last value for zip, and then updates the field, if necessary:
with lastval as (
select *
from (select id, zip, row_number() over (partition by id order by datecreated desc) as seqnum
from t
where zip is not null
) t
where seqnum = 1
)
update t
set t.zip = lastval.zip
from lastval
where t.id = lastval.id
However, I would suggest that you create a new table with the data that you want. Don't both deleting and updating a zilion rows, create a table using a query such as:
select *
from (select t.*, row_number() over (partition by id order by datecreated desc) as seqnum
from t
where zip is not null
) t
where seqnum = 1
And insert the rows into a new table.
And, one more suggestion. Ask another question, with a better notion of what the fields are like in the table, and which ones you want to look up last values for. That will provide additional information for better solutions.
You could use a statement similar to the following one:
update t1
set t1.address = dt.address,
t1.city = dt.city,
... and so on ...
from your_table as t1
inner join
(
select
max(id) as id,
companyname,
max(address) as address,
max(city) as city,
... and so on ...
from your_table
group by companyname -- your duplicate detection goes here
) dt
on dt.id = t1.id
This way you fill up all gaps in your duplicates. Then you just have to delete the duplicates.