Replace value in column based on another column - sql

I have the following table:
+----+--------+------------+----------------------+
| ID | Name | To_Replace | Replaced |
+----+--------+------------+----------------------+
| 1 | Fruits | 1 | Fruits |
| 2 | Apple | 1-2 | Fruits-Apple |
| 3 | Citrus | 1-3 | Fruits-Citrus |
| 4 | Orange | 1-3-4 | Fruits-Citrus-Orange |
| 5 | Empire | 1-2-5 | Fruits-Apple-Empire |
| 6 | Fuji | 1-2-6 | Fruits-Apple-Fuji |
+----+--------+------------+----------------------+
How can I create the column Replaced ? I thought of creating 10 maximum columns (I know there are no more than 10 nested levels) and fetch the ID from every substring split by '-', and then concatenating them if not null into Replaced, but I think there is a simpler solution.

While what you ask for is technically feasible (probably using a recursive query or a tally), I will take a different stance and suggest that you fix your data model instead.
You should not be storing multiple values as a delimited list in a single database column. This defeats the purpose of a relational database, and makes simple things both unnecessarily complicated and inefficient.
Instead, you should have a separate table to store that data, which each replacement id on a separate row, and possibly a column that indicates the sequence of each element in the list.
For your sample data, this would look like:
id replace_id seq
1 1 1
2 1 1
2 2 2
3 1 1
3 3 2
4 1 1
4 3 2
4 4 3
5 1 1
5 2 2
5 5 3
6 1 1
6 2 2
6 6 3
Now you can efficiently generate the expected result with either a join, a subquery, or a lateral join. Assuming that your table is called mytable and that the mapping table is mymapping, the lateral join solution would be:
select t.*, r.*
from mytable t
outer apply (
select string_agg(t1.name) within group(order by m.seq) replaced
from mymapping m
inner join mytable t1 on t1.id = m.replace_id
where m.id = t.id
) x

You can try something like this:
DECLARE #Data TABLE ( ID INT, [Name] VARCHAR(10), To_Replace VARCHAR(10) );
INSERT INTO #Data ( ID, [Name], To_Replace ) VALUES
( 1, 'Fruits', '1' ),
( 2, 'Apple', '1-2' ),
( 3, 'Citrus', '1-3' ),
( 4, 'Orange', '1-3-4' ),
( 5, 'Empire', '1-2-5' ),
( 6, 'Fuji', '1-2-6' );
SELECT
*
FROM #Data AS d
OUTER APPLY (
SELECT STRING_AGG ( [Name], '-' ) AS Replaced FROM #Data WHERE ID IN (
SELECT CAST ( [value] AS INT ) FROM STRING_SPLIT ( d.To_Replace, '-' )
)
) List
ORDER BY ID;
Returns
+----+--------+------------+----------------------+
| ID | Name | To_Replace | Replaced |
+----+--------+------------+----------------------+
| 1 | Fruits | 1 | Fruits |
| 2 | Apple | 1-2 | Fruits-Apple |
| 3 | Citrus | 1-3 | Fruits-Citrus |
| 4 | Orange | 1-3-4 | Fruits-Citrus-Orange |
| 5 | Empire | 1-2-5 | Fruits-Apple-Empire |
| 6 | Fuji | 1-2-6 | Fruits-Apple-Fuji |
+----+--------+------------+----------------------+
UPDATE
Ensure the id list order is maintained when aggregating names.
DECLARE #Data TABLE ( ID INT, [Name] VARCHAR(10), To_Replace VARCHAR(10) );
INSERT INTO #Data ( ID, [Name], To_Replace ) VALUES
( 1, 'Fruits', '1' ),
( 2, 'Apple', '1-2' ),
( 3, 'Citrus', '1-3' ),
( 4, 'Orange', '1-3-4' ),
( 5, 'Empire', '1-2-5' ),
( 6, 'Fuji', '1-2-6' ),
( 7, 'Test', '6-2-7' );
SELECT
*
FROM #Data AS d
OUTER APPLY (
SELECT STRING_AGG ( [Name], '-' ) AS Replaced FROM (
SELECT TOP 100 PERCENT
Names.[Name]
FROM ( SELECT CAST ( '<ids><id>' + REPLACE ( d.To_Replace, '-', '</id><id>' ) + '</id></ids>' AS XML ) AS id_list ) AS xIds
CROSS APPLY (
SELECT
x.f.value('.', 'INT' ) AS name_id,
ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL ) ) AS row_id
FROM xIds.id_list.nodes('//ids/id') x(f)
) AS ids
INNER JOIN #Data AS Names ON Names.ID = ids.name_id
ORDER BY row_id
) AS x
) List
ORDER BY ID;
Returns
+----+--------+------------+----------------------+
| ID | Name | To_Replace | Replaced |
+----+--------+------------+----------------------+
| 1 | Fruits | 1 | Fruits |
| 2 | Apple | 1-2 | Fruits-Apple |
| 3 | Citrus | 1-3 | Fruits-Citrus |
| 4 | Orange | 1-3-4 | Fruits-Citrus-Orange |
| 5 | Empire | 1-2-5 | Fruits-Apple-Empire |
| 6 | Fuji | 1-2-6 | Fruits-Apple-Fuji |
| 7 | Test | 6-2-7 | Fuji-Apple-Test |
+----+--------+------------+----------------------+
I'm sure there's optimization that can be done here, but this solution seems to guarantee the list order is kept.

Related

Oracle SQL, how to select * having distinct columns

I want to have a query something like this (this doesn't work!)
select * from foo where rownum < 10 having distinct bar
Meaning I want to select all columns from ten random rows with distinct values in column bar. How to do this in Oracle?
Here is an example. I have the following data
| item | rate |
-------------------
| a | 50 |
| a | 12 |
| a | 26 |
| b | 12 |
| b | 15 |
| b | 45 |
| b | 10 |
| c | 5 |
| c | 15 |
And result would be for example
| item no | rate |
------------------
| a | 12 | --from (26 , 12 , 50)
| b | 45 | --from (12 ,15 , 45 , 10)
| c | 5 | --from (5 , 15)
Aways having distinct item no
SQL Fiddle
Oracle 11g R2 Schema Setup:
Generate a table with 12 items A - L each with rates 0 - 4:
CREATE TABLE items ( item, rate ) AS
SELECT CHR( 64 + CEIL( LEVEL / 5 ) ),
MOD( LEVEL - 1, 5 )
FROM DUAL
CONNECT BY LEVEL <= 60;
Query 1:
SELECT item,
rate
FROM (
SELECT i.*,
-- Give the rates for each item a unique index assigned in a random order
ROW_NUMBER() OVER ( PARTITION BY item ORDER BY DBMS_RANDOM.VALUE ) AS rn
FROM items i
ORDER BY DBMS_RANDOM.VALUE -- Order all the rows randomly
)
WHERE rn = 1 -- Only get the first row for each item
AND ROWNUM <= 10 -- Only get the first 10 items.
Results:
| ITEM | RATE |
|------|------|
| A | 0 |
| K | 2 |
| G | 4 |
| C | 1 |
| E | 0 |
| H | 0 |
| F | 2 |
| D | 3 |
| L | 4 |
| I | 1 |
I mention table create and query for distinct and top 10 rows;
(Ref SqlFiddle)
create table foo(item varchar(20), rate int);
insert into foo values('a',50);
insert into foo values('a',12);
insert into foo values('a',26);
insert into foo values('b',12);
insert into foo values('b',15);
insert into foo values('b',45);
insert into foo values('b',10);
insert into foo values('c',5);
insert into foo values('c',15);
--Here first get the distinct item and then filter row number wise rows:
select item, rate from (
select item, rate, ROW_NUMBER() over(PARTITION BY item ORDER BY rate desc)
row_num from foo
) where row_num=1;

Select distinct one field other first non empty or null

I have table
| Id | val |
| --- | ---- |
| 1 | null |
| 1 | qwe1 |
| 1 | qwe2 |
| 2 | null |
| 2 | qwe4 |
| 3 | qwe5 |
| 4 | qew6 |
| 4 | qwe7 |
| 5 | null |
| 5 | null |
is there any easy way to select distinct 'id' values with first non null 'val' values. if not exist then null. for example
result should be
| Id | val |
| --- | ---- |
| 1 | qwe1 |
| 2 | qwe4 |
| 3 | qwe5 |
| 4 | qew6 |
| 5 | null |
In your case a simple GROUP BY should be the solution:
SELECT Id
,MIN(val)
FROM dbo.mytable
GROUP BY Id
Whenever using a GROUP BY, you have to use an aggregate function on all columns, which are not listed in the GROUP BY.
If an Id has a value (val) other than NULL, this value will be returned.
If there are just NULLs for the Id, NULL will be returned.
As far as i unterstood (regarding your comment), this is exactly what you're going to approach.
If you always want to have "the first" value <> NULL, you'll need another sort criteria (like a timestamp column) and might be able to solve it with a WINDOW-function.
If you want the first non-NULL value (where "first" is based on id), then MIN() doesn't quite do it. Window functions do:
select t.*
from (select t.*,
row_number() over (partition by id
order by (case when val is not null then 1 else 2 end),
id
) as seqnum
from t
) t
where seqnum = 1;
SQL Fiddle:
Create Table from SQL Fiddle:
CREATE TABLE tab1(pid integer, id integer, val varchar(25))
Insert dummy records :
insert into tab1
values (1, 1 , null),
(2, 1 , 'qwe1' ),
(3, 1 , 'qwe2'),
(4, 2 , null ),
(5, 2 , 'qwe4' ),
(6, 3 , 'qwe5' ),
(7, 4 , 'qew6' ),
(8, 4 , 'qwe7' ),
(9, 5 , null ),
(10, 5 , null );
fire below query:
SELECT Id ,MIN(val) as val FROM tab1 GROUP BY Id;

Querying latest date for a particluar attribute where it is not in date format

I need to set up a query that allows me to pick the most recent updated record within a group. If two records have the latest update, then the one with the longest update history should be picked. If both are null, or both have the same length of history, then neither should be chosen. The fields are varchar2 format. The last two digits in first record and last record correspond to the years those records were taken. The letters in the history length correspond to codes for what type of data was taken. Below is a sample table, with the expected results:
| group_id | id | First Record | Last Record | History Length |
---------------------------------------------------------------------------------
| a | 1 | record98 | record16 | SNDAWEDSPSEDSYSEAOE |
| a | 2 | record97 | record14 | AVNDAWEDSPSEDSYS |
| b | 3 | record96 | record15 | BVNDAWEDSPSEDSYSEAOE |
| b | 4 | record98 | record16 | UNDAWEDSPSEDSYSEAOP |
| b | 5 | record95 | record16 | UNDAWEDSPSEDSYSEAOPHYE|
| c | 6 | record96 | record12 | BVNDAWEDSPSEDSYSE |
| c | 7 | record10 | record15 | HUSIKD |
| d | 8 | null | null | null |
| d | 9 | null | null | null |
| e | 10 | record11 | record16 | ASIKSO |
| e | 11 | record11 | record16 | SIXLLO |
-------------------------------------------------------------------------------------------------------------------
Output
| group_id | id | First Record | Last Record | History Length |
---------------------------------------------------------------------------------
| a | 1 | record98 | record16 | SNDAWEDSPSEDSYSEAOE |
| b | 5 | record95 | record16 | UNDAWEDSPSEDSYSEAOPHYE|
| c | 7 | record10 | record15 | HUSIKD |
The history isn't as important as the latest record, so if that is too difficult to implement, I just need the one row with the latest record. Thank you.
Let me know if the below query works for your requirement.
SELECT group_id,ID,first_record,last_record,history_length
FROM (
SELECT group_id,ID,first_record,last_record,history_length,diff,
MAX(LENGTH(history_length)) OVER (PARTITION BY group_id) max_len,
count(1) OVER (PARTITION BY group_id,LENGTH(history_length)) cnt
FROM (
SELECT group_id,ID,first_record,last_record,history_length,
count(1) OVER (PARTITION BY group_id,LENGTH(history_length)) cnt,
MAX(to_date(to_number(substr(last_record, 7,2)),'RR')-to_date(to_number(substr(first_record, 7,2)),'RR')) OVER (PARTITION BY group_id) diff
FROM (
SELECT group_id,ID,first_record,last_record,history_length,
MAX(last_record) OVER (PARTITION BY group_id) max_last_record
FROM t
WHERE nvl(first_record,last_record) IS NOT NULL
)
WHERE last_record=max_last_record
)
WHERE (to_date(to_number(substr(last_record, 7,2)),'RR')-to_date(to_number(substr(first_record, 7,2)),'RR'))=diff
)
WHERE cnt=1
AND LENGTH(history_length)=max_len;
Personally I find hemalp108's answer hard to follow; I prefer to break each step down.
Below is how I did this using CTEs, where each subsequent CTE is the next step with a descriptive name i.e.
Add Max LastRecord
then Search by Max LastRecord
then Add HistoryTally
then Add Max HistoryTally
then Search by Max HistoryTally
then Add HistoryTally Frequency
then Search by HistoryTally Frequency
then return the result
P.S. SQLFiddle wasn't working so I had to so this in local SQL Server (don't have local Oracle) and tried to translate it back!
WITH YourTable AS
( SELECT *
FROM ( VALUES ( 'a',1,'record98','record16','SNDAWEDSPSEDSYSEAOE' ),
( 'a',2,'record97','record14','AVNDAWEDSPSEDSYS' ),
( 'b',3,'record96','record15','BVNDAWEDSPSEDSYSEAOE' ),
( 'b',4,'record98','record16','UNDAWEDSPSEDSYSEAOP' ),
( 'b',5,'record95','record16','UNDAWEDSPSEDSYSEAOPHYE' ),
( 'c',6,'record96','record12','BVNDAWEDSPSEDSYSE' ),
( 'c',7,'record10','record15','HUSIKD' ),
( 'd',8,null,null,null),
( 'd',9,null,null,null),
( 'e',10,'record11','record16','ASIKSO' ),
( 'e',11,'record11','record16','SIXLLO' )
) AS T ( group_id, id, FirstRecord, LastRecord, HistoryLength ) ),
AddMaxLastRecord AS
( SELECT *, MAX( LastRecord ) OVER ( PARTITION BY group_id ) MaxLastRecord
FROM YourTable ),
SearchByMaxLastRecord AS
( SELECT group_id, id, FirstRecord, LastRecord, HistoryLength
FROM AddMaxLastRecord
WHERE LastRecord = MaxLastRecord ),
AddHistoryTally AS
( SELECT *, LEN( HistoryLength ) AS HistoryTally
FROM SearchByMaxLastRecord ),
AddMaxHistoryTally AS
( SELECT *, MAX( HistoryTally ) OVER ( PARTITION BY group_id ) MaxHistoryTally
FROM AddHistoryTally ),
SearchByMaxHistoryTally AS
( SELECT group_id, id, FirstRecord, LastRecord, HistoryLength, HistoryTally
FROM AddMaxHistoryTally
WHERE HistoryTally = MaxHistoryTally ),
AddHistoryTallyFrequency AS
( SELECT *, COUNT( HistoryTally ) OVER ( PARTITION BY group_id ) AS HistoryTallyFreq
FROM SearchByMaxHistoryTally ),
SearchByHistoryTallyFrequency AS
( SELECT group_id, id, FirstRecord, LastRecord, HistoryLength
FROM AddHistoryTallyFrequency
WHERE HistoryTallyFreq = 1 )
SELECT *
FROM SearchByHistoryTallyFrequency;

T-SQL How to translate multiple sub-strings to new values

First of all, sorry because I don't know how to title my problem.
My situation is, I have 1 lookup table with this format:
+----+-----------+------------+
| ID | Fruit | Color |
+----+-----------+------------+
| 1 | Banana | Yellow |
| 2 | Apple | Red |
| 3 | Blueberry | NotYetBlue |
+----+-----------+------------+
And my main table is like this:
+-------+------------------------+------------+
| MixID | Contains | MixedColor |
+-------+------------------------+------------+
| 1 | Banana | |
| 2 | Apple:Blueberry | |
| 3 | Banana:Apple:Blueberry | |
+-------+------------------------+------------+
I want to make a look-up on the first table and fill in the MixedColor column as below:
+-------+------------------------+-----------------------+
| MixID | Contains | MixedColor |
+-------+------------------------+-----------------------+
| 1 | Banana | Yellow |
| 2 | Apple:Blueberry | Red:NotYetBlue |
| 3 | Banana:Apple:Blueberry | Yellow:Red:NotYetBlue |
+-------+------------------------+-----------------------+
Any help will be very appreciated.
Thank you
I agree that ideally your table structure should be altered. But, you can get what you want with:
SELECT MIXID, [CONTAINS],
STUFF((
SELECT ':' + Color
FROM Table1 a
WHERE ':'+b.[Contains]+':' LIKE '%:'+a.Fruit+':%'
FOR XML PATH('')
), 1, 1, '') AS Color
FROM Table2 b
GROUP BY MIXID, [CONTAINS]
Demo: SQL Fiddle
As "Charles Bretana" suggested it would be best to modify you schema to something like this:
+--------+-------+----------+
| RowID | MixID | FruitID |
+--------+-------+----------+
| 0 | 1 | 1 |
| 1 | 2 | 2 |
| 2 | 2 | 3 |
| 3 | 3 | 1 |
| 4 | 3 | 2 |
| 5 | 3 | 3 |
|--------+-------+----------+
now using a simple inenr join you can select the correct color and match the fruit.
if it is not possible for you to achieve that construct you could use a recursive query mentioned here : Turning a Comma Separated string into individual rows.
to manipulate your data to look like that.
Here is a SQL Fiddle: http://sqlfiddle.com/#!3/8d68f/12
table data :
create table Mixses(MixID int, ContainsData varchar(max))
insert Mixses select 1, '10:11:12'
insert Mixses select 2, '10:11'
insert Mixses select 3, '10'
insert Mixses select 4, '11:12'
create table Fruits(FruitID int, Name varchar(200), Color varchar(200))
insert Fruits select 10, 'Bannana' , 'Yellow'
insert Fruits select 11, 'Apple' , 'Red'
insert Fruits select 12, 'BlueBerry' , 'Blue'
insert Fruits select 13, 'Pineapple' , 'Brown'
Query:
;with tmp(MixID, DataItem, Data) as
(
select
MixID,
LEFT(ContainsData, CHARINDEX(':',ContainsData+':')-1),
STUFF(ContainsData, 1, CHARINDEX(':',ContainsData+':'), '')
from Mixses
union all
select MixID,
LEFT(Data, CHARINDEX(':',Data+':')-1),
STUFF(Data, 1, CHARINDEX(':',Data+':'), '')
from tmp
where Data > ''
)
select t.MixID, t.DataItem, f.Color
from tmp t
inner join Fruits f on f.FruitID=t.DataItem
order by MixID

Create New Table From Other Table After Grouping

How can I insert to a table a value from "grouping" other table?
That means I have 2 table with different structure.
The table ORDRE with existed DATA
Table ORDRE:
ORDRE ID | CODE_DEST |
-------------------------
1 | a |
2 | b |
3 | c |
4 | a |
5 | a |
6 | b |
7 | g |
I want to INSERT the value FROM Table ORDRE INTO TABLE VOIT:
ID_VOIT | ORDRE ID | CODE_DEST |
---------------------------------------
1 | 1 | a |
1 | 4 | a |
1 | 5 | a |
2 | 2 | b |
2 | 6 | b |
3 | 3 | c |
4 | 7 | g |
This is my best guess on what you need using only the info available.
declare #Ordre table
(
ordre_id int,
code_dest char(1)
)
declare #Voit table
(
id_voit int,
ordre_id int,
code_dest char(1)
)
insert into #Ordre values
(1,'a'),
(2,'b'),
(3,'c'),
(4,'a'),
(5,'a'),
(6,'b'),
(7,'g')
insert into #Voit
select id_voit, ordre_id, rsOrdre.code_dest
from #Ordre rsOrdre
inner join
(
select code_dest, ROW_NUMBER() over (order by code_dest) as id_voit
from #Ordre
group by code_dest
) rsVoit on rsVoit.code_dest = rsOrdre.code_dest
order by id_voit, ordre_id
select * from #Voit
Working Example.
For the specific data you give as an example, this works:
insert into VOIT
select
case code_dest
when 'a' then 1
when 'b' then 2
when 'c' then 3
when 'g' then 4
else 0
end, orderId, code_dest from ORDRE order by code_dest, orderId
But it kind of sucks because it requires hard-coding in a huge case statement.
Test is here - https://data.stackexchange.com/stackoverflow/q/119442/
What I like more is moving the VOIT ID / Code_Dest associations to a new table, so then you could do an inner join instead.
insert into VOIT
select voit_id, orderId, t.code_dest
from ORDRE t
join Voit_CodeDest t2 on t.code_dest = t2.code_dest
order by code_dest, orderId
Working example of that here - https://data.stackexchange.com/stackoverflow/q/119443/