SQL: How to update multiple fields so empty field content is moved to the logically last columns - lose blank address lines

SQL: How to update multiple fields so empty field content is moved to the logically last columns - lose blank address lines - sql

I have three address line columns, aline1, aline2, aline3 for a street
address. As staged from inconsistent data, any or all of them can be
blank. I want to move the first non-blank to addrline1, 2nd non-blank
to addrline2, and clear line 3 if there aren't three non blank lines,
else leave it. ("First" means aline1 is first unless it's blank,
aline2 is first if aline1 is blank, aline3 is first if aline1 and 2
are both blank)
The rows in this staging table do not have a key and there could be
duplicate rows. I could add a key.
Not counting a big case statement that enumerates the possible
combination of blank and non blank and moves the fields around, how
can I update the table? (This same problem comes up with a lot more
than 3 lines, so that's why I don't want to use a case statement)
I'm using Microsoft SQL Server 2008

Another alternative. It uses the undocumented %%physloc%% function to work without a key. You would be much better off adding a key to the table.
CREATE TABLE #t
(
aline1 VARCHAR(100),
aline2 VARCHAR(100),
aline3 VARCHAR(100)
)
INSERT INTO #t VALUES(NULL, NULL, 'a1')
INSERT INTO #t VALUES('a2', NULL, 'b2')
;WITH cte
AS (SELECT *,
MAX(CASE WHEN RN=1 THEN value END) OVER (PARTITION BY %%physloc%%) AS new_aline1,
MAX(CASE WHEN RN=2 THEN value END) OVER (PARTITION BY %%physloc%%) AS new_aline2,
MAX(CASE WHEN RN=3 THEN value END) OVER (PARTITION BY %%physloc%%) AS new_aline3
FROM #t
OUTER APPLY (SELECT ROW_NUMBER() OVER (ORDER BY CASE WHEN value IS NULL THEN 1 ELSE 0 END, idx) AS
RN, idx, value
FROM (VALUES(1,aline1),
(2,aline2),
(3,aline3)) t (idx, value)) d)
UPDATE cte
SET aline1 = new_aline1,
aline2 = new_aline2,
aline3 = new_aline3
SELECT *
FROM #t
DROP TABLE #t

Here's an alternative
Sample table for discussion, don't worry about the nonsensical data, they just need to be null or not
create table taddress (id int,a varchar(10),b varchar(10),c varchar(10));
insert taddress
select 1,1,2,3 union all
select 2,1, null, 3 union all
select 3,null, 1, 2 union all
select 4,null,null,2 union all
select 5,1, null, null union all
select 6,null, 4, null
The query, which really just normalizes the data
;with tmp as (
select *, rn=ROW_NUMBER() over (partition by t.id order by sort)
from taddress t
outer apply
(
select 1, t.a where t.a is not null union all
select 2, t.b where t.b is not null union all
select 3, t.c where t.c is not null
--- EXPAND HERE
) u(sort, line)
)
select t0.id, t1.line, t2.line, t3.line
from taddress t0
left join tmp t1 on t1.id = t0.id and t1.rn=1
left join tmp t2 on t2.id = t0.id and t2.rn=2
left join tmp t3 on t3.id = t0.id and t3.rn=3
--- AND HERE
order by t0.id
EDIT - for the update back into table
;with tmp as (
select *, rn=ROW_NUMBER() over (partition by t.id order by sort)
from taddress t
outer apply
(
select 1, t.a where t.a is not null union all
select 2, t.b where t.b is not null union all
select 3, t.c where t.c is not null
--- EXPAND HERE
) u(sort, line)
)
UPDATE taddress
set a = t1.line,
b = t2.line,
c = t3.line
from taddress t0
left join tmp t1 on t1.id = t0.id and t1.rn=1
left join tmp t2 on t2.id = t0.id and t2.rn=2
left join tmp t3 on t3.id = t0.id and t3.rn=3

Update - Changed statement to an Update statement. Removed Case statement solution
With this solution, you will need a unique key in the staging table.
With Inputs As
(
Select PK, 1 As LineNum, aline1 As Value
From StagingTable
Where aline1 Is Not Null
Union All
Select PK, 2, aline2
From StagingTable
Where aline2 Is Not Null
Union All
Select PK, 3, aline3
From StagingTable
Where aline3 Is Not Null
)
, ResequencedInputs As
(
Select PK, Value
, Row_Number() Over( Order By LineNum ) As LineNum
From Inputs
)
, NewValues As
(
Select S.PK
, Min( Case When R.LineNum = 1 Then R.addrline1 End ) As addrline1
, Min( Case When R.LineNum = 2 Then R.addrline1 End ) As addrline2
, Min( Case When R.LineNum = 3 Then R.addrline1 End ) As addrline3
From StagingTable As S
Left Join ResequencedInputs As R
On R.PK = S.PK
Group By S.PK
)
Update OtherTable
Set addrline1 = T2.addrline1
, addrline2 = T2.addrline2
, addrline3 = T2.addrline3
From OtherTable As T
Left Join NewValues As T2
On T2.PK = T.PK

R. A. Cyberkiwi, Thomas, and Martin, thanks very much - these were very generous responses by each of you. All of these answers were the type of spoonfeeding I was looking for. I'd say they all rely on a key-like device and work by dividing addresses into lines, some of which are empty and some of which aren't, excluding the empties. In the case of lines of addresses, in my opinion this is semantically a gimmick to make the problem fit what SQL does well, and it's not a natural way to conceptualize the problem. Address lines are not "really" separate rows in a table that just got denormalized for a report. But that's debatable and whether you agree or not, I (a rank beginner) think each of your alternatives are idiomatic solutions worth elaborating on and studying.
I also get lots of similar cases where there really is normalization to be done - e.g., collatDesc1, collatCode1, collatLastAppraisal1, ... collatLastAppraisal5, with more complex criteria about what in excludeand how to order than with addresses, and I think techniques from your answers will be helpful.
%%phsloc%% is fun - since I'm able to create a key in this case I won't use it (as Martin advises). There was other stuff in Martin's stuff I wasn't familiar with too, and I'm still tossing them all around.
FWIW, here's the trigger I tried out, I don't know that I'll actually use it for the problem at hand. I think this qualifies a "bubble sort", with the swapping expressed in a peculiar way.
create trigger fixit on lines
instead of insert as
declare #maybeblank1 as varchar(max)
declare #maybeblank2 as varchar(max)
declare #maybeblank3 as varchar(max)
set #maybeBlank1 = (select line1 from inserted)
set #maybeBlank2 = (select line2 from inserted)
set #maybeBlank3 = (select line3 from inserted)
declare #counter int
set #counter = 0
while #counter < 3
begin
set #counter = #counter + 1
if #maybeBlank2 = ''
begin
set #maybeBlank2 =#maybeblank3
set #maybeBlank3 = ''
end
if #maybeBlank1 = ''
begin
set #maybeBlank1 = #maybeBlank2
set #maybeBlank2 = ''
end
end
select * into #kludge from inserted
update #kludge
set line1 = #maybeBlank1,
line2 = #maybeBlank2,
line3 = #maybeBlank3
insert into lines
select * from #kludge

You could make an insert and update trigger that check if the fields are empty and then move them.

Related

T-SQL - Copying & Transposing Data

I'm trying to copy data from one table to another, while transposing it and combining it into appropriate rows, with different columns in the second table.
First time posting. Yes this may seem simple to everyone here. I have tried for a couple hours to solve this. I do not have much support internally and have learned a great deal on this forum and managed to get so much accomplished with your other help examples. I appreciate any help with this.
Table 1 has the data in this format.
Type Date Value
--------------------
First 2019 1
First 2020 2
Second 2019 3
Second 2020 4
Table 2 already has the Date rows populated and columns created. It is waiting for the Values from Table 1 to be placed in the appropriate column/row.
Date First Second
------------------
2019 1 3
2020 2 4

For an update, I might use two joins:
update t2
set first = tf.value,
second = ts.value
from table2 t2 left join
table1 tf
on t2.date = tf.date and tf.type = 'First' left join
table1 ts
on t2.date = ts.date and ts.type = 'Second'
where tf.date is not null or ts.date is not null;

use conditional aggregation
select date,max(case when type='First' then value end) as First,
max(case when type='Second' then value end) as Second from t
group by date

You can do conditional aggregation :
select date,
max(case when type = 'first' then value end) as first,
max(case when type = 'Second' then value end) as Second
from table t
group by date;
After that you can use cte :
with cte as (
select date,
max(case when type = 'first' then value end) as first,
max(case when type = 'Second' then value end) as Second
from table t
group by date
)
update t2
set t2.First = t1.First,
t2.Second = t1.Second
from table2 t2 inner join
cte t1
on t1.date = t2.date;

Seems like you're after a PIVOT
DECLARE #Table1 TABLE
(
[Type] NVARCHAR(100)
, [Date] INT
, [Value] INT
);
DECLARE #Table2 TABLE(
[Date] int
,[First] int
,[Second] int
)
INSERT INTO #Table1 (
[Type]
, [Date]
, [Value]
)
VALUES ( 'First', 2019, 1 )
, ( 'First', 2020, 2 )
, ( 'Second', 2019, 3 )
, ( 'Second', 2020, 4 );
INSERT INTO #Table2 (
[Date]
)
VALUES (2019),(2020)
--Show us what's in the tables
SELECT * FROM #Table1
SELECT * FROM #Table2
--How to pivot the data from Table 1
SELECT * FROM #Table1
PIVOT (
MAX([Value]) --Pivot on this Column
FOR [Type] IN ( [First], [Second] ) --Make column where [Value] is in one of this
) AS [pvt] --Table alias
--which gives
--Date First Second
------------- ----------- -----------
--2019 1 3
--2020 2 4
--Using that we can update #Table2
UPDATE [tbl2]
SET [tbl2].[First] = pvt.[First]
,[tbl2].[Second] = pvt.[Second]
FROM #Table1 tbl1
PIVOT (
MAX([Value]) --Pivot on this Column
FOR [Type] IN ( [First], [Second] ) --Make column where [Value] is in one of this
) AS [pvt] --Table alias
INNER JOIN #Table2 tbl2 ON [tbl2].[Date] = [pvt].[Date]
--Results from #Table 2 after updated
SELECT * FROM #Table2
--which gives
--Date First Second
------------- ----------- -----------
--2019 1 3
--2020 2 4

Using Multiple CTE with one INSERT statement. [Error:More Columns than Specified in Column List]

I have CTE1 and CTE2 as below. The CTE2 shows error
CTE2 has more columns than specified in the column list
I would like to know what I am doing wrong. It cannot be because Insert statement has more columns than the CTE2 because CTE1 worked fine before. CTE1 and CTE2 are both using different tables. Is that the problem?
If I remove the parameters in CTE2(NewRoomCost,NewQuantity) Then I get the error
No Columns specified for Column3 of CTE2
Below is the code that I tried. Any help would be appreciated.
CREATE PROCEDURE dbo.SpTransactionGenerate
AS
BEGIN
SET NOCOUNT ON
DECLARE #MinReservationId INT = (SELECT MIN(f.ReservationId) FROM dbo.Reservation AS f)
DECLARE #MaxReservationId INT = (SELECT MAX(f.ReservationId) FROM dbo.Reservation AS f)
DECLARE #FirstSeasonEndDate DATE= '2018-02-13';
DECLARE #SecondSeasonEndDate DATE='2018-02-14';
DECLARE #ThirdSeasonEndDate DATE='2018-12-31';
WHILE #MinReservationId<=#MaxReservationId
BEGIN
WITH CTE1(ServiceId,ServiceRate,Quantity) AS
(
SELECT ServiceId,
ServiceRate,
ABS(CHECKSUM(NEWID())%3) + 1 AS Quantity
FROM dbo.[Service]
),
CTE2(NewRoomCost,NewQuantity) AS
(
SELECT
(SELECT roomRate.RoomCost FROM dbo.RoomRate as roomRate WHERE roomRate.RoomTypeId=
(SELECT room.RoomTypeId FROM dbo.Room as room
JOIN dbo.Reservation as res ON res.RoomId=room.RoomId WHERE res.ReservationId=#MinReservationId
AND roomRate.SeasonId=(
CASE WHEN (SELECT resv.CheckInDate FROM dbo.Reservation as resv WHERE resv.ReservationId=#MinReservationId)<=#FirstSeasonEndDate
THEN (SELECT sea.SeasonId FROM dbo.Season as sea WHERE sea.SeasonEndDate=#FirstSeasonEndDate)
WHEN (SELECT resv.CheckInDate FROM dbo.Reservation as resv WHERE resv.ReservationId=#MinReservationId)<=#SecondSeasonEndDate
THEN (SELECT sea.SeasonId FROM dbo.Season as sea WHERE sea.SeasonEndDate=#SecondSeasonEndDate)
ELSE (SELECT sea.SeasonId FROM dbo.Season as sea WHERE sea.SeasonEndDate=#ThirdSeasonEndDate) END
)
)) AS NewRoomCost,
DATEDIFF(DAY,(SELECT r.CheckinDate FROM dbo.Reservation AS r WHERE r.ReservationId=#MinReservationId), (SELECT r.CheckOutDate FROM dbo.Reservation AS r WHERE r.ReservationId=#MinReservationId)) AS NewQuantity,
)
INSERT INTO dbo.[Transaction]
(
ReservationId,
ServiceId,
Rate,
Quantity,
Amount
)
SELECT
#MinReservationId,
ServiceId,
ServiceRate,
Quantity,
ServiceRate*Quantity
FROM CTE1
UNION
SELECT
#MinReservationId,
NULL,
NewRoomCost,
NewQuantity,
NewRoomCost*NewQuantity
FROM CTE2
SELECT #MinReservationId=#MinReservationId+1
END
END
UPDATE : The error resulted because of a single extra comma in the CTE2. Sorry for the unnecessary question asked.

The issue in CTE2 is that you have an extra comma at the end of this line:
DATEDIFF(DAY,(SELECT r.CheckinDate FROM dbo.Reservation AS r WHERE r.ReservationId=#MinReservationId), (SELECT r.CheckOutDate FROM dbo.Reservation AS r WHERE r.ReservationId=#MinReservationId)) AS NewQuantity,
A sidenote: I suggest not writing explicit column names in the future but rather just naming them as you already did with the AS keyword. It just gives more flexibility overall.

Because your 2nd CTE defines two columns:
CTE2(NewRoomCost,NewQuantity) AS
But your select statement returns 3.
(SELECT roomRate.RoomCost FROM...
DATEDIFF(DAY,(SELECT r.CheckinDate...
(SELECT r.CheckOutDate FROM dbo.Reservation...

How to merge two columns from CASE STATEMENT of DIFFERENT CONDITION

My expected result should be like
----invoiceNo----
T17080003,INV14080011
But right now, I've come up with following query.
SELECT AccountDoc.jobCode,AccountDoc.shipmentSyskey,AccountDoc.docType,
CASE AccountDoc.docType
WHEN 'M' THEN
JobInvoice.invoiceNo
WHEN 'I' THEN
(STUFF((SELECT ', ' + RTRIM(CAST(AccountDoc.docNo AS VARCHAR(20)))
FROM AccountDoc LEFT OUTER JOIN JobInvoice
ON AccountDoc.principalCode = JobInvoice.principalCode AND
AccountDoc.jobCode = JobInvoice.jobCode
WHERE (AccountDoc.isCancelledByCN = 0)
AND (AccountDoc.docType = 'I')
AND (AccountDoc.jobCode = #jobCode)
AND (AccountDoc.shipmentSyskey = #shipmentSyskey)
AND (AccountDoc.principalCode = #principalCode) FOR XML
PATH(''), TYPE).value('.','NVARCHAR(MAX)'),1,2,' '))
END AS invoiceNo
FROM AccountDoc LEFT OUTER JOIN JobInvoice
ON JobInvoice.principalCode = AccountDoc.principalCode AND
JobInvoice.jobCode = AccountDoc.jobCode
WHERE (AccountDoc.jobCode = #jobCode)
AND (AccountDoc.isCancelledByCN = 0)
AND (AccountDoc.shipmentSyskey = #shipmentSyskey)
AND (AccountDoc.principalCode = #principalCode)
OUTPUT:
----invoiceNo----
T17080003
INV14080011
Explanation:
I want to select docNo from table AccountDoc if AccountDoc.docType = I.
Or select invoiceNo from table JobInvoice if AccountDoc.docType = M.
The problem is what if under same jobCode there have 2 docType which are M and I, how I gonna display these 2 invoices?

You can achieve this by using CTE and FOR XML. below is the sample code i created using similar tables you have -
Create table #AccountDoc (
id int ,
docType char(1),
docNo varchar(10)
)
Create table #JobInvoice (
id int ,
invoiceNo varchar(10)
)
insert into #AccountDoc
select 1 , 'M' ,'M1234'
union all select 2 , 'M' ,'M2345'
union all select 3 , 'M' ,'M3456'
union all select 4 , 'I' ,'I1234'
union all select 5 , 'I' ,'I2345'
union all select 6 , 'I' ,'I3456'
insert into #JobInvoice
select 1 , 'INV1234'
union all select 2 , 'INV2345'
union all select 3 , 'INV3456'
select *
from #AccountDoc t1 left join #JobInvoice t2
on t1.id = t2.id
with cte as
(
select isnull( case t1.docType WHEN 'M' THEN t2.invoiceNo WHEN 'I' then
t1.docNo end ,'') invoiceNo
from #AccountDoc t1 left join #JobInvoice t2
on t1.id = t2.id )
select invoiceNo + ',' from cte For XML PATH ('')

You need to pivot your data if you have situations where there are two rows, and you want two columns. Your sql is a bit messy, particularly the bit where you put an entire select statement inside a case when in the select part of another query. These two queries are virtually the same, you should look for a more optimal way of writing them. However, you can wrap your entire sql in the following:
select
Jobcode, shipmentsyskey, [M],[I]
from(
--YOUR ENTIRE SQL GOES HERE BETWEEN THESE BRACKETS. Do not alter anything else, just paste your entire sql here
) yoursql
pivot(
max(invoiceno)
for docType in([M],[I])
)pvt

how to scan each row of a table, and update current row based on previous row?

I need to update the current row using the following logic:
if current row is null, then set it as previous row
if current row is not null, then no action
the 1st row is not null, then NULL appears randomly
Those NULLs need to be updated using the logic previously mentioned
e.g.
1. 1
2. null
3. null
4. 2
5. null
6. null
needs to be updated as
1. 1
2. 1
3. 1
4. 2
5. 2
6. 2
How to do it in SQL?
Thanks
r

In case of two Null values in a row, you need to define the least non-null value of the table, so I think Outer Apply will handle your problem:
CREATE TABLE #TB(ID Int Identity(1, 1), Value Int)
INSERT INTO #TB([Value]) VALUES(1),(Null),(Null),(2),(Null),(Null)
UPDATE G SET G.Value = GG.Value
FROM
#TB AS G
OUTER APPLY
(SELECT
TOP 1 *
FROM
#TB AS GG
WHERE
GG.Value IS NOT NULL
AND
GG.ID < G.ID
ORDER BY
GG.ID DESC
) AS GG
WHERE
G.Value IS NULL
SELECT * FROM #TB AS T
but note, that if the first value is Null it will not give you the results, as you have not defined the logic for this scenario.

This might help:
SELECT
t1.col1,
t1.col2 AS previous,
(SELECT
t2.col2
FROM table_1 t2
WHERE t2.col1 = (SELECT
MAX(t3.col1)
FROM table_1 t3
WHERE t3.col1 <= t1.col1
AND col2 IS NOT NULL))
AS new
FROM table_1 t1;
result

Where are you using this SQL code? If you are using Hive SQL for example, there is a function which allows you to directly get last non null value:
LAST_VALUE(col, true) over (PARTITION BY id ORDER BY date)
Oracle 10g has also a function to do this, as adressed in this thread:
Fill null values with last non-null amount - Oracle SQL
Are you familiar with window functions?

while (select count(*) FROM Table_1 where c1_derived = '') > 0
begin
update top(1) Table_1
set c1_derived = (select c1_derived from Table_1 t2 where (t2.id = [Table_1].id-1))
where c1_derived = ''
end

Try the below script. (sql 2008 +)
CREATE TABLE #table(id Int Identity(1, 1), value Int)
INSERT INTO #table([Value]) VALUES(1),(Null),(Null),(2),(Null),(Null)
;WITH cte AS
(
SELECT ID,Value,ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS row
FROM #table
)
SELECT a.ID,max(b.Value)
FROM cte a
INNER JOIN cte b ON a.row >=b.row
GROUP BY a.ID
drop table #table
Edit2 this also another script using "UNBOUNDED PRECEDING "
CREATE TABLE #table(id Int Identity(1, 1), value Int)
INSERT INTO #table([Value]) VALUES(1),(Null),(Null),(2),(Null),(Null)
select * ,max(t.value) over(order by Id Rows UNBOUNDED PRECEDING) maxValue
from #table t
drop table #table
check this link about "OVER Clause"
https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql

SQL - How to list all tuples in a relation such that tuple 1 is greater than tuple 2

Suppose that I have a relation with only 1 column "Value (INT)" and its values are in descending order.
+----------+
+ VALUE +
+----------+
+ 10 +
+ 9 +
+ 8 +
+ 7 +
....
How can list all the combinations which contains two tuples such that the first tuple is greater than the second tuple
Note: It may exist two tuples with same value
The desired outputs should be like: (10,9) (10, 8) (9,8), (9,7) (8,7)

You can do a cross join on the same table.
SELECT t1.VALUE AS VALUE1, t2.VALUE AS VALUE2
FROM thing t1 JOIN thing t2 ON (t1.VALUE != t2.VALUE AND t1.VALUE > t2.VALUE)

I understand that a single tuple may apear only twice on the left side of the resultset, am I right? That's why there is no (10,7)?
Then you need to compute row number.
select t1.value, t2.value
from
(
select t.value, row_number(order by t.value) as rnum
from table t
) t1 inner join
(
select t.value, row_number(order by t.value) as rnum
from table t
) t2 on t1.value > t2.value and t1.rnum < t2.rnum + 2
Performance of this query will be pretty bad, but I don't know what database are you using - I've used MS SQL row_number function.
Another idea:
If you are using SQL Server 2012+ and your answer to the question posed at the begining of this post is positive, you can use:
select t.value, lead(t.value,1,0) over(order by t.value desc) as lead1
lead(t.value,2,0) over(order by t.value desc) as lead2
from table t
You may need to handle 0 (defulat value if there is no "lead" tuple). I'm not sure if output in this form is acceptable.
And here you go with cursor solution:
DECLARE #result TABLE
(
value1 int,
value2 int
);
DECLARE
#value int,
#lag1 int,
#lag2 int
DECLARE c CURSOR LOCAL STATIC FORWARD_ONLY READ_ONLY
FOR
SELECT value
FROM table
ORDER BY value desc
OPEN c;
FETCH NEXT FROM c INTO #lag2;
FETCH NEXT FROM c INTO #lag1;
FETCH NEXT FROM c INTO #value;
WHILE ##FETCH_STATUS = 0
BEGIN
INSERT #result(value1, value2) SELECT #lag2, #lag1
INSERT #result(value1, value2) SELECT #lag2, #value
SET #lag2 = #lag1
SET #lag1 = #value
FETCH NEXT FROM c INTO #value
END
CLOSE c;
Again, I used MS SQL syntax. If you write how you want duplicates handled, I can update the solution.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL: How to update multiple fields so empty field content is moved to the logically last columns - lose blank address lines - sql

You could make an insert and update trigger that check if the fields are empty and then move them.

Related

T-SQL - Copying & Transposing Data

Using Multiple CTE with one INSERT statement. [Error:More Columns than Specified in Column List]

How to merge two columns from CASE STATEMENT of DIFFERENT CONDITION

how to scan each row of a table, and update current row based on previous row?

SQL - How to list all tuples in a relation such that tuple 1 is greater than tuple 2

Categories

Resources