Merge Multiple SQL Table Columns into One - sql

I have a table as such (with thousand of rows):
ogr_fid (PK, int, not null)
90 (varchar(157), null)
80 (varchar(157), null)
70 (varchar(157), null)
1
some_text
NULL
NULL
2
NULL
NULL
other_text
3
NULL
more_text
NULL
4
even_more_text
NULL
NULL
5
NULL
NULL
NULL
I would like to merge the table to read as follows;
ogr_fid (PK, int, not null)
m_agl (int, null)
1
90
2
70
3
80
4
90
5
NULL
As you can see I will use the text as simply a IS NOT NULL sort of test. The merged column will be populated with the name of the IS NOT NULL column.
I have been trying to understand IIF and CASE but it is far beyond my current level of understanding without a working example to learn from.
My latest attempt:
ALTER TABLE [dbo].[my_table] ADD [m_agl] AS SELECT
CASE WHEN 90 IS NOT NULL THEN '90'
CASE WHEN 80 IS NOT NULL THEN '80'
CASE WHEN 70 IS NOT NULL THEN '70'
GO
ALTER TABLE [dbo].[my_table] DROP COLUMN (90, 80, 70)
GO
Many thanks.

Note, a Column name cannot be naked integer.
Solution using IIF
ALTER TABLE [dbo].[my_table] ADD [column_merged];
GO
SELECT ogr_fid, IIF(column_1 IS NOT NULL, column_1, IIF(column_2 IS NOT NULL, column_2, IIF(column_3 IS NOT NULL, column_3, NULL)))
FROM dbo.my_table;
GO
ALTER TABLE [dbo].[my_table] DROP COLUMN (column_1, column_2, column_3);
GO
Learn more about CASE at Microsoft Docs
Learn more about IIF at Microsoft Docs

Related

DATEDIFF based on next populated column in row

I am working on a query in SQL Server that is giving me a result set that looks something like this:
ID
DaysInState
DaysInState2
DaysInState3
DaysInState4
1
2022-04-01
2022-04-07
NULL
NULL
2
NULL
2022-04-09
NULL
NULL
3
2022-04-11
2022-04-15
NULL
2022-04-18
4
2022-04-11
NULL
NULL
2022-04-18
I need to calculate the number of days that a given item spent in a given state. The challenge I am facing is 'looking ahead' in the row. Using row 1 as an example these would be the following values:
DaysInState: 6 (DATEDIFF(day, '2022-04-11', '2022-04-07'))
DaysInState2: 12 (DATEDIFF(day, '2022-04-07', GETDATE()))
DaysInState3: NULL
DaysInState4: NULL
The challenging part here is that for each column in each row, I have to look at all the columns to the right of the reference column to see if a date exists to use in DATEDIFF. If a date is not found to the right of the reference column then GETDATE() is used. The table below shows what the result set should look like:
ID
DaysInState
DaysInState2
DaysInState3
DaysInState4
1
6
12
NULL
NULL
2
NULL
10
NULL
NULL
3
4
3
NULL
1
4
7
NULL
NULL
1
I can write fairly convoluted CASE...WHEN statements for each column such that
SELECT
CASE
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState2)
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NULL AND DaysInState3 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState3)
...
END
...
However this isn't very maintainable when states are added / removed. Is there a more dynamic approach to solving this problem that doesn't involve lengthy CASE statements or just generally a "better" approach that maybe I am not seeing?
The COALESCE function allows multiple parameters, evaluating them from left to right, returning the first non-null value, eliminating the need for nesting:
Daysinstate1=
datediff(day,
Daysinstate1,
Coalesce(daysinstate2
,Daysinstate3
,Daysinstate4
,Getdate())
)
If it is possible to adjust the query generating your result set, I'd like to suggest a new approach. One advantage is that it can handle additional DayInState variables (5,6,7,...).
Rewrite your query so that your results have three columns: one for ID, one for "DayInState" number, and one for the date. That is, no NULL values returned. Union the result set with the distinct IDs, an exceedingly large "DayInState" number, and the result of GETDATE(). Then you can use DATEDIFF() with LAG() to look at the next dates.
Here's a working example in SQL Server using your data:
begin
declare #temp table (id int,state_num int,dt date)
insert into #temp values
(1,1,'2022-04-01'),
(1,2,'2022-04-07'),
(2,2,'2022-04-09'),
(3,1,'2022-04-11'),
(3,2,'2022-04-15'),
(3,3,'2022-04-18'),
(4,1,'2022-04-11'),
(4,4,'2022-04-18')
select t.id,t.state_num,DATEDIFF(day,t.dt,LAG(t.dt,1,GETDATE()) over(partition by t.id order by t.state_num desc))
from
(select * from #temp
union (select distinct id,999 as state_num, GETDATE() as dt from #temp) ) t
where t.state_num!=999
order by t.id,t.state_num
end
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[DateTest]') AND type in (N'U'))
DROP TABLE [dbo].[DateTest]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DateTest](
[Id] [int] IDENTITY(1,1) NOT NULL,
[DaysInState] [date] NULL,
[DaysInState2] [date] NULL,
[DaysInState3] [date] NULL,
[DaysInState4] [date] NULL,
CONSTRAINT [PK_DateTest] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT INTO [dbo].[DateTest] ([DaysInState],[DaysInState2],[DaysInState3],[DaysInState4]) VALUES
('2022-04-01','2022-04-07',NULL,NULL),
(NULL,'2022-04-09',NULL,NULL),
('2022-04-11','2022-04-15',NULL,'2022-04-18'),
('2022-04-11',NULL,NULL,'2022-04-18');
GO
SELECT [ID],[DaysInState],[DaysInState2],[DaysInState3],[DaysInState4] FROM dbo.DateTest
Use nested ISNULL to check the next column or pass GETDATE().
Using the variable means you can alter the date if needed.
DECLARE #theDate date = GETDATE()
SELECT
[DaysInState] =DATEDIFF(day,[DaysInState], ISNULL([DaysInState2],ISNULL([DaysInState3],ISNULL([DaysInState4],#theDate))))
,[DaysInState2] =DATEDIFF(day,[DaysInState2],ISNULL([DaysInState3],ISNULL([DaysInState4],#theDate)))
,[DaysInState3] =DATEDIFF(day,[DaysInState3],ISNULL([DaysInState4],#theDate))
,[DaysInState4] =DATEDIFF(day,[DaysInState4],#theDate)
FROM dbo.DateTest
Your original query
SELECT
[DaysInState]=CASE
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState2)
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NULL AND DaysInState3 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState3)
END
FROM dbo.DateTest

Compare between columns on different tables

I need to write T-SQL code that will compare between T1.PercentComplete that need to be between T2.StageFrom and T2.StageTo. and than get the T2.Bonus_Prec and join T1
T1:
T2:
The desired result for T2.Bonus_Prec is 0.02 since T1.Percent_Complete is .27, which is between 0 and 1.
The thing is that each Key can have a different T2.StageID between 1-6.
If Key have just one T2.StageID it'll be 0. (fast way for me to know that there is only 1 bonus option)
If it have more than 1 it's will start with 1. (This can be changed if needed)
T1:
DROP TABLE T1;
CREATE TABLE T1(
Key VARCHAR(10) NOT NULL PRIMARY KEY
,Percent_Complete_ NUMBER(16,2) NOT NULL
);
INSERT INTO T1(Key,Percent_Complete_) VALUES ('Key Vendor',Percent_Complete);
INSERT INTO T1(Key,Percent_Complete_) VALUES ('***',0.27);
T2:
DROP TABLE T2;
CREATE TABLE T2(
Key VARCHAR(50) NOT NULL
,StageID INT NOT NULL
,Stage_From NUMERIC(10,2) NOT NULL
,Stage_To NUMERIC(8,2) NOT NULL
,Stage_Bonus_Prec NUMERIC(16,2) NOT NULL
);
INSERT INTO T2(Key,StageID,Stage_From,Stage_To,Stage_Bonus_Prec) VALUES ('Key',Stage_Id,Stage_From,Stage_To,Stage_Bonus_Prec);
INSERT INTO T2(Key,StageID,Stage_From,Stage_To,Stage_Bonus_Prec) VALUES ('***',1,0,0.8,0.02);
INSERT INTO T2(Key,StageID,Stage_From,Stage_To,Stage_Bonus_Prec) VALUES ('***',2,0.8,1,0.035);
INSERT INTO T2(Key,StageID,Stage_From,Stage_To,Stage_Bonus_Prec) VALUES ('***',3,1,-1,0.05);
OUTPUT:
+-----+-------------------+--------------------+
| Key | Percent_Complete | [Stage_Bonus_Prec] |
+-----+-------------------+--------------------+
| *** | 0.27 | 0.02 |
+-----+-------------------+--------------------+
Here is a SQLFiddle with these values
It is still not clear what you are trying to do but I made an attempt. Please notice I also corrected a number of issues with ddl and sample data you posted.
if OBJECT_ID('T1') is not null
drop table T1
CREATE TABLE T1(
KeyVendor VARCHAR(10) NOT NULL PRIMARY KEY
,PercentComplete VARCHAR(16) NOT NULL
);
INSERT INTO T1(KeyVendor,PercentComplete) VALUES ('***','0.27');
if OBJECT_ID('T2') is not null
drop table T2
CREATE TABLE T2(
MyKey VARCHAR(50) NOT NULL
,StageID INT NOT NULL
,Stage_From NUMERIC(10,0) NOT NULL
,Stage_To NUMERIC(8,0) NOT NULL
,Stage_Bonus_Prec NUMERIC(16,3) NOT NULL
);
INSERT INTO T2(MyKey,StageID,Stage_From,Stage_To,Stage_Bonus_Prec) VALUES ('***',1,0,0.8,0.02);
INSERT INTO T2(MyKey,StageID,Stage_From,Stage_To,Stage_Bonus_Prec) VALUES ('***',2,0.8,1,0.035);
INSERT INTO T2(MyKey,StageID,Stage_From,Stage_To,Stage_Bonus_Prec) VALUES ('***',3,1,-1,0.05);
select *
from T1
cross apply
(
select top 1 Stage_Bonus_Prec
from T2
where t1.PercentComplete >= t2.Stage_Bonus_Prec
and t1.KeyVendor = t2.MyKey
order by Stage_Bonus_Prec
) x
Taking a shot at this as well, since it's still a bit unclear:
SELECT t1.percent_complete, t2.Stage_Bonus_Prec
FROM T1 INNER JOIN T2
ON T1.[key vendor] = T2.[Key] AND
T1.[percent_complete] BETWEEN T2.Stage_From AND T2.Stage_To
Joining T1 and T2 on [Key Vendor] and [Key] and using the BETWEEN operator to find the percent_complete value that is between Stage_From and Stage_To.
I think I'm still missing something since I'm still confused about where Key value of *** comes from in your desired results.
SQLFiddle of this in action, based on a slightly fixed up version of your DDL (you put your field names in their own data record, I've removed them since they don't belong there).

MULTIPLE INSERT OVERWRITE in HIVE

I'm trying to do multiple insert overwrite in Hive by the following commands.
INSERT OVERWRITE table results_3 SELECT NULL, res, NULL, NULL FROM results where field= 'title';
And the content of results_3 table after the first command
NULL Up On Cripple Creek (2000 Digital Remaster) NULL NULL
NULL The Weight (2000 Digital Remaster) NULL NULL
NULL Rhythm Of The Rain (LP Version) NULL NULL
NULL Who'll Stop the Rain NULL NULL
NULL I Walk the Line NULL NULL
NULL Against The Wind NULL NULL
NULL Lyin' Eyes NULL NULL
NULL North To Alaska NULL NULL
NULL You Gave Me A Mountain NULL NULL
NULL Night Moves NULL NULL
INSERT OVERWRITE table results_3 SELECT NULL, NULL, res, NULL FROM results where field= 'albums';
And the content of results_3 table after the second command
NULL NULL The Band NULL
NULL NULL The Band NULL
NULL NULL The Cascades NULL
NULL NULL Creedence Clearwater Revival NULL
NULL NULL Johnny Cash NULL
NULL NULL Bob Seger NULL
NULL NULL The Eagles NULL
NULL NULL Johnny Horton NULL
NULL NULL Marty Robbins NULL
NULL NULL Bob Seger NULL
but I want to merge the two things together. Do you have any idea how I can tackle this?
Thanks
You can append in such way:
INSERT OVERWRITE TABLE
select col1 ... col2
from
(
SELECT col1 ... coln from TABLE --old data
UNION ALL
SELECT col1 ... col2n from TABLE2 --new data
)
Hive insert does not support append so far.
A simple way : insert overwrite two directory. Merge it manually.
Or
insert into a table with different partition(But, in fact different partition have different directory).
Plz see hive wiki for more information.

How to allow temporary tables to accept null values

If you create temp tables using "insert into" in SQL Server it uses the first insert to determine whether a column accepts null value or not. if the first insert has null value the column become nullable otherwise it will be non-nullable.
Is there a way to create temp tables using "insert into" to accept null values?
Example
This works without any problem
Select 'one' as a , null as b
into #temp
insert into #temp
Select 'two' as a , 500 as b
However this throws "Cannot insert the value NULL into column 'b'"
Select 'one' as a , 500 as b
into #temp
insert into #temp
Select 'two' as a , null as b
I know I could do create Table or alter column statement but I want to do it without rewriting hundreds of the existing queries.
How about this?
Select CONVERT(varchar(100), 'one') as a , CONVERT(int, 500) as b
into #temp
insert into #temp
Select 'two' as a , null as b
select * from #temp order by 1
I would workaround this by explicitly creating temporary table before first insert.
create table #temp (a varchar(10) not null, b int null)
(Un)fortunately, this question is too popular and appears at the top for Sybase ASE 15.7 as well, so just adding my answer for Sybase here.
For me neither of cast, convert or coalesce worked, but a case statement did (which is what coalesce is, but eh...)
select
a = case when 1 = 0 then null else 'one' end,
b = case when 1 = 0 null else 500 end
into #temp
This is an old question but I had a similar issue where I UNION NULLs to the initial query which may have helped the OP.
Select 'one' as a , 500 as b
into #temp
UNION
SELECT NULL, NULL
insert into #temp
Select 'two' as a , NULL as b
Putting it here so the next time I need to do this and forget how...

Add a SQL XOR Constraint between two nullable FK's

I'd like to define a constraint between two nullable FK's in a table where if one is null the other needs a value, but both can't be null and both can't have values. Logic is the derived table inherits data from the either of the FK tables to determine its type. Also, for fun bonus points, is this a bad idea?
One way to achieve it is to simply write down what "exclusive OR" actually means:
CHECK (
(FK1 IS NOT NULL AND FK2 IS NULL)
OR (FK1 IS NULL AND FK2 IS NOT NULL)
)
However, if you have many FKs, the above method can quickly become unwieldy, in which case you can do something like this:
CHECK (
1 = (
(CASE WHEN FK1 IS NULL THEN 0 ELSE 1 END)
+ (CASE WHEN FK2 IS NULL THEN 0 ELSE 1 END)
+ (CASE WHEN FK3 IS NULL THEN 0 ELSE 1 END)
+ (CASE WHEN FK4 IS NULL THEN 0 ELSE 1 END)
...
)
)
BTW, there are legitimate uses for that pattern, for example this one (albeit not applicable to MS SQL Server due to the lack of deferred constraints). Whether it is legitimate in your particular case, I can't judge based on the information you provided so far.
You can use check constraint:
create table #t (
a int,
b int);
alter table #t add constraint c1
check ( coalesce(a, b) is not null and a*b is null );
insert into #t values ( 1,null);
insert into #t values ( null ,null);
Running:
The INSERT statement conflicted with the CHECK constraint "c1".
Alternate way is to define this check constraint in a procedure. Before you insert a record in the derived table, the constraint should be satisfied. Else insert fails or returns an error.