Creating new columns in Sql Server based on other columns - sql

I have a table T :
CREATE TABLE T
(
id INT,
type VARCHAR(200),
type_value VARCHAR(10),
value VARCHAR(200)
);
INSERT INTO T VALUES (1, 'HomePhone', 'p1', '1234 ');
INSERT INTO T VALUES (1, 'HomePhone', 'p2', '5678 ');
INSERT INTO T VALUES (1, 'HomePhone', 'p3', '4567');
INSERT INTO T VALUES (1, 'WorkPhone', 'w1', '9007 ');
INSERT INTO T VALUES (2, 'Email', 'e1', 'abc#xyz.com ');
INSERT INTO T VALUES (2, 'Email', 'e1', 'efg#xyz.com');
INSERT INTO T VALUES (2, 'Email', 'e2', 'mno#xyz.com');
INSERT INTO T VALUES (3, 'WorkPhone', 'w1', '0100');
INSERT INTO T VALUES (3, 'WorkPhone', 'w2', '0110');
INSERT INTO T VALUES (4, 'OtherPhone', 'o1', '1010 ');
INSERT INTO T VALUES (4, 'OtherPhone', 'o1', '1110 ');
INSERT INTO T VALUES (4, 'OtherPhone', 'o1', '1011');
INSERT INTO T VALUES (4, 'HomePhone', 'p1', '2567 ');
I need to transform it into :
id primaryhomephone secondaryhomephone primaryemail secondaryemail Primaryworkphone secondaryworkphone primaryotherphone secondaryotherphone
----------------------------------------------------------------------------------------------------------------------------------------------------------------
1 1234 5678 null null 9007 null null null
2 null null abc#xyz.com efg#xyz.com null null null null
3 null null null null 0100 0110 null null
4 2567 null null null null null 1010 1011
Basically the field will be divided based on type_value. If there is two type_value for same id and type then first type will be primary and second type will be secondary.
For more than two type values for same id and type, discard the third one.
For more than two type values of same type(for example o1,o1) for id 4 first o1 will be primaryotherphone and second one will be secondaryprimaryphone.
Sorry if this question has been repeated before but somehow I can't solve it. Can anyone please help. Thanks a lot

You can use MAX/GROUP BY technique. PIVOT is quite complex for beginners, but I agree that the purpose of the PIVOT is the exactly what you need:
SELECT id
, MAX(CASE WHEN type_value = 'p1' THEN value END ) AS primaryhomephone
, MAX(CASE WHEN type_value = 'p2' THEN value END ) AS secondaryhomephone
, MAX(CASE WHEN type_value = 'p3' THEN value END ) AS thirdphone
, MAX(CASE WHEN type_value = 'w1' THEN value END ) AS workphone
, MAX(CASE WHEN type_value = 'w2' THEN value END ) AS secondaryworkphone
, MAX(CASE WHEN type_value = 'o2' THEN value END ) AS otherworkphone
, MAX(CASE WHEN type_value = 'e1' THEN value END ) AS primaryemail
, MAX(CASE WHEN type_value = 'e2' THEN value END ) AS secondaryemail
FROM T
GROUP BY id
;

You need to look after computed columns in SQL Server.
CREATE TABLE [dbo].T
(
...
[primaryhomephone] AS (CASE WHEN type_value = 'p1' THEN value ELSE NULL END)
)
Alternatively you can use Views:
CREATE VIEW dbo.vwT
AS
SELECT
id
,(CASE WHEN type_value = 'p1' THEN value ELSE NULL END as [primaryhomephone]
, (CASE WHEN type_value = 'p2' THEN value ELSE NULL END as [secondaryhomephone]
FROM dbo.T

You need to look into PIVOT Column presentation. You can use it in to ways, you can create new Table with your column and then use PIVOT Query to insert them there, or you can simply fetch at runtime. If there is lot of data and all other Programming interface are ready to upgrade then make new table, else just use query.
Here is the link for your reference: https://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx

Related

SQL Group By specific column with nullable

Let's say I have this data in my table A:
group_id
type
active
1
A
true
1
B
false
1
C
true
2
null
false
3
B
true
3
C
false
I want to create a query which return the A row if exists (without the type column), else return a row with active false.
For this specific table the result will be:
group_id
active
1
true
2
false
3
false
How can I do this ?
I'm assuming I have to use a GROUP BY but I can't find a way to do it.
Thank you
This is a classic row_number problem, generate a row number based on your ordering criteria, then select just the first row in each grouping.
declare #MyTable table (group_id int, [type] char(1), active bit);
insert into #MyTable (group_id, [type], active)
values
(1, 'A', 1),
(1, 'B', 0),
(1, 'C', 1),
(2, null, 0),
(3, 'B', 1),
(3, 'C', 0);
with cte as (
select *
, row_number() over (
partition by group_id
order by case when [type] = 'A' then 1 else 0 end desc, active asc
) rn
from #MyTable
)
select group_id, active
from cte
where rn = 1
order by group_id;
Returns:
group_id
active
1
1
2
0
3
0
Note: Providing the DDL+DML as I have shown makes it much easier for people to assist.
This should do it. We select all the distinct group_ids and then join our table back to that. There is an ISNULL function that will insert the 'false' when 'A' type records are not found.
DECLARE #tableA TABLE (
group_id int
, [type] nchar(1)
, active nvarchar(10)
);
INSERT INTO #tableA (group_id, [type], active)
VALUES
(1, 'A', 'true')
, (1,'B','false')
, (1,'C', 'false')
, (2, null, 'false')
, (3, 'B', 'true')
, (3, 'C', 'false')
;
SELECT
gid.group_id
, ISNULL(a.active,'false') as active
FROM (SELECT DISTINCT group_id FROM #tableA) as gid
LEFT OUTER JOIN #tableA as a
ON a.group_id = gid.group_id
AND a.type = 'A'

Basic SQL question to Select all results with changes in value

Have a table that's basically the following:
ID FileDate Reg# Value
1 01012022 ABC 100.00
2 01012022 CDE 51.20
3 02052022 ABC 101.25
4 02082022 CDE 51.20
(Note - the dates noted above will be properly formatted. I'm just using the example above.)
I want to write a query that will return the rows where the VALUE field has changed for any REG# over a given period of time (example, show me all results where the value of a reg has changed between Jan 1 2022 and March 1 2022).
Ideally, the results would show:
01012022 ABC 100.00
02052022 ABC 101.25
Use can use the LAG() function
--Declaring and populating a Temp table to store the values you mentioned above [added another two entries to test]
DECLARE #tbl TABLE (
ID INT,
FileDate DATE ,
Reg# VARCHAR(3) ,
Value Float)
insert into #tbl (ID, FileDate, Reg#, Value)
values (1, '2022-01-01', 'ABC' , 100) , (2, '2022-01-01', 'CDE' , 51.20), (3, '2022-02-05', 'ABC' , 101.25), (4, '2022-02-08', 'CDE' , 51.20), (5, '2022-03-01', 'ABC' , 101.25), (6, '2022-03-02', 'CDE' , 52);
--Query
with lagtbl as (
SELECT ID, Reg#, FileDate, Value, LAG(t.Value, 1) OVER(PARTITION BY Reg# order by Reg#, ID) Lag
from #tbl t)
select lagtbl.ID, lagtbl.Reg#, lagtbl.FileDate, lagtbl.Value
from lagtbl
where Value != Lag
ORDER BY Reg#, ID

How to include count of NULL values in a temp table without changing the NULL data to 0?

I have a table with these values
create table LoanExample (
LoanId int,
ConstraintId int,
BorrowerName varchar(128));
insert into LoanExample values (1, null, 'Jack')
insert into LoanExample values (1, 33, 'July')
insert into LoanExample values (2, 78, 'Mike')
insert into LoanExample values (2, 72, 'Wayne')
insert into LoanExample values (3, null, 'David')
insert into LoanExample values (3, 79, 'Chris')
insert into LoanExample values (4, null, 'Finn')
insert into LoanExample values (4, null, 'James')
I want to count the constraints of each LoanId even if its value is nulland add it into a temp table.
I tried this
select
LoanId,
Constraints_Count = count(ConstraintId)
into #Test
from LoanExample
group by LoanId
But this query ignores all the null values in the count function and gives me a warning message "Warning: Null value is eliminated by an aggregate or other SET operation."
!
I expected Constraints_Count of each LoanId to be '2' but for the LoanId which has a ConstraintId as null has a reduced value in Constraints_Count.
So, for LoanId 1 and 3 I get Constraints_Count as '1' but I expect '2' and for LoanId 4, I get Constraints_Count as '0' but I expect '2'.
I suppose I can use ROW_NUMBER() but I'm not quite sure how.
simply use count(*) or sum(1):
select
LoanId,
Constraints_Count = count(*)
into #Test
from LoanExample
group by LoanId
Or
select
LoanId,
Constraints_Count = sum(1)
into #Test
from LoanExample
group by LoanId
Depending on your DBMS, you could convert null to 0.
SQL Server:
select
LoanId,
Constraints_Count = count(isnull(ConstraintId,0))
into #Test
from LoanExample
group by LoanId
Oracle:
select
LoanId,
Constraints_Count = count(nvl(ConstraintId,0))
into #Test
from LoanExample
group by LoanId
Other:
select
LoanId,
Constraints_Count = count(case when ConstraintId is null then 0 else ConstraintId end))
into #Test
from LoanExample
group by LoanId
Instead use coalesce() which will convert null values to a valid integer value:
Constraints_Count = count(coalesce(ConstraintId,1))
In your case the value 1 I used is irrelevant. It could be any other integer value.
count() work on null value so you could try using a case for invert the null values
select
LoanId,
count(case when ConstraintId is null then 1 else null) Constraints_Count
from LoanExample
group by LoanId
try like below by using case when
select
LoanId,
Constraints_Count = sum(case when ConstraintId is null then 1 else 1 end)
into #Test
from LoanExample
group by LoanId

Counting and grouping challenge in a pivot table with T-SQL

I have a pivot table that converts a vertical database design to a horizontal one:
The source table:
Id ParentId Property Value
---------------------------------
1 1 Date 01-09-2015
2 1 CountValue 2
3 1 TypeA Value1
4 1 TypeB Value2
5 1 TypeC Value2
6 2 Date 15-10-2015
7 2 CountValue 3
8 2 TypeA Value3
9 2 TypeB Value22
10 2 TypeC Value99
After pivoting this looks like:
ParentId Date CountValue TypeA TypeB TypeC
----------------------------------------------------------
1 01-09-2015 2 Value1 Value2 Value2
2 15-10-2015 3 Value3 Value22 Value99
Then, there's a look-up table for valid values in columns TypeA, TypeB and TypeC:
Id Name Value
-----------------
1 TypeA Value1
2 TypeA Value2
3 TypeA Value3
4 TypeB Value20
5 TypeB Value21
6 TypeB Value22
7 TypeC Value1
8 TypeC Value2
So, given the above structure I'm looking for a way to query the pivot table in a way that I'll get a count of all invalid values in TypeA, TypeB and TypeC where Date is a valid date and CountValue is not empty and greater than 0.
How can I achieve a result that is expected and outputted like below:
Count Column
--------------
0 TypeA
1 TypeB
1 TypeC
I've accomplished the result by creating three several queries and glue the results using UNION, but I think it should also be possible using the pivot table, but I'm unsure how. Can the desired result be realized using the pivot table?
Note: the database used is a SQL Server 2005 database.
I would not approach this a PIVOT, otherwise you have to pivot your data, then unpivot it to get the output required. Breaking it down step by step you can get your valid parent IDs using this:
SELECT t.ParentID
FROM #T AS t
GROUP BY t.ParentID
HAVING ISDATE(MAX(CASE WHEN t.Property = 'Date' THEN t.Value END)) = 1
AND MAX(CASE WHEN t.Property = 'CountValue' THEN CONVERT(INT, t.Value) END) > 0;
The two having clauses limit this to your criteria of having a valid date, and a CountValue that is greater than 0
The next step would be to find your invalid properties:
SELECT t.*
FROM #T AS t
WHERE NOT EXISTS
( SELECT 1
FROM #V AS v
WHERE v.Name = t.Property
AND v.Value = t.Value
);
This will include Date, and CountValue, and also won't include TypeA because all the properties are valid, so a bit more work is required, we must find the distinct properties we are interested in:
SELECT DISTINCT Name
FROM #V
Now we can combine this with the invalid properties to get the count, and with the valid parent IDs to get the desired result:
WITH ValidParents AS
( SELECT t.ParentID
FROM #T AS t
GROUP BY t.ParentID
HAVING ISDATE(MAX(CASE WHEN t.Property = 'Date' THEN t.Value END)) = 1
AND MAX(CASE WHEN t.Property = 'CountValue' THEN CONVERT(INT, t.Value) END) > 0
), InvalidProperties AS
( SELECT t.Property
FROM #T AS t
WHERE t.ParentID IN (SELECT vp.ParentID FROM ValidParents AS vp)
AND NOT EXISTS
( SELECT 1
FROM #V AS v
WHERE v.Name = t.Property
AND v.Value = t.Value
)
)
SELECT [Count] = COUNT(t.Property),
[Column] = v.Name
FROM (SELECT DISTINCT Name FROM #V) AS V
LEFT JOIN InvalidProperties AS t
ON t.Property = v.Name
GROUP BY v.Name;
Which gives:
Count Column
--------------
0 TypeA
1 TypeB
1 TypeC
SCHEMA FOR ABOVE QUERIES
For SQL Server 2008+. Apologies, I don't have SQL Server 2005 anymore, and forgot it doesn't support table value constructors.
CREATE TABLE #T (Id INT, ParentId INT, Property VARCHAR(10), Value VARCHAR(10));
INSERT #T (Id, ParentId, Property, Value)
VALUES
(1, 1, 'Date', '01-09-2015'), (2, 1, 'CountValue', '2'), (3, 1, 'TypeA', 'Value1'),
(4, 1, 'TypeB', 'Value2'), (5, 1, 'TypeC', 'Value2'), (6, 2, 'Date', '15-10-2015'),
(7, 2, 'CountValue', '3'), (8, 2, 'TypeA', 'Value3'), (9, 2, 'TypeB', 'Value22'),
(10, 2, 'TypeC', 'Value99');
CREATE TABLE #V (ID INT, Name VARCHAR(5), Value VARCHAR(7));
INSERT #V (Id, Name, Value)
VALUES
(1, 'TypeA', 'Value1'), (2, 'TypeA', 'Value2'), (3, 'TypeA', 'Value3'),
(4, 'TypeB', 'Value20'), (5, 'TypeB', 'Value21'), (6, 'TypeB', 'Value22'),
(7, 'TypeC', 'Value1'), (8, 'TypeC', 'Value2');
Final result without PIVOT:
SELECT [count] = SUM(CASE WHEN l.id IS NULL THEN 1 ELSE 0 END)
,t.Property
FROM #lookup l
RIGHT JOIN #tab t
ON t.Property = l.Name
AND t.Value = l.Value
WHERE t.Property LIKE 'Type%'
GROUP BY t.Property;
LiveDemo
Data:
CREATE TABLE #tab(
Id INTEGER NOT NULL PRIMARY KEY
,ParentId INTEGER NOT NULL
,Property VARCHAR(10) NOT NULL
,Value VARCHAR(10) NOT NULL
);
INSERT INTO #tab(Id,ParentId,Property,Value) VALUES (1,1,'Date','01-09-2015');
INSERT INTO #tab(Id,ParentId,Property,Value) VALUES (2,1,'CountValue','2');
INSERT INTO #tab(Id,ParentId,Property,Value) VALUES (3,1,'TypeA','Value1');
INSERT INTO #tab(Id,ParentId,Property,Value) VALUES (4,1,'TypeB','Value2');
INSERT INTO #tab(Id,ParentId,Property,Value) VALUES (5,1,'TypeC','Value2');
INSERT INTO #tab(Id,ParentId,Property,Value) VALUES (6,2,'Date','15-10-2015');
INSERT INTO #tab(Id,ParentId,Property,Value) VALUES (7,2,'CountValue','3');
INSERT INTO #tab(Id,ParentId,Property,Value) VALUES (8,2,'TypeA','Value3');
INSERT INTO #tab(Id,ParentId,Property,Value) VALUES (9,2,'TypeB','Value22');
INSERT INTO #tab(Id,ParentId,Property,Value) VALUES (10,2,'TypeC','Value99');
CREATE TABLE #lookup(
Id INTEGER NOT NULL PRIMARY KEY
,Name VARCHAR(5) NOT NULL
,Value VARCHAR(7) NOT NULL
);
INSERT INTO #lookup(Id,Name,Value) VALUES (1,'TypeA','Value1');
INSERT INTO #lookup(Id,Name,Value) VALUES (2,'TypeA','Value2');
INSERT INTO #lookup(Id,Name,Value) VALUES (3,'TypeA','Value3');
INSERT INTO #lookup(Id,Name,Value) VALUES (4,'TypeB','Value20');
INSERT INTO #lookup(Id,Name,Value) VALUES (5,'TypeB','Value21');
INSERT INTO #lookup(Id,Name,Value) VALUES (6,'TypeB','Value22');
INSERT INTO #lookup(Id,Name,Value) VALUES (7,'TypeC','Value1');
INSERT INTO #lookup(Id,Name,Value) VALUES (8,'TypeC','Value2');
EDIT:
Adding more criteria:
LiveDemo2
SELECT [count] = SUM(CASE WHEN l.id IS NULL THEN 1 ELSE 0 END)
,t.Property
FROM #lookup l
RIGHT JOIN #tab t
ON t.Property = l.Name
AND t.Value = l.Value
WHERE t.Property LIKE 'Type%'
AND t.ParentId IN (SELECT ParentId FROM #tab WHERE Property = 'Date' AND ISDATE(VALUE) = 1)
AND t.ParentID IN (SELECT ParentId FROM #tab WHERE Property = 'CountValue' AND Value > 0)
GROUP BY t.Property;

Query to reflect actual significant change in data

Given a table with employee statuses and effective dates, how can I retrieve just the data that reflects a change in status?
For example, given the following structure:
DECLARE #STATUSES TABLE(
EMPLOYEE_ID INT NOT NULL,
EFFECTIVE_DATE DATE NOT NULL,
STATUS_CODE CHAR(1) NOT NULL
)
INSERT #STATUSES VALUES (1, '2012-01-01', 'A')
INSERT #STATUSES VALUES (1, '2012-02-28', 'A')
INSERT #STATUSES VALUES (1, '2012-03-01', 'T')
INSERT #STATUSES VALUES (2, '2012-01-01', 'A')
INSERT #STATUSES VALUES (2, '2012-02-14', 'A')
INSERT #STATUSES VALUES (2, '2012-03-10', 'A')
INSERT #STATUSES VALUES (3, '2012-02-01', 'A')
INSERT #STATUSES VALUES (3, '2012-03-17', 'A')
INSERT #STATUSES VALUES (3, '2012-03-18', 'T')
INSERT #STATUSES VALUES (3, '2012-04-01', 'A')
INSERT #STATUSES VALUES (4, '2012-03-01', 'A')
What query can be used to result in the following?
EMPLOYEE_ID EFFECTIVE_DATE STATUS_CODE
1 2012-01-01 A
1 2012-03-01 T
2 2012-01-01 A
3 2012-02-01 A
3 2012-03-18 T
3 2012-04-01 A
4 2012-03-01 A
In other words, I want to leave out those records that have the same employee id and status code as the one before it, if one exists with an earlier effective date. Notice that employee 1 is listed only twice because there were only two actual changes in status--the one on 2012-02-28 is inconsequential since the status didn't change from the earlier date. Also notice that employee 2 is listed just once since his status never changed despite there being three records. Only the earliest date is shown for each change.
With some further experimenting, it looks like this will do what I want.
;WITH cte
AS (SELECT ROW_NUMBER() OVER (PARTITION BY EMPLOYEE_ID ORDER BY EFFECTIVE_DATE) AS rownum
,EMPLOYEE_ID
,EFFECTIVE_DATE
,STATUS_CODE
FROM #STATUSES)
SELECT t2.EMPLOYEE_ID
,t2.EFFECTIVE_DATE
,t2.STATUS_CODE
FROM cte t2
LEFT JOIN cte t1
ON t2.EMPLOYEE_ID = t1.EMPLOYEE_ID
AND t2.STATUS_CODE = t1.STATUS_CODE
AND t2.rownum = t1.rownum + 1
WHERE t1.EMPLOYEE_ID IS NULL
You could use a CURSOR
You'd need two sets of variables: #PreviousRecord and #CurrentRecord
Declare the cursor for table sorted by employeeid and date
Fetch the first record from the cursor into the #PreviousRecord variables - depending on your requirement register this as a significant change or not (write the record to a temp table)
Then set up a loop that:
Fetches the next record into the #CurrentRecord variables
Compares it with the previous record and if it matches your requirement for a significant change write it to the temp table
Move the #CurrentRecord values into the #PreviousRecord variables
I'd be interested to know if the CTE method was more efficient
SELECT
EMPLOYEE_ID, MIN(EFFECTIVE_DATE) AS EFFECTIVE_DATE, STATUS_CODE
FROM
(
SELECT
T1.EMPLOYEE_ID, T1.EFFECTIVE_DATE, T1.STATUS_CODE,
MAX(T2.EFFECTIVE_DATE) AS MOST_RECENT_PREVIOUS_STATUS_DATE
FROM
#STATUSES T1
LEFT JOIN
#STATUSES T2
ON
T1.EMPLOYEE_ID = T2.EMPLOYEE_ID
AND
T1.EFFECTIVE_DATE > T2.EFFECTIVE_DATE
AND
T1.STATUS_CODE <> T2.STATUS_CODE
GROUP BY
T1.EMPLOYEE_ID, T1.EFFECTIVE_DATE, T1.STATUS_CODE
) SubQuery
GROUP BY
EMPLOYEE_ID, STATUS_CODE, MOST_RECENT_PREVIOUS_STATUS_DATE