Convert number of years and months into number of months - sql

My table holds a column that holds a value that signifies a period in years and months, in other words:
ID PERIOD
1 1Y
2 1Y1M
3 11M
4 5Y2M
When doing a select statement how to I convert/calculate this into number of months? It's easy to deal with values that are 'Y' only or 'M' only but not sure how to do for example ID 2 and ID 4 from the example above.
The result from the select statement if I'd select all above would be:
12
13
11
62

You can use
SUBSTRING(period, 1, CHARINDEX('Y', Period) - 1)
To get only the numbers before the Y, and then multiply it by 12.
And for the cases where the Y would not be present could be handled with CASE
So Something like:
SELECT CASE WHEN Period LIKE '%Y' THEN
CAST(SUBSTRING(period, 1, CHARINDEX('Y', Period) - 1)) * 12
+
CAST(REPLACE(SUBSTRING(period, CHARINDEX('Y', Period)),'M',''))
ELSE
CAST(REPLACE(period,'M',''))

You can do the following
SELECT *, TRY_CAST(REPLACE(Y, 'Y', '') AS INT) * 12 +
TRY_CAST(REPLACE(REPLACE(Period, Y, ''), 'M', '') AS INT)
FROM
(
VALUES
(1, '1Y'),
(2, '1Y1M'),
(3, '11M'),
(4, '5Y2M'),
(5, 'Whatever')
) T(Id, Period)
CROSS APPLY
(
VALUES
(LEFT(Period, CHARINDEX('Y', Period)))
) CI(Y)
Returns:
+----+----------+----+------------------+
| Id | Period | Y | (No column name) |
+----+----------+----+------------------+
| 1 | 1Y | 1Y | 12 |
| 2 | 1Y1M | 1Y | 13 |
| 3 | 11M | | 11 |
| 4 | 5Y2M | 5Y | 62 |
| 5 | Whatever | | |
+----+----------+----+------------------+
Here is an other way
SELECT Id,
Period,
(REPLACE(Years, 'Y', '') * 12) + REPLACE(Months, 'M', '') TotalMonths
FROM
(
VALUES
(1, '1Y'),
(2, '1Y1M'),
(3, '5M'),
(4, '10Y11M'),
(5, 'Whatever write Y and M')
) T(Id, Period)
CROSS APPLY
(
VALUES
(
LEFT(Period, CHARINDEX('Y', Period)), REPLACE(REPLACE(Period, 'Y', ''), 'M', '')
)
) TT(Years, Value)
CROSS APPLY
(
VALUES
(
REPLACE(Period, Years, '')
)
) TTT(Months)
WHERE TRY_CAST(TT.Value AS INT) IS NOT NULL;
db-fiddle

Another alternative
select id, (parsename (clean,2) * 12) + (parsename(clean,1)) as months
from t
cross apply (select replace(replace(case when charindex('M', period)=0 then period + '0M'
when charindex('Y', period)=0 then '0Y' + period
else period end,'M',''),'Y','.') as clean) t2
Outputs
+----+--------+
| id | months |
+----+--------+
| 1 | 12 |
| 2 | 13 |
| 3 | 11 |
| 4 | 62 |
+----+--------+

There aren't so many possibilities. Build a reference table:
select identity(int) as period_id, v.*
into periods p
from (values ('1M', 1),
('2M', 2),
. . .
('1Y', 12),
('1Y1M', 13),
. . .
) v(period, months);
This can easily be constructed using a spreadsheet. Or even a recursive CTE.
I am suggesting this for a serious reason: you should not be doing calculations on string representations like this. These values should be treated as foreign key references to a table. And, in fact, they should be using the primary key (the identity column) rather than the string.
You will probably find malformed strings as you go about fixing this. That is a good thing, from a data quality perspective.

I think something like this would work
Data
drop table if exists #tTEST;
go
select * INTO #tTEST from (values
(1, '1Y'),
(2, '1Y1M'),
(3, '11M'),
(4, '5Y2M')) V(ID, [Period]);
Query
select
t.*,
isnull(substring(t.[Period], 1, nullif(y_ndx, 0)-1)*12, 0) +
isnull(substring(t.[Period], y_ndx+1, nullif(m_ndx, 0)-isnull(y_ndx, 0)-1),0) answer
from #tTEST t
cross apply (select len(t.[Period]) p_len,
CHARINDEX('Y', t.[Period]) y_ndx,
CHARINDEX('M', t.[Period]) m_ndx) ndx;
Results
ID Period answer
1 1Y 12
2 1Y1M 13
3 11M 11
4 5Y2M 62

Try this below Scalar-Valued Function to get period value
CREATE FUNCTION [dbo].[fn_sum](
#delimited NVARCHAR(MAX),
#delimiter NVARCHAR(100)
) RETURNS INT
AS
BEGIN
DECLARE #value INT
set #delimited=replace(replace(#delimited,'M','M,'),'Y','Y,')
DECLARE #xml XML
SET #xml = N'<t>' + REPLACE(#delimited,#delimiter,'</t><t>') + '</t>'
SELECT #value=sum(cast(replace(replace(r.value('.','Nvarchar(MAX)'),'M',''),'Y','') AS INT)
* CASE WHEN right(r.value('.','Nvarchar(MAX)'),1)='M' THEN 1 ELSE 12 END)
FROM #xml.nodes('/t') AS records(r)
RETURN #value
END
Check below sample output
DECLARE #tblPeriod AS TABLE(id VARCHAR(100),period VARCHAR(100))
INSERT INTO #tblPeriod(id,period) VALUES(1,'1Y'),(2,'1Y1M'),(3,'11M'),(4,'5Y2M')
SELECT id,period,dbo.[fn_sum](period,',') AS period_value FROM #tblPeriod a

Related

SQL Server query for multiple conditions on the same column

Here's the schema and data that i am working with
CREATE TABLE tbl (
name varchar(20) not null,
groups int NOT NULL
);
insert into tbl values('a', 35);
insert into tbl values('a', 36);
insert into tbl values('b', 35);
insert into tbl values('c', 36);
insert into tbl values('d', 37);
| name | groups|
|------|-------|
| a | 35 |
| a | 36 |
| b | 35 |
| c | 36 |
| d | 37 |
now i need names of only those that are having group greater than or equal to 35
but also an additional is that i can only include a row for which group=35 when a corresponding groups=36 is also present
| name | groups|
|------|-------|
| a | 35 |
| a | 36 |
second condition is that it CAN include those names that are having groups greater than or equal to 36 without having a groups=35
| name | groups|
|------|-------|
| c | 36 |
| d | 37 |
the only case it should leave out is where a record has only groups=35 present without a corresponding groups=36
| name | groups|
|------|-------|
| b | 35 |
i have tried the following
select name from tbl
where groups>=35
group by name
having count(distinct(groups))>=2
or groups>=36;
this is the error i am facing Column 'tbl.groups' is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.
Try this:
DECLARE #tbl table ( [name] varchar(20) not null, groups int NOT NULL );
INSERT INTO #tbl VALUES
('a', 35), ('a', 36), ('b', 35), ('c', 36), ('d', 37);
DECLARE #group int = 35;
; WITH cte AS (
SELECT
[name]
, COUNT ( DISTINCT groups ) AS distinct_group_count
FROM #tbl
WHERE
groups >= #group
GROUP BY
[name]
)
SELECT t.* FROM #tbl AS t
INNER JOIN cte
ON t.[name] = cte.[name]
WHERE
cte.distinct_group_count > 1
OR t.groups > #group;
RETURNS
+------+--------+
| name | groups |
+------+--------+
| a | 35 |
| a | 36 |
| c | 36 |
| d | 37 |
+------+--------+
Basically, this restricts the name results to groups with a value >= 35 with more than one distinct group associated, or any name with a group value greater than 35. Several assumptions were made in regard to your data, but I believe the logic still applies.
So, as far as i can tell you just want to limit where groups 35 is by itself. I thought, lets try and isolate those names where they only have groups=35 and then not exists from there. Is this the correct output youre after?
Also, using complicated OR's in the where clause will often lead to your query not being SARGable. Better to UNION or some how building the query so that each part can use indexes (if they can).
if object_id('tempdb..#tbl') is not null drop table #tbl;
CREATE TABLE #tbl (
name varchar(20) not null,
groups int NOT NULL
);
insert into #tbl values('a', 35), ('a', 36), ('b', 35), ('c', 36), ('d', 37);
select *
from #tbl tbl
WHERE NOT EXISTS
(
SELECT COUNT(groups), name
FROM #tbl t
WHERE EXISTS
(
SELECT name
FROM #tbl tb
WHERE groups = 35
and tb.name=t.name
)
AND t.name = tbl.name
GROUP BY name
HAVING COUNT(groups)=1
)
;
It looks like you need an exists() condition. Try:
select *
from tbl t
where t.groups >= 35
and (
t.groups > 35
or exists(select * from tbl t2 where t2.name = t.name and t2.groups = 36)
)
There are other ways to arrange the where clause to achieve the same effect. Having the t.groups >= 35 condition up front should give the query optimizer the ability to leverage an index on groups.
You can use a windowed count for this
This avoids joining the table multiple times
SELECT
name,
groups
FROM (
SELECT *,
Count36 = COUNT(CASE WHEN groups = 36 THEN 1 END) OVER (PARTITION BY name)
FROM tbl
WHERE groups >= 35
) tbl
WHERE groups >= 36 OR Count36 > 0;
db<>fiddle

How can i duplicate records with T-SQL and keep track of the progressive number?

How can I duplicate the records of table1 and store them in table2 along with the progressive number calculated from startnum and endnum?
Thanks
the first row must be duplicated in 4 records i.e num: 80,81,82,83
Startnum | Endnum | Data
---------+-------------+----------
80 | 83 | A
10 | 11 | C
14 | 16 | D
Result:
StartEndNum | Data
------------+-----------
80 | A
81 | A
82 | A
83 | A
10 | C
11 | C
14 | D
15 | D
16 | D
A simple method uses a recursive CTE:
with cte as
select startnum, endnum, data
from t
union all
select startnum + 1, endnum, data
from cte
where startnum < endnum
)
select startnum, data
from cte;
If you have ranges that exceed 100, you need option (maxrecursion 0).
Note: There are other solutions as well, using numbers tables (either built-in or generated). I like this solution as a gentle introduction to recursive CTEs.
Without recursion:
declare #t table(Startnum int, Endnum int, Data varchar(20))
insert into #t values
(80, 83, 'A'),
(10, 11, 'C'),
(14, 16, 'D');
select a.StartEndNum, t.Data
from #t t cross apply (select top (t.Endnum - t.Startnum + 1)
t.Startnum + row_number() over(order by getdate()) - 1 as StartEndNum
from sys.all_columns) a;
You can use any other table with enough rows instead of sys.all_columns

Count Based on Columns in SQL Server

I have 3 tables:
SELECT id, letter
FROM As
+--------+--------+
| id | letter |
+--------+--------+
| 1 | A |
| 2 | B |
+--------+--------+
SELECT id, letter
FROM Xs
+--------+------------+
| id | letter |
+--------+------------+
| 1 | X |
| 2 | Y |
| 3 | Z |
+--------+------------+
SELECT id, As_id, Xs_id
FROM A_X
+--------+-------+-------+
| id | As_id | Xs_id |
+--------+-------+-------+
| 9 | 1 | 1 |
| 10 | 1 | 2 |
| 11 | 2 | 3 |
| 12 | 1 | 2 |
| 13 | 2 | 3 |
| 14 | 1 | 1 |
+--------+-------+-------+
I can count all As and Bs with group by. But I want to count As and Bs based on X,Y and Z. What I want to get is below:
+-------+
| X,Y,Z |
+-------+
| 2,2,0 |
| 0,0,2 |
+-------+
X,Y,Z
A 2,2,0
B 0,0,2
What is the best way to do this at MSSQL? Is it an efficent way to use foreach for example?
edit: It is not a duplicate because I just wanted to know the efficent way not any way.
For what you're trying to do without knowing what is inefficient with your current code (because none was provided), a Pivot is best. There are a million resources online and here in the stack overflow Q/A forums to find what you need. This is probably the simplest explanation of a Pivot which I frequently need to remind myself of the complicated syntax of a pivot.
To specifically answer your question, this is the code that shows how the link above applies to your question
First Tables needed to be created
DECLARE #AS AS TABLE (ID INT, LETTER VARCHAR(1))
DECLARE #XS AS TABLE (ID INT, LETTER VARCHAR(1))
DECLARE #XA AS TABLE (ID INT, AsID INT, XsID INT)
Values were added to the tables
INSERT INTO #AS (ID, Letter)
SELECT 1,'A'
UNION
SELECT 2,'B'
INSERT INTO #XS (ID, Letter)
SELECT 1,'X'
UNION
SELECT 2,'Y'
UNION
SELECT 3,'Z'
INSERT INTO #XA (ID, ASID, XSID)
SELECT 9,1,1
UNION
SELECT 10,1,2
UNION
SELECT 11,2,3
UNION
SELECT 12,1,2
UNION
SELECT 13,2,3
UNION
SELECT 14,1,1
Then the query which does the pivot is constructed:
SELECT LetterA, [X],[Y],[Z]
FROM (SELECT A.LETTER AS LetterA
,B.LETTER AS LetterX
,C.ID
FROM #XA C
JOIN #AS A
ON A.ID = C.ASID
JOIN #XS B
ON B.ID = C.XSID
) Src
PIVOT (COUNT(ID)
FOR LetterX IN ([X],[Y],[Z])
) AS PVT
When executed, your results are as follows:
Letter X Y Z
A 2 2 0
B 0 0 2
As i said in comment ... just join and do simple pivot
if object_id('tempdb..#AAs') is not null drop table #AAs
create table #AAs(id int, letter nvarchar(5))
if object_id('tempdb..#XXs') is not null drop table #XXs
create table #XXs(id int, letter nvarchar(5))
if object_id('tempdb..#A_X') is not null drop table #A_X
create table #A_X(id int, AAs int, XXs int)
insert into #AAs (id, letter) values (1, 'A'), (2, 'B')
insert into #XXs (id, letter) values (1, 'X'), (2, 'Y'), (3, 'Z')
insert into #A_X (id, AAs, XXs)
values (9, 1, 1),
(10, 1, 2),
(11, 2, 3),
(12, 1, 2),
(13, 2, 3),
(14, 1, 1)
select LetterA,
ISNULL([X], 0) [X],
ISNULL([Y], 0) [Y],
ISNULL([Z], 0) [Z]
from (
select distinct a.letter [LetterA], x.letter [LetterX],
count(*) over (partition by a.letter, x.letter order by a.letter) [Counted]
from #A_X ax
join #AAs A on ax.AAs = A.ID
join #XXs X on ax.XXs = X.ID
)src
PIVOT
(
MAX ([Counted]) for LetterX in ([X], [Y], [Z])
) piv
You get result as you asked for
LetterA X Y Z
A 2 2 0
B 0 0 2

SQL Pivoting or Transposing or ... column to row?

I have a question and this looks way better in SQLfiddle:
http://www.sqlfiddle.com/#!3/dffa1/2
I have a table with multirows for each user with datestamp and test results and i would like to transpose or pivot it into one line result as follows where each user has listed all time and value results:
USERID PSA1_time PSA1_result PSA2_time PSA2_result PSA3_time PSA3_result ...
1 1999-.... 2 1998... 4 1999... 6
3 1992... 4 1994 6
4 2006 ... 8
Table below:
CREATE TABLE yourtable
([userid] int, [Ranking] int,[test] varchar(3), [Date] datetime, [result] int)
;
INSERT INTO yourtable
([userid], [Ranking],[test], [Date], [result])
VALUES
('1', '1', 'PSA', 1997-05-20, 2),
('1', '2','PSA', 1998-05-07, 4),
('1', '3','PSA', 1999-06-08, 6),
('1', '4','PSA', 2001-06-08, 8),
('1', '5','PSA', 2004-06-08, 0),
('3', '1','PSA', 1992-05-07, 4),
('3', '2','PSA', 1994-06-08, 6),
('4', '1','PSA', 2006-06-08, 8)
;
Since you want to PIVOT two columns my suggestion would be to unpivot the date and result columns first, then apply the PIVOT function.
The unpivot process will convert the two columns date and result into multiple rows:
select userid,
col = test +'_'+cast(ranking as varchar(10))+'_'+ col,
value
from yourtable t1
cross apply
(
select 'time', convert(varchar(10), date, 120) union all
select 'result', cast(result as varchar(10))
) c (col, value)
See Demo. This will give you a result:
| USERID | COL | VALUE |
--------------------------------------
| 1 | PSA_1_time | 1997-05-20 |
| 1 | PSA_1_result | 2 |
| 1 | PSA_2_time | 1998-05-07 |
| 1 | PSA_2_result | 4 |
| 1 | PSA_3_time | 1999-06-08 |
Now that you have the data in this format, then you can apply pivot to get the max/min value for each item in col:
If you have a limited number of columns, then you can hard-code the query:
select *
from
(
select userid,
col = test +'_'+cast(ranking as varchar(10))+'_'+ col,
value
from yourtable t1
cross apply
(
select 'time', convert(varchar(10), date, 120) union all
select 'result', cast(result as varchar(10))
) c (col, value)
) d
pivot
(
max(value)
for col in (PSA_1_time, PSA_1_result,
PSA_2_time, PSA_2_result,
PSA_3_time, PSA_3_result,
PSA_4_time, PSA_4_result,
PSA_5_time, PSA_5_result)
) piv;
See SQL Fiddle with Demo
If you have unknown columns, then you will need to use dynamic SQL:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT ',' + QUOTENAME(test +'_'+cast(ranking as varchar(10))+'_'+ col)
from yourtable
cross apply
(
select 'time', 1 union all
select 'result', 2
) c (col, so)
group by test, ranking, col, so
order by Ranking, so
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT userid,' + #cols + '
from
(
select userid,
col = test +''_''+cast(ranking as varchar(10))+''_''+ col,
value
from yourtable t1
cross apply
(
select ''time'', convert(varchar(10), date, 120) union all
select ''result'', cast(result as varchar(10))
) c (col, value)
) x
pivot
(
max(value)
for col in (' + #cols + ')
) p '
execute sp_executesql #query;
See SQL Fiddle with Demo. Both versions will give a result:
| USERID | PSA_1_TIME | PSA_1_RESULT | PSA_2_TIME | PSA_2_RESULT | PSA_3_TIME | PSA_3_RESULT | PSA_4_TIME | PSA_4_RESULT | PSA_5_TIME | PSA_5_RESULT |
------------------------------------------------------------------------------------------------------------------------------------------------------
| 1 | 1997-05-20 | 2 | 1998-05-07 | 4 | 1999-06-08 | 6 | 2001-06-08 | 8 | 2004-06-08 | 0 |
| 3 | 1992-05-07 | 4 | 1994-06-08 | 6 | (null) | (null) | (null) | (null) | (null) | (null) |
| 4 | 2006-06-08 | 8 | (null) | (null) | (null) | (null) | (null) | (null) | (null) | (null) |

CONCAT(column) OVER(PARTITION BY...)? Group-concatentating rows without grouping the result itself

I need a way to make a concatenation of all rows (per group) in a kind of window function like how you can do COUNT(*) OVER(PARTITION BY...) and the aggregate count of all rows per group will repeat across each particular group. I need something similar but a string concatenation of all values per group repeated across each group.
Here is some example data and my desired result to better illustrate my problem:
grp | val
------------
1 | a
1 | b
1 | c
1 | d
2 | x
2 | y
2 | z
And here is what I need (the desired result):
grp | val | groupcnct
---------------------------------
1 | a | abcd
1 | b | abcd
1 | c | abcd
1 | d | abcd
2 | x | xyz
2 | y | xyz
2 | z | xyz
Here is the really tricky part of this problem:
My particular situation prevents me from being able to reference the same table twice (I'm actually doing this within a recursive CTE, so I can't do a self-join of the CTE or it will throw an error).
I'm fully aware that one can do something like:
SELECT a.*, b.groupcnct
FROM tbl a
CROSS APPLY (
SELECT STUFF((
SELECT '' + aa.val
FROM tbl aa
WHERE aa.grp = a.grp
FOR XML PATH('')
), 1, 0, '') AS groupcnct
) b
But as you can see, that is referencing tbl two times in the query.
I can only reference tbl once, hence why I'm wondering if windowing the group-concatenation is possible (I'm a bit new to TSQL since I come from a MySQL background, so not sure if something like that can be done).
Create Table:
CREATE TABLE tbl
(grp int, val varchar(1));
INSERT INTO tbl
(grp, val)
VALUES
(1, 'a'),
(1, 'b'),
(1, 'c'),
(1, 'd'),
(2, 'x'),
(2, 'y'),
(2, 'z');
In sql 2017 you can use STRING_AGG function:
SELECT STRING_AGG(T.val, ',') AS val
, T.grp
FROM #tbl AS T
GROUP BY T.grp
I tried using pure CTE approach: Which is the best way to form the string value using column from a Table with rows having same ID? Thinking it is faster
But the benchmark tells otherwise, it's better to use subquery(or CROSS APPLY) results from XML PATH as they are faster: Which is the best way to form the string value using column from a Table with rows having same ID?
DECLARE #tbl TABLE
(
grp INT
,val VARCHAR(1)
);
BEGIN
INSERT INTO #tbl(grp, val)
VALUES
(1, 'a'),
(1, 'b'),
(1, 'c'),
(1, 'd'),
(2, 'x'),
(2, 'y'),
(2, 'z');
END;
----------- Your Required Query
SELECT ST2.grp,
SUBSTRING(
(
SELECT ','+ST1.val AS [text()]
FROM #tbl ST1
WHERE ST1.grp = ST2.grp
ORDER BY ST1.grp
For XML PATH ('')
), 2, 1000
) groupcnct
FROM #tbl ST2
Is it possible for you to just put your stuff in the select instead or do you run into the same issue? (i replaced 'tbl' with 'TEMP.TEMP123')
Select
A.*
, [GROUPCNT] = STUFF((
SELECT '' + aa.val
FROM TEMP.TEMP123 AA
WHERE aa.grp = a.grp
FOR XML PATH('')
), 1, 0, '')
from TEMP.TEMP123 A
This worked for me -- wanted to see if this worked for you too.
I know this post is old, but just in case, someone is still wondering, you can create scalar function that concatenates row values.
IF OBJECT_ID('dbo.fnConcatRowsPerGroup','FN') IS NOT NULL
DROP FUNCTION dbo.fnConcatRowsPerGroup
GO
CREATE FUNCTION dbo.fnConcatRowsPerGroup
(#grp as int) RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #val AS VARCHAR(MAX)
SELECT #val = COALESCE(#val,'')+val
FROM tbl
WHERE grp = #grp
RETURN #val;
END
GO
select *, dbo.fnConcatRowsPerGroup(grp)
from tbl
Here is the result set I got from querying a sample table:
grp | val | (No column name)
---------------------------------
1 | a | abcd
1 | b | abcd
1 | c | abcd
1 | d | abcd
2 | x | xyz
2 | y | xyz
2 | z | xyz