Related
My table holds a column that holds a value that signifies a period in years and months, in other words:
ID PERIOD
1 1Y
2 1Y1M
3 11M
4 5Y2M
When doing a select statement how to I convert/calculate this into number of months? It's easy to deal with values that are 'Y' only or 'M' only but not sure how to do for example ID 2 and ID 4 from the example above.
The result from the select statement if I'd select all above would be:
12
13
11
62
You can use
SUBSTRING(period, 1, CHARINDEX('Y', Period) - 1)
To get only the numbers before the Y, and then multiply it by 12.
And for the cases where the Y would not be present could be handled with CASE
So Something like:
SELECT CASE WHEN Period LIKE '%Y' THEN
CAST(SUBSTRING(period, 1, CHARINDEX('Y', Period) - 1)) * 12
+
CAST(REPLACE(SUBSTRING(period, CHARINDEX('Y', Period)),'M',''))
ELSE
CAST(REPLACE(period,'M',''))
You can do the following
SELECT *, TRY_CAST(REPLACE(Y, 'Y', '') AS INT) * 12 +
TRY_CAST(REPLACE(REPLACE(Period, Y, ''), 'M', '') AS INT)
FROM
(
VALUES
(1, '1Y'),
(2, '1Y1M'),
(3, '11M'),
(4, '5Y2M'),
(5, 'Whatever')
) T(Id, Period)
CROSS APPLY
(
VALUES
(LEFT(Period, CHARINDEX('Y', Period)))
) CI(Y)
Returns:
+----+----------+----+------------------+
| Id | Period | Y | (No column name) |
+----+----------+----+------------------+
| 1 | 1Y | 1Y | 12 |
| 2 | 1Y1M | 1Y | 13 |
| 3 | 11M | | 11 |
| 4 | 5Y2M | 5Y | 62 |
| 5 | Whatever | | |
+----+----------+----+------------------+
Here is an other way
SELECT Id,
Period,
(REPLACE(Years, 'Y', '') * 12) + REPLACE(Months, 'M', '') TotalMonths
FROM
(
VALUES
(1, '1Y'),
(2, '1Y1M'),
(3, '5M'),
(4, '10Y11M'),
(5, 'Whatever write Y and M')
) T(Id, Period)
CROSS APPLY
(
VALUES
(
LEFT(Period, CHARINDEX('Y', Period)), REPLACE(REPLACE(Period, 'Y', ''), 'M', '')
)
) TT(Years, Value)
CROSS APPLY
(
VALUES
(
REPLACE(Period, Years, '')
)
) TTT(Months)
WHERE TRY_CAST(TT.Value AS INT) IS NOT NULL;
db-fiddle
Another alternative
select id, (parsename (clean,2) * 12) + (parsename(clean,1)) as months
from t
cross apply (select replace(replace(case when charindex('M', period)=0 then period + '0M'
when charindex('Y', period)=0 then '0Y' + period
else period end,'M',''),'Y','.') as clean) t2
Outputs
+----+--------+
| id | months |
+----+--------+
| 1 | 12 |
| 2 | 13 |
| 3 | 11 |
| 4 | 62 |
+----+--------+
There aren't so many possibilities. Build a reference table:
select identity(int) as period_id, v.*
into periods p
from (values ('1M', 1),
('2M', 2),
. . .
('1Y', 12),
('1Y1M', 13),
. . .
) v(period, months);
This can easily be constructed using a spreadsheet. Or even a recursive CTE.
I am suggesting this for a serious reason: you should not be doing calculations on string representations like this. These values should be treated as foreign key references to a table. And, in fact, they should be using the primary key (the identity column) rather than the string.
You will probably find malformed strings as you go about fixing this. That is a good thing, from a data quality perspective.
I think something like this would work
Data
drop table if exists #tTEST;
go
select * INTO #tTEST from (values
(1, '1Y'),
(2, '1Y1M'),
(3, '11M'),
(4, '5Y2M')) V(ID, [Period]);
Query
select
t.*,
isnull(substring(t.[Period], 1, nullif(y_ndx, 0)-1)*12, 0) +
isnull(substring(t.[Period], y_ndx+1, nullif(m_ndx, 0)-isnull(y_ndx, 0)-1),0) answer
from #tTEST t
cross apply (select len(t.[Period]) p_len,
CHARINDEX('Y', t.[Period]) y_ndx,
CHARINDEX('M', t.[Period]) m_ndx) ndx;
Results
ID Period answer
1 1Y 12
2 1Y1M 13
3 11M 11
4 5Y2M 62
Try this below Scalar-Valued Function to get period value
CREATE FUNCTION [dbo].[fn_sum](
#delimited NVARCHAR(MAX),
#delimiter NVARCHAR(100)
) RETURNS INT
AS
BEGIN
DECLARE #value INT
set #delimited=replace(replace(#delimited,'M','M,'),'Y','Y,')
DECLARE #xml XML
SET #xml = N'<t>' + REPLACE(#delimited,#delimiter,'</t><t>') + '</t>'
SELECT #value=sum(cast(replace(replace(r.value('.','Nvarchar(MAX)'),'M',''),'Y','') AS INT)
* CASE WHEN right(r.value('.','Nvarchar(MAX)'),1)='M' THEN 1 ELSE 12 END)
FROM #xml.nodes('/t') AS records(r)
RETURN #value
END
Check below sample output
DECLARE #tblPeriod AS TABLE(id VARCHAR(100),period VARCHAR(100))
INSERT INTO #tblPeriod(id,period) VALUES(1,'1Y'),(2,'1Y1M'),(3,'11M'),(4,'5Y2M')
SELECT id,period,dbo.[fn_sum](period,',') AS period_value FROM #tblPeriod a
I have data that looks something like this example (on an unfortunately much larger scale):
+----+-------+--------------------+-----------------------------------------------+
| ID | Data | Cost | Comments |
+----+-------+--------------------+-----------------------------------------------+
| 1 | 1|2|3 | $0.00|$3.17|$42.42 | test test||previous thing has a blank comment |
+----+-------+--------------------+-----------------------------------------------+
| 2 | 1 | $420.69 | test |
+----+-------+--------------------+-----------------------------------------------+
| 3 | 1|2 | $3.50|$4.20 | |test |
+----+-------+--------------------+-----------------------------------------------+
Some of the columns in the table I have are pipeline delimited, but they are consistent by each row. So each delimited value corresponds to the same index in the other columns of the same row.
So I can do something like this which is what I want for a single column:
SELECT ID, s.value AS datavalue
FROM MyTable t CROSS APPLY STRING_SPLIT(t.Data, '|') s
and that would give me this:
+----+-----------+
| ID | datavalue |
+----+-----------+
| 1 | 1 |
+----+-----------+
| 1 | 2 |
+----+-----------+
| 1 | 3 |
+----+-----------+
| 2 | 1 |
+----+-----------+
| 3 | 1 |
+----+-----------+
| 3 | 2 |
+----+-----------+
but I also want to get the other columns as well (cost and comments in this example) so that the corresponding items are all in the same row like this:
+----+-----------+-----------+------------------------------------+
| ID | datavalue | costvalue | commentvalue |
+----+-----------+-----------+------------------------------------+
| 1 | 1 | $0.00 | test test |
+----+-----------+-----------+------------------------------------+
| 1 | 2 | $3.17 | |
+----+-----------+-----------+------------------------------------+
| 1 | 3 | $42.42 | previous thing has a blank comment |
+----+-----------+-----------+------------------------------------+
| 2 | 1 | $420.69 | test |
+----+-----------+-----------+------------------------------------+
| 3 | 1 | $3.50 | |
+----+-----------+-----------+------------------------------------+
| 3 | 2 | $4.20 | test |
+----+-----------+-----------+------------------------------------+
I'm not sure what the best or most simple way to achieve this would be
This isn't going to be achievable with STRING_SPLIT as Microsoft refuse to supply the ordinal position as part of the result set. As a result, you'll need to use a different function which does. Personally, I recommend Jeff Moden's DelimitedSplit8k.
Then, you can do this:
CREATE TABLE #Sample (ID int,
[Data] varchar(200),
Cost varchar(200),
Comments varchar(8000));
GO
INSERT INTO #Sample
VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment'),
(2,'1','$420.69','test'),
(3,'1|2','$3.50|$4.20','|test');
GO
SELECT S.ID,
DSd.Item AS DataValue,
DSc.Item AS CostValue,
DSct.Item AS CommentValue
FROM #Sample S
CROSS APPLY dbo.DelimitedSplit8K(S.[Data],'|') DSd
CROSS APPLY (SELECT *
FROM DelimitedSplit8K(S.Cost,'|') SS
WHERE SS.ItemNumber = DSd.ItemNumber) DSc
CROSS APPLY (SELECT *
FROM DelimitedSplit8K(S.Comments,'|') SS
WHERE SS.ItemNumber = DSd.ItemNumber) DSct;
GO
DROP TABLE #Sample;
GO
There is, however, only one true answer to this question: Don't store delimited values in SQL Server. Store them in a normalised manner, and you won't have this problem.
Here is a solution approach using a recursive CTE instead of a User Defined Funtion (UDF) which is useful for those without permission to create functions.
CREATE TABLE mytable(
ID INTEGER NOT NULL PRIMARY KEY
,Data VARCHAR(7) NOT NULL
,Cost VARCHAR(20) NOT NULL
,Comments VARCHAR(47) NOT NULL
);
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (1,'1|2|3','$0.00|$3.17|$42.42','test test||previous thing has a blank comment');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (2,'1','$420.69','test');
INSERT INTO mytable(ID,Data,Cost,Comments) VALUES (3,'1|2','$3.50|$4.20','|test');
This query allows choice of delimiter by using a variable, then using a common table expression it parses each delimited string to produce a rows for each portion of those strings, and retains the ordinal position of each.
declare #delimiter as varchar(1)
set #delimiter = '|'
;with cte as (
select id
, convert(varchar(max), null) as datavalue
, convert(varchar(max), null) as costvalue
, convert(varchar(max), null) as commentvalue
, convert(varchar(max), data + #delimiter) as data
, convert(varchar(max), cost + #delimiter) as cost
, convert(varchar(max), comments + #delimiter) as comments
from mytable as t
union all
select id
, convert(varchar(max), left(data, charindex(#delimiter, data) - 1))
, convert(varchar(max), left(cost, charindex(#delimiter, cost) - 1))
, convert(varchar(max), left(comments, charindex(#delimiter, comments) - 1))
, convert(varchar(max), stuff(data, 1, charindex(#delimiter, data), ''))
, convert(varchar(max), stuff(cost, 1, charindex(#delimiter, cost), ''))
, convert(varchar(max), stuff(comments, 1, charindex(#delimiter, comments), ''))
from cte
where (data like ('%' + #delimiter + '%') and cost like ('%' + #delimiter + '%')) or comments like ('%' + #delimiter + '%')
)
select id, datavalue, costvalue, commentvalue
from cte
where datavalue IS NOT NULL
order by id, datavalue
As the recursion adds new rows, it places the first portion of the delimited strings into the wanted output columns using left(), then also, using stuff(), removes the last used delimiter from the source strings so that the next row will start at the next delimiter. Note that to initiate the extractions, the delimiter is added to the end of the source delimited strings which is to ensure the where clause does not exclude any of the wanted strings.
the result:
id datavalue costvalue commentvalue
---- ----------- ----------- ------------------------------------
1 1 $0.00 test test
1 2 $3.17
1 3 $42.42 previous thing has a blank comment
2 1 $420.69 test
3 1 $3.50
3 2 $4.20 test
demonstrated here at dbfiddle.uk
If I have a very simple table called tree
create table if not exists tree (id int primary key, parent int, name text);
And a few rows of data
insert into tree values (1, null, 'A');
insert into tree values (2, 1, 'B');
insert into tree values (3, 1, 'C');
insert into tree values (4, 2, 'D');
insert into tree values (5, 2, 'E');
insert into tree values (6, 3, 'F');
insert into tree values (7, 3, 'G');
I can easily run CTEs on it, and produce an output giving me path like this
with recursive R(id, level, path, name) as (
select id,1,name,name from tree where parent is null
union select tree.id, level + 1, path || '.' || tree.name, tree.name from tree join R on R.id=tree.parent
) select level,path,name from R;
Which gives the output
level | path | name
-------+-------+------
1 | A | A
2 | A.B | B
2 | A.C | C
3 | A.B.D | D
3 | A.B.E | E
3 | A.C.F | F
3 | A.C.G | G
What I'm wondering, is it possible to somehow project this output into another table, dynamically creating columns based on level (level1, level2, level3 etc), giving me something like this in return
id | level1 | level2 | level3
---+--------+--------+-------
1 | A | |
2 | A | B |
3 | A | C |
4 | A | B | D
5 | A | B | E
6 | A | C | F
7 | A | C | G
Any help would be appreciated.
If you know the maximum depth of your tree, I'd keep your approach and simplify it using array concatenation to produce the desired output.
So for a 5 level tree, that would look like this :
WITH RECURSIVE R(id, path) AS (
SELECT id, ARRAY[name::text] FROM tree WHERE parent IS NULL
UNION SELECT tree.id, path || tree.name FROM tree JOIN R ON R.id=tree.parent
)
SELECT id,
path[1] AS l1,
path[2] AS l2,
path[3] AS l3,
path[4] AS l4,
path[5] AS l5
FROM R;
PS : sorry for not commenting on Ziggy's answer which is very close, but I don't have enough reputation to do so. I don't see why you would need a windowing function here ?
PostgreSQL requires to always define the type of the output, so you can't have the columns levelX produced dynamically. However, you can do the following:
with recursive
R(id, path) as (
select id,ARRAY[name::text] from tree where parent is null
union
select tree.id, path || tree.name::text from tree join R on R.id=tree.parent
)
select row_number() over (order by cardinality(path), path), id,
path[1] as level1, path[2] as level2, path[3] as level3
from R
order by 1
In the example above, the column row_number happens to match id, but probably that wouldn't happen with your real data.
I have 3 tables:
SELECT id, letter
FROM As
+--------+--------+
| id | letter |
+--------+--------+
| 1 | A |
| 2 | B |
+--------+--------+
SELECT id, letter
FROM Xs
+--------+------------+
| id | letter |
+--------+------------+
| 1 | X |
| 2 | Y |
| 3 | Z |
+--------+------------+
SELECT id, As_id, Xs_id
FROM A_X
+--------+-------+-------+
| id | As_id | Xs_id |
+--------+-------+-------+
| 9 | 1 | 1 |
| 10 | 1 | 2 |
| 11 | 2 | 3 |
| 12 | 1 | 2 |
| 13 | 2 | 3 |
| 14 | 1 | 1 |
+--------+-------+-------+
I can count all As and Bs with group by. But I want to count As and Bs based on X,Y and Z. What I want to get is below:
+-------+
| X,Y,Z |
+-------+
| 2,2,0 |
| 0,0,2 |
+-------+
X,Y,Z
A 2,2,0
B 0,0,2
What is the best way to do this at MSSQL? Is it an efficent way to use foreach for example?
edit: It is not a duplicate because I just wanted to know the efficent way not any way.
For what you're trying to do without knowing what is inefficient with your current code (because none was provided), a Pivot is best. There are a million resources online and here in the stack overflow Q/A forums to find what you need. This is probably the simplest explanation of a Pivot which I frequently need to remind myself of the complicated syntax of a pivot.
To specifically answer your question, this is the code that shows how the link above applies to your question
First Tables needed to be created
DECLARE #AS AS TABLE (ID INT, LETTER VARCHAR(1))
DECLARE #XS AS TABLE (ID INT, LETTER VARCHAR(1))
DECLARE #XA AS TABLE (ID INT, AsID INT, XsID INT)
Values were added to the tables
INSERT INTO #AS (ID, Letter)
SELECT 1,'A'
UNION
SELECT 2,'B'
INSERT INTO #XS (ID, Letter)
SELECT 1,'X'
UNION
SELECT 2,'Y'
UNION
SELECT 3,'Z'
INSERT INTO #XA (ID, ASID, XSID)
SELECT 9,1,1
UNION
SELECT 10,1,2
UNION
SELECT 11,2,3
UNION
SELECT 12,1,2
UNION
SELECT 13,2,3
UNION
SELECT 14,1,1
Then the query which does the pivot is constructed:
SELECT LetterA, [X],[Y],[Z]
FROM (SELECT A.LETTER AS LetterA
,B.LETTER AS LetterX
,C.ID
FROM #XA C
JOIN #AS A
ON A.ID = C.ASID
JOIN #XS B
ON B.ID = C.XSID
) Src
PIVOT (COUNT(ID)
FOR LetterX IN ([X],[Y],[Z])
) AS PVT
When executed, your results are as follows:
Letter X Y Z
A 2 2 0
B 0 0 2
As i said in comment ... just join and do simple pivot
if object_id('tempdb..#AAs') is not null drop table #AAs
create table #AAs(id int, letter nvarchar(5))
if object_id('tempdb..#XXs') is not null drop table #XXs
create table #XXs(id int, letter nvarchar(5))
if object_id('tempdb..#A_X') is not null drop table #A_X
create table #A_X(id int, AAs int, XXs int)
insert into #AAs (id, letter) values (1, 'A'), (2, 'B')
insert into #XXs (id, letter) values (1, 'X'), (2, 'Y'), (3, 'Z')
insert into #A_X (id, AAs, XXs)
values (9, 1, 1),
(10, 1, 2),
(11, 2, 3),
(12, 1, 2),
(13, 2, 3),
(14, 1, 1)
select LetterA,
ISNULL([X], 0) [X],
ISNULL([Y], 0) [Y],
ISNULL([Z], 0) [Z]
from (
select distinct a.letter [LetterA], x.letter [LetterX],
count(*) over (partition by a.letter, x.letter order by a.letter) [Counted]
from #A_X ax
join #AAs A on ax.AAs = A.ID
join #XXs X on ax.XXs = X.ID
)src
PIVOT
(
MAX ([Counted]) for LetterX in ([X], [Y], [Z])
) piv
You get result as you asked for
LetterA X Y Z
A 2 2 0
B 0 0 2
I need a way to make a concatenation of all rows (per group) in a kind of window function like how you can do COUNT(*) OVER(PARTITION BY...) and the aggregate count of all rows per group will repeat across each particular group. I need something similar but a string concatenation of all values per group repeated across each group.
Here is some example data and my desired result to better illustrate my problem:
grp | val
------------
1 | a
1 | b
1 | c
1 | d
2 | x
2 | y
2 | z
And here is what I need (the desired result):
grp | val | groupcnct
---------------------------------
1 | a | abcd
1 | b | abcd
1 | c | abcd
1 | d | abcd
2 | x | xyz
2 | y | xyz
2 | z | xyz
Here is the really tricky part of this problem:
My particular situation prevents me from being able to reference the same table twice (I'm actually doing this within a recursive CTE, so I can't do a self-join of the CTE or it will throw an error).
I'm fully aware that one can do something like:
SELECT a.*, b.groupcnct
FROM tbl a
CROSS APPLY (
SELECT STUFF((
SELECT '' + aa.val
FROM tbl aa
WHERE aa.grp = a.grp
FOR XML PATH('')
), 1, 0, '') AS groupcnct
) b
But as you can see, that is referencing tbl two times in the query.
I can only reference tbl once, hence why I'm wondering if windowing the group-concatenation is possible (I'm a bit new to TSQL since I come from a MySQL background, so not sure if something like that can be done).
Create Table:
CREATE TABLE tbl
(grp int, val varchar(1));
INSERT INTO tbl
(grp, val)
VALUES
(1, 'a'),
(1, 'b'),
(1, 'c'),
(1, 'd'),
(2, 'x'),
(2, 'y'),
(2, 'z');
In sql 2017 you can use STRING_AGG function:
SELECT STRING_AGG(T.val, ',') AS val
, T.grp
FROM #tbl AS T
GROUP BY T.grp
I tried using pure CTE approach: Which is the best way to form the string value using column from a Table with rows having same ID? Thinking it is faster
But the benchmark tells otherwise, it's better to use subquery(or CROSS APPLY) results from XML PATH as they are faster: Which is the best way to form the string value using column from a Table with rows having same ID?
DECLARE #tbl TABLE
(
grp INT
,val VARCHAR(1)
);
BEGIN
INSERT INTO #tbl(grp, val)
VALUES
(1, 'a'),
(1, 'b'),
(1, 'c'),
(1, 'd'),
(2, 'x'),
(2, 'y'),
(2, 'z');
END;
----------- Your Required Query
SELECT ST2.grp,
SUBSTRING(
(
SELECT ','+ST1.val AS [text()]
FROM #tbl ST1
WHERE ST1.grp = ST2.grp
ORDER BY ST1.grp
For XML PATH ('')
), 2, 1000
) groupcnct
FROM #tbl ST2
Is it possible for you to just put your stuff in the select instead or do you run into the same issue? (i replaced 'tbl' with 'TEMP.TEMP123')
Select
A.*
, [GROUPCNT] = STUFF((
SELECT '' + aa.val
FROM TEMP.TEMP123 AA
WHERE aa.grp = a.grp
FOR XML PATH('')
), 1, 0, '')
from TEMP.TEMP123 A
This worked for me -- wanted to see if this worked for you too.
I know this post is old, but just in case, someone is still wondering, you can create scalar function that concatenates row values.
IF OBJECT_ID('dbo.fnConcatRowsPerGroup','FN') IS NOT NULL
DROP FUNCTION dbo.fnConcatRowsPerGroup
GO
CREATE FUNCTION dbo.fnConcatRowsPerGroup
(#grp as int) RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #val AS VARCHAR(MAX)
SELECT #val = COALESCE(#val,'')+val
FROM tbl
WHERE grp = #grp
RETURN #val;
END
GO
select *, dbo.fnConcatRowsPerGroup(grp)
from tbl
Here is the result set I got from querying a sample table:
grp | val | (No column name)
---------------------------------
1 | a | abcd
1 | b | abcd
1 | c | abcd
1 | d | abcd
2 | x | xyz
2 | y | xyz
2 | z | xyz