SQL Server 2008: duplicate a row n-times, where n is a value in a field - sql

In SQL Server 2018 I have three tables:
T1 (idService, dateStart, dateStop)
T2 (idService, totalCostOfService)
T3 (idService, companyName)
Using joins, I created a view:
V1 (idService, dateStart, dateStop, totalCostOfService, companyName)
And we are fine. I can do my selects on the view and obtain the list of services done.
What I would like to do now is to duplicate every row of the view n times, where n=dateStart-dateStop; every row should have a "new" totalCostOfService = totalCostOfService/n.
I can do that using a temporary table, declaring variables, insert in temp using some while etc. etc. Let's call it "the procedure"
But what I would like to understand is:
is it possibile to do that directly with a select on V1? If not, is it possible to save "the procedure" as a view so that I can have it as a easy select?
Sorry if my question looks somewhat stupid, but I'm totally new with SQL. I tried searching here and on google but I couldn't find what an answer to my questions.
Thank you!

Rather than an rCTE (which is RBAR), you could use a Tally Table:
WITH N AS (
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1
CROSS JOIN N N2 --100
CROSS JOIN N N3 --1000
CROSS JOIN N N4) --10000
SELECT *
FROM YourTable
JOIN Tally T ON T.I <= dateStart-dateStop --Assumes dateStart and DateStop are integer values, even though their name implies otherwise
--If they are dates, then use DATEDIFF(DAY, dateStart, dateEnd)
That tally will generate numbers up to 10000 (which over 27 years worth of days. That should be far more than enough).

I will assume the existence of a numbers table which has the column val for the individual value numbers. If you don't, you will find plenty by searching around.
Add this in the end of the FROM clause of your view:
cross apply (select datediff(day,T1.dateStart,T1.dateStop)+1 as n_days)q1 -- number of days INCLUDING start
cross apply (select dateadd(day,T1.dateStart,n.val) as day_of_charge)q2 from numbers n where n.val between 0 and n_days-1)
Then you will be able to have the following field on your SELECT:
T2.totalCostOfService/n_days as totalCostOfService
I'll add a numbers table solution shortly.

You can use a recursive CTE:
with cte as (
select idService, dateStart, dateStop,
totalCostOfService / (datediff(day, datestop, datestart) + 1) as dailyCostOfService,
companyName
from v1
union all
select idService,
dateadd(day, 1, dateStart),
dateStop,
dailyCostOfService
companyName
from cte
)
select idservice, dateStart as dateOfService,
dailyCostOfService, companyName
from cte;
Note that if there are more than 100 days in any row, then you will need to add OPTION (MAXRECURSION 0).

Related

Count Similar Substrings SQL query

I've tried a few scenarios and googled a lot, but still can't find a solution.
I have a table of user names with entries something like the below:
UserName
Cakes420
18Jack01
18Jack04
16Jack22
22Jack16
Mapple7609
Chrom44
chrom22
chrom77
013Cake
016Cake
122Cake
123Cake87
So I need a query that checks for all records that share 4 or more (in sequence) characters in the table.
So I need to return something like :
Characters
Times Used
Names Sharing
Cake
5
Cakes420, 013Cake, 016Cake, 122Cake, 123Cake87
Chro
3
Chrom44, chrom22, chrom77
or anything similar as I'd prefer not to repeat patterns, but hey, at this stage if it returns the values properly, I don't mind.
The shared characters can naturally appear in any place in the string, which is what makes this so difficult.
Should you do this in T-SQL? Probably not.
Can you do this in T-SQL? Yes.
Sample data
create table Names
(
Name nvarchar(20)
);
insert into Names (Name) values
('Cakes420'),
('18Jack01'),
('18Jack04'),
('16Jack22'),
('22Jack16'),
('Mapple7609'),
('Chrom44'),
('chrom22'),
('chrom77'),
('013Cake'),
('016Cake'),
('122Cake'),
('123Cake87');
Solution
Using STRING_AGG() for easy concatenation. Available from SQL Server 2017. Alternatives available for older SQL versions (use the search box on this site, there are many examples).
with rcte as
(
select n.Name,
convert(nvarchar(4), substring(n.Name, 1, 4)) as Part,
1 as PartFrom
from Names n
where len(n.Name) >= 4
union all
select r.Name,
convert(nvarchar(4), substring(r.Name, r.PartFrom+1, r.PartFrom+4)),
r.PartFrom+1
from rcte r
where len(r.Name) >= r.PartFrom+4
),
cte_count as
(
select r.Part,
count(1) as PartCount
from rcte r
where r.Part not like '%[0-9]%' -- exclude parts with numbers in them
group by r.Part
having count(1) > 1
)
select c.Part,
c.PartCount,
string_agg(r.Name, ', ') as Names
from cte_count c
join rcte r
on r.Part = c.Part
group by c.Part,
c.PartCount
order by c.Part;
Result
Part PartCount Names
---- --------- ----------------------------------------------
Cake 5 Cakes420, 123Cake87, 122Cake, 016Cake, 013Cake
Chro 3 Chrom44, chrom22, chrom77
hrom 3 chrom77, chrom22, Chrom44
Jack 4 22Jack16, 16Jack22, 18Jack04, 18Jack01
Fiddle to see it in action with the intermediate CTE results.
Let's use Itzik Ben-Gan's Tally Function to break out a list of substrings, then group them. This is called N-Gram, after the more common Trigram which is 3-character substrings.
I've removed one extra cross-join from the function to speed it up slightly, it's now good for up to varchar(65536):
CREATE OR ALTER FUNCTION dbo.GetNums(#num AS BIGINT)
RETURNS TABLE
AS
RETURN
WITH
L0 AS ( SELECT 1 AS c
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1)) AS D(c) ),
L1 AS ( SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B ),
L2 AS ( SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B ),
Nums AS ( SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
FROM L2 )
SELECT TOP(#num)
rownum AS rn
FROM Nums
ORDER BY rownum;
GO
DECLARE #substringLen int = 4;
SELECT
Characters,
[Times Used] = COUNT(*),
[Names Sharing] = STRING_AGG(Username, ', ')
FROM (
SELECT DISTINCT
-- remove DISTINCT if you want to know about multiple in a single username
t.Username,
Characters = SUBSTRING(t.Username, n.rn, #substringLen)
FROM myTable t
CROSS APPLY dbo.GetNums (LEN(t.UserName) - #substringLen + 1) n
) t
GROUP BY t.Characters
HAVING COUNT(*) > 1

Generate group code between two values of the group with SQL

i have a big issue , i need to generate a code in the range of two existing columns (CodeFrom / CodeTo) . Like the following screenshots below :
Input :
estimated Output :
Any shared Ideas can help my sure. Thanks
In SQL Server, you can use a recursive CTE:
with cte as (
select codefrom, codeto, town, codefrom as code
from t
union all
select codefrom, codeto, town, code + 1
from cte
where code < codeto
)
select *
from cte;
SQL Server has a built-in default recursion limit of 100. So, if you might be generating more than 100 codes, then add option (maxrecursion 0).
Like I mentioned under Gordon's answer in the comments, use a Tally for this. They are far faster by far (especially with larger datasets) and don't suffer the max recursion error as they aren't recursive:
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3) --1,000 rows, Add more N for more rows
SELECT YT.CodeFrom,
YT.CodeTo,
YT.Town,
T.I AS Code
FROM (VALUES(1,7,'Paris'),
(14,17,'Sao Paulo'))YT(CodeFrom,CodeTo,Town)
JOIN Tally T ON YT.CodeFrom <= T.I
AND YT.CodeTo >= T.I;

Iterate through SQL table creating another SQL Table

Wonder if someone could cast an eye over the following problem:
I'm running a SQL SELECT statement which gives me the following results:
DATE NumberOfHours
2017-05-01 4
2017-06-01 38
2017-07-01 68
And what I'm trying (like to be able to) to do is off the back of this table create another table that contains 4 rows for 2017-05-01, 38 Rows for 2017-06-01 and 68 rows for 2017-07-01. So I end up with a table that's got 110 rows in it.
I'm at a bit of a loss as to how this could be achieved...could anyone assist?
////////////////////////////////////////////////////////////
Using the response listed by Gordon Linoff I managed to get this working working by using:
with cte as (
SELECT DATEADD(month, datediff(month,0,L.DateAdded),0) AS 'Date', CEILING(SUM(l.CPDHours))AS NumberOfHours
FROM WebsiteICA_SF.dbo.CPD_Log L
WHERE L.DateAdded >= DATEADD(month, -6, GETDATE())
AND (L.Provider = 'ICA' OR L.Provider like 'International Compli%')
GROUP BY DATEADD(month, datediff(month,0,L.DateAdded),0)
union all
select date, NumberOfHours - 1
from cte
where NumberOfHours > 1
)
select 1 AS 'ObId', date, 'ICA' AS Provider, '# ICA' AS DataType
from cte
order by DATEADD(month, datediff(month,0,cte.Date),0)
OPTION (maxrecursion 10000);
One simple method is a recursive CTE:
with cte as (
select date, NumberOfHours
from t
union all
select date, NumberOfHours - 1
from cte
where NumberOfHours > 1
)
select date
from cte;
By default, this is limited to a maximum of 100 hours. However, that is easily changed using the MAXRECURSION option.
Other methods generally rely on a second table to generate numbers. I also like this approach because it is a gentle introduction to recursive CTEs.
Here is a nice SQL Fiddle.
So you have a result set with 3 rows and one column in it which tells you how many rows it represents. You want to generate that many rows.
Not sure what you want to store in that, but here is a solution to the base problem:
Create a table (temp table or CTE is fine too) which contains only one column, storing numbers from 0 to whatever. This is called Tally Table or Numbers Table.
Join this table to your resultset:
WITH NumbersCTE AS (
-- This will give you a bunch of Numbers
-- Persist a table if you want to use it more frequently
SELECT ROW_NUMBER() OVER (ORDER BY name) AS Number FROM sys.columns
)
SELECT
MT.Date,
N.Number
FROM
dbo.MyTable MT
INNER JOIN NumbersCTE N
ON N.Number <= MT.NumberOfHours
As Pieter Geerkens pointed out in the comments, the above method is not the best to generate a numbers table, but for demostration puposes it is fine.
For more info about how to generat tally tables in SQL Server, you can check
http://www.sqlservercentral.com/blogs/dwainsql/2014/03/27/tally-tables-in-t-sql/

How to split and display distinct letters from a word in SQL?

Yesterday in a job interview session I was asked this question and I had no clue about it. Suppose I have a word "Manhattan " I want to display only the letters 'M','A','N','H','T'
in SQL. How to do it?
Any help is appreciated.
Well, here is my solution (sqlfiddle) - it aims to use a "Relational SQL" operations, which may have been what the interviewer was going for conceptually.
Most of the work done is simply to turn the string into a set of (pos, letter) records as the relevant final applied DQL is a mere SELECT with a grouping and ordering applied.
select letter
from (
-- All of this just to get a set of (pos, letter)
select ns.n as pos, substring(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s) as ss
cross join (
-- Or use another form to create a "numbers table"
select n from (values (1),(2),(3),(4),(5),(6),(7),(8),(9)) as X(n)
) as ns
) as pairs
group by letter -- guarantees distinctness
order by min(pos) -- ensure output is ordered MANHT
The above query works in SQL Server 2008, but the "Numbers Table" may have to be altered for other vendors. Otherwise, there is nothing used that is vendor specific - no CTE, or cross application of a function, or procedural language code ..
That being said, the above is to show a conceptual approach - SQL is designed for use with sets and relations and multiplicity across records; the above example is, in some sense, merely a perversion of such.
Examining the intermediate relation,
select ns.n as pos, substring(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s) as ss
cross join (
select n from (values (1),(2),(3),(4),(5),(6),(7),(8),(9)) as X(n)
) as ns
uses a cross join to generate the Cartesian product of the string (1 row) with the numbers (9 rows); the substring function is then applied with the string and each number to obtain each character in accordance with its position. The resulting set contains the records-
POS LETTER
1 M
2 A
3 N
..
9 N
Then the outer select groups each record according to the letter and the resulting records are ordered by the minimum (first) occurrence position of the letter that establishing the grouping. (Without the order by the letters would have been distinct but the final order would not be guaranteed.)
One way (if using SQL Server) is with a recursive CTE (Commom Table Expression).
DECLARE #source nvarchar(100) = 'MANHATTAN'
;
WITH cte AS (
SELECT SUBSTRING(#source, 1, 1) AS c1, 1 as Pos
WHERE LEN(#source) > 0
UNION ALL
SELECT SUBSTRING(#source, Pos + 1, 1) AS c1, Pos + 1 as Pos
FROM cte
WHERE Pos < LEN(#source)
)
SELECT DISTINCT c1 from cte
SqlFiddle for this is here. I had to inline the #source for SqlFiddle, but the code above works fine in Sql Server.
The first SELECT generates the initial row(in this case 'M', 1). The second SELECT is the recursive part that generates the subsequent rows, with the Pos column getting incremented each time until the termination condition WHERE Pos < LEN(#source) is finally met. The final select removes the duplicates. Internally, SELECT DISTINCT sorts the rows in order to facilitate the removal of duplicates, which is why the final output happens to be in alphabetic order. Since you didn't specify order as a requirement, I left it as-is. But you could modify it to use a GROUP instead, that ordered on MIN(Pos) if you needed the output in the characters' original order.
This same technique can be used for things like generating all the Bigrams for a string, with just a small change to the general structure above.
declare #charr varchar(99)
declare #lp int
set #charr='Manhattan'
set #lp=1
DECLARE #T1 TABLE (
FLD VARCHAR(max)
)
while(#lp<=LEN(#charr))
begin
if(not exists(select * from #T1 where FLD=(select SUBSTRING(#charr,#lp,1))))
begin
insert into #T1
select SUBSTRING(#charr,#lp,1)
end
set #lp=#lp+1
end
select * from #T1
check this it may help u
Here's an Oracle version of #user2864740's answer. The only difference is how you construct the "numbers table" (plus slight differences in aliasing)
select letter
from (
select ns.n as pos, substr(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s from dual) ss
cross join (
SELECT LEVEL as n
FROM DUAL
CONNECT BY LEVEL <= 9
ORDER BY LEVEL) ns
) pairs
group by letter
order by min(pos)

Query question regarding aggregates over a date range

I have a data set where the structure could be like this
yes_no date
0 1/1/2011
1 1/1/2011
1 1/2/2011
0 1/4/2011
1 1/9/2011
Given a start data and and end date, I would like to create a query where it would aggregate over the date and provide a 0 for dates that do not exist in the table, for dates between start_data and end_date including both
This is in SQL.
I am stumped. I can get the aggregate queries very simply, but i don't know how to get zeros for dates that do not exist in the table.
If you're working with a DBMS that supports common table expressions, the following will generate a derived table of dates that you can then left join to your table. This was written for MSSQL, so you may need to derive your dates differently (i.e., an object other than master..spt_values)
with AllDates as (
select top 100000
convert(datetime, row_number() over (order by x.name)) as 'Date'
from
master..spt_values x
cross join master..spt_values y
)
select
ad.Date, isnull(yt.yn, 0)
from
AllDates ad
left join (
select date, sum(yes_no) yn
from YourTable yt
) yt
on ad.date = yt.date
where
ad.Date between YourStartDate and YourEndDate
Generating the dates has to be the way to go.
In ORACLE you could join on to a list of dates, why not..
(SELECT TRUNC(startdate + LEVEL)
FROM DUAL CONNECT BY LEVEL <(enddate-startdate))
If you can't generate your dates on-the-fly
a database agnostic solution would be to create a table containing all of the dates you will ever need and join on to that. (this should be your last resort)
here's the pseudeo code, you will need to substitute mydates for either the on-the fly sql or date table select
SELECT
CASE WHEN COUNT(b.date)=0
THEN
0
ELSE
1
END as yes_no
FROM (mydates) a
LEFT JOIN aggtable b ON a.date=b.date