Generate group code between two values of the group with SQL - sql

i have a big issue , i need to generate a code in the range of two existing columns (CodeFrom / CodeTo) . Like the following screenshots below :
Input :
estimated Output :
Any shared Ideas can help my sure. Thanks

In SQL Server, you can use a recursive CTE:
with cte as (
select codefrom, codeto, town, codefrom as code
from t
union all
select codefrom, codeto, town, code + 1
from cte
where code < codeto
)
select *
from cte;
SQL Server has a built-in default recursion limit of 100. So, if you might be generating more than 100 codes, then add option (maxrecursion 0).

Like I mentioned under Gordon's answer in the comments, use a Tally for this. They are far faster by far (especially with larger datasets) and don't suffer the max recursion error as they aren't recursive:
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3) --1,000 rows, Add more N for more rows
SELECT YT.CodeFrom,
YT.CodeTo,
YT.Town,
T.I AS Code
FROM (VALUES(1,7,'Paris'),
(14,17,'Sao Paulo'))YT(CodeFrom,CodeTo,Town)
JOIN Tally T ON YT.CodeFrom <= T.I
AND YT.CodeTo >= T.I;

Related

Create 10 Million products in SQL

Im trying to add 10 Million products to my products table.
That was my attempt so far:
INSERT INTO Artikel (Hersteller, Artikelnummer, Artikelnamen, Artikelbeschreibung, Preis)
VALUES (
(SELECT TOP 1 Hersteller FROM Artikel ORDER BY NEWID()) ,
(SELECT FLOOR(RAND() * (100000000000-101 + 1)) + 101 ) ,
(SELECT REPLACE(NEWID(),'-','')),
(SELECT REPLACE(NEWID(),'-','')),
(SELECT ROUND(RAND(CHECKSUM(NEWID())) * (9999), 2))
)
GO 10000000
But this takes forever. After ~45 minutes my query was nowhere near 200K values. Are there any faster/more efficient solutions?
What you actually want is unclear, however, you could likely get some very good performance by doing this in only a couple of batches. I don't understand why you are getting a value(of Hersteller) from your table (Artikel) only to insert it into the table again, but I've incorporated that anyway.
This does the INSERT in 2 batches of 5,000,000:
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3, N N4, N N5, N N6, N N7), --10,000,000 rows
Dataset AS(
SELECT TOP (5000000)
A.Hersteller
FROM dbo.Artikel A
CROSS JOIN Tally T)
INSERT INTO dbo.Artikel
SELECT D.Hersteller,
FLOOR(RAND() * (100000000000-101 + 1)) + 101, --This'll be the same for every row, is that intended?
REPLACE(NEWID(),'-',''),
REPLACE(NEWID(),'-',''),
ROUND(RAND(CHECKSUM(NEWID())) * (9999), 2) --This'll be the same for every row, is that intended?
FROM Dataset D;
GO 2
Note my comment about RAND, and that it'll produce the same value on every row (within the batch). If that isn't desired then see this post about making a random number per row: How do I generate a random number for each row in a T-SQL select?

SQL Server 2008: duplicate a row n-times, where n is a value in a field

In SQL Server 2018 I have three tables:
T1 (idService, dateStart, dateStop)
T2 (idService, totalCostOfService)
T3 (idService, companyName)
Using joins, I created a view:
V1 (idService, dateStart, dateStop, totalCostOfService, companyName)
And we are fine. I can do my selects on the view and obtain the list of services done.
What I would like to do now is to duplicate every row of the view n times, where n=dateStart-dateStop; every row should have a "new" totalCostOfService = totalCostOfService/n.
I can do that using a temporary table, declaring variables, insert in temp using some while etc. etc. Let's call it "the procedure"
But what I would like to understand is:
is it possibile to do that directly with a select on V1? If not, is it possible to save "the procedure" as a view so that I can have it as a easy select?
Sorry if my question looks somewhat stupid, but I'm totally new with SQL. I tried searching here and on google but I couldn't find what an answer to my questions.
Thank you!
Rather than an rCTE (which is RBAR), you could use a Tally Table:
WITH N AS (
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1
CROSS JOIN N N2 --100
CROSS JOIN N N3 --1000
CROSS JOIN N N4) --10000
SELECT *
FROM YourTable
JOIN Tally T ON T.I <= dateStart-dateStop --Assumes dateStart and DateStop are integer values, even though their name implies otherwise
--If they are dates, then use DATEDIFF(DAY, dateStart, dateEnd)
That tally will generate numbers up to 10000 (which over 27 years worth of days. That should be far more than enough).
I will assume the existence of a numbers table which has the column val for the individual value numbers. If you don't, you will find plenty by searching around.
Add this in the end of the FROM clause of your view:
cross apply (select datediff(day,T1.dateStart,T1.dateStop)+1 as n_days)q1 -- number of days INCLUDING start
cross apply (select dateadd(day,T1.dateStart,n.val) as day_of_charge)q2 from numbers n where n.val between 0 and n_days-1)
Then you will be able to have the following field on your SELECT:
T2.totalCostOfService/n_days as totalCostOfService
I'll add a numbers table solution shortly.
You can use a recursive CTE:
with cte as (
select idService, dateStart, dateStop,
totalCostOfService / (datediff(day, datestop, datestart) + 1) as dailyCostOfService,
companyName
from v1
union all
select idService,
dateadd(day, 1, dateStart),
dateStop,
dailyCostOfService
companyName
from cte
)
select idservice, dateStart as dateOfService,
dailyCostOfService, companyName
from cte;
Note that if there are more than 100 days in any row, then you will need to add OPTION (MAXRECURSION 0).

The right way to use CTE

I'm new to Common Table Expressions and I think I need to use one in order to achieve what I require.
If I run the following script -
select MainRentAccountReference,EffectiveFromDate,CollectionDay,NumberOfCollections,DirectDebitTotalOverrideAmount
from DirectDebitApportionment
where id = 1
It would give me the below results -
So for each row that my CTE would return- for each unique MainRentAccountReference - I would want to create a row based on the following criteria.
3 Rows as the NumberOfCollections is set to 3
The following dates on each row - 01/05/18, 01/06/18, 01/07/18 so basically plus one month.
However is the CollectionDate was set to say 10, then I would want the 3 dates to be 10/05/18, 10/06/18, 10/07/18
Finally each row to have a value of DirectDebitTotalOverrideAmount divided by number of NumberOfCollections.
I've been playing about with this and can get no where near the results I'm trying to achieve. Any help would be greatly appreciated. Thanks
You can do this with a recursive CTE
with t as (
select *
from DirectDebitApportionment
where id = 1
),
cte as (
select . . ., , 1 as collection, DirectDebitTotalOverrideAmount / NumberOfCollections as collection_amount
from t
union all
select . . ., , collection + 1, DirectDebitTotalOverrideAmount / NumberOfCollections as collection_amount
from cte
where collection < NumberOfCollections
)
select . . .
from cte;
In some dialects of SQL, you need the recursive keyword.
Also, this can also be accomplished using a numbers table -- and that can be more efficient than the recursive CTE (although recursive CTEs often perform surprisingly well).
This seems to do the trick based on the pointers that Gordon gave me -
with t as (
select MainRentAccountReference,EffectiveFromDate,CollectionDay,NumberOfCollections,DirectDebitTotalOverrideAmount
from DirectDebitApportionment
where id = 1
),
cte as (
select 1 as collection
,t.MainRentAccountReference
,convert(decimal(18,2),DirectDebitTotalOverrideAmount / NumberOfCollections) as collection_amount
,NumberOfCollections
,convert(datetime,DATEFROMPARTS ( DATEPART(YEAR,EffectiveFromDate), DATEPART(MONTH,EffectiveFromDate), CollectionDay )) AS EffectiveFromDate
,CollectionDay
from t
union all
select collection + 1,MainRentAccountReference,collection_amount,NumberOfCollections,DATEADD(M,1,EffectiveFromDate),CollectionDay
from cte
where collection < cte.NumberOfCollections
)
select *
from cte
Order by MainRentAccountReference,collection
;
Gives me the following results -

SQL to find missing numbers in sequence starting from min?

I have found several examples of SQL queries which will find missing numbers in a sequence. For example this one:
Select T1.val+1
from table T1
where not exists(select val from table T2 where T2.val = T1.val + 1);
This will only find gaps in an existing sequence. I would like to find gaps in a sequence starting from a minimum.
For example, if the values in my sequence are 2, 4 then the query above will return 3,5.
I would like to specify that my sequence must start at 0 so I would like the query to return 0,1,3,5.
How can I add the minimum value to my query?
A few answers to questions below:
There is no maximum, only a minimum
The DB is oracle
This is quite easy in Postgres:
select x.i as missing_sequence_value
from (
select i
from generate_series(0,5) i -- 0,5 are the lower and upper bounds you want
) x
left join the_table t on t.val = x.i
where t.val is null;
SQLFiddle: http://www.sqlfiddle.com/#!15/acb07/1
Edit
The Oracle solution is a bit more complex because generating the numbers requires a workaround
with numbers as (
select level - 1 as val
from dual
connect by level <= (select max(val) + 2 from the_table) -- this is the maximum
), number_range as (
select val
from numbers
where val >= 0 -- this is the minimum
)
select nr.val as missing_sequence_value
from number_range nr
left join the_table t on t.val = nr.val
where t.val is null;
SQLFiddle: http://www.sqlfiddle.com/#!4/71584/4
The idea (in both cases) is to generate a list of numbers you are interested in (from 0 to 5) and then doing an outer join against the values in your table. The rows where the outer join does not return something from your table (that's the condition where t.val is null) are the values that are missing.
The Oracle solution requires two common table expressions ("CTE", the "with" things) because you can't add a where level >= x in the first CTE that generates the numbers.
Note that the connect by level <= ... is relies on an undocumented (and unsupported) way of using connect by. But so many people are using that to get a "number generator" that I doubt that Oracle will actually remove this.
If you have to option to use a common table expression you can generate a sequence of numbers and use that as the source of numbers.
The variables #start and #end defines the range of numbers (you could easily use max(val) from yourtable as end instead).
This example is for MS SQL Server (but CTEs is a SQL 99 feature supported by many databases):
declare #start int, #end int
select #start=0, #end=5
;With sequence(num) as
(
select #start as num
union all
select num + 1
from sequence
where num < #end
)
select * from sequence seq
where not exists(select val from YourTable where YourTable.val = seq.num)
Option (MaxRecursion 1000)
Sample SQL Fiddle
Just presenting 2 small variations to suggestions above, both are for Oracle.
First is a small variation to one presented by jpw using a recursive Common Table Expression (CTE); simply to demonstrate that this technique is also available in Oracle.
WITH
seq (val)
AS (
SELECT 0 FROM dual
UNION ALL
SELECT val + 1
FROM seq
WHERE val < (
SELECT MAX(val) FROM the_table -- note below
)
)
SELECT
seq.val AS missing_sequence_value
FROM seq
LEFT JOIN the_table t
ON seq.val = t.val
WHERE t.val IS NULL
ORDER BY
missing_sequence_value
;
This variation at SQLfiddle
Notable difference to SQL Server: you can use a subquery to limit the recursion
Also, Oracle documentation often refers to Subquery Factoring e.g. subquery_factoring_clause::= instead of CTE
Second is a variation on the use of connect by level as used by a_horse_with_no_name
Level is a pseudo column available in Oracle's hierarchical queries and the root of a hierarchy is 1. When using connect by level, by default this will commence at 1
For this variation I just wished to demonstrate that it does not need to be coupled with CTE's at all, and hence the syntax can be quite concise.
SELECT
seq.val AS missing_sequence_value
FROM (
SELECT
level - 1 AS val
FROM dual
CONNECT BY LEVEL <= (SELECT max(val) FROM the_table)
) seq
LEFT JOIN the_table t
ON seq.val = t.val
WHERE t.val IS NULL
ORDER BY
missing_sequence_value
;
This variation at SQLfiddle

How to split and display distinct letters from a word in SQL?

Yesterday in a job interview session I was asked this question and I had no clue about it. Suppose I have a word "Manhattan " I want to display only the letters 'M','A','N','H','T'
in SQL. How to do it?
Any help is appreciated.
Well, here is my solution (sqlfiddle) - it aims to use a "Relational SQL" operations, which may have been what the interviewer was going for conceptually.
Most of the work done is simply to turn the string into a set of (pos, letter) records as the relevant final applied DQL is a mere SELECT with a grouping and ordering applied.
select letter
from (
-- All of this just to get a set of (pos, letter)
select ns.n as pos, substring(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s) as ss
cross join (
-- Or use another form to create a "numbers table"
select n from (values (1),(2),(3),(4),(5),(6),(7),(8),(9)) as X(n)
) as ns
) as pairs
group by letter -- guarantees distinctness
order by min(pos) -- ensure output is ordered MANHT
The above query works in SQL Server 2008, but the "Numbers Table" may have to be altered for other vendors. Otherwise, there is nothing used that is vendor specific - no CTE, or cross application of a function, or procedural language code ..
That being said, the above is to show a conceptual approach - SQL is designed for use with sets and relations and multiplicity across records; the above example is, in some sense, merely a perversion of such.
Examining the intermediate relation,
select ns.n as pos, substring(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s) as ss
cross join (
select n from (values (1),(2),(3),(4),(5),(6),(7),(8),(9)) as X(n)
) as ns
uses a cross join to generate the Cartesian product of the string (1 row) with the numbers (9 rows); the substring function is then applied with the string and each number to obtain each character in accordance with its position. The resulting set contains the records-
POS LETTER
1 M
2 A
3 N
..
9 N
Then the outer select groups each record according to the letter and the resulting records are ordered by the minimum (first) occurrence position of the letter that establishing the grouping. (Without the order by the letters would have been distinct but the final order would not be guaranteed.)
One way (if using SQL Server) is with a recursive CTE (Commom Table Expression).
DECLARE #source nvarchar(100) = 'MANHATTAN'
;
WITH cte AS (
SELECT SUBSTRING(#source, 1, 1) AS c1, 1 as Pos
WHERE LEN(#source) > 0
UNION ALL
SELECT SUBSTRING(#source, Pos + 1, 1) AS c1, Pos + 1 as Pos
FROM cte
WHERE Pos < LEN(#source)
)
SELECT DISTINCT c1 from cte
SqlFiddle for this is here. I had to inline the #source for SqlFiddle, but the code above works fine in Sql Server.
The first SELECT generates the initial row(in this case 'M', 1). The second SELECT is the recursive part that generates the subsequent rows, with the Pos column getting incremented each time until the termination condition WHERE Pos < LEN(#source) is finally met. The final select removes the duplicates. Internally, SELECT DISTINCT sorts the rows in order to facilitate the removal of duplicates, which is why the final output happens to be in alphabetic order. Since you didn't specify order as a requirement, I left it as-is. But you could modify it to use a GROUP instead, that ordered on MIN(Pos) if you needed the output in the characters' original order.
This same technique can be used for things like generating all the Bigrams for a string, with just a small change to the general structure above.
declare #charr varchar(99)
declare #lp int
set #charr='Manhattan'
set #lp=1
DECLARE #T1 TABLE (
FLD VARCHAR(max)
)
while(#lp<=LEN(#charr))
begin
if(not exists(select * from #T1 where FLD=(select SUBSTRING(#charr,#lp,1))))
begin
insert into #T1
select SUBSTRING(#charr,#lp,1)
end
set #lp=#lp+1
end
select * from #T1
check this it may help u
Here's an Oracle version of #user2864740's answer. The only difference is how you construct the "numbers table" (plus slight differences in aliasing)
select letter
from (
select ns.n as pos, substr(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s from dual) ss
cross join (
SELECT LEVEL as n
FROM DUAL
CONNECT BY LEVEL <= 9
ORDER BY LEVEL) ns
) pairs
group by letter
order by min(pos)