Split one record into multiple rows - sql

Can somebody help me to split records into multiple rows.
My records look like this
321517 2013 SEPTEMBER 3 30 286787 321517-2013
321517 2013 SEPTEMBER 2 42 286787 321517-2013
I want them to look like this
321517 2013 SEPTEMBER 1 30 286787 321517-2013
321517 2013 SEPTEMBER 1 30 286787 321517-2013
321517 2013 SEPTEMBER 1 30 286787 321517-2013
321517 2013 SEPTEMBER 1 42 286787 321517-2013
321517 2013 SEPTEMBER 1 42 286787 321517-2013

You can get the max possible value and then make a recursive CTE to generate rows.
;WITH MAX_VALUE AS (
SELECT MAX(C4) AS VAL FROM Table1
),
TMP_ROWS AS (
SELECT 1 AS PARENT, 0 AS LVL, 1 AS ID
UNION ALL
SELECT
CHILD.PARENT,
TMP_ROWS.LVL + 1 AS LVL,
TMP_ROWS.ID
FROM (SELECT 1 AS PARENT, 1 AS ID, 0 AS NIVEL) AS CHILD
INNER JOIN TMP_ROWS ON CHILD.PARENT = TMP_ROWS.ID
WHERE TMP_ROWS.LVL < (SELECT VAL FROM MAX_VALUE)
)
select C1, C2, C3, 1 C4, C5, C6, C7
from Table1 join TMP_ROWS on C4 > TMP_ROWS.LVL
order by C1, C2, C3, C5, C6, C7
Demo (based on previuos reply data)
*Edit: "ROWS" isnt a good name for a table

You can try something like this. Please note that this query assumes maximum value of your 4th Column is 10. You can add more rows to the CTE using a cross join if you have higher values.
;WITH CTE AS (
select Digit
from ( values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS t(Digit)
)
select C1, C2, C3, 1 C4, C5, C6, C7
from Table1 join CTE on C4 > Digit
order by C1
Fiddle demo on sql server 2008

Related

How to select first 5 records and group rest others records in sql?

Suppose I have 2 columns NAME and COUNT.
NAME
COUNT
a1
2
a2
4
a3
5
a4
1
a5
6
a6
2
a7
4
a8
6
a9
7
a10
4
a11
1
I want to select first 5 records and group the rest others as one record( naming that record as others)
The output I need is
NAME
COUNT
a1
2
a2
4
a3
5
a4
1
a5
6
others
24
In others I need sum of all the count values excluding first 5 records.
We can use a union approach with the help of ROW_NUMBER():
WITH cte AS (
SELECT t.*, ROW_NUMBER() OVER (ORDER BY NAME) rn
FROM yourTable t
)
SELECT NAME, COUNT
FROM
(
SELECT NAME, COUNT, 1 AS pos FROM cte WHERE rn <= 5
UNION ALL
SELECT 'others', SUM(COUNT), 2 FROM cte WHERE rn > 5
) t
ORDER BY pos, NAME;

Get DISTINCT COUNT in one pass in SQL Server

I have a table like below:
Region Country Manufacturer Brand Period Spend
R1 C1 M1 B1 2016 5
R1 C1 M1 B1 2017 10
R1 C1 M1 B1 2017 20
R1 C1 M1 B2 2016 15
R1 C1 M1 B3 2017 20
R1 C2 M1 B1 2017 5
R1 C2 M2 B4 2017 25
R1 C2 M2 B5 2017 30
R2 C3 M1 B1 2017 35
R2 C3 M2 B4 2017 40
R2 C3 M2 B5 2017 45
...
I wrote the query below to aggregate them:
SELECT [Region]
,[Country]
,[Manufacturer]
,[Brand]
,Period
,SUM([Spend]) AS [Spend]
FROM myTable
GROUP BY [Region]
,[Country]
,[Manufacturer]
,[Brand]
,[Period]
ORDER BY 1,2,3,4
which yields something like below:
Region Country Manufacturer Brand Period Spend
R1 C1 M1 B1 2016 5
R1 C1 M1 B1 2017 30 -- this row is an aggregate from raw table above
R1 C1 M1 B2 2016 15
R1 C1 M1 B3 2017 20
R1 C2 M1 B1 2017 4 -- aggregated result
R1 C2 M2 B4 2017 25
R1 C2 M2 B5 2017 30
R2 C3 M2 B4 2017 40
R2 C3 M2 B5 2017 45
I'd like to add another column to the above table that shows the DISTINCT COUNT of Brand grouped by Region,Country,Manufacturer and Period. So the final table would become as follow:
Region Country Manufacturer Brand Period Spend UniqBrandCount
R1 C1 M1 B1 2016 5 2 -- two brands by R1, C1, M1 in 2016
R1 C1 M1 B1 2017 30 1
R1 C1 M1 B2 2016 15 2 -- same as first row's result
R1 C1 M1 B3 2017 20 1
R1 C2 M1 B1 2017 4 1
R1 C2 M2 B4 2017 25 2
R1 C2 M2 B5 2017 30 2
R2 C3 M2 B4 2017 40 2
R2 C3 M2 B5 2017 45 2
I know how to get to final result in three steps.
Run this query (Query #1):
SELECT [Region]
,[Country]
,[Manufacturer]
,[Period]
,COUNT(DISTINCT [Brand]) AS [BrandCount]
INTO Temp1
FROM myTable
GROUP BY [Region]
,[Country]
,[Manufacturer]
,[Period]
Run this query (Query #2)
SELECT [Region]
,[Country]
,[Manufacturer]
,[Brand]
,YEAR([Period]) AS Period
,SUM([Spend]) AS [Spend]
INTO Temp2
FROM myTable
GROUP BY [Region]
,[Country]
,[Manufacturer]
,[Brand]
,[Period]
Then LEFT JOIN Temp2 and Temp1 to bring in [BrandCount] from the latter like below:
SELECT a.*
,b.*
FROM Temp2 AS a
LEFT JOIN Temp1 AS b ON a.[Region] = b.[Region]
AND a.[Country] = b.[Country]
AND a.[Advertiser] = b.[Advertiser]
AND a.[Period] = b.[Period]
I'm pretty sure there is a more efficient way to do this, is there? Thank you in advance for your suggestions/answers!
Borrowing heavily from this question: https://dba.stackexchange.com/questions/89031/using-distinct-in-window-function-with-over
Count Distinct doesn't work, so dense_rank is required. Ranking the brands in forward and then reverse order, and then subtracting 1 gives the distinct count.
Your sum function can also be rewritten using PARTITION BY logic. This way you can use different grouping levels for each aggregation:
SELECT
[Region]
,[Country]
,[Manufacturer]
,[Brand]
,[Period]
,dense_rank() OVER
(PARTITION BY
[Region]
,[Country]
,[Manufacturer]
,[Period] Order by Brand)
+ dense_rank() OVER
(PARTITION BY
[Region]
,[Country]
,[Manufacturer]
,[Period] Order by Brand Desc)
- 1
AS [BrandCount]
,SUM([Spend]) OVER
(PARTITION BY
[Region]
,[Country]
,[Manufacturer]
,[Brand]
,[Period]) as [Spend]
from
myTable
ORDER BY 1,2,3,4
You may then need to reduce the number of rows in your output, as this syntax gives the same number of rows as myTable, but with the aggregation totals appearing on each row they apply to:
R1 C1 M1 B1 2016 2 5
R1 C1 M1 B1 2017 2 30 --dup1
R1 C1 M1 B1 2017 2 30 --dup1
R1 C1 M1 B2 2016 2 15
R1 C1 M1 B3 2017 2 20
R1 C2 M1 B1 2017 1 5
R1 C2 M2 B4 2017 2 25
R1 C2 M2 B5 2017 2 30
R2 C3 M1 B1 2017 1 35
R2 C3 M2 B4 2017 2 40
R2 C3 M2 B5 2017 2 45
Selecting distinct rows from this output gives you what you need.
How the dense_rank trick works
Consider this data:
Col1 Col2
B 1
B 1
B 3
B 5
B 7
B 9
dense_rank() ranks data according to the number of distinct items before the current one, plus 1. So:
1->1, 3->2, 5->3, 7->4, 9->5.
In reverse order (using desc) this yields the reverse pattern:
1->5, 3->4, 5->3, 7->2, 9->1:
Adding these ranks together gives the same value:
1+5 = 2+4 = 3+3 = 4+2 = 5+1 = 6
The wording is helpful here,
(number of distinct items before + 1) + (number of distinct items after + 1)
= number of distinct OTHER items before AND after + 2
= Total number of distinct items + 1
So to get the total number of distinct items, add the ascending and descending dense_ranks together and subtract 1.
The tag to your question;
window-functions
suggests you have a pretty good idea.
For DISTINCT COUNT of Brand grouped by Region,Country,Manufacturer and Period: you may write:
Select Region
,Country
,Manufacturer
,Brand
,Period
,Spend
,DENSE_RANK() Over (Partition By Region, Country, Manufacturer, Period Order By Brand asc)
+ DENSE_RANK() Over (Partition By Region, Country, Manufacturer, Period Order By Brand desc)
-1 UniqBrandCount
From myTable T1
Order By 1,2,3,4
The double dense_rank idea means that you need two sorts (assuming no index exists that provides sort order). Assuming no NULL brands (as that idea does) you can use a single dense_rank and a windowed MAX as below (demo)
WITH T1
AS (SELECT *,
DENSE_RANK() OVER (PARTITION BY [Region], [Country], [Manufacturer], [Period] ORDER BY Brand) AS [dr]
FROM myTable),
T2
AS (SELECT *,
MAX([dr]) OVER (PARTITION BY [Region], [Country], [Manufacturer], [Period]) AS UniqBrandCount
FROM T1)
SELECT [Region],
[Country],
[Manufacturer],
[Brand],
Period,
SUM([Spend]) AS [Spend],
MAX(UniqBrandCount) AS UniqBrandCount
FROM T2
GROUP BY [Region],
[Country],
[Manufacturer],
[Brand],
[Period]
ORDER BY [Region],
[Country],
[Manufacturer],
[Period],
Brand
The above has some inevitable spooling (it isn't possible to do this in a 100% streaming manner) but a single sort.
Strangely the final order by clause is needed to keep the number of sorts down to one (or zero if a suitable index exists).

Oracle left join returning no rows

I have a (now thoroughly derived) CTE:
feature_id | function_id | group_id | subgroup_id | type_id
1 1 null 1 null
1 1 null null 14
2 1 null 5 null
2 1 null null 21
3 1 null 7 null
3 1 null null 5
I am trying to collate the rows together using this:
select C1.feature_id, C1.function_Id, C2.Group_ID, C3.Subgroup_ID, C4.Type_id
from CTE C1
left join CTE C2
on C1.feature_id = C2.feature_id
and c1.function_id = c2.function_id
and c2.group_id is not null
left join CTE C3
on C1.feature_id = C3.feature_id
and c1.function_id = c3.function_id
and c3.subgroup_id is not null
left join CTE C4
on C1.feature_id = C4.feature_id
and c1.function_id = c4.function_id
and c4.type_id is not null
This gives me 0 rows...
To validate, I ran:
select *
from CTE C1
236 rows selected
Can anyone help? Surely the rows from C1 should be coming back...
EDIT: Fixed it with the Oracle syntax:
select C1.feature_id, C1.function_Id, C2.Group_ID, C3.Subgroup_ID, C4.Type_id
from CTE C1, CTE C2, CTE C3, CTE C4
where C1.feature_id = C2.feature_id(+)
and c1.function_id = c2.function_id(+)
and C2.group_id(+) is not null
...
(I hate the oracle syntax)
Ok, solved it... Fixed it with the Oracle syntax:
select C1.feature_id, C1.function_Id, C2.Group_ID, C3.Subgroup_ID, C4.Type_id
from CTE C1, CTE C2, CTE C3, CTE C4
where C1.feature_id = C2.feature_id(+)
and c1.function_id = c2.function_id(+)
and C2.group_id(+) is not null
...
(I hate the oracle syntax)

how to group data based on its sequence and group by other columns

I have a table with 3 columns c1,c2,c3 in Oracle like below:
c1 c2 c3
1 34 2
2 34 2
3 34 2
4 24 2
5 24 2
6 34 2
7 34 2
8 34 1
I need to group the col1 and get the min and max number (of col1) based on its sequence, col2 and col3.
i.e., I need the result as below:
c1_min c1_max c2 c3
1 3 34 2
4 5 24 2
6 7 34 2
8 8 34 1
There are a number of ways to approach a gaps-and-islands problem. As an alternative to Sylvain's lag version - not better, just different - you can use a trick with row numbers calculated analytically based on your grouping fields. This adds a 'chain' psuedocolumn to the table values, which will be unique for each contiguous group of c2/c3 pairs:
select c1, c2, c3,
dense_rank() over (partition by c2, c3 order by c1)
- dense_rank() over (partition by null order by c1) as chain
from t42
order by c1, c2, c3;
(I can't take credit for this - I first saw it here). You can then use that as an inline view to calculate your sum:
select min(c1) as c1_min, max(c1) as c1_max, c2, c3
from (
select c1, c2, c3,
dense_rank() over (partition by c2, c3 order by c1)
- dense_rank() over (partition by null order by c1) as chain
from t42
)
group by c2, c3, chain
order by c1_min;
C1_MIN C1_MAX C2 C3
---------- ---------- ---------- ----------
1 3 34 2
4 5 24 2
6 7 34 2
8 8 34 1
SQL Fiddle showing the intermediate stage too.
You can use other analytic functions like row_number() instead of dense_rank(); they may give slightly different results for some data, but you get the same result with this sample.
If I understand it well, you want to group consecutive rows together. This is far from being trivial. Or at least, I can't find right now a simple way of doing it. For ease of understanding, I will break the query in several steps:
Step 1:
The first thing is to identify your "groups" boundaries. Using the LAG analytic function might help you here:
CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2"
AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3"
THEN 0
ELSE 1
END CLK,
T.* FROM T
ORDER BY "c1"
Step 2:
The second step, is to number each of your groups. A simple SUM over partition will do the trick. That leads to:
SELECT SUM(CLK) OVER (ORDER BY "c1"
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) GRP,
V.*
FROM (
SELECT
CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2"
AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3"
THEN 0
ELSE 1
END CLK,
T.* FROM T
) V
ORDER BY "c1";
Final step:
Finally, you can wrap that in a simple GROUP BY query to obtain the desired output:
SELECT MIN("c1"), MAX("c1"), "c2", "c3" FROM
(
SELECT SUM(CLK) OVER (ORDER BY "c1"
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) GRP,
V.*
FROM (
SELECT
CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2"
AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3"
THEN 0
ELSE 1
END CLK,
T.* FROM T
) V
)
GROUP BY GRP, "c2", "c3"
ORDER BY GRP
See http://sqlfiddle.com/#!4/7d57c/10

SQL Lookup T1.ColumnA vs T2.ColumnA, if all 5 values in T1.ColumnA are listed in T2.ColumnA, Replace T1.ColumnA with "ALL"

I am trying to perform a lookup of specific values. If all values for Table1 for Region are listed as part of the lookup table (Table2), then replace the Table1.Region field with value = 'ALL', basically trying to group all 5 regions and subsitute the value with "ALL" when all 5 regions are listed (according to the values in the lookup table) When all regions ARE NOT listed, keep the Region Code
Year Region Value
2012 A1 24
2012 B2 24
2012 C3 24
2012 D4 24
2012 E5 24
2012 A1 36
2012 B2 36
2012 C3 36
2012 D4 36
2012 E5 36
2013 A1 24
2013 B2 24
Lookup Table
Region Value
1 A1
2 B2
3 C3
4 D4
5 E5
Result Desired
Year Region Term
2012 ALL 24 <-- Note region change to all because there are 5 regions per above
2012 ALL 36 <-- Note region change to all because there are 5 regions per above
2013 A1 24 <-- Region did not change because there were only 2 regions in source
2013 B2 24 <-- Region did not change because there were only 2 regions in source
I know how to group data but the grouped data needs to be consolidated further and replaced with ALL
Thanks for any direction!
Carlos
There's a little trick to this:
(#var IS NULL OR Table1.Region = #var)
you need to translate 'ALL' to null, so when 'ALL' is selected, it cancels out the filter.
I'd imagine you're either using SSRS, or a drop down above this, but here's a link:
https://www.simple-talk.com/content/print.aspx?article=835#hr2
You did not specify what RDBMS you are using but you should be able to use the following in all database products:
select distinct t1.year,
case
when t2.year is null and t2.value is null
then t1.region
else 'ALL' End Region,
t1.value
from table1 t1
left join
(
select year, value
from table1
group by year, value
having count(distinct region) = (select count(distinct value)
from table2)
) t2
on t1.year = t2.year
and t1.value = t2.value
See SQL Fiddle with Demo
Use this query with APPLY AND EXCEPT operators
SELECT t1.Year, CASE WHEN o.Value IS NULL THEN 'ALL' ELSE t1.Region END AS Region,
t1.Value
FROM dbo.test7 t1 OUTER APPLY (
SELECT Value
FROM test8
EXCEPT
SELECT Region
FROM test7 t2
WHERE t1.Year = t2.Year
AND t1.Value = t2.Value
) o
GROUP BY t1.Year, CASE WHEN o.Value IS NULL THEN 'ALL' ELSE t1.Region END,
t1.Value
Demo on SQLFiddle