Multiply rows in single query - sql

Table1 has the following 2 columns and 4 rows:
Entity Number
------ ------
Car 4
Shop 1
Apple 3
Pear 1
I'd like to have one set based SQL query, which produces the below desired results. Basically duplicating the Entities by the Number of times in the Number column.
I could only do it by loop through the rows one by one, which is not really elegant, neither set based.
Desired result:
Entity
------
Car
Car
Car
Car
Shop
Apple
Apple
Apple
Pear

One method uses recursive CTEs:
with cte as (
select t1.entity, t1.number
from table1 t1
union all
select cte.entity, cte.number - 1
from cte
where cte.number > 0
)
select entity
from cte;
Note: Using the default settings, this is limited to 100 rows per entity. You can use OPTION (MAXRECURSION 0) to get around this.
You can also solve this with a numbers table, but such a problem is a good introduction to recursive CTEs.

Use this
;WITH CTE
AS
(
SELECT
SeqNo = 1,
Entity,
Number
FROM YourTable
UNION ALL
SELECT
SeqNo = SeqNo+1,
Entity,
Number
FROM CTE
WHERE SeqNo < Number
)
SELECT
Entity
FROM CTE
ORDER BY 1

A non-recursion solution, will be using a fixed sequence number, then join the table based on this number like this:
WITH numbers
AS
(
SELECT n
FROM (VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9), (10)) AS numbers(n)
)
SELECT t.Entity
FROM Table1 AS t
INNER JOIN numbers as n ON t.number >= n.n;
This will support up to 10 times duplication, you can add extra numbers to support extra duplication times.
Demo

You can use spt_values as source for numbers table
select EntityList.*
from EntityList
, (
select number as n from master..spt_values WHERE Type = 'P' and Number between 1 and (select max(number) from EntityList)
) t
where n <= number
order by entity

Related

Random Samples of XX rows per Column Value

I'm using T-SQL and require some sample output of random rows.
Typically I would write some SQL as per below
Select top 10 *
from SampleTable as ST
Order by NewID()
However this time I want say 100 rows but them split equally by another column value for instance Column 'Type'.
100 Rows with a sample of 25 rows for TypeA , 25 rows for Type B, 25 rows for Type C and lastly 25 rows for Type D scenerio.
My 'Type' values are saved to a temp table
Select top 10 *
from SampleTable as ST
Inner Join #Types as TY
on TY.Type = ST.Type
Order by NewID()
I've seen NTILE but not sure if applicable for my problem.
Thanks.
Use ROW_NUMBER in conjunction with NEWID():
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY ST.Type ORDER BY NEWID()) rn
FROM SampleTable AS ST
INNER JOIN #TypesAS TY ON TY.Type = ST.Type
)
SELECT *
FROM cte
WHERE rn <= 25;
The above solution will return 25 records from each type (or however many fewer might be available), randomly.

How to find Max value in a column in SQL Server 2012

I want to find the max value in a column
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
2 1 10 P2
3 2 50 P2
4 2 80 P1
Above is my table structure. I just want to find the max total value only from the table. In that four row ID 1 and 2 have same value in CName but total val and PName has different values. What I am expecting is have to find the max value in ID 1 and 2
Expected result:
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
4 2 80 P1
I need result same as like mention above
select Max(Tot_Val), CName
from table1
where PName in ('P1', 'P2')
group by CName
This is query I have tried but my problem is that I am not able to bring PName in this table. If I add PName in the select list means it will showing the rows doubled e.g. Result is 100 rows but when I add PName in selected list and group by list it showing 600 rows. That is the problem.
Can someone please help me to resolve this.
One possible option is to use a subquery. Give each row a number within each CName group ordered by Tot_Val. Then select the rows with a row number equal to one.
select x.*
from ( select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt ) x
where x.No = 1;
An alternative would be to use a common table expression (CTE) instead of a subquery to isolate the first result set.
with x as
(
select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt
)
select x.*
from x
where x.No = 1;
See both solutions in action in this fiddle.
You can search top-n-per-group for this kind of a query.
There are two common ways to do it. The most efficient method depends on your indexes and data distribution and whether you already have another table with the list of all CName values.
Using ROW_NUMBER
WITH
CTE
AS
(
SELECT
ID, CName, Tot_Val, PName,
ROW_NUMBER() OVER (PARTITION BY CName ORDER BY Tot_Val DESC) AS rn
FROM table1
)
SELECT
ID, CName, Tot_Val, PName
FROM CTE
WHERE rn=1
;
Using CROSS APPLY
WITH
CTE
AS
(
SELECT CName
FROM table1
GROUP BY CName
)
SELECT
A.ID
,A.CName
,A.Tot_Val
,A.PName
FROM
CTE
CROSS APPLY
(
SELECT TOP(1)
table1.ID
,table1.CName
,table1.Tot_Val
,table1.PName
FROM table1
WHERE
table1.CName = CTE.CName
ORDER BY
table1.Tot_Val DESC
) AS A
;
See a very detailed answer on dba.se Retrieving n rows per group
, or here Get top 1 row of each group
.
CROSS APPLY might be as fast as a correlated subquery, but this often has very good performance (and better than ROW_NUMBER():
select t.*
from t
where t.tot_val = (select max(t2.tot_val)
from t t2
where t2.cname = t.cname
);
Note: The performance depends on having an index on (cname, tot_val).

Split one row into multiple rows in SQL with amounts divided equally

I have a table that contains ID, an amount column and a count column. For each row I would like to split them into multiple rows, based on the count column. I would then like the amount column to be split evenly between these rows, and create a new id based on the original id and the row count.
This is how the table looks like:
ID Amount Count
1001 8 2
1002 15 3
And this is the desired output
ID Amount
1001-1 4
1001-2 4
1002-1 5
1002-2 5
1002-3 5
Whats the best approach for this?
You can use a recursive CTE. This looks something like:
with recursive cte as (
select id, amount / cnt as amount, cnt, 1 as lev
from t
union all
select id, amount, cnt, lev + 1
from t
where lev < cnt
)
select id || '-' || lev, amount
from cte;
Note that this uses standard syntax; the exact syntax might vary depending on your database.
Unfortunately, Redshift does not support recursive queries.
Here is another option using a temporary table of numbers.
create temp table tmp(n int);
insert into tmp(n) values (1), (2), (3), (4), ...; -- expand as needed
select concat(t.id, '-', p.n) id, t.amount/t.count amount
from mytable t
inner join tmp p on p.n <= t.count
order by t.id, p.n

get ROW NUMBER of random records

For a simple SQL like,
SELECT top 3 MyId FROM MyTable ORDER BY NEWID()
how to add row numbers to them so that the row numbers become 1,2, and 3?
UPDATE:
I thought I can simplify my question as above, but it turns out to be more complicated. So here is a fuller version -- I need to give three random picks (from MyTable) for each person, with pick/row number of 1, 2, and 3, and there is no logical joining between person and picks.
SELECT * FROM Person
LEFT JOIN (
SELECT top 3 MyId FROM MyTable ORDER BY NEWID()
) D ON 1=1
The problem with above SQL are,
Obviously, pick/row number of 1, 2, and 3 should be added
and what is not obvious is that, the above SQL will give each person the same picks, whereas I need to give different person different picks
Here is a working SQL to test it out:
SELECT TOP 15 database_id, create_date, cs.name FROM sys.databases
CROSS apply (
SELECT top 3 Row_number()OVER(ORDER BY (SELECT NULL)) AS RowNo,*
FROM (SELECT top 3 name from sys.all_views ORDER BY NEWID()) T
) cs
So, Please help.
NOTE: This is NOT about MySQL byt T-SQL as their syntax are different, Thus the solution is different as well.
Add Row_number to outer query. Try this
SELECT Row_number()OVER(ORDER BY (SELECT NULL)),*
FROM (SELECT TOP 3 MyId
FROM MyTable
ORDER BY Newid()) a
Logically TOP keyword is processed after Select. After Row Number is generated random 3 records will be pulled. So you should not generate Row Number in original query
Update
It can be achieved through CROSS APPLY. Replace the column names inside cross apply where clause with valid column name from Person table
SELECT *
FROM Person p
CROSS apply (SELECT Row_number()OVER(ORDER BY (SELECT NULL)) rn,*
FROM (SELECT TOP 3 MyId
FROM MyTable
WHERE p.some_col = p.some_col -- Replace it with some column from person table
ORDER BY Newid())a) cs

Replace nested query to single select query

Consider the table fields as follows.
Appid Client_name is_real RTT
100 C1 1 1
200 C1 1 6
200 C2 1 7
100 C1 1 9
200 C1 0 7
Now I need total number of unique real Appid's in the table. We can say one appid record is real by if 'is_real' is 1.
In above table, we have only 3 real Appid's. Which are (100,C1), (200,C1) and (200, C2).
Postgesql command:
Select sum(r)
from (select count(is_real) as r from table group by Appid, Client_name) as t;
I don't want any recursive query. If you can fetch with single select query, it would be helpful.
Since you seem to define a unique id by (Appid, Client_name) (which is confusing, since you are mixing terms):
SELECT COUNT(DISTINCT (Appid, Client_name)) AS ct
FROM tbl
WHERE is_real = 1;
(Appid, Client_name) is a row-type expression, short for ROW(Appid, Client_name). Only distinct combinations are counted.
Another trick to get this done without subquery is to use a window function:
SELECT DISTINCT count(*) OVER () AS ct
FROM tbl
WHERE is_real = 1
GROUP BY Appid, Client_name;
But neither is going to be faster than using a subquery (which is not a recursive query):
SELECT count(*) AS ct
FROM (
SELECT 1
FROM tbl
WHERE is_real = 1
GROUP BY Appid, Client_name
) sub;
That's what I would use.
It's essential to understand the sequence of events in a SELECT query:
Best way to get result count before LIMIT was applied
total number of unique real Appid's in the table
I assume is_real is 1 = true, 0 = false.
SELECT COUNT(DISTINCT Appid)
FROM table
WHERE is_real = 1;