Using CROSS JOIN SEQUENCE to produce large duplicated tables

Using CROSS JOIN SEQUENCE to produce large duplicated tables - sql

I have the following in an Azure notebook (databricks sql):
CREATE TABLE my_new_big_table AS
SELECT t.*
FROM my_table t
CROSS JOIN VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10) v(i);
This duplicates the my_table, 10 times, however how can I use a loop (SEQUENCE, EXPLORE) to create iterations of 100, 1000 etc?

This is my sample data:
With your code, I got the output like below which have every row repeated 10 times.
To do the above dynamically, you can use range(start,end) like below.
%sql
CREATE TABLE result2 AS
SELECT t.*
FROM sample1 as t
CROSS JOIN (select * from range(0,10)) v(i);
select * from result2;
My Result:
(Or)
Use sequence(start,end) with explode() as discussed by community in comments.
create table result3 as
SELECT t.*
FROM sample1 as t
CROSS JOIN (select explode(sequence(1,10))) v(i);
select * from result3;

Related

MS SQL Server how to instantly insert a 1 to 10 number column to table (virtual column)?

any idea about to instantly add a column for number (1 to 10) for each rows value on existing table?

You can use a CROSS JOIN in concert with an ad-hoc tally table
Example
Select A.*
,B.Code
From YourTable A
Cross Join ( Select Top 10 Code=row_number() Over (Order By (Select NULL)) From master..spt_values n1 ) B

You can generate the rows with a recurisve query, then cross join that with your table.
with codes as (
select 1 code
union all select code + 1 from cte where code < 10
)
select t.*, c.code
from mytable t
cross join codes c
For a small number of rows, I would expect the recusive query to be faster than top 10 against a large table.

Is there a SQL function to expand table?

I vaguely remember there being a function that does this, but I think I may be going crazy.
Say I have a datatable, call it table1. It has three columns: column1, column2, column3. The query
SELECT * FROM table1
returns all rows/columns from table1. Isn't there some type of EXPAND function that allows me to duplicate that result? For example, if I want to duplicate everything from the SELECT * FROM table1 query three times, I can do something like EXPAND(3) ?

In BigQuery, I would recommend a CROSS JOIN:
SELECT t1.*
FROM table1 CROSS JOIN
(SELECT 1 as n UNION ALL SELECT 2 UNION ALL SELECT 3) n;
This can get cumbersome for lots of copies, but you can simplify this by generating the numbers:
SELECT t1.*
FROM table1 CROSS JOIN
UNNEST(GENERATE_ARRAY(1, 3)) n
This creates an array with three elements and unnests it into rows.
In both these cases, you can include n in the SELECT to distinguish the copies.

Below is for BigQuery Standard SQL
I think below is close enough to what "got you crazy" o)
#standardSQL
SELECT copy.*
FROM `project.dataset.tabel1` t, UNNEST(FN.EXPAND(t, 3)) copy
To be able to do so, you can leverage recently announced support for persistent standard SQL UDFs, namely - you need to create FN.EXPAND() function as in below example (note: you need to have FN dataset in your project - or use existing dataset in which case you should use YOUR_DATASET.EXPAND() reference
#standardSQL
CREATE FUNCTION FN.EXPAND(s ANY TYPE, dups INT64) AS (
ARRAY (
SELECT s FROM UNNEST(GENERATE_ARRAY(1, dups))
)
);
Finally, if you don't want to create persistent UDF - you can use temp UDF as in below example
#standardSQL
CREATE TEMP FUNCTION EXPAND(s ANY TYPE, dups INT64) AS ( ARRAY(
SELECT s FROM UNNEST(GENERATE_ARRAY(1, dups))
));
SELECT copy.*
FROM `project.dataset.tabel1` t, UNNEST(EXPAND(t, 3)) copy

if you want a cartesian product (all the combination on a row ) you could use
SELECT a.*, b.*, c.*
FROM table1 a
CROSS JOIN table1 b
CROSS JOIN table1 c
if you want the same rows repeated you can use UNION ALL
SELECT *
FROM table1
UNION ALL
SELECT *
FROM table1
UNION ALL
SELECT *
FROM table1

Use union all
Select * from table1
Union all
Select * from table1
Union all
Select * from table1
Union all
Select * from table1
For reuse purposes can embed this code in a procedure like
Create Procedure
expandTable(tablename
varchar2(50))
As
Select * from table1
Union all
Select * from table1
Union all
Select * from table1
Union all
Select * from table1
End
/

SQL Duplicate Row Results

I have a very simple select query which is being used to create an input file for a piece of software. I have the query pulling all the required fields, however I need to replicate the results six times with a hard coded ID number (1,2,3,4,5,6).
I have seen CROSS APPLY and PIVOT but the problem is the column I need to use for these doesn't exist as I'm hard coded then number.
Any help would be much appreciated.
Thanks in Advance

Maybe like this:
select CJ.ID,T.* from dbo.Table T
CROSS JOIN
(select 1 ID UNION ALL select 2 ID UNION ALL select 3 ID UNION ALL select 4 ID UNION ALL select 5 ID UNION ALL select 6 ID) CJ

Bit of a pure guess here, but are you saying that every row in your table needs to be repeated 6 times with the ID 1-6? If so, you can use a CTE of the values 1-6 and CROSS APPLY to that.
WITH Nums AS(
SELECT *
FROM (VALUES (1),(2),(3),(4),(5),(6)) V(N))
SELECT *
FROM YourTable YT
CROSS APPLY Nums;

SQL multiple merge statements with same data source (WITH as)

I would like to have multiple merge statements (one after another) in same query
but i can't use same date source. Example:
WITH DATA as(
SELECT * FROM tables_or_joins
)
MERGE table_name as Target
USING
(SELECT * FROM DATA JOIN another table
)
....
do something more; --and finish this statement here
-- start another merge here
MERGE table_name_2 as Target
USING(
SELECT * FROM DATA and join with another table
)
do something else
But output is Invalid object name 'DATA'. In second merge. Is any other way how to use data in both merge? Hope this is clear enough.

CTE is only good for one statement
Cannot even do two simple selects on DATA
You can have multiple CTE but still only one actual statement

You can use mutiple CTE statements
WITH DATA as( SELECT * FROM tables_or_joins
), DATA2 AS( (SELECT * FROM DATA JOIN another table )
SELECT * from DATA d join DATA2 d2 on d.id=d2.id
OR
Select * from DATA union Select * from DATA2

SQL Server: Find Values that don't exist in a table

I have a list or set of values that I would like to know which ones do not currently exist in a table. I know I can find out which ones do exist with:
SELECT * FROM Table WHERE column1 IN (x,x,x,x,x)
The set is the values I am checking against. Is there a way to find out which values in that set do not exist in column1? Basically, I'm looking for the inverse of the sql statement above.
This is for a report, so all I need is the values that don't exist to be returned back.
I have and could do this with a left join and putting the values in another table, but the values I check are always different and was hoping to find a solution that didn't involve clearing a table and inserting data first. Trying to find a better solution for me if one exists.

You can also use EXCEPT as well as the OUTER JOIN e.g.
SELECT * FROM
(
SELECT -1 AS N
UNION
SELECT 2 AS N
) demo
EXCEPT
SELECT number
FROM spt_values

WITH q(x) AS
(
SELECT x1
UNION ALL
SELECT x2
UNION ALL
SELECT x3
)
SELECT x
FROM q
WHERE x NOT IN
(
SELECT column1
FROM [table]
)

Put the values you want to check for in a table A
LEFT OUTER JOIN the table A against your Table WHERE Table.column1 IS NULL
SELECT column1
FROM A
LEFT OUTER JOIN
Table
ON A.column1 = Table.column1
WHERE Table.column1 IS NULL
This will only show the rows that exist in A but not in Table.

As you want some of the values from the set in the result, and you can't take them from the table (as you want the ones that doesn't exist there), you have to put the set in some kind of table or result so that you can use that as source.
You can for example make a temporary result, that you can join against the table to filter out the ones that does exist in the table:
select set.x
from (
select 1 as x union all
select 2 union all
select 3 union all
select 4 union all
select 5
) as set
left join Table as t on t.column1 = set.x
where t.columnn1 is null

One way you can do it is:
SELECT * FROM Table WHERE column1 NOT IN(...);

Use the NOT operator:
SELECT * FROM Table WHERE column1 NOT IN (x,x,x,x,x)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Using CROSS JOIN SEQUENCE to produce large duplicated tables - sql

Related

MS SQL Server how to instantly insert a 1 to 10 number column to table (virtual column)?

Is there a SQL function to expand table?

SQL Duplicate Row Results

SQL multiple merge statements with same data source (WITH as)

SQL Server: Find Values that don't exist in a table

Categories

Resources