"Duplicating" entries SQL - sql

I have a column that looks like
a
b
c
and I think I can select using some sort of a window function to get
a 1
a 2
b 1
b 2
c 1
c 2
but can't seem to find something suitable.
I know you can do this using a union but would prefer using a window function if it exists.

Considering you just want 2 rows, I would just CROSS JOIN to a VALUES table construct with the values 1 and 2 in it:
SELECT YT.YourColumn,
V.I
FROM dbo.YourTable YT
CROSS JOIN (VALUES(1),(2))V(I);

select t.myColumn, x.N
from myTable t
CROSS JOIN
(SELECT TOP (2)
ROW_NUMBER() OVER (ORDER BY t1.Object_ID) AS N
FROM Master.sys.All_Columns t1
CROSS JOIN Master.sys.All_Columns t2) x

Related

Insert missing values in column at all levels of another column in SQL?

I have been working with some big data in SQL/BigQuery and found that it has some holes in it that need to be filled with values in order to complete the dataset. What I'm struggling with is how to insert the missing values properly.
Say that I have multiple levels of a variable (1, 2, 3... no upper bound) and for each of these levels, they should have an A, B, C value. Some of these records will have data, others will not.
Current dataset:
level value data
1 A 1a_data
1 B 1b_data
1 C 1c_data
2 A 2a_data
2 C 2c_data
3 B 3b_data
What I want the dataset to look like:
level value data
1 A 1a_data
1 B 1b_data
1 C 1c_data
2 A 2a_data
2 B NULL
2 C 2c_data
3 A NULL
3 B 3b_data
3 C NULL
What would be the best way to do this?
You need a CROSS join of the distinct levels with the distinct values and a LEFT join to the table:
SELECT l.level, v.value, t.data
FROM (SELECT DISTINCT level FROM tablename) l
CROSS JOIN (SELECT DISTINCT value FROM tablename) v
LEFT JOIN tablename t ON t.level = l.level AND t.value = v.value
ORDER BY l.level, v.value;
See the demo.
We can use an INSERT INTO ... SELECT with the help of a calendar table:
INSERT INTO yourTable (level, value, data)
SELECT t1.level, t2.value, NULL
FROM (SELECT DISTINCT level FROM yourTable) t1
CROSS JOIN (SELECT DISTINCT value FROM yourTable) t2
LEFT JOIN yourTable t3
ON t3.level = t1.level AND
t3.value = t2.value
WHERE t3.data IS NULL;

Using A CTE with a left join

I have an exiting query with a structure:
With As MainQuery(.....)
As Sub 1 (.....)
As Sub 2 (.....)
As Sub 3 (.....)
select.....
I now need to join the results from this query to another query I have, so I want to left join it to the other query..
like this:
left join (
select * from (
With As MainQuery(.....)
As Sub 1 (.....)
As Sub 2 (.....)
As Sub 3 (.....)
select.....) as results
I keep getting errors. not sure if this is possible or which other methods can I use.
Thank you
I can't think of any exceptions as to why this won't work. But all you have to do is move your CTE's to the top of the main query.
In this particular example, since your sub-query is not correlated (the subquery is not accessing any fields in the outside query, it's simply returning a dataset). Then you can just move the CTE's to the top, like this:
-- Generate list of numbers 1-9, then select top 5 randomly
WITH c1 AS (SELECT x.x FROM (VALUES(1),(1),(1)) x(x))
, c2(x) AS (SELECT 1 FROM c1 x CROSS JOIN c1 y)
, c3(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY c2.x) FROM c2)
SELECT TOP(5) x.n
FROM c3 x
ORDER BY NEWID();
-- All numbers 1-9
SELECT x.n
FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9)) x(n)
Say I have these two queries. I want to LEFT JOIN my query with the CTE's into my second query. So that way I have all numbers 1-9, on the left, and then I can see which numbers are missing from the right.
Just move your SELECT statement to the LEFT JOIN subquery, and you can still reference your CTE's:
WITH c1 AS (SELECT x.x FROM (VALUES(1),(1),(1)) x(x))
, c2(x) AS (SELECT 1 FROM c1 x CROSS JOIN c1 y)
, c3(n) AS (SELECT ROW_NUMBER() OVER (ORDER BY c2.x) FROM c2)
SELECT x.n, z.n
FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9)) x(n)
LEFT JOIN (
SELECT TOP(5) x.n
FROM c3 x
ORDER BY NEWID()
) z ON z.n = x.n
Now, if you were trying to use something that references outside the subquery, like a correlated subquery in the SELECT list, or an OUTER/CROSS APPLY or maybe an EXISTS() clause, that's different. But in this case, if it's not correlated, then simply moving it to the top should work just fine.

How can I display two columns together in SQL?

I have 2 queries that return data in the form:
query 1:
column 1
a
b
c
query 2:
column 2
d
e
How can I combine the 2 queries to get output as:
column 1 column 2
a d
b e
c
The order of data in the columns does not matter.
Possibly anything with joins ?
Thanks
use row_number()
select t1.col1,t2.col2 from
(
select *,row_number() over(order by col1) rn from query1
) t1 full outer join
(
select *,row_number() over(order by col2) rn from query2
) t2 on t1.rn=t2.rn
For n,m rows use full outer join
A possible solution is selecting both columns with row_number() and join them by the row_number. One must be aware to select first from the table with the higher number of rows. Example:
select
col_1,
col_2
from (
select
a.col_1,
row_number() over () rn
from a
) s1
FULL OUTER JOIN (
select
b.col_2,
row_number() over () rn
from b
) s2 on s1.rn = s2.rn

Cross Apply Equivalent in Hive?

I want to create the effect of a cross apply in AWS EMR Hive. I've got a little sample code here that runs in SQL Server 2017.
with r as (
select 1 as d
union all
select 2 as d
)
select * from r
cross apply (select 'f' as u) e;
How can I run the equivalent of this in EMR Hive?
I've checked out the Lateral View documentation, but it all references explode, and I don't have an array.
Instead of do CROSS APPLY you may do CROSS JOIN in your case. F.e.:
SET hive.strict.checks.cartesian.product = false;
WITH r AS (
SELECT 1 AS d
UNION ALL
SELECT 2 AS d
)
SELECT *
FROM r
CROSS JOIN (SELECT 'f' AS u) e;
I ended up working around with by just adding an extra field with a single value and joining the two tables together on that to produce the same effect.
It ended up looking something like:
with d as (
select column, 'AreYouKiddingMe' as k from table
), e as (
select column2, 'AreYouKiddingMe' as k from table2
)
select * from d inner join e on d.k = e.k

How to hardcode row in Select statement?

Select 0 AS A, 1 AS B FROM someTable
Based on the above query, I can hardcode the number of column and the data regardless of what data is in someTable, and the number of rows is depending on the number of row in someTable.
I'm wondering what can I do if I want to hardcode the number of row as well?
For example if someTable have only 10 rows, how should I modify the above query so that I can have 1000 lines of records?
You can just keep cross joining your table:
SELECT TOP 1000 0 AS A, 1 AS B
FROM someTable a
CROSS JOIN someTable b
CROSS JOIN someTable c
CROSS JOIN someTable d;
I am assuming from the fact that you have tagged with SSMS this is SQL Server, If not you may need to use LIMIT
SELECT 0 AS A, 1 AS B
FROM someTable a
CROSS JOIN someTable b
CROSS JOIN someTable c
CROSS JOIN someTable d
LIMIT 1000;
The problem here is that if SomeTable only has 1 row, it won't matter how many times you cross join it, you will still only have one row. If you don't actually care about the values in the table, and only want to use it to generate rows then you could just use a system view that you know has more rows than you need (again assuming SQL Server):
SELECT TOP 1000 0 AS A, 1 AS B
FROM sys.all_objects a;
Even on an empty database sys.all_objects will have 2083 rows. If you might need more then just CROSS JOIN the views:
SELECT TOP 1000 0 AS A, 1 AS B
FROM sys.all_objects a
CROSS JOIN sys.all_objects b;
This will give you 4,338,889 rows.