SELECT VALUES in Teradata

SELECT VALUES in Teradata - sql

I know that it's possible in other SQL flavors (T-SQL) to "select" provided data without a table. Like:
SELECT *
FROM (VALUES (1,2), (3,4)) tbl
How can I do this using Teradata?

Teradata has strange syntax for this:
select t.*
from (select * from (select 1 as a, 2 as b) x
union all
select * from (select 3 as a, 4 as b) x
) t;

I don't have access to a TD system to test, but you might be able to remove one of the nested SELECTs from the answer above:
select x.*
from (
select 1 as a, 2 as b
union all
select 3 as a, 4 as b
) x
If you need to generate some random rows, you can always do a SELECT from a system table, like sys_calendar.calendar:
SELECT 1, 2
FROM sys_calendar.calendar
SAMPLE 10;
Updated example:
SELECT TOP 1000 -- Limit to 1000 rows (you can use SAMPLE too)
ROW_NUMBER() OVER() MyNum, -- Sequential numbering
MyNum MOD 7, -- Modulo operator
RANDOM(1,1000), -- Random number between 1,1000
HASHROW(MyNum) -- Rowhash value of given column(s)
FROM sys_calendar.calendar; -- Use as table to source rows
A couple notes:
make sure you pick a system table that will always be present and have rows
if you need more rows than are available in the source table, do a UNION to get more rows
you can always easily create a one-column table and populate it to whatever number of rows you want by INSERT/SELECT into it:
CREATE DummyTable (c1 INT); -- Create table
INSERT INTO DummyTable(1); -- Seed table
INSERT INTO DummyTable SELECT * FROM DummyTable; -- Run this to duplicate rows as many times are you want
Then use this table to create whatever resultset you want, similar to the query above with sys_calendar.calendar.
I don't have a TD system to test so you might get syntax errors...but that should give you a basic idea.

I am a bit late to this thread, but recently got the same error.
I solved this by simply using
select distinct 1 as a, 2 as b from DBC.tables
union all
select distinct 3 as a, 4 as b from DBC.tables
Here, DBC.tables is a DB backend table with a few rows only. So, the query runs fast as well

Related

Microsoft SQL Server - Convert column values to list for SELECT IN

I have this (3 int columns in one table)
Int1 Int2 Int3
---------------
1 2 3
I would like to run such query with another someTable:
SELECT * FROM someTable WHERE someInt NOT IN (1,2,3)
where 1,2,3 are list of INTs converted to a list that I can use with SELECT * NOT IN statement
Any suggestions how to achieve this without stored procedures in Micorosft SQL Server 2019 ?

If you want rows in some table that are not in one of three columns of another table, then use not exists:
select t.*
from sometable t
where not exists (select 1
from t t2
where t.someint in (t2.int1, t2.int2, t2.int3)
);
The subquery returns a row where there is a match. The outer query then rejects any rows with a match.

Seems like you actually want a NOT EXISTS?
SELECT {Your Columns}
FROM dbo.someTable sT
WHERE NOT EXISTS (SELECT 1
FROM dbo.oneTable oT
WHERE sT.someInt NOT IN (oT.int1,oT.int2,oT.int3));
An alternative method would be to unpivot the data, and then use an equality operator:
SELECT {Your Columns}
FROM dbo.someTable sT
WHERE NOT EXISTS (SELECT 1
FROM dbo.oneTable oT
CROSS APPLY (VALUES(oT.int1),(oT.int2),(oT.int3))V(I)
WHERE V.I = sT.someInt);

How to split records evenly into 4 tables from one table?

I have to design a solution that will help me out to load data into 4 tables from 1 master table.
All that the function or package is supposed to do is following :
Count total number of rows in a master table
Divide by 4
Load into table 1,2,3 and 4.
Every time we run the program, this function wipes out 4 tables and do the above process again and the name of the main table and of the destination tables will be always same.
For example, if the Master Table has 4200 records then :
Table A will get 1-1000
Table B will get 1001-2000
Table C will get 2001-3000
Table D will get 3001-4200.
Can anyone help me?

This is a very simple way to do it. There may be faster ways. Replace [TABLE] with the name of your table and [ID] with the name of a unique column in the table.
DECLARE #count int = 0;
DECLARE #numRecsPerTable int = 0;
SELECT #count = COUNT(*) FROM [TABLE]
SELECT #numRecsPerTable = #count / 4
SELECT TOP (#numRecsPerTable) *
INTO temp_1
FROM [TABLE]
SELECT TOP (#numRecsPerTable) *
INTO temp_2
FROM [TABLE]
WHERE [ID] NOT IN (SELECT TOP (#numRecsPerTable) [ID] FROM [TABLE])
SELECT TOP (#numRecsPerTable) *
INTO temp_3
FROM [TABLE]
WHERE [ID] NOT IN (SELECT TOP (#numRecsPerTable * 2) [ID] FROM [TABLE])
SELECT *
INTO temp_4
FROM [TABLE]
WHERE [ID] NOT IN (SELECT TOP (#numRecsPerTable * 3) [ID] FROM [TABLE])
Note: the remainder of recs / 4 will be in the 4th table.

The SSIS implementation is similar to Steve's answer.
Source
The first difference is that instead of division, we'll use the modulo operator, % It generates the remainder after division. In this example, I use %4 which means I will have values of 0, 1, 2 and 3. Four "buckets" of data. To give the modulus operator something to work on, I use the ROW_NUMBER function to generate an arbitrary monotonically incrementing sequence of numbers.
The query would look something like
SELECT
T.*
, (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) % 4 AS bucketNumber
FROM
sys.all_columns AS T;
Conditional Split
I route the data to a Conditional Split component. Here you define boolean expressions and correlate them to named outputs. I defined mine as Bucket0, Bucket1, Bucket2, Bucket3 and used expressions of bucketNumber==0...
Destination
I now have 4 connectors coming out of my conditional split and wire them up to tables Bucket0 to Bucket3.

SQL Join on sequence number

I have 2 tables (A, B). They each have a different column that is basically an order or a sequence number. Table A has 'Sequence' and the values range from 0 to 5. Table B has 'Index' and the values are 16740, 16744, 16759, 16828, 16838, and 16990. Unfortunately I do not know the significance of these values. But I do believe they will always match in sequential order. I want to join these tables on these numbers where 0 = 16740, 1 = 16744, etc. Any ideas?
Thanks

You could use a case expression to convert table a's values to table b's values (or vise-versa) and join on that:
SELECT *
FROM a
JOIN b ON a.[sequence] = CASE b.[index] WHEN 16740 THEN 0
WHEN 16744 THEN 1
WHEN 16759 THEN 2
WHEN 16828 THEN 3
WHEN 16838 THEN 4
WHEN 16990 THEN 5
ELSE NULL
END;

#Mureinik has a great example. If down the road you do end up adding more numbers maybe putting this information into a new table would be a good idea.
CREATE TABLE C(
AInfo INT,
BInfo INT
)
INSERT INTO TABLE C(AInfo,BInfo) VALUES(0,16740)
INSERT INTO TABLE C(AInfo,BInfo) VALUES(1,16744)
etc
Then you can Join all the tables.

If the values are in ascending order as per your example, you can use the ROW_NUMBER() function to achieve this:
;with cte AS (SELECT *, ROW_NUMBER() OVER(ORDER BY [Index])-1 RN
FROM B)
SELECT *
FROM cte

create a table of duplicated rows of another table using the select statement

I have a table with one column containing different integers.
For each integer in the table I would like to duplicate it as the number of digits -
For example:
12345 (5 digits):
1. 12345
2. 12345
3. 12345
4. 12345
5. 12345
I thought doing it using with recursion t (...) as () but I didn't manage, since I don't really understand how it works and what is happening "behind the scenes.
I don't want to use insert because I want it to be scalable and automatic for as many integers as needed in a table.
Any thoughts and an explanation would be great.

The easiest way is to join to a table with numbers from 1 to n in it.
SELECT n, x
FROM yourtable
JOIN
(
SELECT day_of_calendar AS n
FROM sys_calendar.CALENDAR
WHERE n BETWEEN 1 AND 12 -- maximum number of digits
) AS dt
ON n <= CHAR_LENGTH(TRIM(ABS(x)))
In my example I abused TD's builtin calendar, but that's not a good choice, as the optimizer doesn't know how many rows will be returned and as the plan must be a Product Join it might decide to do something stupid. So better use a number table...

Create a numbers table that will contain the integers from 1 to the maximum number of digits that the numbers in your table will have (I went with 6):
create table numbers(num int)
insert numbers
select 1 union select 2 union select 3 union select 4 union select 5 union select 6
You already have your table (but here's what I was using to test):
create table your_table(num int)
insert your_table
select 12345 union select 678
Here's the query to get your results:
select ROW_NUMBER() over(partition by b.num order by b.num) row_num, b.num, LEN(cast(b.num as char)) num_digits
into #temp
from your_table b
cross join numbers n
select t.num
from #temp t
where t.row_num <= t.num_digits

I found a nice way to perform this action. Here goes:
with recursive t (num,num_as_char,char_n)
as
(
select num
,cast (num as varchar (100)) as num_as_char
,substr (num_as_char,1,1)
from numbers
union all
select num
,substr (t.num_as_char,2) as num_as_char2
,substr (num_as_char2,1,1)
from t
where char_length (num_as_char2) > 0
)
select *
from t
order by num,char_length (num_as_char) desc

Numeric Overflow in Recursive Query : Teradata

I'm new to teradata. I want to insert numbers 1 to 1000 into the table test_seq, which is created as below.
create table test_seq(
seq_id integer
);
After searching on this site, I came up with recusrive query to insert the numbers.
insert into test_seq(seq_id)
with recursive cte(id) as (
select 1 from test_dual
union all
select id + 1 from cte
where id + 1 <= 1000
)
select id from cte;
test_dual is created as follows and it contains just a single value. (something like DUAL in Oracle)
create table test_dual(
test_dummy varchar(1)
);
insert into test_dual values ('X');
But, when I run the insert statement, I get the error, Failure 2616 Numeric overflow occurred during computation.
What did I do wrong here? Isn't the integer datatype enough to hold numeric value 1000?
Also, is there a way to write the query so that i can do away with test_dual table?

When you simply write 1 the parser assigns the best matching datatype to it, which is a BYTEINT. The valid range of values for BYTEINT is -128 to 127, so just add a typecast to INT :-)
Usually you don't need a dummy DUAL table in Teradata, "SELECT 1;" is valid, but in some cases the parser still insists on a FROM (don't ask me why). This trick should work:
SEL * FROM (SELECT 1 AS x) AS dt;
You can create a view on this:
REPLACE VIEW oDUAL AS SELECT * FROM (SELECT 'X' AS dummy) AS dt;
Explain "SELECT 1 FROM oDUAL;" is a bit stupid, so a real table might be better. But to get efficient access (= single AMP/single row) it must be defined as follows:
CREATE TABLE dual_tbl(
dummy VARCHAR(1) CHECK ( dummy = 'X')
) UNIQUE PRIMARY INDEX(dummy); -- i remember having fun when you inserted another row in Oracle's DUAL :_)
INSERT INTO dual_tbl VALUES ('X');
REPLACE VIEW oDUAL AS SELECT dummy FROM dual_tbl WHERE dummy = 'X';
insert into test_seq(seq_id)
with recursive cte(id) as (
select cast(1 as int) from oDUAL
union all
select id + 1 from cte
where id + 1 <= 1000
)
select id from cte;
But recursion is not an appropriate way to get a range of numbers as it's sequential and always an "all-AMP step" even if it the data resides on a single AMP like in this case.
If it's less than 73414 values (201 years) better use sys_calendar.calendar (or any other table with a known sequence of numbers) :
SELECT day_of_calendar
FROM sys_calendar.CALENDAR
WHERE day_of_calendar BETWEEN 1 AND 1000;
Otherwise use CROSS joins, e.g. to get numbers from 1 to 1,000,000:
WITH cte (i) AS
( SELECT day_of_calendar
FROM sys_calendar.CALENDAR
WHERE day_of_calendar BETWEEN 1 AND 1000
)
SELECT
(t2.i - 1) * 1000 + t1.i
FROM cte AS t1 CROSS JOIN cte AS t2;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SELECT VALUES in Teradata - sql

I know that it's possible in other SQL flavors (T-SQL) to "select" provided data without a table. Like: SELECT * FROM (VALUES (1,2), (3,4)) tbl How can I do this using Teradata?

Teradata has strange syntax for this: select t.* from (select * from (select 1 as a, 2 as b) x union all select * from (select 3 as a, 4 as b) x ) t;

I am a bit late to this thread, but recently got the same error. I solved this by simply using select distinct 1 as a, 2 as b from DBC.tables union all select distinct 3 as a, 4 as b from DBC.tables Here, DBC.tables is a DB backend table with a few rows only. So, the query runs fast as well

Related

Microsoft SQL Server - Convert column values to list for SELECT IN

How to split records evenly into 4 tables from one table?

SQL Join on sequence number

create a table of duplicated rows of another table using the select statement

Numeric Overflow in Recursive Query : Teradata

Categories

Resources