PostgreSQL: Select a single-row x amount of times - sql

A single row in a table has a column with an integer value >= 1 and must be selected however many times the column says. So if the column had '2', I'd like the select query to return the single-row 2 times.
How can this be accomplished?

Don't know why you would want to do such a thing, but...
CREATE TABLE testy (a int,b text);
INSERT INTO testy VALUES (3,'test');
SELECT testy.*,generate_series(1,a) from testy; --returns 3 rows

You could make a table that is just full of numbers, like this:
CREATE TABLE numbers
(
num INT NOT NULL
, CONSTRAINT numbers_pk PRIMARY KEY (num)
);
and populate it with as many numbers as you need, starting from one:
INSERT INTO numbers VALUES(1);
INSERT INTO numbers VALUES(2);
INSERT INTO numbers VALUES(3);
...
Then, if you had the table "mydata" that han to repeat based on the column "repeat_count" you would query it like so:
SELECT mydata.*
FROM mydata
JOIN numbers
ON numbers.num <= mydata.repeat_count
WHERE ...
If course you need to know the maximum repeat count up front, and have your numbers table go that high.
No idea why you would want to do this thought. Care to share?

You can do it with a recursive query, check out the examples in
the postgresql docs.
something like
WITH RECURSIVE t(cnt, id, field2, field3) AS (
SELECT 1, id, field2, field3
FROM foo
UNION ALL
SELECT t.cnt+1, t.id, t.field2, t.field3
FROM t, foo f
WHERE t.id = f.id and t.cnt < f.repeat_cnt
)
SELECT id, field2, field3 FROM t;

The simplest way is making a simple select, like this:
SELECT generate_series(1,{xTimes}), a.field1, a.field2 FROM my_table a;

Related

UNION two SELECT queries but result set is smaller than one of them

In a SQL Server statement there is
SELECT id, book, acnt, prod, category from Table1 <where clause...>
UNION
SELECT id, book, acnt, prod, category from Table2 <where clause...>
The first query returned 131,972 lines of data; the 2nd one, 147,692 lines. I didn't notice there is any commonly shared line of data from these two tables, so I expect the result set after UNION should be the same as the sum of 131972 + 147692 = 279,384.
However the result set after UNION is 133,857. Even though they might have overlapped lines that I accidently missed, the result should be at least the same as the larger result set of those two. I can't figure how the number 133,857 came from.
Is my understanding about SQL UNION correct? I use SQL server in this case.
To expand comment given under the question, which I think states what you already know:
UNION takes care of duplicates also within one table as well.
Just take a look at a example:
SETUP:
create table tbl1 (col1 int, col2 int);
insert into tbl1 values
(1,2),
(3,4);
create table tbl2 (col1 int, col2 int);
insert into tbl1 values
(1,2),
(1,2),
(1,2),
(3,4);
Query
select * from tbl1
union
select * from tbl2;
will produce output
col1 | col2
-----|------
1 | 2
3 | 4
DB fiddle

How can I merge 2 partially overlapping strings using Apache Hive?

I have a field which holds a short list of ids of a fixed length.
e.g. aab:aac:ada:afg
The field is intended to hold at most 5 ids, growing gradually. I update it by adding from a similarly constructed field that may partially overlap with my existing set, e.g. ada:afg:fda:kfc.
The field expans when joined to an "update" table, as in the following example.
Here, id_list is the aforementioned list I want to "merge", and table_update is a table with new values I want to "merge" into table1.
insert overwrite table table1
select
id,
field1,
field2,
case
when (some condition) then a.id_list
else merge(a.id_list, b.id_list)
end as id_list
from table1 a
left join
table_update b
on a.id = b.id;
I'd like to produce a combined field with the following value:
aab:aac:ada:afg:fda.
The challenge is that I don't know whether or how much overlap the strings have until execution, and I cannot run any external code, or create UDFs.
Any suggestions how I could approach this?
Split to get arrays, explode them, select existing union all new, aggregate using collect_set, it will produce unique array, concatenate array into string using concat_ws(). Not tested:
select concat_ws(':',collect_set(id))
from
(
select explode(split('aab:aac:ada:afg',':')) as id --existing
union all
select explode(split('ada:afg:fda:kfc',':')) as id --new
);
You can use UNION instead UNION ALL to get distinct values before aggregating into array. Or you can join new and existing and concatenate strings into one, then do the same:
select concat_ws(':',collect_set(id))
from
(
select explode(split(concat('aab:aac:ada:afg',':','ada:afg:fda:kfc'),':')) as id --existing+new
);
Most probably you will need to use lateral view with explode in the real query. See this answer about lateral view usage
Update:
insert overwrite table table1
select concat_ws(':',collect_set(a.idl)) as id_list,
id,
field1,
field2
from
(
select
id,
field1,
field2,
split(
case
when (some condition) then a.id_list
when b.id_list is null then a.id_list
else concat(a.id_list,':',b.id_list)
end,':') as id_list_array
from table1 a
left join table_update b on a.id = b.id
)s
LATERAL VIEW OUTER explode(id_list_array ) a AS idl
group by
id,
field1,
field2
;

Numeric Overflow in Recursive Query : Teradata

I'm new to teradata. I want to insert numbers 1 to 1000 into the table test_seq, which is created as below.
create table test_seq(
seq_id integer
);
After searching on this site, I came up with recusrive query to insert the numbers.
insert into test_seq(seq_id)
with recursive cte(id) as (
select 1 from test_dual
union all
select id + 1 from cte
where id + 1 <= 1000
)
select id from cte;
test_dual is created as follows and it contains just a single value. (something like DUAL in Oracle)
create table test_dual(
test_dummy varchar(1)
);
insert into test_dual values ('X');
But, when I run the insert statement, I get the error, Failure 2616 Numeric overflow occurred during computation.
What did I do wrong here? Isn't the integer datatype enough to hold numeric value 1000?
Also, is there a way to write the query so that i can do away with test_dual table?
When you simply write 1 the parser assigns the best matching datatype to it, which is a BYTEINT. The valid range of values for BYTEINT is -128 to 127, so just add a typecast to INT :-)
Usually you don't need a dummy DUAL table in Teradata, "SELECT 1;" is valid, but in some cases the parser still insists on a FROM (don't ask me why). This trick should work:
SEL * FROM (SELECT 1 AS x) AS dt;
You can create a view on this:
REPLACE VIEW oDUAL AS SELECT * FROM (SELECT 'X' AS dummy) AS dt;
Explain "SELECT 1 FROM oDUAL;" is a bit stupid, so a real table might be better. But to get efficient access (= single AMP/single row) it must be defined as follows:
CREATE TABLE dual_tbl(
dummy VARCHAR(1) CHECK ( dummy = 'X')
) UNIQUE PRIMARY INDEX(dummy); -- i remember having fun when you inserted another row in Oracle's DUAL :_)
INSERT INTO dual_tbl VALUES ('X');
REPLACE VIEW oDUAL AS SELECT dummy FROM dual_tbl WHERE dummy = 'X';
insert into test_seq(seq_id)
with recursive cte(id) as (
select cast(1 as int) from oDUAL
union all
select id + 1 from cte
where id + 1 <= 1000
)
select id from cte;
But recursion is not an appropriate way to get a range of numbers as it's sequential and always an "all-AMP step" even if it the data resides on a single AMP like in this case.
If it's less than 73414 values (201 years) better use sys_calendar.calendar (or any other table with a known sequence of numbers) :
SELECT day_of_calendar
FROM sys_calendar.CALENDAR
WHERE day_of_calendar BETWEEN 1 AND 1000;
Otherwise use CROSS joins, e.g. to get numbers from 1 to 1,000,000:
WITH cte (i) AS
( SELECT day_of_calendar
FROM sys_calendar.CALENDAR
WHERE day_of_calendar BETWEEN 1 AND 1000
)
SELECT
(t2.i - 1) * 1000 + t1.i
FROM cte AS t1 CROSS JOIN cte AS t2;

Return id if a row exists, INSERT otherwise

I'm writing a function in node.js to query a PostgreSQL table.
If the row exists, I want to return the id column from the row.
If it doesn't exist, I want to insert it and return the id (insert into ... returning id).
I've been trying variations of case and if else statements and can't seem to get it to work.
A solution in a single SQL statement. Requires PostgreSQL 8.4 or later though.
Consider the following demo:
Test setup:
CREATE TEMP TABLE tbl (
id serial PRIMARY KEY
,txt text UNIQUE -- obviously there is unique column (or set of columns)
);
INSERT INTO tbl(txt) VALUES ('one'), ('two');
INSERT / SELECT command:
WITH v AS (SELECT 'three'::text AS txt)
,s AS (SELECT id FROM tbl JOIN v USING (txt))
,i AS (
INSERT INTO tbl (txt)
SELECT txt
FROM v
WHERE NOT EXISTS (SELECT * FROM s)
RETURNING id
)
SELECT id, 'i'::text AS src FROM i
UNION ALL
SELECT id, 's' FROM s;
The first CTE v is not strictly necessary, but achieves that you have to enter your values only once.
The second CTE s selects the id from tbl if the "row" exists.
The third CTE i inserts the "row" into tbl if (and only if) it does not exist, returning id.
The final SELECT returns the id. I added a column src indicating the "source" - whether the "row" pre-existed and id comes from a SELECT, or the "row" was new and so is the id.
This version should be as fast as possible as it does not need an additional SELECT from tbl and uses the CTEs instead.
To make this safe against possible race conditions in a multi-user environment:
Also for updated techniques using the new UPSERT in Postgres 9.5 or later:
Is SELECT or INSERT in a function prone to race conditions?
I would suggest doing the checking on the database side and just returning the id to nodejs.
Example:
CREATE OR REPLACE FUNCTION foo(p_param1 tableFoo.attr1%TYPE, p_param2 tableFoo.attr1%TYPE) RETURNS tableFoo.id%TYPE AS $$
DECLARE
v_id tableFoo.pk%TYPE;
BEGIN
SELECT id
INTO v_id
FROM tableFoo
WHERE attr1 = p_param1
AND attr2 = p_param2;
IF v_id IS NULL THEN
INSERT INTO tableFoo(id, attr1, attr2) VALUES (DEFAULT, p_param1, p_param2)
RETURNING id INTO v_id;
END IF;
RETURN v_id:
END;
$$ LANGUAGE plpgsql;
And than on the Node.js-side (i'm using node-postgres in this example):
var pg = require('pg');
pg.connect('someConnectionString', function(connErr, client){
//do some errorchecking here
client.query('SELECT id FROM foo($1, $2);', ['foo', 'bar'], function(queryErr, result){
//errorchecking
var id = result.rows[0].id;
};
});
Something like this, if you are on PostgreSQL 9.1
with test_insert as (
insert into foo (id, col1, col2)
select 42, 'Foo', 'Bar'
where not exists (select * from foo where id = 42)
returning foo.id, foo.col1, foo.col2
)
select id, col1, col2
from test_insert
union
select id, col1, col2
from foo
where id = 42;
It's a bit longish and you need to repeat the id to test for several times, but I can't think of a different solution that involves a single SQL statement.
If a row with id=42 exists, the writeable CTE will not insert anything and thus the existing row will be returned by the second union part.
When testing this I actually thought the new row would be returned twice (therefor a union not a union all) but it turns out that the result of the second select statement is actually evaluated before the whole statement is run and it does not see the newly inserted row. So in case a new row is inserted, it will be taken from the "returning" part.
create table t (
id serial primary key,
a integer
)
;
insert into t (a)
select 2
from (
select count(*) as s
from t
where a = 2
) s
where s.s = 0
;
select id
from t
where a = 2
;

Make SQL Select same row multiple times

I need to test my mail server. How can I make a Select statement
that selects say ID=5469 a thousand times.
If I get your meaning then a very simple way is to cross join on a derived query on a table with more than 1000 rows in it and put a top 1000 on that. This would duplicate your results 1000 times.
EDIT: As an example (This is MSSQL, I don't know if Access is much different)
SELECT
MyTable.*
FROM
MyTable
CROSS JOIN
(
SELECT TOP 1000
*
FROM
sysobjects
) [BigTable]
WHERE
MyTable.ID = 1234
You can use the UNION ALL statement.
Try something like:
SELECT * FROM tablename WHERE ID = 5469
UNION ALL
SELECT * FROM tablename WHERE ID = 5469
You'd have to repeat the SELECT statement a bunch of times but you could write a bit of VB code in Access to create a dynamic SQL statement and then execute it. Not pretty but it should work.
Create a helper table for this purpose:
JUST_NUMBER(NUM INT primary key)
Insert (with the help of some (VB) script) numbers from 1 to N. Then execute this unjoined query:
SELECT MYTABLE.*
FROM MYTABLE,
JUST_NUMBER
WHERE MYTABLE.ID = 5469
AND JUST_NUMBER.NUM <= 1000
Here's a way of using a recursive common table expression to generate some empty rows, then to cross join them back onto your desired row:
declare #myData table (val int) ;
insert #myData values (666),(888),(777) --some dummy data
;with cte as
(
select 100 as a
union all
select a-1 from cte where a>0
--generate 100 rows, the max recursion depth
)
,someRows as
(
select top 1000 0 a from cte,cte x1,cte x2
--xjoin the hundred rows a few times
--to generate 1030301 rows, then select top n rows
)
select m.* from #myData m,someRows where m.val=666
substitute #myData for your real table, and alter the final predicate to suit.
easy way...
This exists only one row into the DB
sku = 52 , description = Skullcandy Inkd Green ,price = 50,00
Try to relate another table in which has no constraint key to the main table
Original Query
SELECT Prod_SKU , Prod_Descr , Prod_Price FROM dbo.TB_Prod WHERE Prod_SKU = N'52'
The Functional Query ...adding a not related table called 'dbo.TB_Labels'
SELECT TOP ('times') Prod_SKU , Prod_Descr , Prod_Price FROM dbo.TB_Prod,dbo.TB_Labels WHERE Prod_SKU = N'52'
In postgres there is a nice function called generate_series. So in postgreSQL it is as simple as:
select information from test_table, generate_series(1, 1000) where id = 5469
In this way, the query is executed 1000 times.
Example for postgreSQL:
CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; --To be able to use function uuid_generate_v4()
--Create a test table
create table test_table (
id serial not null,
uid UUID NOT NULL,
CONSTRAINT uid_pk PRIMARY KEY(id));
-- Insert 10000 rows
insert into test_table (uid)
select uuid_generate_v4() from generate_series(1, 10000);
-- Read the data from id=5469 one thousand times
select id, uid, uuid_generate_v4() from test_table, generate_series(1, 1000) where id = 5469;
As you can see in the result below, the data from uid is read 1000 times as confirmed by the generation of a new uuid at every new row.
id |uid |uuid_generate_v4
----------------------------------------------------------------------------------------
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"5630cd0d-ee47-4d92-9ee3-b373ec04756f"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"ed44b9cb-c57f-4a5b-ac9a-55bd57459c02"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"3428b3e3-3bb2-4e41-b2ca-baa3243024d9"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"7c8faf33-b30c-4bfa-96c8-1313a4f6ce7c"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"b589fd8a-fec2-4971-95e1-283a31443d73"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"8b9ab121-caa4-4015-83f5-0c2911a58640"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"7ef63128-b17c-4188-8056-c99035e16c11"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"5bdc7425-e14c-4c85-a25e-d99b27ae8b9f"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"9bbd260b-8b83-4fa5-9104-6fc3495f68f3"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"c1f759e1-c673-41ef-b009-51fed587353c"
5469|"10791df5-ab72-43b6-b0a5-6b128518e5ee"|"4a70bf2b-ddf5-4c42-9789-5e48e2aec441"
Of course other DBs won't necessarily have the same function but it could be done:
See here.
If your are doing this in sql Server
declare #cnt int
set #cnt = 0
while #cnt < 1000
begin
select '12345'
set #cnt = #cnt + 1
end
select '12345' can be any expression
Repeat rows based on column value of TestTable. First run the Create table and insert statement, then run the following query for the desired result.
This may be another solution:
CREATE TABLE TestTable
(
ID INT IDENTITY(1,1),
Col1 varchar(10),
Repeats INT
)
INSERT INTO TESTTABLE
VALUES ('A',2), ('B',4),('C',1),('D',0)
WITH x AS
(
SELECT TOP (SELECT MAX(Repeats)+1 FROM TestTable) rn = ROW_NUMBER()
OVER (ORDER BY [object_id])
FROM sys.all_columns
ORDER BY [object_id]
)
SELECT * FROM x
CROSS JOIN TestTable AS d
WHERE x.rn <= d.Repeats
ORDER BY Col1;
This trick helped me in my requirement.
here, PRODUCTDETAILS is my Datatable
and orderid is my column.
declare #Req_Rows int = 12
;WITH cte AS
(
SELECT 1 AS Number
UNION ALL
SELECT Number + 1 FROM cte WHERE Number < #Req_Rows
)
SELECT PRODUCTDETAILS.*
FROM cte, PRODUCTDETAILS
WHERE PRODUCTDETAILS.orderid = 3
create table #tmp1 (id int, fld varchar(max))
insert into #tmp1 (id, fld)
values (1,'hello!'),(2,'world'),(3,'nice day!')
select * from #tmp1
go
select * from #tmp1 where id=3
go 1000
drop table #tmp1
in sql server try:
print 'wow'
go 5
output:
Beginning execution loop
wow
wow
wow
wow
wow
Batch execution completed 5 times.
The easy way is to create a table with 1000 rows. Let's call it BigTable. Then you would query for the data you want and join it with the big table, like this:
SELECT MyTable.*
FROM MyTable, BigTable
WHERE MyTable.ID = 5469