Recursive SQL statement (Postgresql) - simplified version - sql

This is simplified question for more complicated one posted here:
Recursive SQL statement (PostgreSQL 9.1.4)
Simplified question
Given you have upper triangular matrix stored in 3 columns (RowIndex, ColumnIndex, MatrixValue):
ColumnIndex
1 2 3 4 5
1 2 2 3 3 4
2 4 4 5 6 X
3 3 2 2 X X
4 2 1 X X X
5 1 X X X X
X values are to be calculated using the following algorithm:
M[i,j] = (M[i-1,j]+M[i,j-1])/2
(i= rows, j = columns, M=matrix)
Example:
M[3,4] = (M[2,4]+M[3,3])/2
M[3,5] = (m[2,5]+M[3,4])/2
The full required result is:
ColumnIndex
1 2 3 4 5
1 2 2 3 3 4
2 4 4 5 6 5
3 3 2 2 4 4.5
4 2 1 1.5 2.75 3.625
5 1 1 1.25 2.00 2.8125
Sample data:
create table matrix_data (
RowIndex integer,
ColumnIndex integer,
MatrixValue numeric);
insert into matrix_data values (1,1,2);
insert into matrix_data values (1,2,2);
insert into matrix_data values (1,3,3);
insert into matrix_data values (1,4,3);
insert into matrix_data values (1,5,4);
insert into matrix_data values (2,1,4);
insert into matrix_data values (2,2,4);
insert into matrix_data values (2,3,5);
insert into matrix_data values (2,4,6);
insert into matrix_data values (3,1,3);
insert into matrix_data values (3,2,2);
insert into matrix_data values (3,3,2);
insert into matrix_data values (4,1,2);
insert into matrix_data values (4,2,1);
insert into matrix_data values (5,1,1);
Can this be done?

Test setup:
CREATE TEMP TABLE matrix (
rowindex integer,
columnindex integer,
matrixvalue numeric);
INSERT INTO matrix VALUES
(1,1,2),(1,2,2),(1,3,3),(1,4,3),(1,5,4)
,(2,1,4),(2,2,4),(2,3,5),(2,4,6)
,(3,1,3),(3,2,2),(3,3,2)
,(4,1,2),(4,2,1)
,(5,1,1);
Run INSERTs in a LOOP with DO:
DO $$
BEGIN
FOR i IN 2 .. 5 LOOP
FOR j IN 7-i .. 5 LOOP
INSERT INTO matrix
VALUES (i,j, (
SELECT sum(matrixvalue)/2
FROM matrix
WHERE (rowindex, columnindex) IN ((i-1, j),(i, j-1))
));
END LOOP;
END LOOP;
END;
$$
See result:
SELECT * FROM matrix order BY 1,2;

This can be done in a single SQL select statement, but only because recursion is not necessary. I'll outline the solution. If you actually want the SQL code, let me know.
First, notice that the only items that contribute to the sums are along the diagonal. Now, if we follow the contribution of the value "4" in (1, 5), it contributes 4/2 to (2,5) and 4/4 to (3,5) and 4/8 to (4,5). Each time, the contribution is cut in half, because (a+b)/2 is (a/2 + b/2).
When we extend this, we start to see a pattern similar to Pascal's triangle. In fact, for any given point in the lower triangular matrix (below where you have values), you can find the diagonal elements that contribute to the value. Extend a vertical line up to hit the diagonal and a horizontal line to hit the diagonal. Those are the contributors from the diagonal row.
How much do they contribute? Well, for that we can go to Pascal's triangle. For the first diagonal below where we have values, the contributions are (1,1)/2. For the second diagonal, (1,2,1)/4. For the third, (1,3,3,1)/8 . . . and so on.
Fortunately, we can calculate the contributions for each value using a formula (the "choose" function from combinatorics). The power of 2 is easy. And, determining how far a given cell is from the diagonal is not too hard.
All of this can be combined into a single Postgres SQL statement. However, #Erwin's solution also works. I only want to put the effort into debugging the statement if his solution doesn't meet your needs.

... and here comes the recursive CTE with multiple embedded CTE's (tm):
DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;
CREATE TABLE matrix_data (
yyy integer,
xxx integer,
val numeric);
insert into matrix_data (yyy,xxx,val) values
(1,1,2) , (1,2,2) , (1,3,3) , (1,4,3) , (1,5,4)
, (2,1,4) , (2,2,4) , (2,3,5) , (2,4,6)
, (3,1,3) , (3,2,2) , (3,3,2)
, (4,1,2) , (4,2,1)
, (5,1,1)
;
WITH RECURSIVE rr AS (
WITH xx AS (
SELECT MIN(xxx) AS x0
, MAX(xxx) AS x1
FROM matrix_data
)
, mimax AS (
SELECT generate_series(xx.x0,xx.x1) AS xxx
FROM xx
)
, yy AS (
SELECT MIN(yyy) AS y0
, MAX(yyy) AS y1
FROM matrix_data
)
, mimay AS (
SELECT generate_series(yy.y0,yy.y1) AS yyy
FROM yy
)
, cart AS (
SELECT * FROM mimax mm
JOIN mimay my ON (1=1)
)
, empty AS (
SELECT * FROM cart ca
WHERE NOT EXISTS (
SELECT *
FROM matrix_data nx
WHERE nx.xxx = ca.xxx
AND nx.yyy = ca.yyy
)
)
, hot AS (
SELECT * FROM empty emp
WHERE EXISTS (
SELECT *
FROM matrix_data ex
WHERE ex.xxx = emp.xxx -1
AND ex.yyy = emp.yyy
)
AND EXISTS (
SELECT *
FROM matrix_data ex
WHERE ex.xxx = emp.xxx
AND ex.yyy = emp.yyy -1
)
)
-- UPDATE from here:
SELECT h.xxx,h.yyy, md.val / 2 AS val
FROM hot h
JOIN matrix_data md ON
(md.yyy = h.yyy AND md.xxx = h.xxx-1)
OR (md.yyy = h.yyy-1 AND md.xxx = h.xxx)
UNION ALL
SELECT e.xxx,e.yyy, r.val / 2 AS val
FROM empty e
JOIN rr r ON ( e.xxx = r.xxx+1 AND e.yyy = r.yyy)
OR ( e.xxx = r.xxx AND e.yyy = r.yyy+1 )
)
INSERT INTO matrix_data(yyy,xxx,val)
SELECT DISTINCT yyy,xxx
,SUM(val)
FROM rr
GROUP BY yyy,xxx
;
SELECT * FROM matrix_data
;
New result:
NOTICE: drop cascades to table tmp.matrix_data
DROP SCHEMA
CREATE SCHEMA
SET
CREATE TABLE
INSERT 0 15
INSERT 0 10
yyy | xxx | val
-----+-----+------------------------
1 | 1 | 2
1 | 2 | 2
1 | 3 | 3
1 | 4 | 3
1 | 5 | 4
2 | 1 | 4
2 | 2 | 4
2 | 3 | 5
2 | 4 | 6
3 | 1 | 3
3 | 2 | 2
3 | 3 | 2
4 | 1 | 2
4 | 2 | 1
5 | 1 | 1
2 | 5 | 5.0000000000000000
5 | 5 | 2.81250000000000000000
4 | 3 | 1.50000000000000000000
3 | 5 | 4.50000000000000000000
5 | 2 | 1.00000000000000000000
3 | 4 | 4.00000000000000000000
5 | 3 | 1.25000000000000000000
4 | 5 | 3.62500000000000000000
4 | 4 | 2.75000000000000000000
5 | 4 | 2.00000000000000000000
(25 rows)

while (select max(ColumnIndex+RowIndex) from matrix_data)<10
begin
insert matrix_data
select c1.RowIndex, c1.ColumnIndex+1, (c1.MatrixValue+c2.MatrixValue)/2
from matrix_data c1
inner join
matrix_data c2
on c1.ColumnIndex+1=c2.ColumnIndex and c1.RowIndex-1 = c2.RowIndex
where c1.RowIndex+c1.ColumnIndex=(select max(RowIndex+ColumnIndex) from matrix_data)
and c1.ColumnIndex<5
end

Related

get the nth-lowest value in a `group by` clause

Here's a tough one: I have data coming back in a temporary table foo in this form:
id n v
-- - -
1 3 1
1 3 10
1 3 100
1 3 201
1 3 300
2 1 13
2 1 21
2 1 300
4 2 1
4 2 7
4 2 19
4 2 21
4 2 300
8 1 11
Grouping by id, I need to get the row with the nth-lowest value for v based on the value in n. For example, for the group with an ID of 1, I need to get the row which has v equal to 100, since 100 is the third-lowest value for v.
Here's what the final results need to look like:
id n v
-- - -
1 3 100
2 1 13
4 2 7
8 1 11
Some notes about the data:
the number of rows for each ID may vary
n will always be the same for every row with a given ID
n for a given ID will never be greater than the number of rows with that ID
the data will already be sorted by id, then v
Bonus points if you can do it in generic SQL instead of oracle-specific stuff, but that's not a requirement (I suspect that rownum may factor prominently in any solutions). It has in my attempts, but I wind up confusing myself before I get a working solution.
I would use row_number function make row number the compare with n column value in CTE, do another CTE to make row number order by v desc.
get rn = 1 which is mean max value in the n number group.
CREATE TABLE foo(
id int,
n int,
v int
);
insert into foo values (1,3,1);
insert into foo values (1,3,10);
insert into foo values (1,3,100);
insert into foo values (1,3,201);
insert into foo values (1,3,300);
insert into foo values (2,1,13);
insert into foo values (2,1,21);
insert into foo values (2,1,300);
insert into foo values (4,2,1);
insert into foo values (4,2,7);
insert into foo values (4,2,19);
insert into foo values (4,2,21);
insert into foo values (4,2,300);
insert into foo values (8,1,11);
Query 1:
with cte as(
select id,n,v
from
(
select t.*, row_number() over(partition by id ,n order by n) as rn
from foo t
) t1
where rn <= n
), maxcte as (
select id,n,v, row_number() over(partition by id ,n order by v desc) rn
from cte
)
select id,n,v
from maxcte
where rn = 1
Results:
| ID | N | V |
|----|---|-----|
| 1 | 3 | 100 |
| 2 | 1 | 13 |
| 4 | 2 | 7 |
| 8 | 1 | 11 |
use window function
select * from
(
select t.*, row_number() over(partition by id ,n order by v) as rn
from foo t
) t1
where t1.rn=t1.n
as ops sample output just need 3rd highest value so i put where condition t1.rn=3 though accodring to description it would be t1.rn=t1.n
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=65abf8d4101d2d1802c1a05ed82c9064
If your database is version 12.1 or higher then there is a much simpler solution:
SELECT DISTINCT ID, n, NTH_VALUE(v,n) OVER (PARTITION BY ID) AS v
FROM foo
ORDER BY ID;
| ID | N | V |
|----|---|-----|
| 1 | 3 | 100 |
| 2 | 1 | 13 |
| 4 | 2 | 7 |
| 8 | 1 | 11 |
Depending on your real data you may have to add an ORDER BY n clause and/or windowing_clause as RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, see NTH_VALUE

Teradata - Counting previous values

I am trying to sum previous values in a query as an intermediate step to accomplish another task. I want to sum previous values of 3,
for example
Type value
A 3
A 3
A 3
A 3
A 3
A 3
A 3
B 2.3
B 2.3
B 3
B 2.3
B 2.3
B 3
B 2.3
and my ideal answers would be
Type value Previous 3's
A 3 0
A 3 1
A 3 2
A 3 3
A 3 4
A 3 5
A 3 6
B 2.3 7
B 2.3 7
B 3 7
B 2.3 8
B 2.3 8
B 3 8
B 2.3 9
How would I achieve this in Teradata or SQL?
SQL tables represent unordered sets. To count previous values, you need a column that specifies the ordering, and you don't have one in your question.
You can use a cumulative count or sum:
select t.*,
count(case when value = 3 then 1) over
(order by ? rows between unbounded preceding and 1 preceding)
from t;
The ? is for the column specifying the ordering.
You can achieve this at least in MYSQL:
create table teadata(Type varchar(1), value number);
insert into teadata(Type, value) values('A', 3);
insert into teadata(Type, value) values('A', 3);
insert into teadata(Type, value) values('A', 3);
insert into teadata(Type, value) values('B', 3);
insert into teadata(Type, value) values('B', 2.3);
select type, value, (#sum := #sum + (case when value = 3 then 1 else 0 end)) as cumesum
from teadata t cross join (select #sum := 0) params;
This will print:
+------+-------+---------+
| type | value | cumesum |
+------+-------+---------+
| A | 3 | 1 |
| A | 3 | 2 |
| A | 3 | 3 |
| B | 3 | 4 |
| B | 2.3 | 4 |
+------+-------+---------+
The trick in this case is to use a variable #sum together with a case statement. This works on MySQL. Not sure about Teradata.

Juggling the values of a column in oracle

In a table tab I have a column with the name of col1 and it has 5 rows with values 1 to 5.
col1
1
2
3
4
5
Now I want to write a select query which will juggle the values in col1,distribute it and put those values in new column.
Below output will help you understand my requirement.
col1 New_col
1 3
2 5
3 4
4 1
5 2
Note: If 1 is changed to 3, then no other value in col1 after juggling should result in 3. i have to do it for 500 rows, i am taking a small example for better understanding.
Please let me know if you require further clarification.
This is a step by step approach:
Try it at SQL Fiddle
Oracle 11g R2 Schema Setup:
create table t ( i int );
insert into t values (1);
insert into t values (2);
insert into t values (3);
insert into t values (4);
insert into t values (5);
Step by step query:
with
/*add a random column to shuffle*/
a as
( select i, dbms_random.value as o
from t),
/*get last element to pair it with the first*/
b as
( select i,
o,
last_Value(i) over (ORDER BY o asc
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) AS i2
from a)
/*pair each element with the next one, take the last one as default*/
select i, LAG(i, 1, i2 ) OVER (ORDER BY o ) AS i3
from b
Results:
| I | I3 |
|---|----|
| 2 | 5 |
| 1 | 2 |
| 3 | 1 |
| 4 | 3 |
| 5 | 4 |
What about this?
SELECT row_number() over (order by 1) col, col1 new_col
FROM tab
ORDER BY DBMS_RANDOM.VALUE
demo

Duplicate selected row in Vertica Database

I have been asked to duplicate a specific row in a table. For this I used a simple SQL statment:
insert into xyz_tablename(x,y) select * from xyz_tablename where x = "something";
However, this statement copies all the row present where x = "something" which is like multipling the selected row by 2.
What I want is to control via counter the number of rows to be duplicated. Is there any function/procedure for this in Vertica?
Now what I have done uptil now:
Studied about function (I have understood this, however I cannot use this for this problem).
Studied about procedures (Have studied, but cannot understand how to make that bash file).
Learnt that there are no for-while loop in vertica.
Anyone can help me with this problem? I hope I am clear. Let me know if I am missing something. Thanks in advance.
Try as below :
create table mystore.xyz_tablename(
x varchar (10)
,y INT
)
;
INSERT INTO mystore.xyz_tablename VALUES('abc' ,1);
INSERT INTO mystore.xyz_tablename VALUES('abc' ,2);
INSERT INTO mystore.xyz_tablename VALUES('abc' ,3);
INSERT INTO mystore.xyz_tablename VALUES('abc' ,4);
INSERT INTO mystore.xyz_tablename VALUES('abc' ,5);
INSERT INTO mystore.xyz_tablename VALUES('abc' ,6);
INSERT INTO mystore.xyz_tablename VALUES('abc' ,7);
INSERT INTO mystore.xyz_tablename VALUES('abc' ,8);
select * from mystore.xyz_tablename;
mystore_owner=> select * from mystore.xyz_tablename;
x | y
-----+---
abc | 1
abc | 2
abc | 3
abc | 4
abc | 5
abc | 6
abc | 7
abc | 8
(8 rows)
INSERT INTO mystore.xyz_tablename(x,y)
SELECT a.*
FROM (SELECT * from mystore.xyz_tablename where y = 8 LIMIT 1) a
INNER JOIN (SELECT y FROM mystore.xyz_tablename LIMIT 5) b
ON (1=1)
;
OUTPUT
--------
5
(1 row)
mystore_owner=> select * from mystore.xyz_tablename;
x | y
-----+---
abc | 1
abc | 2
abc | 3
abc | 4
abc | 5
abc | 6
abc | 7
abc | 8
abc | 8
abc | 8
abc | 8
abc | 8
abc | 8
(13 rows)
Let us know if this works for your requirement .
The number of copies can be controlled via the limit clause which is now 5 , you can alter that to your wish . But you can select from other table also which has more rows . If the source table has < than clause then it will fail.

How to insert into separate tables result of aggregate SQL query

I have a table with an index and I am executing a aggregate SQL query using sum
you can see what I am doing here in sqlfiddle.
Create table TX (
i int NOT NULL PRIMARY KEY,
x1 DECIMAL(7,3),
x2 DECIMAL(7,3),
x3 DECIMAL(7,3)
);
INSERT INTO TX (i,x1,x2,x3) values
(1,5, 6,6) ;
INSERT INTO TX (i,x1,x2,x3) values
(2,6, 7, 5);
INSERT INTO TX (i,x1,x2,x3) values
(3,5, 6, 7) ;
INSERT INTO TX (i,x1,x2,x3) values
(4,6, 7, 4);
My question is How can I insert into 3 different tables the results of that query?
SELECT SUM(1),
SUM(x1),SUM(x2),SUM(x3),
SUM(x1*x1),
SUM(x2*x1),SUM(x2*x2),
SUM(x3*x1),SUM(x3*x2),SUM(x3*x3)
FROM TX
so
how can I get something like
Sum(1)
-----
n
index Sums
------------
1 4
2 22
3 26
index1 index2 Mult
----------------------
1 1 122
2 1 144
2 2 170
3 1 119
3 2 141
3 3 126
Instead of
SUM(1) SUM(X1) SUM(X2) SUM(X3) SUM(X1*X1) SUM(X2*X1) SUM(X2*X2) SUM(X3*X1) SUM(X3*X2) SUM(X3*X3)
_____________________________________________________________________________________________________
4 22 26 22 122 144 170 119 141 126
Run 3 separate queries. Turning the SELECTs into INSERTs depends on the RDBMS. For SQL Server, it's just adding an INTO newTableName before the FROM clause to create a new one, or INSERT INTO existingTableName before the SELECT statement.
Create table TX (
i int NOT NULL PRIMARY KEY,
x1 DECIMAL(7,3),
x2 DECIMAL(7,3),
x3 DECIMAL(7,3)
);
INSERT INTO TX (i,x1,x2,x3) values
(1,5, 6,6) ;
INSERT INTO TX (i,x1,x2,x3) values
(2,6, 7, 5);
INSERT INTO TX (i,x1,x2,x3) values
(3,5, 6, 7) ;
INSERT INTO TX (i,x1,x2,x3) values
(4,6, 7, 4);
Query 1:
SELECT COUNT(*) AS SUM1
FROM TX
Results:
| SUM1 |
--------
| 4 |
Query 2:
SELECT SUM(X1) index1, SUM(X2) sums
FROM TX
Results:
| INDEX1 | SUMS |
-----------------
| 22 | 26 |
Query 3:
SELECT x.index1,
x.index2,
case x.id
when 1 then SUM(x1*x1)
when 2 then SUM(x2*x1)
when 3 then SUM(x2*x2)
when 4 then SUM(x3*x1)
when 5 then SUM(x3*x2)
when 6 then SUM(x3*x3)
end Mult
FROM TX
CROSS JOIN
(select 1 id, 1 index1, 1 index2 union all
select 2 id, 2 index1, 1 index2 union all
select 3 id, 3 index1, 1 index2 union all
select 4 id, 2 index1, 2 index2 union all
select 5 id, 3 index1, 2 index2 union all
select 6 id, 3 index1, 3 index2) x
GROUP BY x.id, x.index1, x.index2
ORDER BY x.id
Results:
| INDEX1 | INDEX2 | MULT |
--------------------------
| 1 | 1 | 122 |
| 2 | 1 | 144 |
| 3 | 1 | 170 |
| 2 | 2 | 119 |
| 3 | 2 | 141 |
| 3 | 3 | 126 |
SELECT SUM(1)
FROM TX;
SELECT 1, SUM(x1)
FROM TX
UNION ALL
SELECT 2, SUM(x2)
FROM TX
UNION ALL
SELECT 3, SUM(x3)
FROM TX;
SELECT a.x i1, b.x i2, SUM(a.s * b.s)
FROM
(
SELECT i, 1 x, x1 s
FROM TX
UNION ALL
SELECT i, 2 x, x2 s
FROM TX
UNION ALL
SELECT i, 3 x, x3 s
FROM TX
) a
INNER JOIN
(
SELECT i, 1 x, x1 s
FROM TX
UNION ALL
SELECT i, 2 x, x2 s
FROM TX
UNION ALL
SELECT i, 3 x, x3 s
FROM TX
) b ON a.i = b.i AND a.x >= b.x
GROUP BY a.x, b.x;
SQL Fiddle using your data - Note that your data's sums (second query) do not match those in your question. I trust this is a typo.
Notice I got a bit lazy with the third query. Instead of writing out the expansion I flattened the table first and joined it on itself.
Also note that in the first query SUM(1) can be replaced with COUNT(*).