dividing sum of the column with each part - sql

i have the following table in my database
i am currently using oracle 11g
the data is like this
id valus
1 2 3
100 200 300 = 600
I want to derive new column as: divide each value from the column "value" with the total sum of the column "value". Then load into the another table. The data in other table should look as
id value drived_col
1 100 100/600
2 200 200/600
3 300 300/600
thanks

SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE data ( id, value ) AS
SELECT 1, 100 FROM DUAL
UNION ALL SELECT 2, 200 FROM DUAL
UNION ALL SELECT 3, 300 FROM DUAL;
CREATE TABLE derived_data AS
SELECT id,
value,
value/SUM(value) OVER ( ORDER BY NULL ) AS derived_col
FROM data;
Or if the derived_data table already exists then you can do:
INSERT INTO derived_data
SELECT id,
value,
value/SUM(value) OVER ( ORDER BY NULL ) AS derived_col
FROM data;
Query 1:
SELECT * FROM derived_data
Results:
| ID | VALUE | DERIVED_COL |
|----|-------|----------------|
| 1 | 100 | 0.166666666667 |
| 2 | 200 | 0.333333333333 |
| 3 | 300 | 0.5 |
Or if you want the derived_col as a string:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE data ( id, value ) AS
SELECT 1, 100 FROM DUAL
UNION ALL SELECT 2, 200 FROM DUAL
UNION ALL SELECT 3, 300 FROM DUAL;
CREATE TABLE derived_data AS
SELECT id,
value,
value||'/'||SUM(value) OVER ( ORDER BY NULL ) AS derived_col
FROM data;
Query 1:
SELECT * FROM derived_data
Results:
| ID | VALUE | DERIVED_COL |
|----|-------|-------------|
| 1 | 100 | 100/600 |
| 2 | 200 | 200/600 |
| 3 | 300 | 300/600 |

Assuming your table already exists, you want to use an INSERT INTO new_table SELECT to insert the data in the derived table based on a query. For the insertion query to perform the division, it needs two subqueries:
query the sum of the values
query the (id,value) pair
Because the sum of the values is a single value, constant for all rows, you can then join these subqueries together with an INNER JOIN that has no conditions:
INSERT INTO derived_table
SELECT
ot.id AS id,
ot.value AS value,
CAST(ot.value AS float)/summed.total AS derived_col
FROM
orig_table AS ot
INNER JOIN
SELECT sum(value) AS total FROM orig_table AS summed;
The CAST(ot.value AS FLOAT) is necessary if value is a column of integers. Otherwise, your division will be integer division and all of the derived values will be zero.
There is no join condition here because the summation is a single value to all rows of orig_table. If you want to apply different divisors to different rows, you would need a more complicated subquery and an appropriate join condition.

Related

How to convert JSONB array of pair values to rows and columns?

Given that I have a jsonb column with an array of pair values:
[1001, 1, 1002, 2, 1003, 3]
I want to turn each pair into a row, with each pair values as columns:
| a | b |
|------|---|
| 1001 | 1 |
| 1002 | 2 |
| 1003 | 3 |
Is something like that even possible in an efficient way?
I found a few inefficient (slow) ways, like using LEAD(), or joining the same table with the value from next row, but queries take ~ 10 minutes.
DDL:
CREATE TABLE products (
id int not null,
data jsonb not null
);
INSERT INTO products VALUES (1, '[1001, 1, 10002, 2, 1003, 3]')
DB Fiddle: https://www.db-fiddle.com/f/2QnNKmBqxF2FB9XJdJ55SZ/0
Thanks!
This is not an elegant approach from a declarative standpoint, but can you please see whether this performs better for you?
with indexes as (
select id, generate_series(1, jsonb_array_length(data) / 2) - 1 as idx
from products
)
select p.id, p.data->>(2 * i.idx) as a, p.data->>(2 * i.idx + 1) as b
from indexes i
join products p on p.id = i.id;
This query
SELECT j.data
FROM products
CROSS JOIN jsonb_array_elements(data) j(data)
should run faster if you just need to unpivot all elements within the query as in the demo.
Demo
or even remove the columns coming from products table :
SELECT jsonb_array_elements(data)
FROM products
OR
If you need to return like this
| a | b |
|------|---|
| 1001 | 1 |
| 1002 | 2 |
| 1003 | 3 |
as unpivoting two columns, then use :
SELECT MAX(CASE WHEN mod(rn,2) = 1 THEN data->>(rn-1)::int END) AS a,
MAX(CASE WHEN mod(rn,2) = 0 THEN data->>(rn-1)::int END) AS b
FROM
(
SELECT p.data, row_number() over () as rn
FROM products p
CROSS JOIN jsonb_array_elements(data) j(data)) q
GROUP BY ceil(rn/2::float)
ORDER BY ceil(rn/2::float)
Demo

Select data in tables by conditions between 2 tables which is not linked

I have a large select with a lot of inner joins.
In the select I have an array_agg function for one set of data.
This array contains only a column of a table, but now I want to append at the end of the array data from another table. The data I need to add is not directly linked with the previous table where I need the column.
Query example:
select
origin_table.x,
origin_table.y,
array_agg(table1.data) ...
from
origin_table
inner join ... inner join ... full join table1 on
table.origin_table_id = origin_table.id ...
group by
...
Result array:
ID 1: example_data, {baba, bobo}
ID 2: example_data, {bibi, bubu}
Example of my tables:
table 1:
id | data | origin_table_id
----+---------+----------
1 | baba | 1
2 | bobo | 1
3 | bibi | 2
4 | bubu | 2
table 2:
id | data_bis
---+---------
1 | byby
2 | bebe
origin table:
id | table2_id
---+----------
1 | 2
2 | 1
Expected result with the 3 tables:
ID 1: example_data, {baba, bobo, bebe}
ID 2: example_data, {bibi, bubu, byby}
But got :
ID 1: example_data, {baba, bobo, bebe, byby}
ID 2: example_data, {bibi, bubu, bebe, byby}
What I need is:
How to have all the data of table 1 which respect the condition and append to it the unique table 2 data but not all elements of the table.
Try below query..
create table tab1 (id integer,data character varying,origin_table_id integer);
insert into tab1
select 1,'baba',1
union all
select 2,'bobo',1
union all
select 3,'bibi',2
union all
select 4,'bubu',2
create table tab2 (id integer,data_bis character varying);
insert into tab2
select 1,'byby'
union all
select 2,'bebe'
create table OriginalTable (id integer,table2_id integer);
insert into OriginalTable
select 1,2
union all
select 2,1
select * from OriginalTable
select origin_table_id,data
from tab1
union all
select OriginalTable.table2_id,data_bis
from OriginalTable
join tab2 on tab2.id = OriginalTable.id
order by origin_table_id
Result:
1;"baba"
1;"bobo"
1;"bebe"
2;"bibi"
2;"bubu"
2;"byby"
I can give you some idea and sample code to achieve your requirement.
First you can UNION ALL table 'table 1' and 'table 2' using the relation table 'origin table'.
WITH CTE
AS
(
SELECT AA.id,AA.data_bis,AA.origin_table_id
FROM table_1 AA
UNION ALL
SELECT NULL id, A.data_bis,B.id origin_table_id
FROM table_2 A
INNER JOIN origin_table B
ON A.id = B.table2_id
)
SELECT * FROM CTE
After applying UNION ALL, data will be looks like below-
data_bis origin_table_id
baba 1
bobo 1
bibi 2
bubu 2
byby 2
bebe 1
Now you can apply 'string_agg' on your data as below-
SELECT origin_table_id,string_agg(DISTINCT data_bis,',')
FROM CTE
GROUP BY origin_table_id
And the output will be-
origin_table_id string_agg
1 baba,bebe,bobo
2 bibi,bubu,byby
Now, you can apply further JOINING to this data as per your requirement. You can check DEMO HERE
Please keep in mind hat this is not exact solution of your issue. Just idea...

Join Postgresql array to table

I have following tables
create table top100
(
id integer not null,
top100ids integer[] not null
);
create table top100_data
(
id integer not null,
data_string text not null
);
Rows in table top100 look like:
1, {1,2,3,4,5,6...100}
Rows in table top100_data look like:
1, 'string of text, up to 500 chars'
I need to get the text values from table top100_data and join them with table top100.
So the result will be:
1, {'text1','text2','text3',...'text100'}
I am currenly doing this on application side by selecting from top100, then iterating over all array items and then selecting from top100_data and iterating again + transforming ids to their _data text values.
This can be very slow on large data sets.
Is is possible to get this same result with single SQL query?
You can unnest() and re-aggregate:
select t100.id, array_agg(t100d.data order by top100id)
from top100 t100 cross join
unnest(top100ids) as top100id join
top100_data t100d
on t100d.id = top100id
group by t100.id;
Or if you want to keep the original ordering:
select t100.id, array_agg(t100d.data order by top100id.n)
from top100 t100 cross join
unnest(top100ids) with ordinality as top100id(id, n) join
top100_data t100d
on t100d.id = top100id.id
group by t100.id;
Just use unnest and array_agg function in PostgreSQL, your final sql could be like below:
with core as (
select
id,
unnest(top100ids) as top_id
from
top100
)
select
t1.id,
array_agg(t1.data_string) as text_datas
from
top100 t1
join
core c on t1.id = c.top_id
The example of unnest as below:
postgres=# select * from my_test;
id | top_ids
----+--------------
1 | {1,2,3,4,5}
2 | {6,7,8,9,10}
(2 rows)
postgres=# select id, unnest(top_ids) from my_test;
id | unnest
----+--------
1 | 1
1 | 2
1 | 3
1 | 4
1 | 5
2 | 6
2 | 7
2 | 8
2 | 9
2 | 10
(10 rows)
The example of array_agg as below:
postgres=# select * from my_test_1 ;
id | content
----+---------
1 | a
1 | b
1 | c
1 | d
2 | x
2 | y
(6 rows)
postgres=# select id,array_agg(content) from my_test_1 group by id;
id | array_agg
----+-----------
1 | {a,b,c,d}
2 | {x,y}
(2 rows)

SQL inner join using multiple in statements on single table

Having a bit of trouble with an SQL query I am trying to create. The table format is as follows,
ID | Data Identifier | Date Added | Data Column
1 | 1001 | 15400 | Newest Value
1 | 1001 | 15000 | Oldest Value
1 | 1001 | 15200 | Older Value
1 | 1002 | 16000 | Newest Value
2 | 1001 | 16000 | Newest Value
What I am trying to do is, for each ID in a list (1,2) , and for each Data Identifier id in (1001,1002) return just the rows with the first matching field id and date nearest and below 16001.
So the results would be :
1 | 1001 | 15400 | Newest Value
1 | 1002 | 16000 | Newest Value
2 | 1001 | 16000 | Newest Value
I have tried several manner of joins but I keep returning duplicate records. Any advice or help would be appreciated.
It seems as if you want to GROUP BY and maybe a self join onto the table.
I have the following code for you:
-- Preparing a test table
INSERT INTO #tmpTable(ID, Identifier, DateAdded, DataColumn)
SELECT 1, 1001, 15400, 'Newest Value'
UNION
SELECT 1, 1001, 15000, 'Oldest Value'
UNION
SELECT 1, 1001, 15200, 'Older Value'
UNION
SELECT 1, 1002, 16000, 'Newest Value'
UNION
SELECT 2, 1001, 16000, 'Newest Value'
-- Actual Select
SELECT b.ID, b.Identifier, b.DateAdded, DataColumn
FROM
(SELECT ID, Identifier, MAX(DateAdded) AS DateAdded
FROM #tmpTable
WHERE DateAdded < 16001
GROUP BY ID, Identifier) a
INNER JOIN #tmpTable b ON a.DateAdded = b.DateAdded
AND a.ID = b.ID
AND a.Identifier = b.Identifier
You need to create a primary key column on your table that will not be used as an aggregate. Then you can create a CTE to select the rows required and then use it to select the data.
The aggregate function MIN(ABS(15500 - DateAdded)) will return the closest value to 15500.
WITH g AS
(
SELECT MAX(UniqueKey) AS UniqueKey, ID, DataIdentifier, MIN(ABS(15500 - DateAdded)) AS "DateTest"
FROM test
GROUP BY ID, DataIdentifier
)
SELECT test.ID, test.DataIdentifier, test.DateAdded, test.DataColumn
FROM g
INNER JOIN test
ON g.UniqueKey = test.UniqueKey
EDIT:
Screenshot of working example:
I think in this case self-join would be the best, but I still don't get the nearest and below value... (may be 15400)

How do I print out 'NULL' or '0' values for column values when an element isn't found?

I need to loop through a set of values (less than 10) and see if they are in a table. If so, I need to print out all of the record values, but if the item doesn't exist, I still want it to be included in the printed result, although with NULL or 0 values. So, for example, the following query returns:
select *
from ACTOR
where ID in (4, 5, 15);
+----+-----------------------------+-------------+----------+------+
| ID | NAME | DESCRIPTION | ORDER_ID | TYPE |
+----+-----------------------------+-------------+----------+------+
| 4 | [TEST-1] | | 3 | NULL |
| 5 | [TEST-2] | | 4 | NULL |
+----+-----------------------------+-------------+----------+------+
But I want it to return
+----+-----------------------------+-------------+----------+------+
| ID | NAME | DESCRIPTION | ORDER_ID | TYPE |
+----+-----------------------------+-------------+----------+------+
| 4 | [TEST-1] | | 3 | NULL |
| 5 | [TEST-2] | | 4 | NULL |
| 15| NULL | | 0 | NULL |
+----+-----------------------------+-------------+----------+------+
Is this possible?
To get the output you want, you first have to construct a derived table containing the ACTOR.id values you desire. UNION ALL works for small data sets:
SELECT *
FROM (SELECT 4 AS actor_id
FROM DUAL
UNION ALL
SELECT 5
FROM DUAL
UNION ALL
SELECT 15
FROM DUAL) x
With that, you can OUTER JOIN to the actual table to get the results you want:
SELECT x.actor_id,
a.name,
a.description,
a.orderid,
a.type
FROM (SELECT 4 AS actor_id
FROM DUAL
UNION ALL
SELECT 5
FROM DUAL
UNION ALL
SELECT 15
FROM DUAL) x
LEFT JOIN ACTOR a ON a.id = x.actor_id
If there's no match between x and a, the a columns will be null. So if you want orderid to be zero when there's no match for id 15:
SELECT x.actor_id,
a.name,
a.description,
COALESCE(a.orderid, 0) AS orderid,
a.type
FROM (SELECT 4 AS actor_id
FROM DUAL
UNION ALL
SELECT 5
FROM DUAL
UNION ALL
SELECT 15
FROM DUAL) x
LEFT JOIN ACTOR a ON a.id = x.actor_id
Well, for that few values, you could do something ugly like this, I suppose:
SELECT
*
FROM
(
SELECT 4 AS id UNION
SELECT 5 UNION
SELECT 15
) ids
LEFT JOIN ACTOR ON ids.id = ACTOR.ID
(That should work in MySQL, I think; for Oracle you'd need to use DUAL, e.g. SELECT 4 as id FROM DUAL...)
That is only possible using a temporary table.
CREATE TABLE actor_temp (id INTEGER);
INSERT INTO actor_temp VALUES(4);
INSERT INTO actor_temp VALUES(5);
INSERT INTO actor_temp VALUES(15);
select actor_temp.id, ACTOR.* from ACTOR RIGHT JOIN actor_temp on ACTOR.id = actor_temp.id;
DROP TABLE actor_temp;
If you know the upper and lower limits on the ID, it's not too bad. Set up a view with all possible ids - the connect by trick is the simplest way - and do an outer join with your real table. Here, I've limited it to values from 1-1000.
select * from (
select ids.id, a.name, a.description, nvl(a.order_id,0), a.type
from Actor a,
(SELECT level as id from dual CONNECT BY LEVEL <= 1000) ids
where ids.id = a.id (+)
)
where id in (4,5,15);
Can you make a table that contains expected actor ids?
If so you can left join from it.
SELECT * FROM expected_actors LEFT JOIN actors USING (ID)