POSTGRESQL: Join columns - sql

I have a POSTGRESQL version 12.5 database where I have a table that has three columns: c_id, columna and columnb. The three columns can have a different values.
I need to do a join their values into a single object like this:
Here is the sample data
I have a table that has 3 columns with the same type
c_id columna columnb
1 a b
2 c d
3 x y
I need to run a query that will join the columns columna and columnb like this:
c_id merge_column
1 {"columna":a, "columnb": "b"}
2 {"columna":d, "columnb": "d"}
3 {"columna":x, "columnb": "y"}
Any ideas?

You can convert the whole row into a JSON, the remove the c_id key:
select t.c_id, to_jsonb(t) - 'c_id' as merge_column
from the_table t
If there are more columns than you have shown, and you only want to get two of them, using jsonb_build_object() is probably easier:
select t.c_id,
jsonb_build_object('columna', t.columna, 'columnb', t.columnb) as merge_column
from the_table t

Related

Combining two columns into a single map column in hive table

I have a hive table like
a b
-------------
1 2
3 4
How can I create a map column c
c
-----------
1: 2
3: 4
where column a is the keys and column b is the values?
Use map() construct in Hive:
select map(a,b) as c from mytable
In Presto you can use map(array[key], array[value])
select map(array[a],array[b]) as c from mytable
Or map_agg()
select map_agg(a,b) as c from mytable group by a,b

Snowflake SQL: concat values from multiple rows based on shared key

I have a table of values where there are a variable number of rows per each key value. I want to output a table that concats those row values together onto each distinct key value.
INPUT TABLE
KEY_ID
SOURCE_VAL
1
a
1
b
1
c
2
d
3
e
3
f
Target OUTPUT TABLE
KEY_ID
OUTPUT_VAL
1
a,b,c
2
d
3
e,f
What is the most efficient way to write this in Snowflake SQL?
It could be done with LISTAGG:
SELECT KEY_ID,
LISTAGG(SOURCE_VAL, ',') WITHIN GROUP(ORDER BY SOURCE_VAL) AS OUTPUT_VAL
FROM tab
GROUP BY KEY_ID

SQL grouping by distinct values in a multi-value string column

(I want to perform a group-by based on the distinct values in a string column that has multiple values
The said column has a list of strings in a standard format separated by commas. The potential values are only a,b,c,d.
For example the column collection (type: String) contains:
Row 1: ["a","b"]
Row 2: ["b","c"]
Row 3: ["b","c","a"]
Row 4: ["d"]`
The expected output is a count of unique values:
collection | count
a | 2
b | 3
c | 2
d | 1
For all the below i used this table:
create table tmp (
id INT auto_increment,
test VARCHAR(255),
PRIMARY KEY (id)
);
insert into tmp (test) values
("a,b"),
("b,c"),
("b,c,a"),
("d")
;
If the possible values are only a,b,c,d you can try one of this:
Tke note that this will only works if you have not so similar values like test and test_new, because then the test would be joined also with all test_new rows and the count would not match
select collection, COUNT(*) as count from tmp JOIN (
select CONCAT("%", tb.collection, "%") as like_collection, collection from (
select "a" COLLATE utf8_general_ci as collection
union select "b" COLLATE utf8_general_ci as collection
union select "c" COLLATE utf8_general_ci as collection
union select "d" COLLATE utf8_general_ci as collection
) tb
) tb1
ON tmp.test LIKE tb1.like_collection
GROUP BY tb1.collection;
Which will give you the result you want
collection | count
a | 2
b | 3
c | 2
d | 1
or you can try this one
SELECT
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%a%') as a_count,
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%b%') as b_count,
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%c%') as c_count,
(SELECT COUNT(*) FROM tmp WHERE test LIKE '%d%') as d_count
;
The result would be like this
a_count | b_count | c_count | d_count
2 | 3 | 2 | 1
What you need to do is to first explode the collection column into separate rows (like a flatMap operation). In redshift the only way to generate new rows is to JOIN - so let's CROSS JOIN your input table with a static table having consecutive numbers, and take only ones having id less or equal to number of elements in the collection. Then we'll use split_part function to read the item at correct index. Once we have the exploaded table, we'll do a simple GROUP BY.
If your items are stored as JSON array strings ('["a", "b", "c"]') then you can use JSON_ARRAY_LENGTH and JSON_EXTRACT_ARRAY_ELEMENT_TEXT instead of REGEXP_COUNT and SPLIT_PART respectively.
with
index as (
select 1 as i
union all select 2
union all select 3
union all select 4 -- could be substituted with 'select row_number() over () as i from arbitrary_table limit 4'
),
agg as (
select 'a,b' as collection
union all select 'b,c'
union all select 'b,c,a'
union all select 'd'
)
select
split_part(collection, ',', i) as item,
count(*)
from index,agg
where regexp_count(agg.collection, ',') + 1 >= index.i -- only get rows where number of items matches
group by 1

Combine three columns from different tables into one row

I am new to sql and are trying to combine a column value from three different tables and combine to one row in DB2 Warehouse on Cloud. Each table consists of only one row and unique column name. So what I want to is just join these three to one row their original column names.
Each table is built from a statement that looks like this:
SELECT SUM(FUEL_TEMP.FUEL_MLAD_VALUE) AS FUEL
FROM
(SELECT ML_ANOMALY_DETECTION.MLAD_METRIC AS MLAD_METRIC, ML_ANOMALY_DETECTION.MLAD_VALUE AS FUEL_MLAD_VALUE, ML_ANOMALY_DETECTION.TAG_NAME AS TAG_NAME, ML_ANOMALY_DETECTION.DATETIME AS DATETIME, DATA_CONFIG.SYSTEM_NAME AS SYSTEM_NAME
FROM ML_ANOMALY_DETECTION
INNER JOIN DATA_CONFIG ON
(ML_ANOMALY_DETECTION.TAG_NAME =DATA_CONFIG.TAG_NAME AND
DATA_CONFIG.SYSTEM_NAME = 'FUEL')
WHERE ML_ANOMALY_DETECTION.MLAD_METRIC = 'IFOREST_SCORE'
AND ML_ANOMALY_DETECTION.DATETIME >= (CURRENT DATE - 9 DAYS)
ORDER BY DATETIME DESC)
AS FUEL_TEMP
I have tried JOIN, INNER JOIN, UNION/UNION ALL, but can't get it to work as it should. How can I do this?
Use a cross-join like this:
create table table1 (field1 char(10));
create table table2 (field2 char(10));
create table table3 (field3 char(10));
insert into table1 values('value1');
insert into table2 values('value2');
insert into table3 values('value3');
select *
from table1
cross join table2
cross join table3;
Result:
field1 field2 field3
---------- ---------- ----------
value1 value2 value3
A cross join joins all the rows on the left with all the rows on the right. You will end up with a product of rows (table1 rows x table2 rows x table3 rows). Since each table only has one row, you will get (1 x 1 x 1) = 1 row.
Using UNION should solve your problem. Something like this:
SELECT
WarehouseDB1.WarehouseID AS TheID,
'A' AS TheSystem,
WarehouseDB1.TheValue AS TheValue
FROM WarehouseDB1
UNION
SELECT
WarehouseDB2.WarehouseID AS TheID,
'B' AS TheSystem,
WarehouseDB2.TheValue AS TheValue
FROM WarehouseDB2
UNION
WarehouseDB3.WarehouseID AS TheID,
'C' AS TheSystem,
WarehouseDB3.TheValue AS TheValue
FROM WarehouseDB3
Ill adapt the code with your table names and rows if you tell me what they are. This kind of query would return something like the following:
TheID TheSystem TheValue
1 A 10
2 A 20
3 B 30
4 C 40
5 C 50
As long as your column names match in each query, you should get the desired results.

Querying Same Lookup Table With Multiple Columns

I'm a bit confused on this. I have a data table structured like this:
Table: Data
DataID Val
1 Value 1
2 Value 2
3 Value 3
4 Value 4
Then I have another table structured like this:
Table: Table1
Col1 Col2
1 2
3 4
4 3
2 1
Both columns from Table1 point to the data in the data table. How can I get this data to show in a query? For example, a query to return this:
Query: Query1
Column1 Column2
Value 1 Value 2
Value 3 Value 4
Value 4 Value 3
Value 2 Value 1
I'm familiar enough with SQL to do a join with one column, but lost beyond that. Any help is appreciated. Sample sql or a link to something to read. Thanks!
PS: This is in sqlite
You can join the same table twice:
Select
d1.val As column1,
d2.val As column2
From table1 t
Join data d1 On ( d1.dataId = t.col1 )
Join data d2 On ( d2.dataId = t.col2 )