Resultset rows into array Json [PrestoDB] - sql

I have a table where I have multiple rows against id. I want to convert each row as an entry to an array of array containing key-value pair in prestoDB using sql
id
col1
col2
col3
1
2ad
ff.
sdfs
1
asf.
erew
dsds
1
vfdv
dfds
sdf
and I want the output to be something like this
id
value
1
{{'col1':'2ad','col2':'ff','col3':'sdfs'},{'col1':'asf','col2':'erew','col3':'dsds'},{'col1':'vfdv','col2':'dfds','col3':'sdf'}}
...
....
with the below query I am able to achieve almost:
select id, CAST( MAP(Array['col1','col2','col3'],Array [k."col1",
k."col2", k."col3"]) As Json) as tt
from table k order by 1;
|id| value|
|--|---- |
| 1| {'col1':'2ad','col2':'ff','col3':'sdfs'}|
|1|{'col1':'asf','col2':'erew','col3':'dsds'}|
|1|{'col1':'vfdv','col2':'dfds','col3':'sdf'}|
|...|....|
but I am still not able to concatenate based on ID as array_agg only works on a string and I don't know how to proceed

with below query i was able to achieve it
select mm.id ,CAST(array_agg(mm.tt) as Json) from (select nn."id" as id , nn.tt
from(
select k."id", CAST( MAP(Array['col1','col2','col3'],Array [k."col1",
k."col2", k."col3"]) As Json) as tt
from table k order by 1
)nn group by 1,2
)mm group by 1;

Related

How to get substring for filter and group by clause in AWS Redshift database

How to get substring from column which contains records for filter and group by clause in AWS Redshift database.
I have table with records like:
Table_Id | Categories | Value
<ID> | ABC1; ABC1-1; XYZ | 10
<ID> | ABC1; ABC1-2; XYZ | 15
<ID> | XYZ | 5
.....
Now I want to filter records based on individual category like 'ABC1' or 'ABC1 and XYZ'
Expected output from query would like:
Table_Id | Categories | Value
<ID> | ABC1 | 25
<ID> | ABC1-1 | 10
<ID> | ABC1-2 | 15
<ID> | XYZ | 30
.....
So need to group results based on individual categories.
If you have at most 3 values in any "categories" cell you can unnest the cells, get the list of unique values and use that list in a join condition like this:
WITH
values as (
select distinct category
from (
select distinct split_part(categories,';',1) as category from your_table
union select distinct split_part(categories,';',2) from your_table
union select distinct split_part(categories,';',3) from your_table
)
where nullif(category,'') is not null
)
SELECT
t2.category
,sum(t1.value)
FROM your_table t1
JOIN values t2
ON split_part(categories,';',1)=t2.category
OR split_part(categories,';',2)=t2.category
OR split_part(categories,';',3)=t2.category
if you have more than 3 options just add another split_part level both in WITH part and the join condition
#JonScott, #AlexYes and other pals who struggle with similar kinda situations.
I found more better approach other than suggested by #AlexYes.
What I did, I flatter category column which result individual records.
Which I can further process.
Query:
select row_number() over(order by 1) as r1,
to_char(timestamptz 'epoch' + date_time * interval '1 second', 'yyyy-mm-dd') AS DAY,
split_part(categories, ';', numbers.n) as catg,
value
from <TABLE>
join numbers
on numbers.n <= regexp_count(category_string, ';') + 1 <OTHER_CONDITIONS>
Explanation:
Two functions are useful here: first, the split_part function, which takes a string, splits it on ';' delimiter, and returns the first, second, ... , nth value specified from the split string; second, regexp_count, which tells us how many times a particular pattern is found in our string.
To do this fully dynamically, you need to transpose or pivot values in "categories" column into separate rows.
Unfortunately, a "fully dynamic" solution (without knowing the different values beforehand) is NOT possible using redshift.
Your options are as follows:
Use the method suggested by AlexYes in another answer. This is
semi-dynamic and is probably your best option.
Outside of Redshift, run some ETL code to perform
the column -> multiple rows ETL.
Create a hardcoded type solution, and perform the pivot something like this:
select table_id,'ABC1' as category, case when concat(Categories,';') ilike '%ABC1;%' then value else 0 end as value from your_table
union all
select table_id,'ABC1-1' as category, case when concat(Categories,';')ilike '%ABC1-1;%' then value else 0 end as value from your_table
union all
etc

How to get an array in postgres where the array size is greater than 1

I have a table that looks like this:
val | fkey | num
------------------
1 | 1 | 1
1 | 2 | 1
1 | 3 | 1
2 | 3 | 1
What I would like to do is return a set of rows in which values are grouped by 'val', with an array of fkeys, but only where the array of fkeys is greater than 1. So, in the above example, the return would look something like:
1 | [1,2,3]
I have the following query aggregates the arrays:
SELECT val, array_agg(fkey)
FROM mytable
GROUP BY val;
But this returns something like:
1 | [1,2,3]
2 | [3]
What would be the best way of doing this? I guess one possibility would be to use my existing query as a subquery, and do a sum / count on that, but that seems inefficient. Any feedback would really help!
Use Having clause to filter the groups which is having more than fkey
SELECT val, array_agg(fkey)
FROM mytable
GROUP BY val
Having Count(fkey) > 1
Using the HAVING clause as #Fireblade pointed out is probably more efficient, but you can also leverage subqueries:
SQLFiddle: Subquery
SELECT * FROM (
select val, array_agg(fkey) fkeys
from mytable
group by val
) array_creation
WHERE array_length(fkeys,1) > 1
You could also use the array_length function in the HAVING clause, but again, #Fireblade has used count(), which should be more efficient. Still:
SQLFiddle: Having Clause
SELECT val, array_agg(fkey) fkeys
FROM mytable
GROUP BY val
HAVING array_length(array_agg(fkey),1) > 1
This isn't a total loss, though. Using the array_length in the having can be useful if you want a distinct list of fkeys:
SELECT val, array_agg(DISTINCT fkey) fkeys
There may still be other ways, but this method is more descriptive, which may allow your SQL to be easier to understand when you come back to it, years from now.

Oracle SQL - filter out partitions or row groups that contain rows with specific value

I'm trying to solve the following: the data is organized in the table with Column X as the foreign key for the information (it's the ID which identifies a set of rows in this table as belonging together in a bundle, owned by a particular entity in another table). So each distinct value of X has multiple rows associated with it here. I would like to filter out all distinct values of X that have a row associated with them containing value "ABC" in Column Q.
i.e.
data looks like this:
Column X Column Q
-------- ---------
123 ABC
123 AAA
123 ANQ
456 ANQ
456 PKR
579 AAA
579 XYZ
886 ABC
the query should return "456" and "579" because those two distinct values of X have no rows containing the value "ABC" in Column Q.
I was thinking of doing this with a minus function (select distinct X minus (select distinct X where Q = "ABC")), as all I want are the distinct values of X. But i was wondering if there was a more efficient way to do this that could avoid a subquery? If for example I could partition the table over X and throw out each partition that had a row with the value "ABC" in Q?
I prefer to answer questions like this (i.e. about groups within groups) using aggregation and the having clause. Here is the solution in this case:
select colx
from data d
group by colx
having max(case when colq = 'ABC' then 1 else 0 end) = 0
If any values of colx have ABC, then the max() expression returns 1 . . . which does not match 0.
This should work:
SELECT DISTINCT t.ColX
FROM mytable t
LEFT JOIN mytable t2 on t.colx = t2.colx and t2.colq = 'ABC'
WHERE t2.colx IS NULL
And here is the SQL Fiddle.
Good luck.
How about this, using IN?
SQLFIDDLE DEMO
select distinct colx from
demo
where colx not in (
SELECT COLX from demo
where colq = 'ABC')
;
| COLX |
--------
| 456 |
| 579 |
Try this:
select DISTINCT colx
from demo
where colq not like '%A%'
AND colq not like '%B%'
AND colx not like '%C%'
SQL Fiddle

How can I select unique rows in a database over two columns?

I have found similar solutions online but none that I've been able to apply to my specific problem.
I'm trying to "unique-ify" data from one table to another. In my original table, data looks like the following:
USERIDP1 USERIDP2 QUALIFIER DATA
1 2 TRUE AB
1 2 CD
1 3 EF
1 3 GH
The user IDs are composed of two parts, USERIDP1 and USERIDP2 concatenated. I want to transfer all the rows that correspond to a user who has QUALIFIER=TRUE in ANY row they own, but ignore users who do not have a TRUE QUALIFIER in any of their rows.
To clarify, all of User 12's rows would be transferred, but not User 13's. The output would then look like:
USERIDP1 USERIDP2 QUALIFIER DATA
1 2 TRUE AB
1 2 CD
So basically, I need to find rows with distinct user ID components (involving two unique fields) that also possess a row with QUALIFIER=TRUE and copy all and only all of those users' rows.
Although this nested query will be very slow for large tables, this could do it.
SELECT DISTINCT X.USERIDP1, X.USERIDP2, X.QUALIFIER, X.DATA
FROM YOUR_TABLE_NAME AS X
WHERE EXISTS (SELECT 1 FROM YOUR_TABLE_NAME AS Y WHERE Y.USERIDP1 = X.USERIDP1
AND Y.USERIDP2 = X.USERIDP2 AND Y.QUALIFIER = TRUE)
It could be written as an inner join with itself too:
SELECT DISTINCT X.USERIDP1, X.USERIDP2, X.QUALIFIER, X.DATA
FROM YOUR_TABLE_NAME AS X
INNER JOIN YOUR_TABLE_NAME AS Y ON Y.USERIDP1 = X.USERIDP1
AND Y.USERIDP2 = X.USERIDP2 AND Y.QUALIFIER = TRUE
For a large table, create a new auxiliary table containing only USERIDP1 and USERIDP2 columns for rows that have QUALIFIER = TRUE and then join this table with your original table using inner join similar to the second option above. Remember to create appropriate indexes.
This should do the trick - if the id fields are stored as integers then you will need to convert / cast into Varchars
SELECT 1 as id1,2 as id2,'TRUE' as qualifier,'AB' as data into #sampled
UNION ALL SELECT 1,2,NULL,'CD'
UNION ALL SELECT 1,3,NULL,'EF'
UNION ALL SELECT 1,3,NULL,'GH'
;WITH data as
(
SELECT
id1
,id2
,qualifier
,data
,SUM(CASE WHEN qualifier = 'TRUE' THEN 1 ELSE 0 END)
OVER (PARTITION BY id1 + '' + id2) as num_qualifier
from #sampled
)
SELECT
id1
,id2
,qualifier
,data
from data
where num_qualifier > 0
Select *
from yourTable
INNER JOIN (Select UserIDP1, UserIDP2 FROM yourTable WHERE Qualifier=TRUE) B
ON yourTable.UserIDP1 = B.UserIDP1 and YourTable.UserIDP2 = B.UserIDP2
How about a subquery as a where clause?
SELECT *
FROM theTable t1
WHERE CAST(t1.useridp1 AS VARCHAR) + CAST(t1.useridp2 AS VARCHAR) IN
(SELECT CAST(t2.useridp1 AS VARCHAR) + CAST(t.useridp2 AS VARCHAR)
FROM theTable t2
WHERE t2.qualified
);
This is a solution in mysql, but I believe it should transfer to sql server pretty easily. Use a subquery to pick out groups of (id1, id2) combinations with at least one True 'qualifier' row; then join that to the original table on (id1, id2).
mysql> SELECT u1.*
FROM users u1
JOIN (SELECT id1,id2
FROM users
WHERE qualifier
GROUP BY id1, id2) u2
USING(id1, id2);
+------+------+-----------+------+
| id1 | id2 | qualifier | data |
+------+------+-----------+------+
| 1 | 2 | 1 | aa |
| 1 | 2 | 0 | bb |
+------+------+-----------+------+
2 rows in set (0.00 sec)

oracle - getting 1 or 0 records based on the number of occurrences of a non-unique field

I have a table MYTABLE
N_REC | MYFIELD |
1 | foo |
2 | foo |
3 | bar |
where N_REC is the primary key and MYFIELD is a non-unique field.
I need to query this table on MYFIELD and extract the associated N_REC, but only if there is only one occurrence of MYFIELD; otherwise I need no records returned.
So if I go with MYFIELD='bar' I will get 3, if I go with MYFIELD='foo' I will get no records.
I went with the following query
select * from
(
select
n_rec,
( select count(*) from mytable where mycolumn=my.mycolumn ) as counter
from mytable my where mycolumn=?
)
where counter=1
While it gives me the desired result I feel like I'm running the same query twice.
Are there better ways to achieve what I'm doing?
I think that this should do what you want:
SELECT
my_field,
MAX(n_rec)
FROM
My_Table
GROUP BY
my_field
HAVING
COUNT(*) = 1
You might also try the analytic or windowing version of count(*) and compare plans to the other options:
select n_rec, my_field
from (select n_rec, my_field
, count(*) over (partition by my_field) as Counter
from myTable
where my_field = ?)
where Counter = 1