Get DISTINCT list of values within GroupBy sql - sql

I have this Table
place subplace
A x
A x
B x
A y
B y
A y
A z
When I do this query
SELECT place, count(distinct subplace) AS count_subplace
from table
GROUP BY place
I get result
place count_subplace
A 3
B 2
Now I want the list of the distinct elements rather than the count
I know we can use string_agg
how can I call distinct and also groupby in it
I want result something like
place subplace_list
A x,y,z
B x,y
I have tried this which won't work
SELECT place, string_agg(distinct subplace,',') AS list_subplace
from table
GROUP BY place

Because of string_agg didn't support distinct inside.
You can try to use a subquery to distinct your result set.
Query 1:
SELECT place, string_agg(subplace,',') AS list_subplace
from (select distinct place,subplace from t) t1
GROUP BY place
Results:
| place | list_subplace |
|-------|---------------|
| A | x,y,z |
| B | x,y |

Related

Select concatenated columns based on criteria list in other table

I have a table1
line
a
b
c
d
e
f
g
h
1
18
2
2
22
0
2
1
2
2
20
2
2
2
0
0
0
2
3
10
2
2
222
0
2
1
2
4
12
2
2
3
0
0
0
0
5
15
2
2
3
0
0
0
0
And a table2
 line
criteria
1
 a,b
2
 b,c,f,h
3
 a,b,e,g,h
4
 c,e
I am using this code to see/select the unique results of concated/joined columns, like concat(c,',',d), concat(b,',',d,',',g) and so on from table1 and is working perfectly:
SELECT DISTINCT(CONCAT(c,',',d))
FROM table1
But, instead of writing manually like concat(c,',',d), I want to refer to table2.criteria to get columns references to be concated/joined from table1 so that i can see the entire unique results against each concated criteria
Tried this, but getting an error:
SELECT DISTINCT(SELECT criteria FROM table2)
FROM table1
ERROR: more than one row returned by a subquery used as an expression
SQL state: 21000
The expected unique result is something like this;
| criteria | result |
| ------------ | ---------- |
| a,b | 15,2 |
| a,b | 10,2 |
| a,b | 20,2 |
| a,b | 12,2 |
| a,b | 18,2 |
| b,c,f,h | 2,2,2,2 |
| b,c,f,h | 2,2,0,2 |
| b,c,f,h | 2,2,0,0 |
| a,b,e,g,h | 20,2,0,0,2 |
| a,b,e,g,h | 12,2,0,0,0 |
| a,b,e,g,h | 15,2,0,0,0 |
| a,b,e,g,h | 10,2,0,1,2 |
| a,b,e,g,h | 18,2,0,1,2 |
| c,e | 2,0 |
SQL does not allow to parameterize identifiers. There are various ways to work around this restriction.
It's unclear from the question, but according to comments you want to concatenate the given pattern for every row in table1.
1. Dynamic SQL
Create a helper function (once!) that concatenates and executes statements dynamically.
Basics:
Define table and column names as arguments in a plpgsql function?
CREATE OR REPLACE FUNCTION f_concat_cols(_cols text)
RETURNS TABLE (result text)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE format(
$q$SELECT concat_ws(',', %s) FROM table1 ORDER BY line$q$, _cols);
END
$func$;
It's a set-returning function (a.k.a. "table function"), to return one result row for every row in table1 for each given pattern.
Warning: Converting user input to code like this is a prime opportunity for SQL injection. You must make sure that table1.criteria can only hold valid strings!
To get the full result matrix (with distinct results per row in table2), the query is simple now:
SELECT DISTINCT line AS t2_line, criteria, t1.*
FROM table2, f_concat_cols(criteria) t1
ORDER BY t2_line;
2. Workaround with conversion to JSON
SELECT DISTINCT t2.line AS t2_line, t2.criteria, c.*
FROM table2 t2
CROSS JOIN (SELECT line, to_json(t) AS js FROM table1 t) t1
CROSS JOIN LATERAL (
SELECT string_agg(t1.js->>sub, ',') AS result
FROM unnest(string_to_array(t2.criteria, ',')) sub
) c
ORDER BY t2_line;
After converting rows from t1 to a JSON record, we can access keys (converted from column names) directly.
I unnest the pattern, access each single key, and aggregate the result in LATERAL subquery. See:
What is the difference between a LATERAL JOIN and a subquery in PostgreSQL?
You could encapsulate the logic in a function like in 1., but that's optional in this case.
3. Workaround with conversion to Postgres arrays
SELECT DISTINCT t2.line AS t2_line, t2.criteria, c.*
FROM table2 t2
CROSS JOIN (SELECT line, ARRAY [a,b,c,d,e,f,g,h] AS arr FROM table1 t) t1
CROSS JOIN LATERAL (
SELECT string_agg(t1.arr[idx]::text, ',') AS result
FROM unnest(string_to_array(translate(t2.criteria, 'abcdefgh', '12345678'), ',')::int[]) idx
) c
ORDER BY t2_line;
Similar to the "trick" with JSON, we can avoid dynamic SQL by converting columns to a plain Postgres array. Then project column names to integer array indices. I use translate() for the simple case, but that only works for single letters! Use replace() or regexp_replace() or some other method for longer names.
The rest is like the above.
fiddle - showing all.

Handling multiple return values in subquery

I have the following data:
cte
=================
gp_id | m_ids
------|----------
1 | {123}
2 | {432,222}
3 | {123,222}
And a function with a signature like this (which in fact returns not a table but a couple of ids):
FUNCTION foo(m_ids integer[])
RETURNS TABLE (
first_id integer,
second_id integer
)
Now, I've got to iterate over each row and perform some calculations with that function, so I would get something like this:
gp_id | first_id | second_id
------|----------|-----------
1 | 25 | 25
2 | 13 | 24
3 | 25 | 11
To achieve that I tried the following code:
SELECT gp_id,
(
SELECT *
FROM foo(
(
SELECT m_ids
FROM cte c2
WHERE c2.gp_id = c1.gp_id)) limit 1)
FROM cte c1
The problem is in the SELECT * statement. If I use SELECT first_id, everything works well (except for that I have to run two consecutive queries, which I'd like to avoid, obviously), but in the former case I'm getting the error
subquery must return only one column
which is somewhat expected.
So how can I correctly iterate over the table in one single query?
Use the function in a lateral join:
select gp_id, first_id, second_id
from cte,
lateral foo(m_ids);

Best SQL query to get unique sets from below table

I have a below table
Select X,Y from T
X | Y
------
1 | 2
1 | 3
2 | 1
3 | 5
3 | 1
Column X and Y holds Strings, I gave numbers just for example.
I need output from this table as below
1,2
1,3
3,5
i,e, Unique sets from the table. Out of Row 1 (1,2) and Row 3 (2,1), I need only one set, because (1,2)=(2,1) in my set. Similarly (1,3)=(3,1).
So unique sets in this table are (1,2) (1,3) and (3,5).
I tried below SQL, let me know if there is a better way, as I am not sure whether I can use '>' or '<' with ROWID
SELECT X||','||Y FROM T t1
WHERE NOT EXISTS (SELECT 1 FROM T t2
WHERE t1.X=t2.Y AND t1.Y=t2.X and t1.ROWID>t2.ROWID)
select distinct least(x,y), greatest(x,y)
from the_table;
least() and greatest() put the values into an order so that 1,2 and 2,1 are returned as 1,2. The distinct then removes the duplicates
DISTINCT gets you distinct rows, so all you need to do is to have your pairs ordered, first the smaller then the larger. You do this with LEAST and GREATEST.
select distinct least(x,y) || ',' || greatest(x,y)
from t;

Multiple row count from single table

How do I get counts of multiple records from a single table using db2 query?
Suppose I want to get the count of 1 record am using:
select count(*) from schema.table where record value='x'
What I need is a count of multiple records from the same table in separate rows for each record. I am trying something like:
select count(*) from schema.table where record in('x','y','z')
The queried result combines the value into one single value in a single row, which I don't want.
I almost agree with the Mureinik. You can add a WHERE clause to get multiple row counts from only those records you want, e.g. (x, y, z)
SELECT record, COUNT(*) AS 'count'
FROM schema.table WHERE record IN ('x', 'y', 'z')
GROUP BY record
result:
------------------
| record | count |
------------------
| x | 100 |
| y | 150 |
| z | 50 |
------------------
The group by syntax breaks the table up into groups, and allows you to perform aggregate functions (count, in your case) on each one separately:
SELECT record, COUNT(*)
FROM schema.table
GROUP BY record

SQLite - select the newest row with a certain field value

I have an SQLite question which essentially boils down to the following problem.
id | key | data
1 | A | x
2 | A | x
3 | B | x
4 | B | x
5 | A | x
6 | A | x
New data is appended to the end of the table with an auto-incremented id.
Now, I want to create a query which returns the latest row for each key, like this:
id | key | data
4 | B | x
6 | A | x
I've tried some different queries but I have been unsuccessful. How do you select only the latest rows for each "key" value in the table?
use this SQL-Query:
select * from tbl where id in (select max(id) from tbl group by key);
You could split the main task into two subroutine.
You could move with the approach first retrieve all id/key value then get the id for the latest value of A and B keys,
Now you could easly write a query to get latest value for A and B because you have value of id's for both A and B keys.
SELECT *
FROM mytable
JOIN
( SELECT MAX(id) AS maxid
FROM mytable
GROUP BY "key"
) AS grp
ON grp.maxid = mytable.id
Side note: it's best not to use reserved words like keyas identifiers (for tables, fields. etc.)
Without nested SELECTs, or JOINs but only if the field determining "newest" is primary key (e.g. autoincrement):
SELECT * FROM table GROUP BY key DESC;