Select values that exist in all arrays in Postgres - sql

I've got some table ignore with col ignored_entry_ids contains array of integer. For example:
id ignored_entry_ids
1 {1,4,6}
2 {6,8,11}
3 {5,6,7}
How can I select numbers that exists in every row with array? (6 in examle)

If your numbers are unique inside array, you can do something like this, don't think it could be made without unnest
with cte as (
select id, unnest(ignored_entry_ids) as arr
from ign
)
select arr
from cte
group by arr
having count(*) = (select count(*) from ign)
sql fiddle demo
if numbers are not unique, add distinct:
with cte as (
select distinct id, unnest(ignored_entry_ids) as arr
from ign
)
select arr
from cte
group by arr
having count(*) = (select count(*) from ign)

Related

BigQuery Count Unique and Count Distinct

I am looking for SQL to count unique values in the column.
I am aware of DISTINCT - that gives me how many unique values there are. However, I am looking for - how many ONLY unique values there are.
So if my data is Letters: {A,A,A,B,B,B,C,D}. I am looking to get:
Count Distinct = 4 {A,B,C,D) and
Count Unique = 2 {C,D} <== this is what I am looking for
I am working with BigQuery.
Thank You,
Do
Below query will return only unique values in the column.
SELECT col
FROM UNNEST(SPLIT('A,A,A,B,B,B,C,D')) col
GROUP BY 1 HAVING COUNT(1) = 1;
Then, you can simply count rows.
WITH uniques AS (
SELECT col
FROM UNNEST(SPLIT('A,A,A,B,B,B,C,D')) col
GROUP BY 1 HAVING COUNT(1) = 1
)
SELECT COUNT(*) cnt FROM uniques;
Another option
select count(*) from (
select * from your_table
qualify 1 = count(*) over(partition by col)
)

How to union an array when grouping?

I am trying to combine multiple array columns into one with distinct elements and then get a count of distinct elements. How can I do something like that in postgres?
create temp table t as ( select 'james' as fn, array ['bond', 'milner'] as ln );
create temp table tt as ( select 'james' as fn, array ['mcface', 'milner'] as ln );
-- expected value: james, 3
select x.name,
array_length()-- what to do here?
from (
select fn, ln
from t
union
select fn, ln
from tt
) as x
group by x.name
You should unnest the arrays in the inner queries:
select x.fn,
count(elem)
from (
select fn, unnest(ln) as elem
from t
union
select fn, unnest(ln) as elem
from tt
) as x
group by x.fn
Db<>fiddle.
Why do you (want to) use arrays? That's not needed. Simply have a derived table using UNION, which eliminates duplicates, GROUP BY the name and use count().
SELECT name,
count(*)
FROM (SELECT name,
ln
FROM t
UNION
SELECT name,
ln
FROM tt) AS x
GROUP BY name;
db<>fiddle
Side note: 9.3 is out of support for a while. Consider upgrading.

Array_agg containing distinct structs

I'm attempting to create an array with distinct struct as values for a column, something like so
select array_agg(distinct struct(field_a, field_b)) as c FROM tables ...
is that possible?
#standardSQL
SELECT ARRAY_AGG(STRUCT(field_a, field_b)) c
FROM (
SELECT DISTINCT field_a, field_b
FROM `project.dataset.table`
)

Pattern matching SQL on first 5 characters

I'm thinking about a SQL query that returns me all entries from a column whose first 5 characters match. Any ideas?
I'm thinking about entries where ANY first 5 characters match, not specific ones. E.g.
HelloA
HelloB
ThereC
ThereD
Something
would return the first four entries:
HelloA
HelloB
ThereC
ThereD
EDIT: I am using SQL92 so cannot use the left command!
Try this :
SELECT *
FROM YourTable
WHERE LEFT(stringColumn, 5) IN (
SELECT LEFT(stringColumn, 5)
FROM YOURTABLE
GROUP BY LEFT(stringColumn, 5)
HAVING COUNT(*) > 1
)
SQLFIDDLE DEMO
This selects the first 5 characters, groups by them and returns only the ones that happen more than once.
Or with Substring:
SELECT * FROM YourTable
WHERE substring(stringColumn,1,5) IN (
SELECT substring(stringColumn,1,5)
FROM YOURTABLE
GROUP BY substring(stringColumn,1,5)
HAVING COUNT(*) > 1)
;
SQLFIDDLE DEMO
Sounds easy enough...
In SQL Server this would be something along the lines of
where Left(ColumnName,5) = '12345'
Try
Select *
From tbl t1
Where exists (
Select 1
From tbl t2
Where left(t1.str, 5) = left(t2.str)
Group by left(t2.str, 5)
Having count(1) > 1
)
You didn't specify your DBMS. If it supports Windowed Aggregate functions it's:
select *
from
(
select
tab.*,
count(*) over (partition by substring(col from 1 for 5) as cnt
from tab
) as dt
where cnt > 1
You want to work with a CTE approach.
Something like this:
with CountriesCTE(Id, Name)
as (
select Id, Name from Countries
)
select distinct Countries.Name
from CountriesCTE, Countries
where left(CountriesCTE.Name,5) = left(Countries.Name,5) and CountriesCTE.Id <> Countries.Id

How to get Original Rows filtered by a HAVING Condition?

What is the method in T-SQL to select the orginal values limited by a HAVING attribute. For example, if I have
A|B
10|1
11|2
10|3
How would I get all the values of B (Not An Average or some other summary stat), Grouped by A, having a Count (Occurrences of A) greater than or equal two 2?
Actually, you have several options to choose from
1. You could make a subquery out of your original having statement and join it back to your table
SELECT *
FROM YourTable yt
INNER JOIN (
SELECT A
FROM YourTable
GROUP BY
A
HAVING COUNT(*) >= 2
) cnt ON cnt.A = yt.A
2. another equivalent solution would be to use a WITH clause
;WITH cnt AS (
SELECT A
FROM YourTable
GROUP BY
A
HAVING COUNT(*) >= 2
)
SELECT *
FROM YourTable yt
INNER JOIN cnt ON cnt.A = yt.A
3. or you could use an IN statement
SELECT *
FROM YourTable yt
WHERE A IN (SELECT A FROM YourTable GROUP BY A HAVING COUNT(*) >= 2)
A self join will work:
select B
from table
join(
select A
from table
group by 1
having count(1)>1
)s
using(A);
You can use window function (no joins, only one table scan):
select * from (
select *, cnt=count(*) over(partiton by A) from table
) as a
where cnt >= 2