PostgreSQL: Select unique rows where distinct values are in list - sql

Say that I have the following table:
with data as (
select 'John' "name", 'A' "tag", 10 "count"
union all select 'John', 'B', 20
union all select 'Jane', 'A', 30
union all select 'Judith', 'A', 40
union all select 'Judith', 'B', 50
union all select 'Judith', 'C', 60
union all select 'Jason', 'D', 70
)
I know there are a number of distinct tag values, namely (A, B, C, D).
I would like to select the unique names that only have the tag A
I can get close by doing
-- wrong!
select
distinct("name")
from data
group by "name"
having count(distinct tag) = 1
however, this will include unique names that only have 1 distinct tag, regardless of what tag is it.
I am using PostgreSQL, although having more generic solutions would be great.

You're almost there - you already have groups with one tag, now just test if it is the tag you want:
select
distinct("name")
from data
group by "name"
having count(distinct tag) = 1 and max(tag)='A'
(Note max could be min as well - SQL just doesn't have single() aggregate function but that's different story.)

You can use not exists here:
select distinct "name"
from data d
where "tag" = 'A'
and not exists (
select * from data d2
where d2."name" = d."name" and d2."tag" != d."tag"
);

This is one possible way of solving it:
select
distinct("name")
from data
where "name" not in (
-- create list of names we want to exclude
select distinct name from data where "tag" != 'A'
)
But I don't know if it's the best or most efficient one.

Related

ROW type/constructor in BigQuery

Does BigQuery have the concept of a ROW, for example, similar to MySQL or Postgres or Oracle or Snowflake? I know it sort of implicitly uses it when doing an INSERT ... VALUES (...) , for example:
INSERT dataset.Inventory (product, quantity)
VALUES('top load washer', 10),
('front load washer', 20)
Each of the values would be implicitly be a ROW type of the Inventory table, but is this construction allowed elsewhere in BigQuery? Or is this a feature that doesn't exist in BQ?
I think below is a simplest / naïve example of such constructor in BigQuery
with t1 as (
select 'top load washer' product, 10 quantity, 'a' type, 'x' category union all
select 'front load washer', 20, 'b', 'y'
), t2 as (
select 1 id, 'a' code, 'x' value union all
select 2, 'd', 'z'
)
select *
from t1
where (type, category) = (select as struct code, value from t2 where id = 1)
Besides using in simple queries, it can also be use in BQ scripts - for example (another simplistic example)
declare type, category string;
create temp table t2 as (
select 1 id, 'a' code, 'x' value union all
select 2, 'd', 'z'
);
set (type, category) = (select as struct code, value from t2 where id = 1);

Find missing value in table from given set

Assume there is a table called "allvalues" with a column named "column".
This column contains the values "A" to "J" while missing the "H".
I am given a set of values from "G" to "J".
How can I query the table to see which value of my set is missing in the column?
The following does not work:
select * from allvalues where column not in ('G', 'H', 'I', 'J')
This query would result in A, B, C, D, E, F, H which also contains values not included in the given set.
Obviously in such a small data pool the missing value is noticeable by eye, but imagine more entries in the table and a bigger set.
You need to start with a (derived) table with the values you are checking. One explicit method is:
with testvalues as (
select 'G' as val from dual union all
select 'H' as val from dual union all
select 'I' as val from dual union all
select 'J' as val from dual
)
select tv.val
from testvalues tv
where not exists (select 1 from allvalues av where av.column = tv.val);
Often, the values originate through a query or a table. So explicitly declaring them is unnecessary -- you can replace that part with a subquery.
Depends on which SQL syntax you can use, but basically you want to check your table allvalues + the extra values.
eg:
SELECT *
FROM ALLVALUES
WHERE COLUMN NOT IN (
( select s.column from allvalues s )
and column not in ('G', 'H', 'I', 'J')
this will work:
select * from table1;
G
H
I
J
select * from table1
minus
(select * from table1
intersect
select column from allvalues
)
sample input:
select * from ns_table10;
G
H
I
J
SELECT * FROM ns_table11;
A
B
C
D
E
F
G
J
I
select * from ns_table10
minus
(select * from ns_table10
intersect
select * from ns_table11
);
output:
H

Find duplicate records in database against unique attributes

I want to find duplicate of IRN # entered into a table in database. Here are the unique attributes (logically unique) of the IRN.
ProjectNo, DrawingNo, DrawingRev, SpoolNo, WeldNo
An IRN can have multiple WeldNos meaning the above unique attributes may repeat for one IRN # (with of course one of the 5 attribute values must be unique).
Now I am trying to find out whether there are any duplicate IRNs entered into the system or not? How can I find that through a sql query?
P.S: Due to bad design of database, there is no primary key in the table..
Here is what I have tried so far but this does not give the correct results.
select * from WeldInfo a, WeldInfo b
where a.ProjectNo = b.ProjectNo and
a.DrawingNo = b.DrawingNo and
a.DrawingRev = b.DrawingRev and
a.SpoolNo = b.SpoolNo and
a.WeldNo = b.WeldNo and
a.IrnNo <> b.IrnNo;
But i'm not sure, have i understood your question.
select * from (
select count(*) over ( partition by ProjectNo, DrawingNo, DrawingRev, SpoolNo, WeldN) rr,t.* from WeldInfo t)
where rr > 1;
Explanation.
with tab as (
select 1 as id, 'a' as a , 'b' as b , 'c' as c from dual
union all
select 2 , 'a', 'b', 'c' from dual
union all
select 3 , 'x', 'b', 'c' from dual
union all
select 3 , 'x', 'b', 'c' from dual
union all
select 3 , 'x', 'd', 'c' from dual
)
select t.*
, count(*) over (partition by a,b,c) cnt1
, count(distinct id) over (partition by a,b,c) cnt2
from tab t;

SQL: Return a count of 0 with count(*)

I am using WinSQL to run a query on a table to count the number of occurrences of literal strings. When trying to do a count on a specific set of strings, I still want to see if some values return a count of 0. For example:
select letter, count(*)
from table
where letter in ('A', 'B', 'C')
group by letter
Let's say we know that 'A' occurs 3 times, 'B' occurs 0 times, and 'C' occurs 5 times. I expect to have a table returned as such:
letter count
A 3
B 0
C 5
However, the table never returns a row with a 0 count, which results like so:
letter count
A 3
C 5
I've looked around and saw some articles mentioning the use of joins, but I've had no luck in correctly returning a table that looks like the first example.
You can create an in-line table containing all letters that you look for, then LEFT JOIN your table to it:
select t1.col, count(t2.letter)
from (
select 'A' AS col union all select 'B' union all select 'C'
) as t1
left join table as t2 on t1.col = t2.letter
group by t1.col
on many platforms you can now use the values statement instead of union all to create your "in line" table - like this
select t.letter, count(mytable.letter)
from ( values ('A'),('B'),('C') ) as t(letter)
left join mytable on t.letter = mytable.letter
group by t.letter
I'm not that familiar with WinSQL, but it's not pretty if you don't have the values that you want in the left most column in a table somewhere. If you did, you could use a left join and a conditional. Without it, you can do something like this:
SELECT all_letters.letter, IFNULL(letter_count.letter_count, 0)
FROM
(
SELECT 'A' AS letter
UNION
SELECT 'B' AS letter
UNION
SELECT 'C' AS letter
) all_letters
LEFT JOIN
(SELECT letter, count(*) AS letter_count
FROM table
WHERE letter IN ('A', 'B', 'C')
GROUP BY letter) letter_count
ON all_letters.letter = letter_count.letter

use SUM on certain conditions

I have a script that extracts transactions and their details from a database. But my users complain that the file size being generated is too large, and so they asked for certain transactions to be just summed up/consolidated instead if they are of a certain classification, say Checking Accounts. That means there should only be one line in the result set named "Checking" which contains the sum of all transactions under Checking Accounts. Is there a way for an SQL script to go like:
CASE
WHEN Acct_class = 'Checking'
then sum(tran_amount)
ELSE tran_amount
END
I already have the proper GROUP BY and ORDER BY statements, but I can't seem to get my desired output. There is still more than one "Checking" line in the result set. Any ideas would be very much appreciated.
Try This,
Select sum(tran_amount) From tran_amount Where Acct_class = 'Checking'
You can try to achieve this using UNION ALL
SELECT tran_amount, .... FROM table WHERE NOT Acct_class = 'Checking'
UNION ALL
SELECT SUM(tran_amount), .... FROM table WHERE Acct_class = 'Checking' GROUP BY Acct_class, ...;
hi you can try below sql
select account_class,
case when account_class = 'saving' then listagg(trans_detail, ',') within group (order by emp_name) -- will give you all details transactions
when account_class = 'checking' then to_char(sum(trans_detail)) -- will give you only sum of transactions
end as trans_det from emp group by account_class;
Or, if your desired output is getting either the sum, either the actual column value based on another column value, the solution would be to use an analytical function to get the sum together with the actual value:
select
decode(acct_class, 'Checking', tran_amount_sum, tran_amount)
from (
select
sum(tran_amount) over (partition by acct_class) as tran_amount_sum,
tran_amount,
acct_class
from
YOUR_TABLE
)
You can try something like the following, by keeping single rows for some classes, and aggregating for some others:
with test (id, class, amount) as
(
select 1, 'a' , 100 from dual union all
select 2, 'a' , 100 from dual union all
select 3, 'Checking', 100 from dual union all
select 4, 'Checking', 100 from dual union all
select 5, 'c' , 100 from dual union all
select 6, 'c' , 100 from dual union all
select 7, 'c' , 100 from dual union all
select 8, 'd' , 100 from dual
)
select sum(amount), class
from test
group by case
when class = 'Checking' then null /* aggregates elements of class 'b' */
else id /* keeps elements of other classes not aggregated */
end,
class