SQL query to fetch distinct records - sql

Can someone help me out with this sql query on postgres which I have to write but I just can't come up with, I have tried my best to simplify the problem from 1 million records and more constraints to this, I know this looks easy, but I am still unable to resolve this somehow :-
Table_name = t
Column_1_name = id
Column_2_name = st
Column_1_elements = [1,1,1,1,2,2,2,3,3]
Column_2_elements = [a,b,c,d,a,c,d,b,d]
Now I want to print to those distinct ids from id where they do not have their corresponding st equals to 'b' or 'a'.
For example, for the above example, the ouput should be [2,3] as 2 does not have corresponding 'b' and 3 does not have 'a'. [even though 3 does not have c also, but we are not concerned about 'c']. id=1 is not returned in solution as it has a relation with both 'a' and 'b'.
Let me know if you need more clarity.
Thanks in advance for helping.
edit1:- The number of elements for id = 1,2,3 could be anything. I just want those ids where there corresponding st does not "contain" 'a' or 'b'.
if there is an id=4 which has just one st which is 'r', and there is an id=5 which contains 'a','b','c','d','e','f','k','z'.
Then we want id=4 in the output as well as it does not contain 'a' or 'b'..

You might need to correct the syntax a little bit based on you SQL engine but this one is a working solution in Google BigQuery -
with temp as (
select 1 as id, 'a' as st union all
select 1 as id, 'b' as st union all
select 1 as id, 'c' as st union all
select 1 as id, 'd' as st union all
select 2 as id, 'a' as st union all
select 2 as id, 'c' as st union all
select 2 as id, 'd' as st union all
select 3 as id, 'b' as st union all
select 3 as id, 'd' as st union all
select 4 as id, 'e' as st union all
select 5 as id, 'g' as st union all
select 5 as id, 'h' as st
)
-- add 2 columns for is_a and is_b flags
, temp2 as (
select *
, case when st = 'a' then 1 else 0 end is_a
,case when st = 'b' then 1 else 0 end as is_b
from temp
)
-- IDs that have both the flags as 1 should be filtered out (like ID = 1)
select id
from temp2
group by 1
having max(is_a) + max(is_b) < 2
This solution takes care of the problem you mentioned with ID 4 . Let me know if this works for you.

See if this works:
create table t (id integer, st varchar);
insert into t values (1, 'a'), (1, 'b'), (1, 'c'), (1, 'd'), (2, 'a'), (2, 'c'), (2, 'd'), (3, 'b'), (3, 'd'), (4, 'r');
insert into t values (5, 'a'), (5, 'b'), (5, 'c'), (5, 'd'), (5, 'e'), (5, 'f'), (5, 'k'), (5, 'z');
select id, array['a', 'b'] <# array_agg(st)::text[] as tf from t group by id;
id | tf
----+----
3 | f
5 | t
4 | f
2 | f
1 | t
select * from (select id, array['a', 'b'] <# array_agg(st)::text[] as tf from t group by id) as agg where agg.tf = 'f';
id | tf
----+----
3 | f
4 | f
2 | f
In the first select query the array_agg(st) aggregates all the st values for an id via the group by id. array['a', 'b'] <# array_agg(st)::text[] then asks if the a and b are both in the array_agg.
The query is then turned into a sub-query where the outer query selects those rows that where 'f'(false), in other words did not have both a and b in the aggregated id values.

Related

Multiple conditions on multiple columns

I have table that looks like this
WO | PS | C
----------------
12 | 1 | a
12 | 2 | b
12 | 2 | b
12 | 2 | c
13 | 1 | a
I want to find values from WO column where PS has value 1 and C value a AND PS has value 2 and C has value b. So on one column I need to have multiple conditions and I need to find it within WO column. If there is no value that matches two four conditions I don't want to have column WO included.
I tried using condition:
WHERE PS = 1 AND C = a AND PS = 2 AND C = b
but it does not work and does not have connection to WO column as mentioned above.
Edit:
I need to find WO which has (PS = 1 AND C = a) and at the same time it also has rows where (PS = 2 and C = b).
The result should be:
WO | PS | C
----------------
12 | 1 | a
12 | 2 | b
12 | 2 | b
If either of rows: (PS = 1 and C = a) or (PS = 2 and C = b) does not exist then nothing should be returned.
WHERE (PS = 1 AND C = a) or (PS = 2 AND C = b)
try this condition
As I understand this, you need two IN clauses or two EXIST clauses, something like this:
SELECT DISTINCT wo, ps, c
FROM yourtable
WHERE wo IN
(SELECT wo FROM yourtable WHERE ps = 1 and c = 'a')
AND wo IN
(SELECT wo FROM yourtable WHERE ps = 2 and c = 'b');
This will produce this outcome:
WO | PS | C
----------------
12 | 1 | a
12 | 2 | b
12 | 2 | c
Please note that in the last row of the result, the column C has value c instead of b as you have shown in your question. I guess this was your mistake when creating the sample outcome?
If I understand your question incorrect, please let me know and explain what's wrong, then I would review it.
Edit: To create the same result as shown in your question, this query would do:
SELECT wo, ps, c
FROM yourtable
WHERE ps IN (1,2) AND c IN ('a','b')
AND wo IN
(SELECT wo FROM yourtable WHERE ps = 1 and c = 'a')
AND wo IN
(SELECT wo FROM yourtable WHERE ps = 2 and c = 'b');
But I really don't believe this is what you were looking for ;)
Try out: db<>fiddle
I think you can make use of an exists criteria here to filter your rows correctly, I would like to see a wider sample data set to be sure though.
select *
from t
where ps in (1,2) and C in ('a','b')
and exists (
select * from t t2 where t2.WO = t.WO
and t2.PS != t.PS and t2.C != t.C
);
Just to throw in one more solution, you can do this with a single reference to your table, but this may not necessarily mean that it is more efficient. The first part is to filter based on the combinations you want:
DECLARE #T TABLE (WO INT, PS INT, C CHAR(1))
INSERT #T (WO, PS, C)
VALUES (12, 1, 'a'), (12, 2, 'b'), (12, 2, 'b'), (12, 2, 'c'), (13, 1, 'a');
SELECT *
FROM #T AS t
WHERE (t.PS = 1 AND t.C = 'a')
OR (t.PS = 2 AND t.C = 'B');
WO
PS
C
12
1
a
12
2
b
12
2
b
13
1
a
But you want to exclude WO 13 because this doesn't have both combinations, so what we ideally need is a count distinct of WS and C to find those with a distinct count of 2. You can't do COUNT(DISTINCT ..) in a windowed function directly, but you can do this indirectly with DENSE_RANK():
DECLARE #T TABLE (WO INT, PS INT, C CHAR(1))
INSERT INTO #T (WO, PS, C)
VALUES (12, 1, 'a'), (12, 2, 'b'), (12, 2, 'b'), (12, 2, 'c'), (13, 1, 'a');
SELECT *,
CntDistinct = DENSE_RANK() OVER(PARTITION BY t.WO ORDER BY t.PS, t.C) +
DENSE_RANK() OVER(PARTITION BY t.WO ORDER BY t.PS DESC, t.C DESC) - 1
FROM #T AS t
WHERE (t.PS = 1 AND t.C = 'a')
OR (t.PS = 2 AND t.C = 'B');
Which gives:
WO
PS
C
CntDistinct
12
1
a
2
12
2
b
2
12
2
b
2
13
1
a
1
You can then put this in a subquery and chose only the rows with a count of 2:
DECLARE #T TABLE (WO INT, PS INT, C CHAR(1))
INSERT INTO #T (WO, PS, C)
VALUES (12, 1, 'a'), (12, 2, 'b'), (12, 2, 'b'), (12, 2, 'c'), (13, 1, 'a');
SELECT t.WO, t.PS, t.C
FROM ( SELECT t.*,
CntDistinct = DENSE_RANK() OVER(PARTITION BY t.WO ORDER BY t.PS, t.C) +
DENSE_RANK() OVER(PARTITION BY t.WO ORDER BY t.PS DESC, t.C DESC) - 1
FROM #T AS t
WHERE (t.PS = 1 AND t.C = 'a')
OR (t.PS = 2 AND t.C = 'B')
) AS t
WHERE t.CntDistinct = 2;
Finally, if the combinations are likely change, or are a lot more than 2, you may find building a table of the combinations you are looking for a more maintainable solution:
DECLARE #T TABLE (WO INT, PS INT, C CHAR(1))
INSERT INTO #T (WO, PS, C)
VALUES (12, 1, 'a'), (12, 2, 'b'), (12, 2, 'b'), (12, 2, 'c'), (13, 1, 'a');
DECLARE #Combinations TABLE (PS INT, C CHAR(1), PRIMARY KEY (PS, C));
INSERT #Combinations(PS, C)
VALUES (1, 'a'), (2, 'b');
SELECT t.WO, t.PS, t.C
FROM ( SELECT t.*,
CntDistinct = DENSE_RANK() OVER(PARTITION BY t.WO ORDER BY t.PS, t.C) +
DENSE_RANK() OVER(PARTITION BY t.WO ORDER BY t.PS DESC, t.C DESC) - 1
FROM #T AS t
INNER JOIN #Combinations AS c
ON c.PS = t.PS
AND c.C = t.C
) AS t
WHERE t.CntDistinct = (SELECT COUNT(*) FROM #Combinations);
Let's chat about demo data. You provided some useful data that helps us see what your problem is, but no DDL. If you provide your demo data similar to this, it makes it easier for us to understand the issue:
DECLARE #table TABLE (WO INT, PS INT, C NVARCHAR(10))
INSERT INTO #table (WO, PS, C) VALUES
(12, 1, 'a'), (12, 2, 'b'),
(12, 2, 'b'), (12, 2, 'c'),
(13, 1, 'a')
Now on to your question. It looks to me like you just need a composite conditions and that one of them needs to evaluate to fully true. Consider this:
SELECT *
FROM #table
WHERE (
PS = 1
AND C = 'a'
)
OR (
PS = 2
AND C = 'b'
)
The predicates wrapped in the parens are evaluated as a whole in the WHERE clause. If one of the predicates is false, the whole thing is. If either composite evaluates to true, we return the row.
WO PS C
---------
12 1 a
12 2 b
12 2 b
13 1 a
This result set does include WO 13, as by your definition it should be there. I don't know if there are additional things you wanted to evaluate which may exclude it, but it does have a PS of 1 and a C of a.
Edit:
if the question is as discussed in the comments that a single WO must contain BOTH then this may be the answer:
SELECT *
FROM #table t
INNER JOIN (
SELECT t1.WO
FROM #table t1
INNER JOIN #table t2
ON t1.WO = t2.WO
WHERE t1.PS = 1
AND t1.C = 'a'
AND t2.PS = 2
AND t2.C = 'b'
GROUP BY t1.WO
) a
ON t.WO = a.WO
WHERE (
t.PS = 1
AND t.C = 'a'
)
OR (
t.PS = 2
AND t.C = 'b'
)
WO PS C WO
--------------
12 1 a 12
12 2 b 12
12 2 b 12

Selecting rows from a table with specific values per id

I have the below table
Table 1
Id WFID data1 data2
1 12 'd' 'e'
1 13 '3' '4f'
1 15 'e' 'dd'
2 12 'f' 'ee'
3 17 'd' 'f'
2 17 'd' 'f'
4 12 'd' 'f'
5 20 'd' 'f'
From this table I just want to select the rows which has 12 and 17 only exclusively. Like from the table I just want to retrieve the distinct id's 2,3 and 4. 1 is excluded because it has 12 but also has 13 and 15. 5 is excluded because it has 20.
2 in included because it has just 12 and 17.
3 is included because it has just 17
4 is included because it has just 12
If you just want the list of distinct ids that satisfy the conditions, you can use aggregation and filter with a having clause:
select id
from mytable
group by id
having max(case when wfid not in (12, 17) then 1 else 0 end) = 0
This filters out groups that have any wfid other than 12 or 17.
If you want the entire corresponding rows, then window functions are more appropriate:
select
from (
select t.*,
max(case when wfid not in (12, 17) then 1 else 0 end) over(partition by id) flag
from mytable t
) t
where flag = 0
You really need to start thinking in terms of sets. And it helps everyone if you provide a script that can be used to experiment and demonstrate. Here is another approach using the EXCEPT operator. The idea is to first generate a set of IDs that we want based on the filter. You then generate a set of IDs that we do not want. Using EXCEPT we can then remove the 2nd set from the 1st.
declare #x table (Id tinyint, WFID tinyint, data1 char(1), data2 varchar(4));
insert #x (Id, WFID, data1, data2) values
(1, 12, 'd', 'e'),
(1, 13, '3', '4f'),
(1, 15, 'e', 'dd'),
(2, 12, 'f', 'ee'),
(3, 17, 'd', 'f'),
(2, 17, 'd', 'f'),
(4, 12, 'd', 'f'),
(2, 12, 'z', 'ef'),
(5, 20, 'd', 'f');
select * from #x
select id from #x where WFID not in (12, 17);
select id from #x where WFID in (12, 17)
except
select id from #x where WFID not in (12, 17);
Notice the added row to demonstrate what happens when there are "duplicates".

Use CTE to generate test data in SQL

I am trying to use a CTE to generate a data table for a unit test in SQL (postgresql).
WITH temp1 AS (
SELECT
('A', 'A', 'B', 'B') AS grp,
(1, 2, NULL, 1) AS outcome
)
SELECT *
FROM temp1
The above query is generating a single row rather than a 4-row table that would be useful to my unit test. How can I generate the 4-row table in the form:
grp.....outcome
A.......1
A.......2
B.......NULL
B.......1
You could just use the values() syntax to create the rows, like so:
with temp1(grp, outcome) as (values ('A', 1), ('A', 2), ('B', null), ('B', 1))
select * from temp1
Demo on DB Fiddle:
grp | outcome
:-- | ------:
A | 1
A | 2
B | null
B | 1
You don't need a CTE. Use UNION ALL, as in:
select 'A' as grp, 1 as outcome
union all select 'A', 2
union all select 'B', null
union all select 'B', 1
WITH temp1 AS (
SELECT 'A' as grp ,1 as outcome
UNION
SELECT 'A', 2
UNION
SELECT 'B',NULL
UNION
SELECT 'B',1
) SELECT * FROM temp1

SQL sort and last record

this is my first post so pardon me if my question is not in it's appropriate places or tittle
I have a table like this
ID DATE Cat VALUE
-------------------------
1 07/07/2018 A 100
2 07/07/2018 A 200
3 07/07/2018 B 300
4 07/07/2018 B 400
5 07/07/2018 C 500
6 07/07/2018 C 600
7 08/07/2018 A 700
8 08/07/2018 A 800
9 08/07/2018 B 900
10 08/07/2018 B 110
11 08/07/2018 C 120
I would like to return
distinct category, sum of value, last record of the category
something like this
Cat sumValue lastrecord
--------------------------
A 1800 800
B 1710 110
C 1220 120
is it possible to do it in a single query
thanks
I am able to find the SUM
SELECT cat, SUM(value) FROM table GROUP BY cat;
and
find the last ID (autonumber key) using MAX
SELECT MAX(ID), cat FROM table GROUP BY cat;
but i just can't get the value for the last record
SQLFiddle
SELECT
t.cat,
SUM(t.value) as sumValue,
(
SELECT
t3.value
FROM
`table` t3
WHERE
t3.id = MAX(t2.id)
) as lastrecord
FROM
`table` t
JOIN
`table` t2 ON t.id = t2.id
GROUP BY
cat
EDIT shorter Version:
SELECT
t.cat,
SUM(t.value) as sumValue,
(SELECT value FROM `table` t2 WHERE t2.id = MAX(t.id)) lastValue
FROM
`table` t
GROUP BY
t.cat
This should do it
declare #t table (id int, cat char, value int);
insert into #t values
(1, 'A', 100),
(2, 'A', 200),
(3, 'B', 300),
(4, 'B', 400),
(5, 'C', 500),
(6, 'C', 600),
(7, 'A', 700),
(8, 'A', 800),
(9, 'B', 900),
(10, 'B', 110),
(11, 'C', 120);
select cat, value, sum
from
( select *
, sum(value) over (partition by cat) as sum
, ROW_NUMBER() over (partition by cat order by id desc) as rn
from #t
) tt
where tt.rn = 1
I hope you're looking for something like this,
Please replace the table name with your table name.
SELECT A.id,
A.cat,
A.date,
A.total_value,
A1.value
FROM (SELECT Max(id) AS id,
cat,
Max(date) AS Date,
Sum(value) AS Total_Value
FROM tbl_sof
GROUP BY cat) AS A
INNER JOIN tbl_sof A1
ON A.id = A1.id

I want to output a list of every Case_Number with Code 1 that has a higher UniqueID value than its Code 2 counterpart's UniqueID value

I have a table which looks something like this:
Case_Number | Code | UniqueID
a 1 1372
a 2 1352
a 3 1325
b 1 1642
b 2 1651
b 3 1623
c 1 1743
c 2 1739
c 3 1720
... ... ...
From this database I want to output a list of every Case_Number where the UniqueID value of Code 1 is higher than the UniqueID value of Code 2 (But ignoring the UniqueID value of Code 2, or any other Code x that might be in the table). Meaning that if the UniqueID value of Code 2 is higher than Code 1, which is the case with Case_Number b in the example above, it should not show up in the list.
So, querying the above table would result in this:
Case_Number | Code | UniqueID
a 1 1372
c 1 1743
Hmmm . . . You seem to want:
select t.*
from t
where t.code = 1 and
t.uniqueid > (select max(t2.uniqueid)
from t t2
where t2.case_number = t.case_number and t2.code = 2
);
The max() in the subquery is simply to handle the case where there is more than one matching value.
The query below gives you the expected result
CREATE TABLE CaseTab
(Case_Number VARCHAR(10),
Code INT,
UniqueID INT);
INSERT INTO CaseTab VALUES ('a', 1, 1372);
INSERT INTO CaseTab VALUES ('a', 2, 1352);
INSERT INTO CaseTab VALUES ('a', 3, 1325);
INSERT INTO CaseTab VALUES ('b', 1, 1642);
INSERT INTO CaseTab VALUES ('b', 2, 1651);
INSERT INTO CaseTab VALUES ('b', 3, 1623);
INSERT INTO CaseTab VALUES ('c', 1, 1743);
INSERT INTO CaseTab VALUES ('c', 2, 1739);
INSERT INTO CaseTab VALUES ('c', 3, 1720);
WITH v_code_gt_1 AS
(SELECT Case_Number, MAX(UniqueID) AS UniqueID
FROM CaseTab
WHERE Code > 1
GROUP BY Case_Number)
SELECT c1.Case_Number, c1.UniqueID
FROM CaseTab c1 JOIN
v_code_gt_1 c2
ON (c1.Case_Number = c2.Case_Number)
WHERE c1.UniqueID > c2.UniqueID
AND c1.Code = 1;
Basically the query gets the max UniqueID for all cases where code is greater than 1 and compares against the Unique ID for Code 1.
You haven't stated whether there can be cases with code = 1, but no other codes. If so, use LEFT JOIN as below.
WITH v_code_gt_1 AS
(SELECT Case_Number, MAX(UniqueID) AS UniqueID
FROM CaseTab
WHERE Code > 1
GROUP BY Case_Number)
SELECT c1.Case_Number, c1.UniqueID
FROM CaseTab c1 LEFT JOIN
v_code_gt_1 c2
ON (c1.Case_Number = c2.Case_Number)
WHERE c1.UniqueID > ISNULL(c2.UniqueID, 0)
AND c1.Code = 1;