How to select an attribute based on string value within a group - sql

Table name: Copies
+------------------------------------+
| group_id | my_id | stuff |
+------------------------------------+
| 900 | 1 | Y |
| 900 | 2 | N |
| 901 | 3 | Y |
| 901 | 4 | Y |
| 902 | 5 | N |
| 902 | 6 | N |
| 903 | 7 | N |
| 903 | 8 | Y |
---------------------------------------
The output should be:
+------------------------------------+
| group_id | my_id | stuff |
+------------------------------------+
| 900 | 1 | Y |
| 903 | 8 | Y |
--------------------------------------
Hello, I have a table where I have to discern a 'good' record within a group_id based on a positive (Y) value within the stuff field. I need the full record where only one value fits this criteria. If both stuff values are Y or both are N, then they shouldn't be selected. It seems like this should be simple, but I am not sure how to proceed.

One option here is to use conditional aggregation over each group_id and retain a group if it has a mixture of yes and no answers.
WITH cte AS (
SELECT group_id
FROM Copies
GROUP BY group_id
HAVING SUM(CASE WHEN stuff = 'Y' THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN stuff = 'N' THEN 1 ELSE 0 END) > 0
)
SELECT c1.*
FROM Copies c1
INNER JOIN cte c2
ON c1.group_id = c2.group_id
WHERE c1.stuff = 'Y'
One advantage of this solution is that it will show all columns of matching records.

select group_id,
min(my_id)
keep (dense_rank first order by case stuff when 'Y' then 0 end) as my_id,
'Y' as stuff
from table_1
group by group_id
having min(stuff) != max(stuff)

with rows as(
select group_id, my_id, sum(case when stuff = 'Y' then 1 else 0 end) c
from copies
group by group_id, my_id)
select c.*
from copies c inner join rows r on (c.group_id = r.group_id and c.my_id = r.my_id)
where r.c = 1;

Try this:
SELECT C.*
FROM COPIES C,
COPIES C2
WHERE C.STUFF='Y'
AND C2.STUFF='N'
AND C.GROUP_ID=C2.GROUP_ID

Try this:
SELECT t1.*
FROM copies t1
JOIN (
SELECT group_id
FROM copies
GROUP BY group_id
HAVING COUNT(CASE WHEN stuff = 'Y' THEN 1 END) = 1 AND
COUNT(CASE WHEN stuff = 'N' THEN 1 END) = 1
) t2 ON t1.group_id = t2.group_id
WHERE t1.stuff = 'Y'
This works as long as group_id values appear in couples.

Related

SQL to get count of distinct rows based on different rules

Say you have a table like:
| key | status |
| --- | ------ |
| 3 | A |
| 4 | A |
| 4 | C |
| 5 | B |
| 6 | B |
| 6 | C |
| 7 | A |
| 7 | B |
I want a query that returns, in a single row, the count of the number of rows that contain a specific status, but applying some priority rules. The rules would be different for each row and something like:
Column a_count = count of any distinct key that has a status of A
Column b_count = count of any distinct key that has a status of B, but where the same key does not also appear with a status of A
Column c_count = count of any distinct key that has a status of C, but where the same key does not also appear with a status of A or B
The point being that the total of all counts should equal the total number of distinct keys in the source table. In my sample data above, the results should be:
| a_count | b_count | c_count |
| ------- | ------- | ------- |
| 3 | 2 | 0 |
should be able to do your pivot with case statements and not exists.
SELECT Count (CASE
WHEN status = 'A' THEN 1
ELSE 0
END) AS a_count,
Count (CASE
WHEN status = 'B'
AND NOT EXISTS (SELECT 1
FROM mytable b
WHERE a.KEY = b.KEY
AND b.status = 'A') THEN 1
ELSE 0
END) AS b_count,
Count (CASE
WHEN status = 'C'
AND NOT EXISTS (SELECT 1
FROM mytable c
WHERE a.KEY = c.KEY
AND c.status IN ( 'A', 'B' )) THEN 1
ELSE 0
END) AS c_count
FROM mytable a

Suggest SQL query for given use case

Original Table
Id | Time | Status
------------------
1 | 5 | T
1 | 6 | F
2 | 3 | F
1 | 2 | F
2 | 4 | T
3 | 7 | F
2 | 3 | T
3 | 1 | F
4 | 7 | H
4 | 6 | S
4 | 5 | F
4 | 4 | T
5 | 5 | S
5 | 6 | F
Expected Table
Id | Time | Status
------------------
1 | 6 | F
3 | 7 | F
4 | 5 | F
I want all the distinct ids who have status as F but time should be maximum, if for any id status is T for given maximum time then that id should not be picked. Also only those ids should be picked who have at-least one T. For e.g 4 will not be picked at it doesn't have any 'T' as status.
Please help in writing the SQL query.
You can use EXISTS and NOT EXISTS in the WHERE clause:
select t.*
from tablename t
where t.status = 'F'
and exists (select 1 from tablename where id = t.id and status = 'T')
and not exists (
select 1
from tablename
where id = t.id and status in ('F', 'T') and time > t.time
)
See the demo.
Results:
| Id | Time | Status |
| --- | ---- | ------ |
| 1 | 6 | F |
| 4 | 5 | F |
Try the below way -
select * from tablename t
where time = (select max(time) from tablename t1 where t.id=t1.id and Status='F')
and Status='F'
the following should work
select id,max(time) as time,status
from table
where status='F'
group by id,status
select id, max(time), status
from stuff s
where status = 'F'
and id not in (
select id
from stuff s2
where s2.id = s.id
and s2.time > s.time
and s2.status = 'T')
group by id, status;
You can see the Fiddle here.
As I understand it, you want to find the highest time for each ID (max(time)) where the status is F, but only if there isn't a later record where the status is 'T'. The sub query filters out records where there exists a later record where the status is T.
WITH MAX_TIME_ID AS (
SELECT
ID
,MAX(TIME) AS MAX_TIME
GROUP BY
ID
)
SELECT
O.*
FROM
ORIGINAL_TABLE O
INNER JOIN
MAX_TIME_ID MAX
ON
O.ID = MAX.ID
WHERE
O.STATUS = 'F'
The CTE will find the max time for each ID and the inner join with the where clause on the status will select it only if the latest is 'F'.
I would just use window functions:
select t.*
from (select t.*
row_number() over (partition by id order by time desc) as seqnum,
sum(case when status = 'T' then 1 else 0 end) over (partition by id) as num_t
from t
) t
where num_t > 0 and
seqnum = 1 and status = 'F';
There is a another fun way to do this just with aggregation:
select id, max(time) as time, 'F' as status
from t
group by id
having sum(case when status = 'T' then 1 else 0 end) > 0 and
max(time) = max(case when status 'F' then time end);

Group by a column and display the value of the column that matches the condition

I have to GROUP BY a column and if there are more than one entries for it, I need to display the one that satisfies the condition. If only one entry is there it should be displayed too.
ID | Name | GroupId
1 | A | x
2 | A | y
3 | B | x
4 | C | z
5 | A | z
6 | B | y
Condition: COUNT(GroupId) > 1 then display y
Expected result:
Name | GroupId
A | y
B | y
C | z
I have found answers with inner query. Is that possible to do without inner queries?
Note: If there are two or more records for a name and none have 'y' then have to display 'x' even if not there
With this:
select
name,
case
when count(distinct groupid) = 1 then max(groupid)
when sum(case when groupid = 'y' then 1 end) > 0 then 'y'
else 'x'
end groupid
from tablename
group by name
For:
CREATE TABLE tablename (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT,
groupid TEXT
);
INSERT INTO tablename (id,name,groupid) VALUES
(1,'A','x'),
(2,'A','y'),
(3,'B','x'),
(4,'C','z'),
(5,'A','z'),
(6,'B','y'),
(7,'D','k'),
(8,'D','m');
The results are:
| name | groupid |
| ---- | ------- |
| A | y |
| B | y |
| C | z |
| D | x |
See the demo.
You describe:
select t.name,
(case when count(*) > 2 then 'y'
else max(groupid)
end)
from t;
But I think you really want:
select t.name,
(case when min(groupid) <> max(groupid) then 'y'
else max(groupid)
end)
from t;
You can try below -
select name, case when count(groupid)>2 then 'y' else min(groupid) end as groupid
from tablename a
group by name

SQL select distinct when one column in and another column greater than

Consider the following dataset:
+---------------------+
| ID | NAME | VALUE |
+---------------------+
| 1 | a | 0.2 |
| 1 | b | 8 |
| 1 | c | 3.5 |
| 1 | d | 2.2 |
| 2 | b | 4 |
| 2 | c | 0.5 |
| 2 | d | 6 |
| 3 | a | 2 |
| 3 | b | 4 |
| 3 | c | 3.6 |
| 3 | d | 0.2 |
+---------------------+
I'm tying to develop a sql select statement that returns the top or distinct ID where NAME 'a' and 'b' both exist and both of the corresponding VALUE's are >= '1'. Thus, the desired output would be:
+---------------------+
| ID | NAME | VALUE |
+---------------------+
| 3 | a | 2 |
+----+-------+--------+
Appreciate any assistance anyone can provide.
You can try to use MIN window function and some condition to make it.
SELECT * FROM (
SELECT *,
MIN(CASE WHEN NAME = 'a' THEN [value] end) OVER(PARTITION BY ID) aVal,
MIN(CASE WHEN NAME = 'b' THEN [value] end) OVER(PARTITION BY ID) bVal
FROM T
) t1
WHERE aVal >1 and bVal >1 and aVal = [Value]
sqlfiddle
This seems like a group by and having query:
select id
from t
where name in ('a', 'b')
having count(*) = 2 and
min(value) >= 1;
No subqueries or joins are necessary.
The where clause filters the data to only look at the "a" and "b" records. The count(*) = 2 checks that both exist. If you can have duplicates, then use count(distinct name) = 2.
Then, you want the minimum value to be 1, so that is the final condition.
I am not sure why your desired results have the "a" row, but if you really want it, you can change the select to:
select id, 'a' as name,
max(case when name = 'a' then value end) as value
you can use in and sub-query
select top 1 * from t
where t.id in
(
select id from t
where name in ('a','b')
group by id
having sum(case when value>1 then 1 else 0)>=2
)
order by id

Aggregation for multiple SQL SELECT statements

I've got a table TABLE1 like this:
|--------------|--------------|--------------|
| POS | TYPE | VOLUME |
|--------------|--------------|--------------|
| 1 | A | 34 |
| 2 | A | 2 |
| 1 | A | 12 |
| 3 | B | 200 |
| 4 | C | 1 |
|--------------|--------------|--------------|
I want to get something like this (TABLE2):
|--------------|--------------|--------------|--------------|--------------|
| POS | Amount_A | Amount_B | Amount_C | Sum_Volume |
|--------------|--------------|--------------|--------------|--------------|
| 1 | 2 | 0 | 0 | 46 |
| 2 | 1 | 0 | 0 | 2 |
| 3 | 0 | 1 | 0 | 200 |
| 4 | 0 | 0 | 1 | 1 |
|--------------|--------------|--------------|--------------|--------------|
My Code so far is:
SELECT
(SELECT COUNT(TYPE)
FROM TABLE1
WHERE TYPE = 'A') AS [Amount_A]
,(SELECT COUNT(TYPE)
FROM TABLE1
WHERE TYPE = 'B') AS [Amount_B]
,(SELECT COUNT(TYPE)
FROM TABLE1
WHERE TYPE = 'C') AS [Amount_C]
,(SELECT SUM(VOLUME)
FROM TABLE AS [Sum_Volume]
INTO [TABLE2]
Now two Questions:
How can I include the distinction concerning POS?
Is there any better way to count each TYPE?
I am using MSSQLServer.
What you're looking for is to use GROUP BY, along with your Aggregate functions. So, this results in:
USE Sandbox;
GO
CREATE TABLE Table1 (Pos tinyint, [Type] char(1), Volume smallint);
INSERT INTO Table1
VALUES (1,'A',34 ),
(2,'A',2 ),
(1,'A',12 ),
(3,'B',200),
(4,'C',1 );
GO
SELECT Pos,
COUNT(CASE WHEN [Type] = 'A' THEN [Type] END) AS Amount_A,
COUNT(CASE WHEN [Type] = 'B' THEN [Type] END) AS Amount_B,
COUNT(CASE WHEN [Type] = 'C' THEN [Type] END) AS Amount_C,
SUM(Volume) As Sum_Volume
FROM Table1 T1
GROUP BY Pos;
DROP TABLE Table1;
GO
if you have a variable, and undefined, number of values for [Type], then you're most likely going to need to use Dynamic SQL.
your first column should be POS, and you'll GROUP BY POS.
This will give you one row for each POS value, and aggregate (COUNT and SUM) accordingly.
You can also use CASE statements instead of subselects. For instance, instead of:
(SELECT COUNT(TYPE)
FROM TABLE1
WHERE TYPE = 'A') AS [Amount_A]
use:
COUNT(CASE WHEN TYPE = 'A' then 1 else NULL END) AS [Amount_A]