SQL for selecting values in a single column by 'AND' condition - sql

I have a table data like bellow
PersonId
Eat
111
Carrot
111
Apple
111
Orange
222
Carrot
222
Apple
333
Carrot
444
Orange
555
Apple
I need an sql query which return the total number of PersonId's who eat both Carrot and Apple.
In the above example the result is, Result : 2. (PersonId's 111 and 222)
An ms-sql query like 'select count(distinct PersonId) from Person where Eat = 'Carrot' and Eat = 'Apple''

You can actually get the count without using a subquery to determine the persons who eat both. Assuming that the rows are unique:
select ( count(distinct case when eat = 'carrot' then personid end) +
count(distinct case when eat = 'apple' then personid end) -
count(distinct personid)
) as num_both
from t
where eat in ('carrot', 'apple')

SELECT PersonID FROM Person WHERE Eat = 'Carrot'
INTERSECT
SELECT PersonID FROM Person WHERE Eat = 'Apple'

You can use conditional aggregation of a sort:
select
personid
from <yourtable>
group by
personid
having
count (case when eat = 'carrot' then 1 else null end) >= 1
and count (case when eat = 'apple' then 1 else null end) >= 1

At this example, I use STRING_AGG to make easy the count and transform 'Apple' and 'Carrot' to one string comparison:
create table #EatTemp
(
PersonId int,
Eat Varchar(50)
)
INSERT INTO #EatTemp VALUES
(111, 'Carrot')
,(111, 'Apple')
,(111, 'Orange')
,(222, 'Carrot')
,(222, 'Apple')
,(333, 'Carrot')
,(444, 'Orange')
,(555, 'Apple')
SELECT Count(PersonId) WhoEatCarrotAndApple FROM
(
SELECT PersonId,
STRING_AGG(Eat, ';')
WITHIN GROUP (ORDER BY Eat) Eat
FROM #EatTemp
WHERE Eat IN ('Apple', 'Carrot')
GROUP BY PersonId
) EatAgg
WHERE Eat = 'Apple;Carrot'

You can use EXISTS statements to achieve your goal. Below is a full set of code you can use to test the results. In this case, this returns a count of 2 since PersonId 111 and 222 match the criteria you specified in your post.
CREATE TABLE Person
( PersonId INT
, Eat VARCHAR(10));
INSERT INTO Person
VALUES
(111, 'Carrot'), (111, 'Apple'), (111, 'Orange'),
(222, 'Carrot'), (222, 'Apple'), (333, 'Carrot'),
(444, 'Orange'), (555, 'Apple');
SELECT COUNT(DISTINCT PersonId)
FROM Person AS p
WHERE EXISTS
(SELECT 1
FROM Person e1
WHERE e1.Eat = 'Apple'
AND p.PersonId = e1.PersonId)
AND EXISTS
(SELECT 1
FROM Person e1
WHERE e1.Eat = 'Carrot'
AND p.PersonId = e1.PersonId);
EXISTS statements have a few advantages:
No chance of changing the granularity of your data since you aren't joining in your FROM clause.
Easy to add additional conditions as needed. Just add more EXISTS statements in your WHERE clause.
The condition is cleanly encapsulated in the EXISTS, so code intent is clear.
If you ever need complex conditions like existence of a value in another table based on specific filter conditions, then you can easily add this without introducing table joins in your main query.
Some alternative solutions such as PersonId IN (SUBQUERY) can introduce unexpected behavior in certain conditions, particularly when the subquery returns a NULL value.

select
count(PersonID)
from Person
where eat = 'Carrot'
and PersonID in (select PersonID
from Person
where eat = 'Apple');
Only selecting those persons who eat apples, and from that result select all those that eat carrots too.

SELECT COUNT (A.personID) FROM
(SELECT distinct PersonID FROM Person WHERE Eat = 'Carrot'
INTERSECT
SELECT distinct PersonID FROM Person WHERE Eat = 'Apple') as A

Related

Show fields only when other column does not contain nulls

I have a table that stores pets and a certain number of vaccines. In one column the identifier, in another column the name of the vaccine and in the third column, the date of completion. In case the date is null, it means that the pet has not received that vaccine yet.
This estructure is the next one:
CREATE TABLE pets (
pet VARCHAR (10),
vaccine VARCHAR (50),
complete_date DATE
);
INSERT INTO pets VALUES ('DOG001', 'Adenovirus', '2021-01-03');
INSERT INTO pets VALUES ('DOG001', 'Parvovirus', '2021-02-03');
INSERT INTO pets VALUES ('DOG001', 'Leptospirosis', null);
INSERT INTO pets VALUES ('CAT774', 'Calcivirosis', '2021-01-06');
INSERT INTO pets VALUES ('CAT774', 'Panleukopenia', null);
INSERT INTO pets VALUES ('DOG002', 'Adenovirus', '2020-12-21');
INSERT INTO pets VALUES ('DOG002', 'Parvovirus', '2021-02-01');
INSERT INTO pets VALUES ('DOG002', 'Leptospirosis', '2021-03-01');
pet
vaccine
complete_date
DOG001
Adenovirus
2021-01-03
DOG001
Parvovirus
2021-02-03
DOG001
Leptospirosis
null
CAT774
Calcivirosis
2021-01-06
CAT774
Panleukopenia
null
DOG002
Adenovirus
2020-12-21
DOG002
Parvovirus
2021-02-01
DOG002
Leptospirosis
2021-03-01
What I need is a list of all the pets that do not have a null "date", considering all the vaccines.
In this example, the result should be simply 'DOG002' since it is the only animal with all its dates with non-null values.
A conditional aggregate in the HAVING would be one method:
SELECT Pet
FROM dbo.Pets
GROUP BY Pet
HAVING COUNT(CASE WHEN Complete_Date IS NULL THEN 1 END) = 0;
I think Larnu posted what you are looking for (+1)... BUT... just in case you want to see the pet's details.
Just another option is WITH TIES.
Select top 1 with ties *
From pets
order by sum(case when complete_date is null then 1 else 0 end) over (partition by pet)
SELECT DISTINCT Pet FROM Pets
WHERE Pet NOT IN (SELECT Pet FROM Pets WHERE Complete_Date IS NULL)
CTE can also be used to achieve the above result
with CTE as
(
select pet,
vaccine,
complete_date,
SUM(IIF(complete_date is null ,1,0)) over (PARTITION BY pet) as pet_flag
from pets
)
select distinct Pet from CTE where
pet_flag = 0

Find rows which have never satistifed a condition

Say I have a table of customers with three possible statuses: loan default, open loan, paid in full.
How can I find the customers who never defaulted?
Example: John and Alex had multiple loans with different statuses.
id | customer | status
----------------------
1 john default
1 john open
1 john paid
2 alex open
2 alex paid
John defaulted once and Alex never defaulted. A simple where status <> "default" attempt doesn't work because it incorrectly includes John's non-defaulted cases. The result should give me:
id | customer
-------------
2 alex
How can I find the customers who never defaulted?
You can use aggregation and having:
select id, customer
from t
group by id, customer
having sum(case when status = 'default' then 1 else 0 end) = 0;
The having clause counts the number of defaults for each customer and returns those customers with no defaults.
If you have a separate table of customers, I would recommend not exists:
select c.*
from customers c
where not exists (select 1
from t
where t.id = c.id and t.status = 'default'
);
Something like
select distinct `customer` from `customers`
where `customer` not in (
select `customer` from `customers where `status` = 'default'
);
The ALL() operator with a correlated sub-query works here:
WITH cte AS (
SELECT * FROM (VALUES
(1, 'john', 'default'),
(1, 'john', 'open'),
(1, 'john', 'paid'),
(2, 'alex', 'open'),
(2, 'alex', 'paid')
) AS x(id, customer, status)
)
SELECT *
FROM cte AS a
WHERE 'default' <> ALL (
SELECT status
FROM cte AS b
WHERE a.id = b.id
);
If you want just user and/or id, do select distinct «your desired columns» instead of select *.

How to replace all non-zero values from column in select?

I need to replace non-zeros in column within select statement.
SELECT Status, Name, Car from Events;
I can do it like this:
SELECT (Replace(Status, '1', 'Ready'), Name, Car from Events;
Or using Case/Update.
But I have numbers from -5 to 10 and writing Replace or something for each case is not good idea.
How can I add comparasing with replace without updating database?
Table looks like this:
Status Name Car
0 John Porsche
1 Bill Dodge
5 Megan Ford
The standard method is to use case:
select t.*,
(case when status = 1 then 'Ready'
else 'Something else'
end) as status_string
from t;
I would instead recommend, though, that you have a status reference table:
create table statuses (
status int primary key,
name varchar(255)
);
insert into statuses (status, name)
values (0, 'UNKNOWN'),
(1, 'READY'),
. . . -- for the rest of the statuses
Then use JOIN:
select t.*, s.name
from t join
statuses s
on t.status = s.status;
SELECT IF(status =1, 'Approved', 'Pending') FROM TABLENAME

Combining select distinct with group and ordering

A simplified example for illustration: Consider a table "fruit" with 3 columns: name, count and the date purchased. Need an alphabetical list of the fruits and their count the last time they were bought. I am a bit confused by the order of sorting and how distinct is applied. My attempt -
drop table if exists fruit;
create table fruit (
name varchar(8),
count integer,
dateP datetime
);
insert into fruit (name, count, dateP) values
('apple', 4, '2014-03-18 16:24:37'),
('orange', 2, '2013-12-11 11:20:16'),
('apple', 7, '2014-07-05 08:34:21'),
('banana', 6, '2014-06-20 19:10:15'),
('orange', 6, '2014-07-22 17:41:12'),
('banana', 4, '2014-08-15 21:26:37'), -- last
('orange', 5, '2014-12-11 11:20:16'), -- last
('apple', 3, '2014-09-25 18:54:32'), -- last
('apple', 5, '2014-02-05 18:47:18'),
('apple', 12, '2013-09-25 14:18:57'),
('banana', 5, '2013-04-18 15:59:04'),
('apple', 9, '2014-01-29 11:47:45');
-- Expecting:
-- apple 3
-- banana 4
-- orange 5
select distinct name, count
from fruit
group by name
order by name, dateP;
-- Produces:
-- apple 9
-- banana 5
-- orange 5
Try this:-
select f1.name,f1.count
from
fruit f1
inner join
(select name,max(dateP) date_P from fruit group by name) f2
on f1.name = f2.name and f1.dateP = f2.date_P
order by f1.name
EDITED for the last line :)
Try the following:
SELECT fruit.name, fruit.count, fruit.dateP
FROM fruit
INNER JOIN (
SELECT name, Max(dateP) AS lastPurchased
FROM fruit
GROUP BY name
) AS dt ON (dt.name = fruit.name AND dt.lastPurchased = fruit.dateP )
Here is a demo of this example on SQLFiddle.
When faced before with a similar situation I resolved as follows, it requires the use of a primary key, in this case I have added UID.
SELECT a.Name,a.Count FROM Fruit a WHERE a.UID IN
(SELECT b.UID FROM Fruit b
WHERE b.Name = a.Name ORDER BY b.DateP Desc,b.UID DESC LIMIT 1)
This also avoids the possibility that the same fruit was purchased twice at the exact same time; unlikely in this example but in a large scale system it is a possibility which could come back to haunt you. It handles this by ordering by UID as well so it will choose the purchase most recently added to the table (assuming incrementing primary key).
Edited to remove the TOP 1 invalid syntax
In SQLite 3.7.11 or later, you can use MAX/MIN to select from which record in a group other values are returned (but this requires that you have that maximum in the result):
SELECT name, count, MAX(dateP)
FROM fruit
GROUP BY name
ORDER BY name
If you wanna improve your performance, use Common Table Expressions instead of nested Select clauses.

SQL query getting data

In SQL Server 2000:
hello i have a table with the following structure:
sku brand product_name inventory_count
------ ------ ------------- ---------------
c001 honda honda car 1 3
t002 honda honda truck 1 6
c003 ford ford car 1 7
t004 ford ford truck 1 8
b005 honda honda bike 5 9
b006 ford ford bike 6 18
I'm using the following SQL query
select distinct left(sku,1) from products
this would return the following:
c
t
b
and then ...
c = car
t = truck
b = bike
this works great,
Now I want to get just one product example for each of the categories with the greatest INVENTORY_COUNT
so that it returns the data as:
c, "ford car 1"
t, "ford truck 1"
b, "ford bike 6"
what SQL query would i run to get that data??
i want the item with the greatest INVENTORY_COUNT for each category.. left(sku,1)
thanks!!
You could join the table on itself to filter out the rows with less than maximum inventory:
select left(a.sku,1), max(a.product_name), max(a.inventory_count)
from YourTable a
left join YourTable more_inv
on left(a.sku,1) = left(more_inv.sku,1)
and a.inventory_count < more_inv.inventory_count
where more_inv.sku is null
group by left(a.sku,1)
The WHERE condition on more_inv.sku is null filters out rows that don't have the highest inventory for their one letter category.
Once we're down to rows with the maximum inventory, you can use max() to get the inventory_count (it'll be the same for all rows) and another max() to get one of the products with the highest inventory_count. You could use min() too.
im using the following sql query which works,
SELECT DISTINCT left(field1,1) as cat , MAX(sku) as topproduct FROM products where inventory_count > 0 GROUP BY left(sku,1)
i just need to add in there an ..order by inventory_count
Using SQL Server 2005 you can try this
DECLARe #Table TABLE(
sku VARCHAR(50),
brand VARCHAR(50),
product_name VARCHAR(50),
inventory_count INT
)
INSERT INTO #Table SELECT 'c001', 'honda', 'honda car 1', 3
INSERT INTO #Table SELECT 't002', 'honda', 'honda truck 1', 6
INSERT INTO #Table SELECT 'c003', 'ford', 'ford car 1', 7
INSERT INTO #Table SELECT 't004', 'ford', 'ford truck 1', 8
INSERT INTO #Table SELECT 'b005', 'honda', 'honda bike 5', 9
INSERT INTO #Table SELECT 'b006', 'ford', 'ford bike 6', 18
SELECT LEFT(sku,1),
product_name
FROM (
SELECT *,
ROW_NUMBER() OVER( PARTITION BY LEFT(sku,1) ORDER BY inventory_count DESC) ORDERCOUNT
FROm #Table
) SUB
WHERE ORDERCOUNT = 1
OK Then you can try
SELECT LEFT(sku,1),
*
FROm #Table t INNER JOIN
(
SELECT LEFT(sku,1) c,
MAX(inventory_count) MaxNum
FROM #Table
GROUP BY LEFT(sku,1)
) sub ON LEFT(t.sku,1) = sub.c and t.inventory_count = sub.MaxNum
For mysql:
SELECT LEFT(sku,1), product_name FROM Table1 GROUP BY LEFT(sku,1)
For MS SQL 2005 (maybe also works in 2000?):
SELECT LEFT(sku,1), MAX(product_name) FROM Table1 GROUP BY LEFT(sku,1)
Try this
declare #t table (sku varchar(50),brand varchar(50),product_name varchar(50),inventory_count int)
insert into #t
select 'c001','honda','honda car 1',3 union all
select 't002','honda','honda truck 1',6 union all
select 'c004','ford','ford car 1',7 union all
select 't004','ford','ford truck 1',8 union all
select 'b005','honda','honda bike 5',9 union all
select 'b006','ford','ford bike 6',18
Query:
select
x.s + space(2) + ',' + space(2) + '"' + t.product_name + '"' as [Output]
from #t t
inner join
(
SELECT left(sku,1) as s,MAX(inventory_count) ic from #t
group by left(sku,1)
) x
on x.ic = t.inventory_count
--order by t.inventory_count desc
Output
c , "ford car 1"
t , "ford truck 1"
b , "ford bike 6"
In general, might there not be more than one item with max(inventory_count)?
To get max inventory per cateogry, use a subquery, (syntax will depend on your database):
SELECT LEFT(sku,1) as category, MAX(inventory_count) as c
FROM Table1
GROUP BY LEFT(sku,1)
SORT BY LEFT(sku,1)
This will give you a table of max_inventory by category, thus:
b,18
c,7
t,8
So now you know the max per category. To get matching products, use this result as
a subquery and find all products in the given cateogry that match the given max(inventory_count):
SELECT t1.*
FROM Table1 AS t1,
(SELECT LEFT(sku,1) AS category, MAX(inventory_count) AS c
FROM Table1
GROUP BY LEFT(sku,1)
) AS t2
WHERE LEFT(t1.sku,1) = t2.category AND t2.c = t1.inventory_count
Sorry, the code above may/may not work in your database, but hope you get the idea.
Bill
PS -- probably not helpful, but the table design isn't really helping you here. If you have control over the schema, would help to separate this into multiple tables.