Show fields only when other column does not contain nulls - sql

I have a table that stores pets and a certain number of vaccines. In one column the identifier, in another column the name of the vaccine and in the third column, the date of completion. In case the date is null, it means that the pet has not received that vaccine yet.
This estructure is the next one:
CREATE TABLE pets (
pet VARCHAR (10),
vaccine VARCHAR (50),
complete_date DATE
);
INSERT INTO pets VALUES ('DOG001', 'Adenovirus', '2021-01-03');
INSERT INTO pets VALUES ('DOG001', 'Parvovirus', '2021-02-03');
INSERT INTO pets VALUES ('DOG001', 'Leptospirosis', null);
INSERT INTO pets VALUES ('CAT774', 'Calcivirosis', '2021-01-06');
INSERT INTO pets VALUES ('CAT774', 'Panleukopenia', null);
INSERT INTO pets VALUES ('DOG002', 'Adenovirus', '2020-12-21');
INSERT INTO pets VALUES ('DOG002', 'Parvovirus', '2021-02-01');
INSERT INTO pets VALUES ('DOG002', 'Leptospirosis', '2021-03-01');
pet
vaccine
complete_date
DOG001
Adenovirus
2021-01-03
DOG001
Parvovirus
2021-02-03
DOG001
Leptospirosis
null
CAT774
Calcivirosis
2021-01-06
CAT774
Panleukopenia
null
DOG002
Adenovirus
2020-12-21
DOG002
Parvovirus
2021-02-01
DOG002
Leptospirosis
2021-03-01
What I need is a list of all the pets that do not have a null "date", considering all the vaccines.
In this example, the result should be simply 'DOG002' since it is the only animal with all its dates with non-null values.

A conditional aggregate in the HAVING would be one method:
SELECT Pet
FROM dbo.Pets
GROUP BY Pet
HAVING COUNT(CASE WHEN Complete_Date IS NULL THEN 1 END) = 0;

I think Larnu posted what you are looking for (+1)... BUT... just in case you want to see the pet's details.
Just another option is WITH TIES.
Select top 1 with ties *
From pets
order by sum(case when complete_date is null then 1 else 0 end) over (partition by pet)

SELECT DISTINCT Pet FROM Pets
WHERE Pet NOT IN (SELECT Pet FROM Pets WHERE Complete_Date IS NULL)

CTE can also be used to achieve the above result
with CTE as
(
select pet,
vaccine,
complete_date,
SUM(IIF(complete_date is null ,1,0)) over (PARTITION BY pet) as pet_flag
from pets
)
select distinct Pet from CTE where
pet_flag = 0

Related

SQL for selecting values in a single column by 'AND' condition

I have a table data like bellow
PersonId
Eat
111
Carrot
111
Apple
111
Orange
222
Carrot
222
Apple
333
Carrot
444
Orange
555
Apple
I need an sql query which return the total number of PersonId's who eat both Carrot and Apple.
In the above example the result is, Result : 2. (PersonId's 111 and 222)
An ms-sql query like 'select count(distinct PersonId) from Person where Eat = 'Carrot' and Eat = 'Apple''
You can actually get the count without using a subquery to determine the persons who eat both. Assuming that the rows are unique:
select ( count(distinct case when eat = 'carrot' then personid end) +
count(distinct case when eat = 'apple' then personid end) -
count(distinct personid)
) as num_both
from t
where eat in ('carrot', 'apple')
SELECT PersonID FROM Person WHERE Eat = 'Carrot'
INTERSECT
SELECT PersonID FROM Person WHERE Eat = 'Apple'
You can use conditional aggregation of a sort:
select
personid
from <yourtable>
group by
personid
having
count (case when eat = 'carrot' then 1 else null end) >= 1
and count (case when eat = 'apple' then 1 else null end) >= 1
At this example, I use STRING_AGG to make easy the count and transform 'Apple' and 'Carrot' to one string comparison:
create table #EatTemp
(
PersonId int,
Eat Varchar(50)
)
INSERT INTO #EatTemp VALUES
(111, 'Carrot')
,(111, 'Apple')
,(111, 'Orange')
,(222, 'Carrot')
,(222, 'Apple')
,(333, 'Carrot')
,(444, 'Orange')
,(555, 'Apple')
SELECT Count(PersonId) WhoEatCarrotAndApple FROM
(
SELECT PersonId,
STRING_AGG(Eat, ';')
WITHIN GROUP (ORDER BY Eat) Eat
FROM #EatTemp
WHERE Eat IN ('Apple', 'Carrot')
GROUP BY PersonId
) EatAgg
WHERE Eat = 'Apple;Carrot'
You can use EXISTS statements to achieve your goal. Below is a full set of code you can use to test the results. In this case, this returns a count of 2 since PersonId 111 and 222 match the criteria you specified in your post.
CREATE TABLE Person
( PersonId INT
, Eat VARCHAR(10));
INSERT INTO Person
VALUES
(111, 'Carrot'), (111, 'Apple'), (111, 'Orange'),
(222, 'Carrot'), (222, 'Apple'), (333, 'Carrot'),
(444, 'Orange'), (555, 'Apple');
SELECT COUNT(DISTINCT PersonId)
FROM Person AS p
WHERE EXISTS
(SELECT 1
FROM Person e1
WHERE e1.Eat = 'Apple'
AND p.PersonId = e1.PersonId)
AND EXISTS
(SELECT 1
FROM Person e1
WHERE e1.Eat = 'Carrot'
AND p.PersonId = e1.PersonId);
EXISTS statements have a few advantages:
No chance of changing the granularity of your data since you aren't joining in your FROM clause.
Easy to add additional conditions as needed. Just add more EXISTS statements in your WHERE clause.
The condition is cleanly encapsulated in the EXISTS, so code intent is clear.
If you ever need complex conditions like existence of a value in another table based on specific filter conditions, then you can easily add this without introducing table joins in your main query.
Some alternative solutions such as PersonId IN (SUBQUERY) can introduce unexpected behavior in certain conditions, particularly when the subquery returns a NULL value.
select
count(PersonID)
from Person
where eat = 'Carrot'
and PersonID in (select PersonID
from Person
where eat = 'Apple');
Only selecting those persons who eat apples, and from that result select all those that eat carrots too.
SELECT COUNT (A.personID) FROM
(SELECT distinct PersonID FROM Person WHERE Eat = 'Carrot'
INTERSECT
SELECT distinct PersonID FROM Person WHERE Eat = 'Apple') as A

How to replace all non-zero values from column in select?

I need to replace non-zeros in column within select statement.
SELECT Status, Name, Car from Events;
I can do it like this:
SELECT (Replace(Status, '1', 'Ready'), Name, Car from Events;
Or using Case/Update.
But I have numbers from -5 to 10 and writing Replace or something for each case is not good idea.
How can I add comparasing with replace without updating database?
Table looks like this:
Status Name Car
0 John Porsche
1 Bill Dodge
5 Megan Ford
The standard method is to use case:
select t.*,
(case when status = 1 then 'Ready'
else 'Something else'
end) as status_string
from t;
I would instead recommend, though, that you have a status reference table:
create table statuses (
status int primary key,
name varchar(255)
);
insert into statuses (status, name)
values (0, 'UNKNOWN'),
(1, 'READY'),
. . . -- for the rest of the statuses
Then use JOIN:
select t.*, s.name
from t join
statuses s
on t.status = s.status;
SELECT IF(status =1, 'Approved', 'Pending') FROM TABLENAME

SQL to find number of NULL and non-NULL entries for a column

For each POSTAL_CODE, I want to know how many NULL TIME_VISITEDs there are and how many NOT NULL TIME_VISITEDs
CREATE TABLE VISITS
(
ID INTEGER NOT NULL,
POSTAL_CODE VARCHAR(5) NOT NULL,
TIME_VISITED TIMESTAMP,
CONSTRAINT PK_VISITS PRIMARY KEY (ID)
);
Sample data:
INSERT INTO VISITS (ID, POSTAL_CODE, TIME_VISITED) VALUES ('234', '01910', '21.04.2014, 10:13:33.000');
INSERT INTO VISITS (ID, POSTAL_CODE, TIME_VISITED) VALUES ('334', '01910', '28.04.2014, 13:13:33.000');
INSERT INTO VISITS (ID, POSTAL_CODE, TIME_VISITED) VALUES ('433', '01910', '29.04.2014, 13:03:19.000');
INSERT INTO VISITS (ID, POSTAL_CODE, TIME_VISITED) VALUES ('533', '01910', NULL);
INSERT INTO VISITS (ID, POSTAL_CODE, TIME_VISITED) VALUES ('833', '01910', NULL);
This is the output I want for the data above:
POSTAL_CODE=01910, NUM_TIME_VISITED_NULL=2, NUM_TIME_VISITED_NOT_NULL=3
I am using the following SQL
SELECT distinct r.POSTAL_CODE,
(select count(*) from VISITS p where p.POSTAL_CODE=r.POSTAL_CODE and p.TIME_VISITED is null) as NUM_TIME_VISITED_NULL,
(select count(*) from VISITS p where p.POSTAL_CODE=r.POSTAL_CODE and p.TIME_VISITED is not null) as NUM_TIME_VISITED_NOT_NULL
FROM VISITS r
ORDER BY r.POSTAL_CODE
The query takes a very long time if there are lots of rows in the table
What changes do I need to make to be able to get this information more quickly?
Use conditional aggregation instead:
select v.postal_code,
sum(case when v.time_visited is null then 1 else 0
end) as NumTimeVisitedNull,
count(v.time_visited) as NumTimeVisitedNotNull
from visits v
group by v.postal_code;
Note: you can also write this as:
select v.postal_code,
(count(*) - count(v.time_visited) ) as NumTimeVisitedNull,
count(v.time_visited) as NumTimeVisitedNotNull
from visits v
group by v.postal_code;
The count() function specifically counts the number of non-NULL values.
You can do this all in one pass. COUNT counts how many non-NULLs there are. Then use SUM of a CASE statement to count up all the NULLs.
SELECT POSTAL_CODE
,COUNT(TIME_VISITED) AS NUM_TIME_VISITED_NOT_NULL
,SUM(CASE WHEN TIME_VISITED IS NULL THEN 1 ELSE 0 END)) AS NUM_TIME_VISITED_NULL
FROM VISITS
GROUP BY POSTAL_CODE

SQL - need to determine implicit end dates for supplied begin dates

Consider the following:
CREATE TABLE Members
(
MemberID CHAR(10)
, GroupID CHAR(10)
, JoinDate DATETIME
)
INSERT Members VALUES ('1', 'A', 2010-01-01)
INSERT Members VALUES ('1', 'C', 2010-09-05)
INSERT Members VALUES ('1', 'B', 2010-04-15)
INSERT Members VALUES ('1', 'B', 2010-10-10)
INSERT Members VALUES ('1', 'A', 2010-06-01)
INSERT Members VALUES ('1', 'D', 2001-11-30)
What would be the best way to select from this table, determining the implied "LeaveDate", producing the following data set:
MemberID GroupID JoinDate LeaveDate
1 A 2010-01-01 2010-04-14
1 B 2010-04-15 2010-05-31
1 A 2010-06-01 2010-09-04
1 C 2010-09-05 2010-10-09
1 B 2010-10-10 2010-11-29
1 D 2010-11-30 NULL
As you can see, a member is assumed to have no lapse in membership. The [LeaveDate] for each member status period is assumed to be the day prior to the next chronological [JoinDate] that can be found for that member in a different group. Of course this is a simplified illustration of my actual problem, which includes a couple more categorization/grouping columns and thousands of different members with [JoinDate] values stored in no particular order.
Something like this perhaps? Self join, and select the minimum joining date that is greater than the joining date for the current row - i.e. the leave date plus one. Subtract one day from it.
You may need to adjust the date arithmetic for your particular RDBMS.
SELECT
m1.*
, MIN( m2.JoinDate ) - INTERVAL 1 DAY AS LeaveDate
FROM
Members m1
LEFT JOIN
Members m2
ON m2.MemberID = m1.MemberID
AND m2.JoinDate > m1.JoinDate
GROUP BY
m1.MemberID
, m1.GroupID
, m1.JoinDate
ORDER BY
m1.MemberID
, m1.JoinDate
Standard (ANSI) SQL solution:
SELECT memberid,
groupid,
joindate,
lead(joindate) OVER (PARTITION BY memberid ORDER BY joindate ASC) AS leave_date
FROM members
ORDER BY joindate ASC

Add or delete repeated row

I have an output like this:
id name date school school1
1 john 11/11/2001 nyu ucla
1 john 11/11/2001 ucla nyu
2 paul 11/11/2011 uft mit
2 paul 11/11/2011 mit uft
I would like to achieve this:
id name date school school1
1 john 11/11/2001 nyu ucla
2 paul 11/11/2011 mit uft
I am using direct join as in:
select distinct
a.id, a.name,
b.date,
c.school
a1.id, a1.name,
b1.date,
c1.school
from table a, table b, table c,table a1, table b1, table c1
where
a.id=b.id
and...
Any ideas?
We will need more information such as what your tables contain and what you are after.
One thing I noticed is you have a school and then school1. 3nf states that you should never duplicate fields and append numbers to them to get more information even if you think that the relationship will only be 1 or 2 additional items. You need to create a second table that stores a user associated with 1 to many schools.
I agree with everyone else that both your source table and your desired output are poor design. While you probably can't do anything about your source table, I recommend the following code and output:
Select id, name, date, school from MyTable;
union
Select id, name, date, school1 from MyTable;
(repeat as necessary)
This will give you results in the format:
id name date school
1 john 11/11/2001 nyu
1 john 11/11/2001 ucla
2 paul 11/11/2011 mit
2 paul 11/11/2011 uft
(Note: in my version of SQL, union queries automatically select distinct records so the distinct flag isn't needed)
With this format, you could easily count the number of schools per student, number of students per school, etc.
If processing time and/or storage space is a factor here, you could then split this into 2 tables, 1 with the id,name & date, the other with the id & school (basically what JonH just said). But if you're just working up some simple statistics, this should suffice.
This problem was just too irresistable, so I just took a guess at the data structures that we are dealing with. The technology wasn't specified in the question. This is in Transact-SQL.
create table student
(
id int not null primary key identity,
name nvarchar(100) not null default '',
graduation_date date not null default getdate(),
)
go
create table school
(
id int not null primary key identity,
name nvarchar(100) not null default ''
)
go
create table student_school_asc
(
student_id int not null foreign key references student (id),
school_id int not null foreign key references school (id),
primary key (student_id, school_id)
)
go
insert into student (name, graduation_date) values ('john', '2001-11-11')
insert into student (name, graduation_date) values ('paul', '2011-11-11')
insert into school (name) values ('nyu')
insert into school (name) values ('ucla')
insert into school (name) values ('uft')
insert into school (name) values ('mit')
insert into student_school_asc (student_id, school_id) values (1,1)
insert into student_school_asc (student_id, school_id) values (1,2)
insert into student_school_asc (student_id, school_id) values (2,3)
insert into student_school_asc (student_id, school_id) values (2,4)
select
s.id,
s.name,
s.graduation_date as [date],
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s1 where s1.rank_num = 1) as school,
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s2 where s2.rank_num = 2) as school1
from
student s
Result:
id name date school school1
--- ----- ---------- ------- --------
1 john 2001-11-11 nyu ucla
2 paul 2011-11-11 mit uft