sqlite: count and group by clause gives not the result expected - sql

on sqlite, I have the tables
papers: rero_id, doi, year
writtenby: rero_id, authorid, instid
authors: author_id, name, firstname
inst: inst_id, name, see_id
inst is a table of Institutions: Universities and so on.
Each line in writtenby gives a paper, an author, an institution this author was attached at that time. There can be more then one institution and the couple paper, authorid is repeated for each institution.
For a given author, I want a list and a count of the institutions he has cohautored paper with.
For a list I tried
SELECT inst.name as loc
FROM (
(authors INNER JOIN writtenby ON authors.authorid =
writtenby.authorid)
INNER JOIN writtenby AS writtenby_1 ON writtenby.rero_id =
writtenby_1.rero_id
)
INNER JOIN authors AS auth_1 ON writtenby_1.authorid =
auth_1.authorid
inner join inst on writtenby_1.instid = inst.inst_id
WHERE (authors.name) ="Doe" AND (authors.firstname)= "Joe"
ORDER BY loc
I got a list that seems ok.
Now, I would like to regroup these institution names and have a count.
I tried
SELECT inst.name, count(inst.name)
FROM (
(authors INNER JOIN writtenby ON authors.authorid =
writtenby.authorid)
INNER JOIN writtenby AS writtenby_1 ON writtenby.rero_id =
writtenby_1.rero_id
)
INNER JOIN authors AS auth_1 ON writtenby_1.authorid =
auth_1.authorid
inner join inst on writtenby_1.instid = inst.inst_id
GROUP BY inst.name
HAVING (authors.name) ="Doe" AND (authors.firstname)= "John"
I have only three line and not a count of the institutions listed from the first query.
Thanks for correcting me !
François

Try using where instead of having
SELECT inst.name, count(inst.name)
FROM (
(authors INNER JOIN writtenby ON authors.authorid =
writtenby.authorid)
INNER JOIN writtenby AS writtenby_1 ON writtenby.rero_id =
writtenby_1.rero_id
)
INNER JOIN authors AS auth_1 ON writtenby_1.authorid =
auth_1.authorid
inner join inst on writtenby_1.instid = inst.inst_id
where authors.name ='Doe' AND authors.firstname= 'John'
GROUP BY inst.name

I got this that works,
SELECT inst.name as loc, count(*) as c
FROM (
(authors INNER JOIN writtenby ON authors.authorid = writtenby.authorid)
INNER JOIN writtenby AS writtenby_1 ON writtenby.rero_id =
writtenby_1.rero_id
inner join inst on writtenby_1.instid = inst.inst_id
)
INNER JOIN authors AS auth_1 ON writtenby_1.authorid = auth_1.authorid
WHERE (authors.name) ="Doe" AND (authors.firstname)= "John"
GROUP BY inst.name
ORDER BY c DESC
I still can use a where clause, and that's not the same as having...
And thanks to fa6 who gave the answer below
F.

Related

SQL question on correlated sub-queries using windowing rank functions

I have this SQL homework assignment that tells me to list all the customers and the most recent DVD which they have rented, including the title, Genre, Rating, DVD and date of rental. This can be solved via a correlated sub-query or a window rank function
Here is a screenshot of the schema:
Here is what I have tried so far:
Select
Concat(m.MemberFirstName, ' ', m.MemberLastName) as Member
, d.DvdTitle
, g.GenreName
, rt.RatingName
, r.RentalRequestDate
From
Member m
Inner Join
RentalQueue rq on m.MemberId = rq.MemberId
Inner Join
DVD d on d.DVDId = rq.DVDId
Inner Join
Genre g on g.GenreId = d.GenreId
Inner Join
Rating rt on rt.RatingId = d.RatingId
Inner Join
Rental r on r.DVDId = d.DVDId
I am not sure how I can use correlated subqueries to answer the question above as I am quite new to correlated subqueries and I would appreciate some help on this. Thanks in advance.
Try something like this, but please try to understand it too.
Select
Concat(m.MemberFirstName, ' ', m.MemberLastName) as Member
, mostrecent.*
From
Member m
CROSS APPLY
(select top (1)
d.DvdTitle
, g.GenreName
, rt.RatingName
, r.RentalRequestDate
FROM
RentalQueue rq
Inner Join
DVD d on d.DVDId = rq.DVDId
Inner Join
Genre g on g.GenreId = d.GenreId
Inner Join
Rating rt on rt.RatingId = d.RatingId
Inner Join
Rental r on r.DVDId = d.DVDId
where m.MemberId = rq.MemberId
order by r.RentalRequestDate desc
) mostrecent

SELECT COLUMNS FROM INNER JOIN

I wanted to select two columns from inner join of two select queries. I have written a query joining three tables and from the result I want to get only two column. But my query is showing error.I am using oracle sql developer.
SELECT firstname,surname
FROM (
SELECT A.firstname,A.surname,I.ACNUM,I.FIELDNUM
FROM ACADEMIC A INNER JOIN INTEREST I
ON (A.ACNUM = I.ACNUM)
INNER JOIN SUBJECT S ON (I.FIELDNUM = S.FIELDNUM) WHERE S.TITLE = 'History' ) ;
I want only the firstname and surname but I am getting error like:
Incorrect syntax near ';'.
Why are you using a subselect? Just use:
SELECT A.firstname, A.surname
FROM ACADEMIC A INNER JOIN
INTEREST I
ON A.ACNUM = I.ACNUM INNER JOIN
SUBJECT S
ON I.FIELDNUM = S.FIELDNUM
WHERE S.TITLE = 'History' ;
When you select from query you should name it as well.
Try this:
SELECT D.firstname,D.surname
FROM (SELECT A.firstname,A.surname,I.ACNUM,I.FIELDNUM
FROM ACADEMIC A
INNER JOIN INTEREST I ON (A.ACNUM = I.ACNUM)
INNER JOIN SUBJECT S ON (I.FIELDNUM = S.FIELDNUM)
WHERE S.TITLE = 'History') D;

Write the SQL code that will list Physician-Person appointments only ONCE

This one is confusing me?
There are three tables. Appointments(Appointment_ID, Physician_ID and Person_ID) , Physician(Physician_ID) and a Person(Person_ID and Physician_ID).
This is what I have so far :
SELECT DISTINCT Appointment_date_time FROM Appointment
INNER JOIN Person
ON Appointment.Person_ID = Person.Person_ID
INNER JOIN Physician
ON Physician.Physician_ID = Person.Physician_ID
HAVING COUNT(*) < 1
There are three tables. Appointments(Appointment_ID, Physician_ID and Person_ID) , Physician(Physician_ID) and a Person(Person_ID and Physician_ID).
select *
from Appointments a
inner join Person p
on a.Person_ID = p.Person_ID
inner join Physician ph
on a.Physician_ID = ph.Physician_ID

Nested 'Where'?

I have a table named Actor, with only a column for City (CityId). I want to return the number of actors in a particular State (StateId). The catch however is that I have separate tables for City, County, and finally State (City has CountyId, County has StateId). How do I this in a T-SQL query?
I have a solution that involves nested Select statements, something like:
SELECT COUNT(1)
FROM Actor a
WHERE a.CityId IN
(SELECT CityId FROM City WHERE CountyId IN...)
...but is there a more efficient way to do this? Thanks
You can use this query to get your output
----------------------------------------------------------
SELECT COUNT(ActorId)
FROM Actor a
INNER JOIN City c ON a.cityId = c.cityId
INNER JOIN Country con ON c.countryId = con.countryId
INNER JOIN STATE s ON con.stateId = s.stateId
GROUP BY s.stateId
Use JOINS to query your data.
I am using INNER JOIN here.
Assuming that you have CountryId in your City Table, You can do it following way.
In case you don't have countryId in your City Table you have to apply one more INNER JOIN on State Table.
SELECT COUNT(1) FROM Actor a INNER JOIN
City b ON a.CityId = b.CityId
WHERE b.CountryId IN (...)
You can easily put the JOINS across different table that you have and then use the Group By clause to find out the total number of actors from specific state.
I have used the column name on the basis of my wild guess, you can change them with the original name that you have in your database.
SELECT StateId,
Count(ActorId) AS Total
FROM ACTOR
INNER JOIN City ON Actor.CityId = City.CityId
INNER JOIN County ON County.CountyId = City.CountyId
INNER JOIN State ON State.StateId = County.StateId
GROUP BY State.StateId
Assuming the relation names, you can do something like this with joins:
select s.ID, s.Name, count(*)
from Actors a
inner join Cities c on c.ID = a.CityID
inner join County cn on cn.ID = c.CountyID
inner join State s on s.ID = cn.StateID
group by s.ID, s.Name
If you only need the StateId you don't even need to join with states, this will do:
select cn.StateID, count(*)
from Actors a
inner join Cities c on c.ID = a.CityID
inner join County cn on cn.ID = c.CountyID
group by cn.StateID

Query extensibility with WHERE EXISTS with a large table

The following query is designed to find the number of people who went to a hospital, the total number of people who went to a hospital and the divide those two to find a percentage. The table Claims is two million plus rows and does have the correct non-clustered index of patientid, admissiondate, and dischargdate. The query runs quickly enough but I'm interested in how I could make it more usable. I would like to be able to add another code in the line where (hcpcs.hcpcs ='97001') and have the change in percentRehabNotHomeHealth be relfected in another column. Is there possible without writing a big, fat join statement where I join the results of the two queries together? I know that by adding the extra column the math won't look right, but I'm not worried about that at the moment. desired sample output: http://imgur.com/BCLrd
database schema
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join
--this join adds the hospitalCounts column
(
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
) as t on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select distinct p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10
You might look into CTE (Common Table Expressions) to get what you need. It would allow you to get summarized data and join that back to the detail on a common key. As an example I modified your join on the subquery to be a CTE.
;with hospitalCounts as (
select h.hospitalname, count(*) as hospitalCounts
from hospitals as h
inner join patient as p on p.hospitalnpi=h.npi
where p.statecode='21' and h.statecode='21'
group by h.hospitalname
)
select h.hospitalname
,count(*) as visitCounts
,hospitalcounts
,round(count(*)/cast(hospitalcounts as float) *100,2) as percentRehabNotHomeHealth
from Patient p
inner join statecounties as sc on sc.countycode = p.countycode
and sc.statecode = p.statecode
inner join hospitals as h on h.npi=p.hospitalnpi
inner join hospitalCounts on t.hospitalname=h.hospitalname
--this where exists clause gives the visitCounts column
where h.stateCode='21' and p.statecode='21'
and exists
(
select p2.patientid
from Patient as p2
inner join Claims as c on c.patientid = p2.patientid
and c.admissiondate = p2.admissiondate
and c.dischargedate = p2.dischargedate
inner join hcpcs on hcpcs.hcpcs=c.hcpcs
inner join hospitals as h on h.npi=p2.hospitalnpi
where (hcpcs.hcpcs ='97001' or hcpcs.hcpcs='9339' or hcpcs.hcpcs='97002')
and p2.patientid=p.patientid
)
and hospitalcounts > 10
group by h.hospitalname, t.hospitalcounts
having count(*)>10