What Clause would most optimally create this query? - sql

So I don't have much experience with SQL, and am trying to learn. An interview question I came across had this question. I'm trying to learn more SQL but maybe I'm missing a piece of info to solve this? Or maybe I'm approaching the problem wrong.
This is the question:
We have following two tables , below is their info:
POLICY (id as int, policy_content as varchar2)
POLICY_VOTES (vote as boolean, policy_id as int)
Write a single query that returns the policy_id, number of yes(true) votes and number of no(false) votes with a row for each policy up for a vote stored
My first thought when approaching this was to use a WITH clause to get the policy_ids and use an inner join to get the votes for yes and no but I can't find a way to make it work, which is what leads me to believe that there's another clause in SQL I'm not aware of or couldn't find that would make it easier. Either that or I'm thinking of the problem in the wrong way.

Good question.
I cannot answer too specifically, since you did not specify a DBMS, but what you will want to do is count or situationally sum based on criteria. When you use an aggregate function like that, you also need GROUP BY.
Here are two example tables I made with test data:
policy
| id | policy_content |
|----|----------------|
| 1 | foo |
| 2 | foo |
| 3 | foo |
| 4 | foo |
| 5 | foo |
policy votes
| vote | policy_id |
|------|-----------|
| yes | 1 |
| no | 1 |
| yes | 2 |
| yes | 2 |
| no | 3 |
| no | 3 |
| no | 4 |
| yes | 4 |
| yes | 5 |
| yes | 5 |
Using the below query:
SELECT
policy_votes.policy_id,
SUM(CASE WHEN vote = 'yes' THEN 1 ELSE 0 END) AS yes_votes,
SUM(CASE WHEN vote = 'no' THEN 1 ELSE 0 END) AS no_votes
FROM
policy_votes
GROUP BY
policy_votes.policy_id
You get:
| POLICY_ID | YES_VOTES | NO_VOTES |
|-----------|-----------|----------|
| 1 | 1 | 1 |
| 2 | 2 | 0 |
| 4 | 1 | 1 |
| 5 | 2 | 0 |
| 3 | 0 | 2 |
Here is an SQL Fiddle for you to try it out.

Try this:
select p.id, p.content,
Count(case when pv.vote='true' then 1 end) as number_of_yes,
Count(case when pv.vote='false' then 1 end) as number_of_no
From policy p join policy_votes pv
On(p.id = pv.policy_id)
Group by p.id, p.content
Cheers!!

Related

SQL: tricky question for finding lockout dates

Hope you can help. We have a table with two columns Customer_ID and Trip_Date. The customer receives 15% off on their first visit and on every visit where they haven't received the 15% off offer in the past thirty days. How do I write a single SQL query that finds all days where a customer received 15% off?
The table looks like this
+-----+-------+----------+
| Customer_ID | date |
+-----+-------+----------+
| 1 | 01-01-17 |
| 1 | 01-17-17 |
| 1 | 02-04-17 |
| 1 | 03-01-17 |
| 1 | 03-15-17 |
| 1 | 04-29-17 |
| 1 | 05-18-17 |
+-----+-------+----------+
The desired output would look like this:
+-----+-------+----------+--------+----------+
| Customer_ID | date | received_discount |
+-----+-------+----------+--------+----------+
| 1 | 01-01-17 | 1 |
| 1 | 01-17-17 | 0 |
| 1 | 02-04-17 | 1 |
| 1 | 03-01-17 | 0 |
| 1 | 03-15-17 | 1 |
| 1 | 04-29-17 | 1 |
| 1 | 05-18-17 | 0 |
+-----+-------+----------+--------+----------+
We are doing this work in Netezza. I can't think of a way using just window functions, only using recursion and looping. Is there some clever trick that I'm missing?
Thanks in advance,
GF
You didn't tell us what your backend is, nor you gave some sample data and expected output nor you gave a sensible data schema :( This is an example based on guess of schema using postgreSQL as backend (would be too messy as a comment):
(I think you have Customer_Id, Trip_Date and LocationId in trips table?)
select * from trips t1
where not exists (
select * from trips t2
where t1.Customer_id = t2.Customer_id and
t1.Trip_Date > t2.Trip_Date
and t1.Trip_date - t2.Trip_Date < 30
);

SQL Server : get Count() of a related table column where some condition

Given tables CollegeMajors
| Id | Major |
|----|-------------|
| 1 | Accounting |
| 2 | Math |
| 3 | Engineering |
and EnrolledStudents
| Id | CollegeMajorId | Name | HasGraduated |
|----|----------------|-----------------|--------------|
| 1 | 1 | Grace Smith | 1 |
| 2 | 1 | Tony Fabio | 0 |
| 3 | 1 | Michael Ross | 1 |
| 4 | 3 | Fletcher Thomas | 1 |
| 5 | 2 | Dwayne Johnson | 0 |
I want to do a query like
Select
CollegeMajors.Major,
Count(select number of students who have graduated) AS TotalGraduated,
Count(select number of students who have not graduated) AS TotalNotGraduated
From
CollegeMajors
Inner Join
EnrolledStudents On EnrolledStudents.CollegeMajorId = CollegeMajors.Id
and I'm expecting these kind of results
| Major | TotalGraduated | TotalNotGraduated |
|-------------|----------------|-------------------|
| Accounting | 2 | 1 |
| Math | 0 | 1 |
| Engineering | 1 | 0 |
So the question is, what kind of query goes inside the COUNT to achieve the above?
Select CollegeMajors.Major
, COUNT(CASE WHEN EnrolledStudents.HasGraduated= 0 then 1 ELSE NULL END) as "TotalNotGraduated",
COUNT(CASE WHEN EnrolledStudents.HasGraduated = 1 then 1 ELSE NULL END) as "TotalGraduated"
From CollegeMajors
InnerJoin EnrolledStudents On EnrolledStudents.CollegeMajorId = CollegeMajors.Id
GROUP BY CollegeMajors.Major
You can use the CASE statement inside your COUNT to achieve the desired result.Please try the below updated query.
Select CollegeMajors.Major
, COUNT(CASE WHEN EnrolledStudents.HasGraduated= 0 then 1 ELSE NULL END) as "TotalNotGraduated",
COUNT(CASE WHEN EnrolledStudents.HasGraduated = 1 then 1 ELSE NULL END) as "TotalGraduated"
From CollegeMajors
InnerJoin EnrolledStudents On EnrolledStudents.CollegeMajorId = CollegeMajors.Id
GROUP BY CollegeMajors.Major
You can try this for graduated count:
Select Count(*) From EnrolledStudents group by CollegeMajorId having HasGraduated = 1
And change 1 to zero for not graduated ones:
Select Count(*) From EnrolledStudents group by CollegeMajorId having HasGraduated = 0

Window functions limited by value in separate column

I have a "responses" table in my postgres database that looks like
| id | question_id |
| 1 | 1 |
| 2 | 2 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
I want to produce a table with the response and question id, as well as the id of the previous response with that same question id, as such
| id | question_id | lag_resp_id |
| 1 | 1 | |
| 2 | 2 | |
| 3 | 1 | 1 |
| 4 | 2 | 2 |
| 5 | 2 | 4 |
Obviously pulling "lag(responses.id) over (order by responses.id)" will pull the previous response id regardless of question_id. I attempted the below subquery, but I know it is wrong since I am basically making a table of all lag ids for each question id in the subquery.
select
responses.question_id,
responses.id as response_id,
(select
lag(r2.id, 1) over (order by r2.id)
from
responses as r2
where
r2.question_id = responses.question_id
)
from
responses
I don't know if I'm on the right track with the subquery, or if I need to do something more advanced (which may involve "partition by", which I do not know how to use).
Any help would be hugely appreciated.
Use partition by. There is no need for a correlated subquery here.
select id,question_id,
lag(id) over (partition by question_id order by id) lag_resp_id
from responses

Get Data from two different tables on a condition

I got situation like below:
Posts and like tables are respectively:
----------------- --------------------------
| post_id | text | | post_id | like | person|
------------------- ---------------------------
| 1 | hello | | 1 | yes | Jhon |
| 2 | Haii | | 1 | yes | Sham |
| 3 | I am..| | 1 | yes | Ram |
-------------------- | 2 | yes | Mahe |
----------------------------
Now I want to get all posts and I want to know whether each post is liked by Sham or not.
So result will be:
-----------------------------------
| post_id | text | liked_by_Sham |
-----------------------------------
| 1 | hello | yes |
| 2 | Haii | no |
| 3 | I am | no |
------------------------------------
As I am new to SQL can anyone explain me how to do that. I tried it using Inner join but it doesn't work.
I tried with below query:
select posts.*,liketb.like
from posts
inner join liketb
on posts.post_id = liketb.post_id
where liketb.person = 'Sham';
This query is giving only posts liked by sham.
Use left join, with case. like is a keyword. Make sure it is properly escaped.
select p.post_id, p.text,
case when l.like = 'yes' then l.like else 'no' end as liked_by_sham
from posts p
left join liketb l on p.post_id = l.post_id and l.person = 'Sham';
Sql Fiddle Demo
You need put the filter clausule on the ON section to get all post with yes, no or null when no match. And use nvl to convert all NULL to 'no'
SELECT p."post_id", p."text", nvl(l."like", 'no') as liked_by_sham
FROM posts p
LEFT JOIN liketb l
ON p."post_id" = l."post_id"
AND l."person" = 'Sham'
OUTPUT
| post_id | text | LIKED_BY_SHAM |
|---------|--------|---------------|
| 1 | hello | yes |
| 3 | I am.. | no |
| 2 | Haii | no |

Create a pivot table from two tables based on dates

I have two MS Access tables sharing a one to many relationship. Their structures are like the following:
tbl_Persons
+----------+------------+-----------+
| PersonID | PersonName | OtherData |
+----------+------------+-----------+
| 1 | PersonA | etc. |
| 2 | PersonB | |
| 3 | PersonC | |
tbl_Visits
+----------+------------+------------+-----------------------
| VisitID | PersonID | VisitDate | dozens of other fields
+----------+------------+------------+-----------
| 1 | 1 | 09/01/13 |
| 2 | 1 | 09/02/13 |
| 3 | 2 | 09/03/13 |
| 4 | 2 | 09/04/13 | etc...
I wish to create a new table based on the VisitDate field, the column headings of which are Visit-n where n is 1 to the number of visits, Visit-n-Data1, Visit-n-Data2, Visit-n-Data3 etc.
MergedTable
+----------+----------+---------------+-----------------+----------+----------------+
| PersonID | Visit1 | Visit1Data1 | Visit1Data2... | Visit2 | Visit2Data1... |
+----------+----------+---------------+-----------
| 1 | 09/01/13 | | | 09/02/13 |
| 2 | 09/03/13 | | | 09/04/13 |
| 3 | etc. | |
I am really not sure how to do this. Whether SQL query or using DAO then looping through records and columns. It is essential that there is only 1 PersonID per row and all his data appears chronologically into columns.
Start of by ranking the visits with something like
SELECT PersonID, VisitID,
(SELECT COUNT(VisitID) FROM tbl_Visits AS C
WHERE C.PersonID = tbl_Visits.PersonID
AND C.VisitDate < tbl_Visits.VisitDate) AS RankNumber
FROM tbl_Visits
Use this query as a base for the 'pivot'
Since you seem to have some visits of persons on the same day (visit 1 and 2) the WHERE clause needs to be a bit more sophisticated. But I hope you get the basic concept.
Pivoting can be done with multiple LEFT JOINs.
I question if my solution will have a high performance, since I did not test it. It is easier in SQL Server than in MS Access to accomplish.