I'm trying to do an overall count function on a set of data with multiple conditions but am having trouble with it. I'm a beginner and tried using a simple count function but am having no luck. I looked into using case when but am having trouble with it. Does anyone know how I should go about this code?
Here is an example of my table:
Name | Date | Status | Candy | Soda | Water
Nancy | 10/19/16 | active | 2 | 0 | 1
Lindsy| 10/20/15 | active | 0 | 1 | 0
Erica | 10/20/13 | active | 0 | 2 | 3
Lane | 10/19/14 | active | 0 | 0 | 4
Alexa | 10/19/16 | notactive | 0 | 5 | 1
Jenn | 10/19/16 | active | 0 | 0 | 0
I'm looking to do an overall count of the names under the conditions that: either candy, soda, or water are anything other than zero(doesn't matter what column or how many, just if one of those three are not zero), the account is active and also when the date falls within the last two years, 10/2014 - 10/2016.
I would want the query to tell me that the count total was 3 and also show me:
Name | Date | Status | Candy | Soda | Water
Nancy | 10/19/16 | active | 2 | 0 | 1
Lindsy| 10/20/15 | active | 0 | 1 | 0
Lane | 10/19/14 | active | 0 | 0 | 4
These are two different questions. The basic idea to get the rows is:
select t.*
from t
where greatest(candy, soda, water) > 0 and
status = 'active' and
date >= curdate() - interval 2 year;
(In Oracle, you would could use sysdate rather than curdate().)
To get the count, you would use count(*) rather than * in the select. SQL queries only return one result set . . . so you either get all the rows or a single count.
SELECT *
FROM yourTable
WHERE (Candy > 0 OR Soda > 0 OR Water > 0) AND
Status = 'active' AND
Date BETWEEN '2014-10-01' AND SYSDATE
Related
I have a database of bets. Each bet has a 'Win', 'Loss', or 'Pending' state. What I want to do is to have an SQL statement that will get the last, say, 20 bets a user has placed, find out their ROI (Total profit / Total staked * 100).
So I'm just wondering if there is a better way to do this. Do I basically have to get the users table, loop over every user, get their last 20 bets, find the ROI and then order it. If my User table gets huge then this process is going to take ages, right?
Is creating a 'View' going to save on this time?
Is there a way to do this in one statement that won't cost my life in processing time?
Here are the tables
Users
| ID | User |
| 1 | Test1 |
| 2 | Test2 |
| 3 | Test3 |
| 4 | Test4 |
Bets
| ID | User | Amount | Odds | Result |
| 1 | 1 | 10 | 1.35 | Win |
| 2 | 1 | 25 | 2.55 | Win |
| 3 | 3 | 15 | 1.65 | Loss |
| 4 | 2 | 11 | 2.12 | Pending |
Se essentially I would like a table that ranks them as ROI.
| User | AmountBet | AmountWon | ROI |
| 1 | 35 | 77 | 215 |
| 2 | 11 | 0 | 0 |
| 3 | 15 | 0 | 0 |
| 4 | 0 | 0 | 0 |
Assuming the ID of the bets table represents increasing time such that it can be used to identify "last 20", then
WITH b
AS
(
SELECT id,
user,
CASE WHEN result = 'Pending' THEN 0 ELSE amount END AS amount,
CASE WHEN result = 'Win' THEN amount * odds ELSE 0 END as winnings,
ROW_NUMBER() OVER (PARTITION BY user ORDER BY id DESC) AS rownum
FROM bets
)
SELECT user,
SUM(amount) AS amount_bet,
SUM(winnings) AS amount_won,
CASE
WHEN SUM(amount) > 0
THEN SUM(winnings) * 100 / SUM(amount)
ELSE 0
END AS roi
FROM b
WHERE rownum < 21
GROUP BY user;
dbfiddle.uk
I want to group the rows on the basis of a specific condition.
The table structure is something like this
EmpID | EmpName | TaskId | A_Shift_Status | B_Shift_Status | C_Shift_Status | D_Shift_Status
1 | John | 1 | 1 | null | 2 | 1
1 | John | 2 | 1 | null | 1 | 1
2 | Mike | 3 | 1 | 1 | 2 | 1
2 | Mike | 4 | null | 1 | null | 1
3 | Steve | 5 | null | 1 | 2 | 1
3 | Steve | 6 | 1 | null | 2 | 1
The criteria will be
Done 1
Pending 2
NA 3
The expected output is to group the employees by task and the status will be on the following condition
if ALL tasks are done by any employee then the status will be done
(i.e. 1)
if ANY of the tasks is incomplete then the status will be
incomplete/pending (i.e. 2)
So the desired output will be
EmpID | EmpName | A_Shift_Status | B_Shift_Status | C_Shift_Status | D_Shift_Status
1 | John | 1 | null | 2 | 1
2 | Mike | 1 | 1 | 2 | 1
3 | Steve | 1 | 1 | 2 | 1
So in other terms summary/grouping should only show complete/done (i.e. 1) when all the rows of a particular shift column of an employee have status as complete/done (i.e. 1)
Based on your data (where the criteria are 1, 2 and NULL for n/a), a simple 'group by' the employee, and MAX of the columns, should work e.g.,
SELECT
yt.EmpID,
yt.EmpName,
MAX(yt.A_Shift_Status) AS A_Shift_Status,
MAX(yt.B_Shift_Status) AS B_Shift_Status,
MAX(yt.C_Shift_Status) AS C_Shift_Status,
MAX(yt.D_Shift_Status) AS D_Shift_Status
FROM
yourtable yt
GROUP BY
yt.EmpID,
yt.EmpName;
For the shift statuses
If any of them are 2, it returns 2
otherwise if any of them are 1, it returns 1
otherwise it returns NULL
Notes re 1/2/3 (which was specified as criteria) vs 1/2/NULL (which is in the data)
It gets a little tricker if the inputs are supposed to use 1/2/3 instead of 1/2/NULL. Let us know if you are changing the inputs to reflect that.
If the input is fine as NULLs, but you need the output to have '3' for n/a (nulls), you can put an ISNULL or COALESCE around the MAX statements e.g., ISNULL(MAX(yt.A_Shift_Status), 3) AS A_Shift_Status
So I don't have much experience with SQL, and am trying to learn. An interview question I came across had this question. I'm trying to learn more SQL but maybe I'm missing a piece of info to solve this? Or maybe I'm approaching the problem wrong.
This is the question:
We have following two tables , below is their info:
POLICY (id as int, policy_content as varchar2)
POLICY_VOTES (vote as boolean, policy_id as int)
Write a single query that returns the policy_id, number of yes(true) votes and number of no(false) votes with a row for each policy up for a vote stored
My first thought when approaching this was to use a WITH clause to get the policy_ids and use an inner join to get the votes for yes and no but I can't find a way to make it work, which is what leads me to believe that there's another clause in SQL I'm not aware of or couldn't find that would make it easier. Either that or I'm thinking of the problem in the wrong way.
Good question.
I cannot answer too specifically, since you did not specify a DBMS, but what you will want to do is count or situationally sum based on criteria. When you use an aggregate function like that, you also need GROUP BY.
Here are two example tables I made with test data:
policy
| id | policy_content |
|----|----------------|
| 1 | foo |
| 2 | foo |
| 3 | foo |
| 4 | foo |
| 5 | foo |
policy votes
| vote | policy_id |
|------|-----------|
| yes | 1 |
| no | 1 |
| yes | 2 |
| yes | 2 |
| no | 3 |
| no | 3 |
| no | 4 |
| yes | 4 |
| yes | 5 |
| yes | 5 |
Using the below query:
SELECT
policy_votes.policy_id,
SUM(CASE WHEN vote = 'yes' THEN 1 ELSE 0 END) AS yes_votes,
SUM(CASE WHEN vote = 'no' THEN 1 ELSE 0 END) AS no_votes
FROM
policy_votes
GROUP BY
policy_votes.policy_id
You get:
| POLICY_ID | YES_VOTES | NO_VOTES |
|-----------|-----------|----------|
| 1 | 1 | 1 |
| 2 | 2 | 0 |
| 4 | 1 | 1 |
| 5 | 2 | 0 |
| 3 | 0 | 2 |
Here is an SQL Fiddle for you to try it out.
Try this:
select p.id, p.content,
Count(case when pv.vote='true' then 1 end) as number_of_yes,
Count(case when pv.vote='false' then 1 end) as number_of_no
From policy p join policy_votes pv
On(p.id = pv.policy_id)
Group by p.id, p.content
Cheers!!
favorite
I need to create 7 datasets (local, web, call, local&call, local&web, call&web, all) depending on if the customer has used a channel from the below sample data.
| customer | call | local | web |
|----------|------|-------|-----|
| 1 | 1 | 1 | 1 |
| 1 | | 1 | 1 |
| 1 | | 1 | |
| 2 | 1 | | 1 |
| 2 | | 1 | |
| 2 | 1 | | |
| 3 | | | 1 |
| 3 | 1 | 1 | |
please see this picture for more details on the sample table
So if a customer has used all three channels in one instance and in the other instance he just uses either of them, then that row with Customer=1 should go to the'all' dataset. Similarly for 3, if he has used local and web in one instance and just web in another instance, then it should go to the local&web dataset.
Customer IDs should not be duplicated in other dataset i.e. customer 1 can belong to wither one of the dataset only.
I am stuck with this, can anyone give me a snippet of either sas or sql code to proceed further.
Thanks !
If all three go to "all", then use aggregation:
select customer,
(case when max(call) > 0 and max(local) > 0 and max(web) > 0 then 'all'
else concat_ws('&', (case when max(call) > 0 then 'call' end),
(case when max(local) > 0 then 'local' end),
(case when max(web) > 0 then 'web' end)
)
end) as grp
from t
group by customer;
I have the following set of survey responses in a table.
It's not very clear but the numbers represent the 'satisfaction' level where:
0 = happy
1 = neutral
2 = sad
+----------+--------+-------+------+-----------+-------------------------+
| friendly | polite | clean | rate | recommend | booking_date |
+----------+--------+-------+------+-----------+-------------------------+
| 2 | 2 | 2 | 0 | 0 | 2014-02-03 00:00:00.000 |
| 1 | 2 | 0 | 0 | 2 | 2014-02-04 00:00:00.000 |
| 0 | 0 | 0 | 1 | 0 | 2014-02-04 00:00:00.000 |
| 1 | 1 | 2 | 0 | 2 | 2014-02-04 00:00:00.000 |
| 0 | 0 | 1 | 2 | 1 | 2014-02-04 00:00:00.000 |
| 2 | 2 | 0 | 2 | 0 | 2014-02-05 00:00:00.000 |
| 2 | 1 | 1 | 0 | 2 | 2014-02-05 00:00:00.000 |
| 1 | 0 | 1 | 2 | 0 | 2014-02-05 00:00:00.000 |
| 0 | 1 | 1 | 1 | 1 | 2014-02-05 00:00:00.000 |
| 1 | 0 | 2 | 2 | 0 | 2014-02-05 00:00:00.000 |
+----------+--------+-------+------+-----------+-------------------------+
For each day I need the totals of each of the columns matching each response option. This will answer the question: "How may people answered happy, neutral or sad for each of the available question options".
I would then require a recordset returned such as:
+------------+----------+------------+--------+----------+------------+--------+
| Date | FriHappy | FriNeutral | FriSad | PolHappy | PolNeutral | PolSad |
+------------+----------+------------+--------+----------+------------+--------+
| 2014-02-03 | 0 | 0 | 1 | 0 | 0 | 1 |
| 2014-02-04 | 2 | 2 | 0 | 2 | 1 | 1 |
| 2014-02-05 | 1 | 2 | 2 | 2 | 2 | 1 |
+------------+----------+------------+--------+----------+------------+--------+
This shows that on the 4th two responders answered "happy" for the "Polite?" question, one answered "Neutral" and one answered "sad".
On the 5th, one responder answered "happy" for the Friendly option, two choose "neutral" and two chose "sad".
I really wish to avoid doing this in code but my SQL isn't great. I did have a look around but couldn't find anything matching this specific requirement.
Obviously this is never going to work (nice if it did) but this may help explain:
SELECT cast(booking_date as date) [booking_date],
COUNT(friendly=0) [FriHappy],
COUNT(friendly=1) [FriNeutral],
COUNT(friendly=2) [FriSad]
FROM [u-rate-gatwick-qsm].[dbo].[Questions]
WHERE booking_date >= '2014-02-01'
AND booking_date <= '2014-03-01'
GROUP BY cast(booking_date as date)
Any pointers would be much appreciated.
Many thanks.
Here is a working version of your sample query:
SELECT cast(booking_date as date) as [booking_date],
sum(case when friendly = 0 then 1 else 0 end) as [FriHappy],
sum(case when friendly = 1 then 1 else 0 end) as [FriNeutral],
sum(case when friendly = 2 then 1 else 0 end) as [FriSad]
FROM [u-rate-gatwick-qsm].[dbo].[Questions]
WHERE booking_date >= '2014-02-01' AND booking_date <= '2014-03-01'
GROUP BY cast(booking_date as date)
ORDER BY min(booking_date);
Your expression count(friendly = 0) doesn't work in SQL Server. Even if it did, it would be the same as count(friendly) -- that is, the number of non-NULL values in the column. Remember what count() does. It counts the number of non-NULL values.
The above logic says: add 1 when there is a match to the appropriate friendly value.
By the way, SQL Server doesn't guarantee the ordering of results from an aggregation, so I also added an order by clause. The min(booking_date) is just an easy way of ordering by the date.
And, I didn't make the change, but I think the second condition in the where should be < rather than <= so you don't include bookings on March 1st (even one at exactly midnight).