How to select for all cases within a group? - sql

I’d like to get metrics on the number of cases in my table where my data fits different criteria. Currently, I have groups of 3+ that we’d like to inspect based on certain criteria. For the time being, I’d like to get counts of the following:
The breakdown of records in a group of 3+ where CHESS = White.
The breakdown of records in a group of 3+ where DATE like ‘%16’
The breakdown of records in a group of 3+ with all combinations of chess = white and DATE like ‘%16’
+------------------------------------------------+
| GroupID | CHESS | DATE |
| 1 | White | n16 |
| 1 | Black | n16 |
| 1 | Black | n03 |
| 2 | White | n16 |
| 2 | White | n10 |
| 2 | White | n11 |
| 3 | Black | n12 |
| 3 | White | n14 |
| 3 | Black | n16 |
---------------------------------------------------
The output would be something like:
Chess count of 1 = 2
Chess count of 2 = 0
Chess count of 3 = 1
Date count of 1 = 2
Date count of 2 = 1
Date count of 3 = 0
Cases with Chess count of 1 and Date count of 1 = 1
Cases with Chess count of 2 and Date count of 2 = 0
Cases with Chess count of 3 and Date count of 3= 0
Cases with Chess count of 1 and Date of count 2 = 1
Cases with Chess count of 1 and Date of count 3 = 0
Cases with Chess count of 2 and Date of count 1 = 0
Cases with Chess count of 2 and Date of count 3= 0
Cases with Chess count of 3 and Date of count 1 = 1
Cases with Chess count of 3 and Date of count 2 = 0
Can this be done in a way that takes into account groups of any sizes, or would it have to be specific to the group size (for example would the query only work on groups of 3)?

You could do it like here, using group by cube and others:
with groups as (select count(case when chess = 'White' then 1 end) cnt_white,
count(case when tdate = 'n16' then 1 end) cnt_n16
from t group by groupid),
numbers as (select level grp from dual connect by level <= 3)
select white, n16, cnt
from (
select n1.grp white, n2.grp n16, count(cnt_white) cnt, grouping_id(n1.grp, n2.grp) gid
from numbers n1
cross join numbers n2
left join groups on n1.grp = cnt_white and n2.grp = cnt_n16
group by cube(n1.grp, n2.grp)
having grouping_id(n1.grp, n2.grp) <> 3 )
order by case when gid = 1 then 1 when gid = 2 then 2 when gid = 0 then 3 end, white, n16
rextester demo
Modify level 3 in subquery numbers to change number of groups.

Related

query for column that are within a variable + or 1 of another column

I have a table that has 2 columns, and I am trying to determine a way to select the records where the two columns are CLOSE to one another. Maybe based on standard deviation if i can think about how to do that. But for now, this is what my table looks like:
ID| PCT | RETURN
1 | 20 | 1.20
2 | 15 | 0.90
3 | 0 | 3.00
The values in the pct field is a percent number (for example 20%). The value in the return field is a not fully calculated % number (so its supposed to be 20% above what the initial value was). The query I am working with so far is this:
select * from TABLE1 where ((pct = ((return - 1)* 100)));
What I'd like to end up with are the rows where both are within a set value of each other. For example If they are within 5 points of each other, then the row would be returned and the output would be:
ID| PCT | RETURN
1 | 20 | 1.20
2 | 15 | 0.90
In the above, ID 1 should work out to be PCT = 20 and Return = 20, and ID 2, is PCT = 15 and RETURN = 10. Because it was within 5 points of each other, it was returned.
ID 3 was not returned because 0 and 200 are way above the 5 point threshold.
Is there any way to set a variable that would return a +- 5 when comparing the two values from the above attributes? Thanks.
RexTester Example:
Use Lead() over (Order by PCT) to look ahead and LAG() to look back to the next row do the math and evaluate results...
WITH CTE (ID, PCT , RETURN) as (
SELECT 1 , 20 , 1.20 FROM DUAL UNION ALL
SELECT 2 , 15 , 0.90 FROM DUAL UNION ALL
SELECT 3 , 0 , 3.00 FROM DUAL),
CTE2 as (SELECT A.*, LEAD(PCT) Over (ORDER BY PCT) LEADPCT, LAG(PCT) Over (order by PCT) LAGPCT
FROM CTE A)
SELECT * FROM CTE2
WHERE LEADPCT-PCT <=5 OR PCT-LAGPCT <=5
Order by ID
Giving us:
+----+----+-----+--------+---------+--------+
| | ID | PCT | RETURN | LEADPCT | LAGPCT |
+----+----+-----+--------+---------+--------+
| 1 | 1 | 20 | 1,20 | NULL | 15 |
| 2 | 2 | 15 | 0,90 | 20 | 0 |
+----+----+-----+--------+---------+--------+
or use the return value instead of PCT... just depends on what you're after. But maybe I don't fully understand the question..

SQL - How to get the count of each distinct value?

I have 3 table
**room**
room_id | nurse_needed
----------------------
1 | 2
2 | 3
3 | 1
**doctor_schedule**
doctor_schedule_id| room_id
---------------------------
1 | 1
2 | 2
3 | 3
*nurse_schedule*
nurse_schedule_id | doctor_schedule_id
--------------------------------------
1 | 1
2 | 1
3 | 2
Each Room needs a number of nurse, A doctor work in Room and a nurse work with doctor's schedule. I want to count how many nurse in each room.
The result should be:
room_id | nurse_needed|nurse_have_in_room
---------------------------------------------
1 | 2 | 2
2 | 3 | 1
3 | 1 | 0
Hmmm . . .
select r.*,
(select count(*)
from doctor_schedule ds join
nurse_schedule ns
on ds.doctor_schedule_id = ns.doctor_schedule_id
where ds.room_id = r.room_id
) as nurse_have_in_room
from room r;
select room.*,
(select count(*) from
dotor_schedule docs,
nurse_schedule nurs
where docs.doctor_schedule_id=nurs.dcotor_schedule_id
group by docs.room_id) as nurse_have_in_room
from room;
Result of join on doctor_schedule_id between doctor_schedule and
nurse_schedule
nurse_schedule_id | doctor_schedule_id room_id
--------------------------------------+------------
1 | 1 | 1
2 | 1 | 1
3 | 2 | 2
We group by room_id and then get the result.
select r.room_id,
r.nurse_needed,
ns.nurses_scheduled,
ns.dist_nurses_scheduled
from room r
left join (select ds.room_id,
count(1) nurses_schedule,
count(distinct ns.nurse_schedule_id) dist_nurses_scheduled
from doctor_schedule ds
join nurse_schedule ns
on ds.doctor_schedule_id = ns.doctor_schedule_id
group by ds.room_id) as ns
on r.room_id = ns.room_id
Left join so you find rooms with no nurses scheduled.
Count(distinct ns.nurse_schedule_id) if needed to see how many different nurses make up the count.
Normally you have a time component in there too. Something like "where r.roomdate = ns.date"

Looking for SQL-Query to select images with similar colors

I created the following sqlite-DB-table and populated it with information about the frequency of different
colors of the pixels of a set of images that I analyzed. I'd like to select images according to alike colors.
I was inspired by a project by Matthew Mueller (http://research.cs.wisc.edu/vision/piximilar/), reengeneered
an alike website and am about to change the search-pattern he suggests.
Each image consists of 100 pixels and hence the sum of the columns color1 ... color6 is always 100.
id int | filename text | color1 int | color2 int | color3 int | color4 int | color5 int | color6 int |
------------------------------------------------------------------------------------------------------
1 | 1.bmp | 23 | 25 | 50 | 0 | 0 | 0 |
2 | 2.bmp | 25 | 12 | 11 | 2 | 37 | 13 |
3 | 3.bmp | 15 | 16 | 17 | 18 | 19 | 15 |
4 | 4.bmp | 0 | 100 | 0 | 0 | 0 | 0 |
...
I'm trying to write an SQL query to select all tuples where
a) one of any of the columns has a frequency above a certain threshold.
Example with DB above: threshold = 40 --> rows with ids 1 and 4 are selected.
b) the sum of two of any of the columns is above a certain threshold.
Example with DB above: threshold = 60 --> rows with ids 1, 2 and 4 are returned
c) rows are sorted according to how «close» / «similar» they are to a certain tuple.
Example with DB above: «closeness» to id 2 is goal:
Resulting order: 2, 3, 1, 4
I would appreciate your suggestions for good queries a, b and c very much.
Thanks, Dani
I think your queries will be easier to write if you normalize your tables
files
file_id, filename
1, 1.bmp
2, 2.bmp
file_colors
file_id, color_id, color_value
1, 1, 23
1, 2, 25
1, 3, 50
1, 4, 0
1, 5, 0
a) Any 1 color above a certain value
select file_id from file_colors
group by file_id
having count(case when color_value >= 40 then 1 end) > 0
b) Any sum of 2 colors above a certain value
select distinct file_id from file_colors t1
join file_colors t2 on t1.file_id = t2.file_id
where t1.color_id <> t2.color_id
and t1.color_value + t2.color_value >= 60
c) You didn't define 'difference'. The query below calculates it as the sum of the absolute distance for each color.
select t1.file_id
from file_colors t1
join file_colors t2 on t2.file_id = 2 and t2.color_id = t1.color_id
group by t1.file_id
order by sum(abs(t1.color_value - t2.color_value))

How to calculate the value of a previous row from the count of another column

I want to create an additional column which calculates the value of a row from count column with its predecessor row from the sum column. Below is the query. I tried using ROLLUP but it does not serve the purpose.
select to_char(register_date,'YYYY-MM') as "registered_in_month"
,count(*) as Total_count
from CMSS.USERS_PROFILE a
where a.pcms_db != '*'
group by (to_char(register_date,'YYYY-MM'))
order by to_char(register_date,'YYYY-MM')
This is what i get
registered_in_month TOTAL_COUNT
-------------------------------------
2005-01 1
2005-02 3
2005-04 8
2005-06 4
But what I would like to display is below, including the months which have count as 0
registered_in_month TOTAL_COUNT SUM
------------------------------------------
2005-01 1 1
2005-02 3 4
2005-03 0 4
2005-04 8 12
2005-05 0 12
2005-06 4 16
To include missing months in your result, first you need to have complete list of months. To do that you should find the earliest and latest month and then use heirarchial
query to generate the complete list.
SQL Fiddle
with x(min_date, max_date) as (
select min(trunc(register_date,'month')),
max(trunc(register_date,'month'))
from users_profile
)
select add_months(min_date,level-1)
from x
connect by add_months(min_date,level-1) <= max_date;
Once you have all the months, you can outer join it to your table. To get the cumulative sum, simply add up the count using SUM as analytical function.
with x(min_date, max_date) as (
select min(trunc(register_date,'month')),
max(trunc(register_date,'month'))
from users_profile
),
y(all_months) as (
select add_months(min_date,level-1)
from x
connect by add_months(min_date,level-1) <= max_date
)
select to_char(a.all_months,'yyyy-mm') registered_in_month,
count(b.register_date) total_count,
sum(count(b.register_date)) over (order by a.all_months) "sum"
from y a left outer join users_profile b
on a.all_months = trunc(b.register_date,'month')
group by a.all_months
order by a.all_months;
Output:
| REGISTERED_IN_MONTH | TOTAL_COUNT | SUM |
|---------------------|-------------|-----|
| 2005-01 | 1 | 1 |
| 2005-02 | 3 | 4 |
| 2005-03 | 0 | 4 |
| 2005-04 | 8 | 12 |
| 2005-05 | 0 | 12 |
| 2005-06 | 4 | 16 |

SQL join problems - users betting on matches

I have the following table:
scores:
user_id | match_id | points
1 | 110 | 4
1 | 111 | 3
1 | 112 | 3
2 | 111 | 2
Users bet on matches and depending on the result of the match they are awarded with points. Depending on how accurate the bet was you are either awarded with 0, 2, 3 or 4 points for a match.
Now I want to rank the users so that i can see who is in 1st, 2nd place etc...
The ranking order is firstly by total_points. If these are equal its ordered by the amount of times a user has scored 4 points then by the amount of times a user scored 3 points and so on.
For that i would need the following table:
user_id | total_points | #_of_fours | #_of_threes | #_of_twos
1 | 10 | 1 | 2 | 0
2 | 2 | 0 | 0 | 1
But i cant figure out the join statements which would help me get it.
This is as far as i get without help:
SELECT user_id, COUNT( points ) AS #_of_fours FROM scores WHERE points = 4 GROUP BY user_id
Which results in
user_id | #_of_fours
1 | 1
2 | 0
Now i would have to do that for #_of_threes and twos aswell as total points and join it all together, but i cant figure out how.
BTW im using MySQL.
Any help would be really apreciated. Thanks in advance
SELECT user_id
, sum(points) as total_points
, sum(case when points = 4 then 1 end) AS #_of_fours
, sum(case when points = 3 then 1 end) AS #_of_threes
, sum(case when points = 2 then 1 end) AS #_of_twos
FROM scores
GROUP BY
user_id
Using mysql syntax, you can use SUM to count the matching rows easily;
SELECT
user_id,
SUM(points) AS total_points,
SUM(points=4) AS no_of_fours,
SUM(points=3) AS no_of_threes,
SUM(points=2) AS no_of_twos
FROM Table1
GROUP BY user_id;
Demo here.