SQL: Add values to STDEVP calculation - sql

I have the following table.
Key | Count | Amount
----| ----- | ------
1 | 2 | 10
1 | 2 | 15
2 | 5 | 1
2 | 5 | 2
2 | 5 | 3
2 | 5 | 50
2 | 5 | 20
3 | 3 | 5
3 | 3 | 4
3 | 3 | 5
Sorry I couldn't figure out who to make the above a table.
I'm running this on SQL Server Management Studio 2012.
I'd like the stdevp return of the amount columns but if the number of records is less than some value 'x' (there will never be more than x records for a given key), then I want to add zeros to account for the remainder.
For example, if 'x' is 6:
for key 1, I need stdevp(10,5,0,0,0,0)
for key 2, I need stdevp(1,2,3,50,20,0)
for key 3, I need stdevp(5,4,5,0,0,0)
I just need to be able to add zeros to the calculation. I could insert records to my table, but that seems rather tedious.

This seems complicated -- padding data for each key. Here is one approach:
with xs as (
select 0 as val, 1 as n
union all
select 0, n + 1
from xs
where xs.n < 6
)
select k.key, stdevp(coalesce(t.amount, 0))
from xs cross join
(select distinct key from t) k left join
(select t.*, row_number() over (partition by key order by key) as seqnum
from t
) t
on t.key = k.key and t.seqnum = xs.n
group by k.key;
The idea is that the cross join generates 6 rows for each key. Then the left join brings in available rows, up to the maximum.

Related

Include zero counts when grouping by multiple columns

I have a table (TCAP) containing the gender (2 categories), race/ethnicity (3 categories), and height (integer in inches) for multiple individuals. For example:
GND RCE HGT
1 3 65
1 2 72
2 1 62
1 2 68
2 1 65
2 2 64
1 3 69
1 1 70
I want to get a count of the number of individuals in each possible gender and race/ethnicity combination. When I group by GND and RCE, however, it doesn't show zero counts. I've tried the following code:
SELECT
GND,
RCE,
COUNT(*) TotalRecords
FROM TCAP
GROUP BY GND, RCE;
This gives me:
GND RCE TotalRecords
1 1 1
1 2 2
1 3 2
2 1 2
2 2 1
I want it to show all possible combinations though. In other words, even though there are no individuals with a gender of 1 and race/ethnicity of 3 in the table, I want that to display as a zero count. So, like this:
GND RCE TotalRecords
1 1 1
1 2 2
1 3 2
2 1 2
2 2 1
2 3 0
I've looked at the responses to similar questions, but they are based on a single group, resolved using an outer join with a table that has all possible values. Would I use a similar process here? Would I create a single table that has all 6 combinations of GND and RCE to join on? Is there another way to accomplish this, especially if the number of combinations increases (for example, 1 group with 5 values and 1 group with 10 values)?
Any help would be much appreciated! Thanks!
You can try to use CROSS JOIN make for GND,RCE columns then do OUTER JOIN base on it.
Query #1
SELECT t1.GND,t1.RCE,COUNT(t3.GND) TotalRecords
FROM (
SELECT GND,RCE
FROM (
SELECT DISTINCT GND
FROM TCAP
) t1 CROSS JOIN
(
SELECT DISTINCT RCE FROM TCAP
) t2
) t1
LEFT JOIN TCAP t3 ON t3.GND = t1.GND and t3.RCE = t1.RCE
group by t1.GND,t1.RCE;
| GND | RCE | TotalRecords |
| --- | --- | ------------ |
| 1 | 1 | 1 |
| 1 | 2 | 2 |
| 1 | 3 | 2 |
| 2 | 1 | 2 |
| 2 | 2 | 1 |
| 2 | 3 | 0 |
View on DB Fiddle
Use a cross join to generate the rows and a left join to bring in the results -- with a final group by:
select g.gnd, r.rce, count(t.gnd) as cnt
from (select distinct gnd from tcap) g cross join
(select distinct rce from tcap) r left join
tcap t
on t.gnd = g.gnd and t.rce = r.rce
group by g.gnd, r.rce;

SQL advanced query counting the max value of a group

I want to create a query that will count the number of times the following condition is met.
I have a table that consists of multiple records with a matching foreign key. I want to check only for each of the foreign key groups if the highest value of another column of that key occurs more than once. If it does that will up the count.
Data
--------------------------
ID | Foreign Key | Value
--------------------------
1 | 1 | 1
2 | 1 | 2
3 | 1 | 2
4 | 2 | 0
5 | 2 | 2
6 | 2 | 1
7 | 3 | 0
8 | 3 | 1
9 | 3 | 1
The query I want should return the number 2. This is because the maximum value in group 1(Foreign Key) occurs twice, the value is 2. In group 2 the maximum value is 2 but only occurs once so this will not up the count. Then in group 3 the maximum value is 1 which occurs twice which will up the count. The count therefore ends up as two.
All credit goes to the comment from #Bob, but here is the sql that solved this problem.
SELECT Count(1)
FROM (SELECT DISTINCT foreign_key
FROM (SELECT foreign_key,
Count(1)
FROM data
WHERE ( foreign_key, value ) IN (SELECT foreign_key,
Max(value)
FROM data
GROUP BY foreign_key)
GROUP BY foreign_key
HAVING Count(1) > 1) AS data) AS data;
This is one approach:
select max(num_at_max)
from (select t.*, count(val) over(partition by fk) as num_at_max
from tbl t
join (select max(max_val_by_grp) as max_val_all_grps
from (select fk, max(val) as max_val_by_grp
from tbl
group by fk) x) x
on t.val = x.max_val_all_grps) x

How to calculate the value of a previous row from the count of another column

I want to create an additional column which calculates the value of a row from count column with its predecessor row from the sum column. Below is the query. I tried using ROLLUP but it does not serve the purpose.
select to_char(register_date,'YYYY-MM') as "registered_in_month"
,count(*) as Total_count
from CMSS.USERS_PROFILE a
where a.pcms_db != '*'
group by (to_char(register_date,'YYYY-MM'))
order by to_char(register_date,'YYYY-MM')
This is what i get
registered_in_month TOTAL_COUNT
-------------------------------------
2005-01 1
2005-02 3
2005-04 8
2005-06 4
But what I would like to display is below, including the months which have count as 0
registered_in_month TOTAL_COUNT SUM
------------------------------------------
2005-01 1 1
2005-02 3 4
2005-03 0 4
2005-04 8 12
2005-05 0 12
2005-06 4 16
To include missing months in your result, first you need to have complete list of months. To do that you should find the earliest and latest month and then use heirarchial
query to generate the complete list.
SQL Fiddle
with x(min_date, max_date) as (
select min(trunc(register_date,'month')),
max(trunc(register_date,'month'))
from users_profile
)
select add_months(min_date,level-1)
from x
connect by add_months(min_date,level-1) <= max_date;
Once you have all the months, you can outer join it to your table. To get the cumulative sum, simply add up the count using SUM as analytical function.
with x(min_date, max_date) as (
select min(trunc(register_date,'month')),
max(trunc(register_date,'month'))
from users_profile
),
y(all_months) as (
select add_months(min_date,level-1)
from x
connect by add_months(min_date,level-1) <= max_date
)
select to_char(a.all_months,'yyyy-mm') registered_in_month,
count(b.register_date) total_count,
sum(count(b.register_date)) over (order by a.all_months) "sum"
from y a left outer join users_profile b
on a.all_months = trunc(b.register_date,'month')
group by a.all_months
order by a.all_months;
Output:
| REGISTERED_IN_MONTH | TOTAL_COUNT | SUM |
|---------------------|-------------|-----|
| 2005-01 | 1 | 1 |
| 2005-02 | 3 | 4 |
| 2005-03 | 0 | 4 |
| 2005-04 | 8 | 12 |
| 2005-05 | 0 | 12 |
| 2005-06 | 4 | 16 |

Why IN operator return distinct selection when passing duplicate value (value1 , value1 ....)

Using SQL Server 2008
Why does the IN operator return distinct values when selecting duplicate values?
Table #temp
x | 1 | 2 | 3
--+------------+-------------+------------
1 | first 1 | first 2 | first 3
2 | Second 1 | second 2 | second 3
When I execute this query
SELECT * FROM #temp WHERE x IN (1,1)
it will return
x | 1 | 2 | 3
--+------------+-------------+------------
1 | first 1 | first 2 | first 3
How can I make it so it returns this instead:
x | 1 | 2 | 3
--+------------+-------------+------------
1 | first 1 | first 2 | first 3
1 | first 1 | first 2 | first 3
What is the alternative of IN in this case?
If you want to return duplicates, then you need to phrase the query as a join. The in is simply testing a condition on each row. Whether the condition is met once or twice doesn't matter -- the row either stays in or gets filtered out.
with xes as (
select 1 as x union all
select 1 as x
)
SELECT *
FROM #temp t join
xes
on t.x = xes.x;
EDIT:
If you have a subquery, then it is even simpler:
select *
from #temp t join
(<subquery>) s
on t.x = s.x
This would be a "normal" use of a join.

Help with optimising SQL query

Hi i need some help with this problem.
I am working web application and for database i am using sqlite. Can someone help me with one query from databse which must be optimized == fast =)
I have table x:
ID | ID_DISH | ID_INGREDIENT
1 | 1 | 2
2 | 1 | 3
3 | 1 | 8
4 | 1 | 12
5 | 2 | 13
6 | 2 | 5
7 | 2 | 3
8 | 3 | 5
9 | 3 | 8
10| 3 | 2
....
ID_DISH is id of different dishes, ID_INGREDIENT is ingredient which dish is made of:
so in my case dish with id 1 is made with ingredients with ids 2,3
In this table a have more then 15000 rows and my question is:
i need query which will fetch rows where i can find ids of dishes ordered by count of ingreedients ASC which i haven added to my algoritem.
examle: foo(2,4)
will rows in this order:
ID_DISH | count(stillMissing)
10 | 2
1 | 3
Dish with id 10 has ingredients with id 2 and 4 and hasn't got 2 more, then is
My query is:
SELECT
t2.ID_dish,
(SELECT COUNT(*) as c FROM dishIngredient as t1
WHERE t1.ID_ingredient NOT IN (2,4)
AND t1.ID_dish = t2.ID_dish
GROUP BY ID_dish) as c
FROM dishIngredient as t2
WHERE t2.ID_ingredient IN (2,4)
GROUP BY t2.ID_dish
ORDER BY c ASC
works,but it is slow....
select ID_DISH, sum(ID_INGREDIENT not in (2, 4)) stillMissing
from x
group by ID_DISH
having stillMissing != count(*)
order by stillMissing
this is the solution, my previous query work 5 - 20s this work about 80ms
This is from memory, as I don't know the SQL dialect of sqlite.
SELECT DISTINCT T1.ID_DISH, COUNT(T1.ID_INGREDIENT) as COUNT
FROM dishIngredient as T1 LEFT JOIN dishIngredient as T2
ON T1.ID_DISH = T2.ID_DISH
WHERE T2.ID_INGREDIENT IN (2,4)
GROUP BY T1.ID_DISH
ORDER BY T1.ID_DISH