Ignite: SQL query to calculate probability of a column

Ignite: SQL query to calculate probability of a column - sql

Gender
-------
Female
Male
Male
Male
Female
Female
Male
Female
Here i want to calculate probability of gender column and following query i tried, but it's not working.
SELECT (count(*)/(SELECT count(*) from DIABETIC_TOPIC) as probability from DIABETIC_TOPIC group by gender order by gender;
what i missed?

I'd cross join the grouped query on a non-grouped query, and divide them:
SELECT gender, cnt_gender / cnt * 100 AS probability
FROM (SELECT gender, COUNT(*)
FROM diabetic_topic
GROUP BY gender) a
CROSS JOIN (SELECT COUNT(*) AS cnt
FROM diabetic_topic) b

Related

Distinct on specific columns in SQL

I know someone on here already asked the similar questions. However, most of them still want to return the first row or last row if multiple rows have the same attributes. For my case, I want to simply discard the rows which have the same specific attributes.
For example, I have a toy dataset like this:
gender age name
f 20 zoe
f 20 natalia
m 39 tom
f 20 erika
m 37 eric
m 37 shane
f 22 jenn
I only want to distinct on gender and age, then discard all rows if those two attributes, which returns:
gender age name
m 39 tom
f 22 jenn

You could use the window (analytic) variant of count to find the rows that have a just one occurance of the gender/age combination:
SELECT gender, age, name
FROM (SELECT gender, age, name, COUNT(*) OVER (PARTITION BY gender, age) AS cnt
FROM mytable) t
WHERE cnt = 1

Use the HAVING clause in a CTE.
;WITH DistinctGenderAges AS
(
SELECT gender
,age
FROM YourTable
GROUP BY gender
,age
HAVING COUNT(*) = 1
)
SELECT yt.gender, yt.age, yt.name
FROM DistinctGenderAges dga
INNER JOIN YourTable yt ON dga.gender = yt.gender AND dga.age = yt.age

No matter what, you have to tell the database which value to pick for name. If you don't care an easy solution is to group:
SELECT gender, age, MIN(name) as name FROM mytable GROUP BY gender, age HAVING COUNT(*)=1
You can use any valid aggregate for name, but you have to pick something.

How do you convert to percentage by using COUNT in SQL?

For example, if I want to list the percentages of males and females in the 'Employee' table.
Is this right?
SELECT Sex, COUNT (Sex) AS [%]
FROM Employees
GROUP BY Sex;
And what if I want to list the gender that is less than 50%? Is the following correct?
SELECT Sex, COUNT (Sex) AS [%]
FROM Employees
GROUP BY Sex
HAVING COUNT (Sex) < 50%
Thank you.

I believe this is what you're looking for.
SELECT Sex, (Count(Sex)* 100 / (SELECT Count(*) FROM Employees)) as MyPercentage
FROM Employees
GROUP BY Sex
Then you can do
HAVING MyPercentage < 50

No. Most versions of SQL support window functions. You can calculate the percentages using the following:
SELECT Sex, COUNT(Sex)/(sum(count(sex)) over ()) AS [%]
FROM Employees
GROUP BY Sex;
(I'm leaving out the 100*, because I'm not sure if you want a percentage between 0 and 100 or a probability between 0 and 1.)
Some versions of SQL do integer division, in which case you need to convert this to a decimal or float:
SELECT Sex, cast(COUNT(Sex) as float)/(sum(count(sex)) over ()) AS [%]
FROM Employees
GROUP BY Sex;

Simple SQL query for Min and Max

So I am trying to find the age of the oldest and youngest male and female patients along with the average age of male and female patients in the clinic I work. I am new to SQL but essentially it all comes from one table I believe which is named "Patients". Inside the Patients table there is a column for Gender which has Either M for male or F for female. There is also an age column. I am guessing this is really simple and I am just making this to complicated but could someone try to help me out?
My Query is pretty limited. I know that if you do something along the lines of:
Select
Min(AGE) AS AGEMIN,
MAX(AGE) AS AGEMAX
From Patients

Use the GROUP BY clause:
select * from #MyTable
M 10
M 15
M 20
F 30
F 35
F 40
select Gender, MIN(Age), MAX(Age), AVG(Age)
from #MyTable
group by Gender
F 30 40 35
M 10 20 15

Here you go
SELECT gender, AVG(age) as avgage, MAX(age) as maxage, MIN(age) as minage
FROM patients
group by gender;

SQL query to that limits only a subset of the records

Let's say I have a table of people, with rows name, gender (M/F), and age.
What would a SQL query look like that returns:
all female people
a maximum of 5 male people
people sorted by age
NB. This is a contrived example. Also, Postgres-specific answers welcome.

SELECT name, gender, age
FROM ( SELECT name, gender, age
FROM people
WHERE gender = 'F'
UNION ALL
( SELECT name, gender, age
FROM people
WHERE gender = 'M'
LIMIT 5
)
) x
ORDER BY age
Note the above solution doesn't pick any particular males. Apply an ordering to the male subquery if you want that.
This one orders the males by age before the pruning takes place:
SELECT name, gender, age
FROM ( SELECT name, gender, age
,ROW_NUMBER() OVER (PARTITION BY gender ORDER BY age) gender_count
FROM people
) x
WHERE gender = 'F'
OR gender_count <= 5
BTW, I've found "gender" is usually used for grammatical references. In this case "sex" would have been the terminology I would have used.

count columns group by

I hava the sql as below:
select a.dept, a.name
from students a
group by dept, name
order by dept, name
And get the result:
dept name
-----+---------
CS | Aarthi
CS | Hansan
EE | S.F
EE | Nikke2
I want to summary the num of students for each dept as below:
dept name count
-----+-----------+------
CS | Aarthi | 2
CS | Hansan | 2
EE | S.F | 2
EE | Nikke2 | 2
Math | Joel | 1
How shall I to write the sql?

Although it appears you are not showing all the tables, I can only assume there is another table of actual enrollment per student
select a.Dept, count(*) as TotalStudents
from students a
group by a.Dept
If you want the total count of each department associated with every student (which doesn't make sense), you'll probably have to do it like...
select a.Dept, a.Name, b.TotalStudents
from students a,
( select Dept, count(*) TotalStudents
from students
group by Dept ) b
where a.Dept = b.Dept
My interpretation of your "Name" column is the student's name and not that of the actual instructor of the class hence my sub-select / join. Otherwise, like others, just using the COUNT(*) as a third column was all you needed.

select a.dept, a.name,
(SELECT count(*)
FROM students
WHERE dept = a.dept)
from students a
group by dept, name
order by dept, name
This is a somewhat questionable query, since you get duplicate copies of the department counts. It would be cleaner to fetch the student list and the department counts as separate results. Of course, there may be pragmatic reasons to go the other way, so this isn't an absolute rule.

SELECT dept, name, COUNT(name) as CT from students
group by dept, name
order by dept, name

This should do it (I haven't got any environment to test on at the min)
select a.dept, a.name, count(a.*) as NumOfStudents
from students a
group by dept, name order by dept, name
HTH

Or Otherwise write simply
select dept, name, count(name) as nostud from students group by dept, name order by dept, name

This will give the results requested above
select a.dept, a.name, cnt
from student a
join (
select dept, count(1) as cnt
from student
group by dept
) b on b.dept = a.dept

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Ignite: SQL query to calculate probability of a column - sql

I'd cross join the grouped query on a non-grouped query, and divide them: SELECT gender, cnt_gender / cnt * 100 AS probability FROM (SELECT gender, COUNT() FROM diabetic_topic GROUP BY gender) a CROSS JOIN (SELECT COUNT() AS cnt FROM diabetic_topic) b

Related

Distinct on specific columns in SQL

How do you convert to percentage by using COUNT in SQL?

Simple SQL query for Min and Max

SQL query to that limits only a subset of the records

count columns group by

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Ignite: SQL query to calculate probability of a column - sql

I'd cross join the grouped query on a non-grouped query, and divide them: SELECT gender, cnt_gender / cnt * 100 AS probability FROM (SELECT gender, COUNT(*) FROM diabetic_topic GROUP BY gender) a CROSS JOIN (SELECT COUNT(*) AS cnt FROM diabetic_topic) b

Related

Distinct on specific columns in SQL

How do you convert to percentage by using COUNT in SQL?

Simple SQL query for Min and Max

SQL query to that limits only a subset of the records

count columns group by

Categories

Resources

I'd cross join the grouped query on a non-grouped query, and divide them: SELECT gender, cnt_gender / cnt * 100 AS probability FROM (SELECT gender, COUNT() FROM diabetic_topic GROUP BY gender) a CROSS JOIN (SELECT COUNT() AS cnt FROM diabetic_topic) b