SQl SELECT Percentage - sql

have two three tables one for student, second for class, and the third for gender. I am trying to get the total number of each gender and the percentage. I used the following statment to get the number and it works well:
SELECT Gender.GenderName as Gender, COUNT(*) as cnt
FROM (Client INNER JOIN
Gender ON Student.GenderID = Gender.GenderID)
GROUP BY Gender.GenderName
I could not figure out how I can get the percenage, also how to make ClassName or ID as as selectable item to get the gender for each class or all classes by using #ClassId int.

I don't have access to a SQL Server right now but you use either Cast or Convert to convert to a floating type. Or you may want to refer to your SQL Server's documentation for its specific type conversion functions.
SELECT a.Gender, a.cnt, Cast(a.cnt as float) / b.cnt * 100
FROM (SELECT Gender.GenderName as Gender, COUNT(*) as cnt
FROM Client
INNER JOIN Gender
ON Student.GenderID = Gender.GenderID
GROUP BY Gender.GenderName) a
CROSS JOIN (SELECT COUNT(*) as cnt FROM student) b
PS: Thanks for pointing out my mistake. a.cnt/b.cnt would do integer math and return zero if b.cnt > a.cnt (In this case it would be) ... we need to convert either a.cnt or b.cnt to a float so a.cnt/b.cnt becomes a float and then times it by 100 to give you the percent. Also, I missed that the GenderName had been aliased in the inner SELECT.

Related

Group by after a partition by in MS SQL Server

I am working on some car accident data and am stuck on how to get the data in the form I want.
select
sex_of_driver,
accident_severity,
count(accident_severity) over (partition by sex_of_driver, accident_severity)
from
SQL.dbo.accident as accident
inner join SQL.dbo.vehicle as vehicle on
accident.accident_index = vehicle.accident_index
This is my code, which counts the accidents had per each sex for each severity. I know I can do this with group by but I wanted to use a partition by in order to work out % too.
However I get a very large table (I assume for each row that is each sex/severity. When I do the following:
select
sex_of_driver,
accident_severity,
count(accident_severity) over (partition by sex_of_driver, accident_severity)
from
SQL.dbo.accident as accident
inner join SQL.dbo.vehicle as vehicle on
accident.accident_index = vehicle.accident_index
group by
sex_of_driver,
accident_severity
I get this:
sex_of_driver
accident_severity
(No column name)
1
1
1
1
2
1
-1
2
1
-1
1
1
1
3
1
I won't give you the whole table, but basically, the group by has caused the count to just be 1.
I can't figure out why group by isn't working. Is this an MS SQL-Server thing?
I want to get the same result as below (obv without the CASE etc)
select
accident.accident_severity,
count(accident.accident_severity) as num_accidents,
vehicle.sex_of_driver,
CASE vehicle.sex_of_driver WHEN '1' THEN 'Male' WHEN '2' THEN 'Female' end as sex_col,
CASE accident.accident_severity WHEN '1' THEN 'Fatal' WHEN '2' THEN 'Serious' WHEN '3' THEN 'Slight' end as serious_col
from
SQL.dbo.accident as accident
inner join SQL.dbo.vehicle as vehicle on
accident.accident_index = vehicle.accident_index
where
sex_of_driver != 3
and
sex_of_driver != -1
group by
accident.accident_severity,
vehicle.sex_of_driver
order by
accident.accident_severity
You seem to have a misunderstanding here.
GROUP BY will reduce your rows to a single row per grouping (ie per pair of sex_of_driver, accident_severity values. Any normal aggregates you use with this, such as COUNT(*), will return the aggregate value within that group.
Whereas OVER gives you a windowed aggregated, and means you are calculating it after reducing your rows. Therefore when you write count(accident_severity) over (partition by sex_of_driver, accident_severity) the aggregate only receives a single row in each partition, because the rows have already been reduced.
You say "I know I can do this with group by but I wanted to use a partition by in order to work out % too." but you are misunderstanding how to do that. You don't need PARTITION BY to work out percentage. All you need to calculate a percentage over the whole resultset is COUNT(*) * 1.0 / SUM(COUNT(*)) OVER (), in other words a windowed aggregate over a normal aggregate.
Note also that count(accident_severity) does not give you the number of distinct accident_severity values, it gives you the number of non-null values, which is probably not what you intend. You also have a very strange join predicate, you probably want something like a.vehicle_id = v.vehicle_id
So you want something like this:
select
sex_of_driver,
accident_severity,
count(*) as Count,
count(*) * 1.0 /
sum(count(*)) over (partition by sex_of_driver) as PercentOfSex
count(*) * 1.0 /
sum(count(*)) over () as PercentOfTotal
from
dbo.accident as accident a
inner join dbo.vehicle as v on
a.vehicle_id = v.vehicle_id
group by
sex_of_driver,
accident_severity;

Count Distinct values in one column based on other column

I am trying to count distinct values on Z_l based on value by using with clause. Sample data exercise included below.
please look at the picture, the distinct values of Z_l based on X='ny'
with distincz_l as (select ny.X, ny.z_l o.cnt From HOPL ny join (select X, count(*) as cnt from HOPL group by X) o on (ny.X = o.Z_l)) select * from HOPL;
You don't even need a WITH clause, since you just need one single sentence:
SELECT z_l, count(1)
FROM hopl
WHERE x='ny'
GROUP BY z_l
;

Multiply an array from a column with the result of a count from another query

I am new to Postgres and databases so I am sorry. I ran a query to get a count of students per school from one table. Now I have the table below:
school_probs:
school_code(PK bigint) schoolName(text) probs(numeric)
1 CAA {0.05,0.08,0.18,0.3,0.11,0.28}
2 CAS {0.06,0.1,0.295,0.36,0.12,0.065}
3 CBA {0.05,0.11,0.35,0.32,0.12,0.05}
4 CL {0.07,0.09,0.24,0.4,0.06,0.09}
How would I go about multiplying the count from each school with each number in the probs column. Ex: We have total number of students at school "CAA" If it is 198, then the probability distribution will be
(0.05*198, 0.08*198, 0.18*198, 0.3*198, 0.11*198, 0.28*198). With the results I can then assign grades to students.
My query to get the count is as follows:
SELECT simulated_records.school, COUNT(simulated_records.school) as studentCount INTO CountSchool
FROM simulated_records, school_probs
WHERE simulated_records.school = school_probs.school
GROUP BY simulated_records.school;
To multiply elements of an array with a constant, you need to unnest, multiply, and aggregate again. Some caveats lie in wait. Consider:
PostgreSQL unnest() with element number
And best use an ARRAY constructor.
That said, and making some assumptions about your undisclosed table design, I would also simplify the count:
Aggregate a single column in query with many columns
Arriving at:
SELECT *, ARRAY(SELECT unnest(p.probs) * r.student_ct) AS scaled_probs
FROM school_probs p
LEFT JOIN (
SELECT school, COUNT(*)::int AS student_ct
FROM simulated_records
GROUP BY 1
) r USING (school);
Or, to represent NULL arrays as NULL arrays:
SELECT *
FROM school_probs p
LEFT JOIN (
SELECT school, COUNT(*)::int AS student_ct
FROM simulated_records
GROUP BY 1
) r USING (school)
LEFT JOIN LATERAL (
SELECT ARRAY(SELECT unnest(p.probs) * r.student_ct) AS scaled_probs
) p1 ON p.probs IS NOT NULL;
db<>fiddle here
I suggest this simply form with a set-returning function in the SELECT list only for Postgres 10 or later, because of:
What is the expected behaviour for multiple set-returning functions in select clause?

update existing column with results of select query using sql

I am trying to update a column called Number_Of_Marks in our Results table using the results we get from our SELECT statement. Our select statement is used to count the numbers of marks per module in our results table. The SELECT statement works and the output is correct, which is
ResultID ModuleID cnt
-------------------------
111 ART3452 2
114 ART3452 2
115 CSC3039 3
112 CSC3039 3
113 CSC3039 3
The table in use is:
Results: ResultID, ModuleID, Number_Of_Marks
We need the results of cnt to be updated into our Number_Of_Marks column. This is our code below...
DECLARE #cnt INT
SELECT #cnt
SELECT C.cnt
FROM Results S
INNER JOIN (SELECT ModuleID, count(ModuleID) as cnt
FROM Results
GROUP BY ModuleID) C ON S.ModuleID = C.ModuleID
UPDATE Results
SET [Number_Of_Marks] = (#cnt)
You can do this in SQL Server using the update/join syntax:
UPDATE s
SET [Number_Of_Marks] = c.cnt
FROM Results S INNER JOIN
(SELECT ModuleID, count(ModuleID) as cnt
FROM Results
GROUP BY ModuleID
) C
ON S.ModuleID = C.ModuleID;
I assume that you want the count from the subquery, not from the uninitialized variable.
EDIT:
In general, when you change the question it is better to ask another question. Sometimes, though, the changes are really small. The revised query looks something like:
UPDATE s
SET [Number_Of_Marks] = c.cnt,
Marks = avgmarks
FROM Results S INNER JOIN
(SELECT ModuleID, count(ModuleID) as cnt, avg(marks * 1.0) as avgmarks
FROM Results
GROUP BY ModuleID
) C
ON S.ModuleID = C.ModuleID;
Note that I multiplied the marks by 1.0. This is a quick-and-dirty way to convert an integer to a numeric value. SQL Server takes averages on integers and produces an integer. Usually you want some sort of decimal or floating value.

SQL Query: Largest number of guns

Schema is below:
Ships(name, yearLaunched, country, numGuns, gunSize, displacement)
Battles(ship, battleName, result)
where name and ship are equal. By this I mean if 'Missouri' was one of the tuple
results for name, 'Missouri' would also appear as a tuple result for ship.
(i.e. name = 'Missouri' , ship = 'Missouri)
They are the same
Now the question I have is what SQL statement would I make in order to list
the battleship amongst a list of battleships that has the largest amount
of guns (i.e. gunSize)
I tried:
SELECT name, max(gunSize)
FROM Ships
But this gave me the wrong result.
I then tried:
SELECT s.name
FROM Ships s,
(SELECT MAX(gunSize) as "Largest # of Guns"
FROM Ships
GROUP BY name) maxGuns
WHERE s.name = maxGuns.name
But then SQLite Admin gave me an error saying that no such column 'maxGuns' exists
even though I assigned it as an alias: maxGuns
Do any of you know what the correct query for this problem would be?
Thanks!
The problem in your query is that the subquery has no column named name.
Anyway, to find the largest amount of guns, just use SELECT MAX(gunSize) FROM Ships.
To get all ships with that number of guns, you need nothing more than a simple comparison with that value:
SELECT name
FROM Ships
WHERE gunSize = (SELECT MAX(gunSize)
FROM Ships)
It does not exist because you are trying to alias a subquery in the 'Where' clause, instead of aliasing specific column from a table. In order to identify the ship with the most guns you could try something like:
with cte as (select *
,ROW_NUMBER() over (order by s.gunsize desc) seq
from ships s )
select * from cte
where seq = '1'
Another approach could be: And it will only select the 1st row,containing the ship with highest number of guns.
select Top 1 *
from ships s
order by s.gunsize desc
WITH TAB_SHIPS(NAME, NUMGUNS,DISPLACEMENT) AS (SELECT NAME, NUMGUNS,DISPLACEMENT FROM SHIPS AS S
LEFT JOIN CLASSES AS C
ON S.CLASS=C.CLASS
WHERE C.NUMGUNS >=ALL(SELECT NUMGUNS FROM CLASSES C1 WHERE C1.DISPLACEMENT = C.DISPLACEMENT )
UNION
SELECT SHIP, NUMGUNS,DISPLACEMENT FROM OUTCOMES AS O
LEFT JOIN CLASSES AS C
ON C.CLASS=O.SHIP
WHERE C.NUMGUNS >=ALL(SELECT NUMGUNS FROM CLASSES C1 WHERE C1.DISPLACEMENT = C.DISPLACEMENT ) )
SELECT NAME FROM TAB_SHIPS
WHERE NUMGUNS IS NOT NULL