Get total from child table with parent table Id - sql

I'm trying to query data from population table from members
table using the first state id (sid), by selecting the first appeared "sid" on members table
for each state id "sid" without duplicating all appeared sid on members table.
I want to get on male and female total for each state using sid. but when I query I get total of all record
from poupulation table
Example:
male_child(20) + female_cahild(70) for sid = 1
male_child(10) + female_cahild(12) for sid = 3
total = 112
Here is my sql query :
SELECT sum(p.number)as total FROM population p
JOIN members m ON p.mid = m.mid
states
+------------
|sid | name |
+----+------+
| 11 | A |
| 23 | B |
+-----------+
members
+-------------------------+
| mId | sid | date |
+------+------+-----------+
| 1 | 11 | 10-2-2021 |
| 2 | 11 | 15-2-2021 |
| 3 | 23 | 12-2-2021 |
| 4 | 23 | 16-2-2021 |
+--------------=----------+
pupulation table
pupulation
+----------------------------------------+
| pid | mid | gender | type | number |
+-----+-----+--------+--------+----------+
| 1 | 1 | male | child | 20 |
| 2 | 1 | female | child | 50 |
| 3 | 2 | male | child | 20 |
| 4 | 2 | female | child | 20 |
| 5 | 3 | male | child | 10 |
| 6 | 3 | female | child | 12 |
| 7 | 4 | female | child | 30 |
| 8 | 4 | female | child | 25 |
+----------------------------------------+
result : getting total / sum of the first `members`.`sid` 11 on row1 and 23 on
row3 of members table then sum their population
that will be (20 + 50) + (10 + 12) = 92

You haven't actually shown what your desired results are, but I suspect you just need to group by the relevant columns
select
s.name,
p.gender,
Sum(p.number) as total
from population p
join members m on p.mid = m.mid
join states s on s.sid = m.sid
group by s.name, p.gender
Edit
To get the desired total, simply get the minimum id for each member and sum the rows for those IDs
select Sum(number) as Total
from population
where mid in (select Min(mid) from members group by sid)
Example DB Fiddle

Related

Is there an easier way to find the row with a max value?

I have a schema where these two tables exist (among others)
participation
+------+--------+------------------+
| movie| person | role |
+------+--------+------------------+
| 1 | 1 | "Regisseur" |
| 1 | 1 | "Schauspieler" |
| 1 | 2 | "Schauspielerin" |
| 2 | 3 | "Regisseur" |
| 3 | 4 | "Regisseur" |
| 3 | 5 | "Schauspieler" |
| 3 | 6 | "Schauspieler" |
| 4 | 7 | "Schauspielerin" |
| 4 | 8 | "Schauspieler" |
| 5 | 1 | "Schauspieler" |
| 5 | 8 | "Schauspieler" |
| 5 | 14 | "Schauspieler" |
+------+--------+------------------+
movie
+----+------------------------------+------+-----+
| id | title | year | fsk |
+----+------------------------------+------+-----+
| 1 | "Die Bruecke am Fluss" | 1995 | 12 |
| 2 | "101 Dalmatiner" | 1961 | 0 |
| 3 | "Vernetzt - Johnny Mnemonic" | 1995 | 16 |
| 4 | "Waehrend Du schliefst..." | 1995 | 6 |
| 5 | "Casper" | 1995 | 6 |
| 6 | "French Kiss" | 1995 | 6 |
| 7 | "Stadtgespraech" | 1995 | 12 |
| 8 | "Apollo 13" | 1995 | 6 |
| 9 | "Schlafes Bruder" | 1995 | 12 |
| 10 | "Assassins - Die Killer" | 1995 | 16 |
| 11 | "Braveheart" | 1995 | 16 |
| 12 | "Das Netz" | 1995 | 12 |
| 13 | "Free Willy 2" | 1995 | 6 |
+----+------------------------------+------+-----+
I want to get the movie with the highest number of people that participated. I figured out an SQL statement that actually does this, but looks super complicated. It looks like this:
SELECT titel
FROM movie.movie
JOIN (SELECT *
FROM (SELECT Max(count_person) AS max_count_person
FROM (SELECT movie,
Count(person) AS count_person
FROM movie.participation
GROUP BY movie) AS countPersons) AS
maxCountPersons
JOIN (SELECT movie,
Count(person) AS count_person
FROM movie.participation
GROUP BY movie) AS countPersons
ON maxCountPersons.max_count_person =
countPersons.count_person)
AS maxPersonsmovie
ON maxPersonsmovie.movie = movie.id
The main problem is, that I can't find an easier way to select the row with the highest value. If I simply could make a selection on the inner table and pick the row with the highest value on count_person without losing the information about the movie itself, this would look so much simpler. Is there a way to simplify this, or is this really the easiest way to do this?
Here is a way without subqueries:
SELECT m.title
FROM movie.movie m JOIN
movie.participation p
ON m.id = p.movie
GROUP BY m.title
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
You can use LIMIT 1 instead of FETCH, if you prefer.
Note: In the event of ties, this only returns one value. That seems consistent with your question.
You can use rank window function to do this.
SELECT title
FROM (SELECT m.title,rank() over(order by count(p.person) desc) as rnk
FROM movie.movie m
LEFT JOIN movie.participation p ON m.id=p.movie
GROUP BY m.title
) t
WHERE rnk=1
SELECT title
FROM movie.movie
WHERE id = (SELECT movie
FROM movie.participation
GROUP BY movie
ORDER BY count(*) DESC
LIMIT 1);

how to split table in postgresql

I get a data set about 70 thousand rows and now I want to split this table into three with exact number of rows(the code was fisrt applied in SAS and now move to postgresql),one from 1-5000,two from 5001-25000 and last one with the rest row,and no duplicated rows in any of them.
like:
+--------+-----+--------+-----+
| cst_id | age | salary | sex |
+--------+-----+--------+-----+
| 1 | 44 | 2000 | M |
| 2 | 23 | 3000 | F |
| 3 | 34 | 4000 | M |
| 4 | 51 | 5000 | M |
| 5 | 26 | 6000 | F |
| 6 | 28 | 7000 | F |
| 7 | 39 | 8000 | M |
+--------+-----+--------+-----+
finally I want three table with the exact number of rows I assign(such as 3rows-2rows-rest rows),and they are all distinct.like:
table1:
+--------+-----+--------+-----+
| cst_id | age | salary | sex |
+--------+-----+--------+-----+
| 1 | 44 | 2000 | M |
| 2 | 23 | 3000 | F |
| 3 | 34 | 4000 | M |
+--------+-----+--------+-----+
table2:
+--------+-----+--------+-----+
| cst_id | age | salary | sex |
+--------+-----+--------+-----+
| 4 | 51 | 5000 | M |
| 5 | 26 | 6000 | F |
+--------+-----+--------+-----+
table3:
+--------+-----+--------+-----+
| cst_id | age | salary | sex |
+--------+-----+--------+-----+
| 6 | 28 | 7000 | F |
| 7 | 39 | 8000 | M |
+--------+-----+--------+-----+
how to use postgresql to finish this?
There is a window function "NTILE" can do this:
-- add a col to help split
create temp table help_table as
select *
,NTILE(3) OVER(ORDER BY cat_id) as batch_nbr
from your_table;
create table_1 as
select * from help_table where batch_nbr = 1;
create table_2 as
select * from help_table where batch_nbr = 2;
create table_3 as
select * from help_table where batch_nbr = 3;
You can split this process into steps as a function.
Get the total number of distinct rows.
Divide that value by 3 and store the value as a DECLARED variable (_size).
Create table_1, table_2, and table_3.
INSERT INTO table_1 with LIMIT (_size).
INSERT INTO table_2 with LIMIT (_size) WHERE id > table_1's greatest id.
INSERT INTO table_3 with LIMIT (_size) WHERE id > table_2's greatest id.
Hopefully this helps.

Joining to tables while linking with a crosswalk table

I am trying to write a query to de identify one of my tables. To make distinct ids for people, I used name, age and sex. However in my main table, the data has been collected for years and the sex code changed from 1 meaning male and 2 meaning female to M meaning male and F meaning female. To make this uniform in my distinct individuals table I used a crosswalk table to convert the sexcode into to the correct format before placing it into the distinct patients table.
I am now trying to write the query to match the distinct patient ids to their correct the rows from the main. table. The issue is that now the sexcode for some has been changed. I know I could use an update statement on my main table and changes all of the 1 and 2 to the m and f. However, I was wondering if there was a way to match the old to the new sexcodes so I would not have to make the update. I did not know if there was a way to join the main and distinct ids tables in the query while using the sexcode table to convert the sexcodes again. Below are the example tables I am currently using.
This is my main table that I want to de identify
----------------------------
| Name | age | sex | Toy |
----------------------------
| Stacy| 30 | 1 | Bat |
| Sue | 21 | 2 | Ball |
| Jim | 25 | 1 | Ball |
| Stacy| 30 | M | Ball |
| Sue | 21 | F | glove |
| Stacy| 18 | F | glove |
----------------------------
Sex code crosswalk table
-------------------
| SexOld | SexNew |
-------------------
| M | M |
| F | F |
| 1 | M |
| 2 | F |
-------------------
This is the table I used to to populate IDs for people I found to be distinct in my main table
--------------------------
| ID | Name | age | sex |
--------------------------
| 1 | Stacy| 30 | M |
| 2 | Jim | 25 | M |
| 3 | Stacy| 18 | F |
| 4 | Sue | 21 | F |
--------------------------
This what I want my de identified table to look like
---------------
| ID | Toy |
---------------
| 1 | Bat |
| 4 | Ball |
| 2 | Ball |
| 1 | Ball |
| 4 | glove |
| 3 | glove |
---------------
select c.ID, a.Toy
from maintable a
left join sexcodecrosswalk b on b.sexold = a.sex
left join peopleids c on c.Name = a.Name and c.age = a.age and c.Sex = b.sexNew
Here's a demonstration that this works:
http://sqlfiddle.com/#!3/a2d26/1

SQL Query to do grouping across a join

I've an enrollment table containing student IDs, course IDs and teacher IDs.
___________________
| sID | cID | tID |
___________________
| 1 | 1 | 1 |
| 1 | 2 | 2 |
| 1 | 3 | 3 |
| 2 | 1 | 1 |
| 2 | 3 | 5 |
| 3 | 1 | 1 |
| 3 | 2 | 2 |
I would like to get a table that can tell me how many students are in each course with a given professor. In other words, I'd like this:
_____________________________
| cID | tID | numOfStudents |
____________________________
| 1 | 1 | 3 |
| 2 | 2 | 2 |
| 3 | 3 | 1 |
| 3 | 5 | 1 |
I've tried
SELECT cID, tID, count(sID)
FROM enrollment
GROUP BY tID
but this type of formula, with different combinations is not working for me. Does anyone have any other suggestions?
Just add cid to the GROUP BY:
SELECT cID, tID, count(*)
FROM enrollment
GROUP BY cid,tID
sqlfiddle demo
From the docs:
When GROUP BY is present, it is not valid for the SELECT list
expressions to refer to ungrouped columns except within aggregate
functions, since there would be more than one possible value to return
for an ungrouped column.
SELECT cID, tID, count(sID)
FROM enrollment
GROUP BY 1,2

postgresql aggregate of aggregate (sum of sum)

I've got workers who have many sales and who belong to departments. I'd like to see how many sales a department is making per day.
For simplicity, let's say a worker belongs to only one department.
Example:
departments:
| id | name |
| 1 | Men's Fashion |
| 2 | Women's Fashion |
workers:
| id | name |
| 1 | Timmy |
| 2 | Sally |
| 3 | Johnny |
sales:
| id | worker_id | datetime | amount |
| 1 | 1 | 2013-1-1 08:00:00 | 1 |
| 2 | 1 | 2013-1-1 09:00:00 | 3 |
| 3 | 3 | 2013-1-2 08:00:00 | 8 |
department_employees
| id | worker_id | department_id |
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 3 | 2 |
I'd like to get
| department | amount |
| Men's Fashion | 4 |
| Women's Fashion | 8 |
To get the individual worker's total sales, I can do
SELECT worker_id, SUM(amount) FROM sales
GROUP BY worker_id
How do I take those sums (the total amount sold per worker) and aggregate it by department?
Don't sum the sum, rather join from sales through the department_employees table to the department:
select d.name, sum(s.amount)
from sales s
join department_employees de on de.worker_id = s.worker_id
join departments d on d.id = de.department_id
group by d.name
Aggregate functions and group by work in a statement with joints too.
Try something like:
SELECT name, SUM(amount) FROM departments, department_employees, sales
WHERE departments.id = department_employees.department_id
AND sales.worker_id = department_employees.worker_id
GROUP BY name