Populating relationship table randomly from two entity tables Postgresql - sql

I'm attempting a simple E-R where the two entity sets are : Student & University each having about 200rows & the relationship to be "Applies" [so student applies to university]
I have the student table with a few data but the Primary Key being Student_ID
the University has Univeristy_Name as the PK;
now the relationship - "applies" - I have the student_ID & University_Name as FK & application_ID as PK
I have to populate this "applies" table containing about 10% chosen at random of the possible relationship, using one single INSERT statement & maybe a "random()" function. Does anyone know how do I go about populating the table using values from the other two tables randomly?

You would use:
insert into applies (student_id, university_name)
select s.student_id, u.university_name
from students s cross join
universities u
order by random()
limit 4000;
Alternatively, you can do this without sorting using:
insert into applies (student_id, university_name)
select s.student_id, u.university_name
from students s cross join
universities u
where random() < 0.1;
Note that this is an approximate 10% sample of the rows rather than an exact count.

You can cross join the two tables to get all combinations of student and university. Then number your rows in random order and keep those with a number <= the number of total rows divided by ten to keep 10% of those combinations:
insert into applies (student_id, university_name)
select student_id, university_name
from
(
select
s.student_id,
u.university_name,
row_number() over (order by random()) as rn,
count(*) over () as cnt
from student s
cross join university u
) randomized
where rn <= cnt / 10.0;

I replicated your case with
create table students (student_id serial primary key, name varchar);
create table university (University_Name varchar primary key);
insert into students (name) values ('Francesco');
insert into students (name) values ('Laura');
insert into students (name) values ('Christian');
insert into students (name) values ('Ugo');
insert into students (name) values ('Maria');
insert into students (name) values ('Antonietta');
insert into university values ('Bocconi');
insert into university values ('Universita di Pisa');
insert into university values ('Universita di Padova');
insert into university values ('Universita di Perugia');
If you don't have any limits in students applying for multiple universities, you could achieve the insert with
select * from students cross join university order by random() limit 3;
Where limit 3 displays only 3 rows

Related

Inserting dynamic amount of rows based off Amount of IDS found from search

I am trying to insert into the table TAKES(ID, COURSEID, SEC_ID, SEMESTER, YEAR , GRADE) off all the students who have not taken a certain course.
I correctly get the IDS needed from the table with the call
select ID from student
where dept_name = 'Computer Science'
minus
select ID from takes
where course_id = 'CS-347';
then I go to actually insert it with these IDS that I have retrieved and all the other fields for insert are static.
insert into TAKES
SELECT ID,'CS-347' as COURSE_ID,1 as SEC_ID,'Spring' as SEMESTER,2021 as YEAR,NULL as GRADE
from student
where dept_name = 'Computer Science'
minus
select ID from takes
where course_id = 'CS-347';
I then get the an error:
Incorrect number of result columns.
I know that I am only pulling from the Student column, but I'm not sure how to work around this as in I have tried selecting the IDS individually and that didn't work either.
you can use this query instead :
insert into TAKES (column names)
SELECT
ID,
'CS-347' as COURSE_ID,
1 as SEC_ID,
'Spring' as SEMESTER,
2021 as YEAR,
NULL as GRADE
from
student
where
dept_name = 'Computer Science'
and ID NOT IN (select
ID
from
takes
where
course_id = 'CS-347');
when you use minus both side of operation need of return them same number of columns . also make sure you are inserting the right columns , better to mention column names

How to insert new row without duplicating existing data

I want to insert rows in my table like so:
my columns are student,subject,class,teacher,level. Primary key is (student,subject). The table contains all the students, but the Math subject is missing for some of them, so I want to add it without duplicating the ones that already have it.
I've tried this but it gives me unique constraint violated:
insert into table (student,subject,class,teacher,level)
select a.student, 'math', null, null, null
from table a
where a.student in (select distinct student from table where subject not in 'math')
and (a.student,a.subject) not in (select student,subject from table);
I think you basically need select distinct:
insert into table (student, subject)
select distinct a.student, 'math'
from table a
where not exists (select 1
from table a2
where a2.student = a.student and
a2.subject = 'math'
);
One approach would be to use minus:
insert into course_students (student, subject)
select student, 'Math' from course_students
minus
select student, subject from course_students;
This would would need extending a little if you wanted to include other columns in the insert:
insert into course_students (student, subject, class, teacher, course_level)
select student, subject, '101', 'Naomi', 1
from ( select student, 'Math' as subject from course_students
minus
select student, subject from course_students );

Inserting Data into Two Tables in one Query

Student Table
id|student_num|name|surname
Course Table
id|course_name
Student_course Table
course_id|student_id|mark
When I insert data into the Student Table (id, student_num, name, surname), the id should be inserted into Student_course Table, student_id column.
Assuming you want to insert the new student for all courses, then in Postgres you can do this:
with new_student as (
insert into student
(id, student_num, name, surname)
values
(1, 42, 'Dent', 'Arthur)
returning id
)
insert into Student_course (student_id, course_id)
select (select id from new_student),
id
from course;

Removing redundancies in sql query that contains subquery

Suppose we have a table with scheme
student(id (primary key), name, math_score, english_score)
I am trying to get student information (id and name) with highest rank (ordered by highest sum of math score and english score). There may be several student with tie, and we want all of them. The way I thought about doing this is to use subquery to get a table with sum of scores, then find ids, names that have highest sum.
SELECT s.id, s.name
FROM (SELECT s.id, s.name, s.math_score+s.english_score as sum
FROM student s) s
WHERE s.sum = (SELECT max(s.sum)
FROM (SELECT s.id, s.name, s.math_score+s.english_score as sum
FROM student s) s)
This works, but seems very redundant and not efficient.
I just started learning sql language, and I would appreciate some insight on this problem!
Use WITH TIES
create table #student(
id int primary key identity(1,1),
name varchar(50),
math_score decimal,
english_score decimal
)
insert into #student
values
('Tom', 90, 90),
('Dick', 70, 70),
('Harry', 80, 100)
select TOP(1) WITH TIES
id,
name,
math_score,
english_score,
math_score + english_score as ScoreRank
from #student
order by
math_score + english_score desc
Gives the answer:
id|name|math_score|english_score|ScoreRank
1|Tom|90|90|180
3|Harry|80|100|180
This should accomplish it, you're adding in an unnecessary step.
select id,
name,
math_score+english_score as total_score
from student
where math_score+english_score=(select max(math_score+english_score)
from student)
SELECT id, name, math_score+english_score as 'sum'
FROM student
Order by math_score+english_score DESC;

find MIN without using min()

I am trying to find student who has min score which will be the result of the below query. However, I was asked to write the query without using MIN(). Spent several hours but I can't find the alternative solution :'(.
select s.sname
from student s
where s.score =
(select min(s2.score)
from score s2)
This is one way, which will work even if two students have same lowest score.
SELECT distinct s1.sname
FROM student s1
LEFT JOIN student s2
ON s2.score < s1.score
WHERE s2.score IS NULL
The below is the method using limit, which will return lowest score student, but only one of them if multiple of them have same score.
select sname
from student
order by score asc
limit 1
Here's a possible alternative to the JOIN approach:
select sname from student where score in
(select score from student order by score asc limit 1)
create table student (name varchar(10), score int);
insert into student (name, score) values('joe', 30);
insert into student (name, score) values('jim', 88);
insert into student (name, score) values('jack', 22);
insert into student (name, score) values('jimbo', 15);
insert into student (name, score) values('jo bob',15);
/* folks with lowest score */
select name, score from student where not exists(select 1 from student s where s.score < student.score);
/* the actual lowest score */
select distinct score from student
where not exists(select 1 from student s where s.score < student.score);
Note that not exists can be brutally inefficient, but it'll do the job on a small set.
One way of doing it would be to Order the results in Ascending order and take the first row.
But if you are looking at a more generic solution as a student will have more than one mark associated with him, So you need to find the total marks for each student and then find the student with the least total.
This is the first scenario, A student only has one row in the table.
CREATE TABLE Student
(
SLNO INT,
MARKS FLOAT,
NAME NVARCHAR(MAX)
)
INSERT INTO Student VALUES(1, 80, 't1')
INSERT INTO Student VALUES(2, 90, 't2')
INSERT INTO Student VALUES(3, 76, 't3')
INSERT INTO Student VALUES(4, 98, 't4')
INSERT INTO Student VALUES(5, 55, 't5')
SELECT * From Student ORDER BY MARKS ASC
The second scenario as specified above is, He has multiple rows in the table, So we insert two more rows into the table for existing users.
Then we select the users by taking the sum of their marks grouping the results by name and then ordering the results by their total
INSERT INTO Student VALUES(6, 55, 't1')
INSERT INTO Student VALUES(6, 90, 't5')
SELECT SUM(MARKS) AS TOTAL, NAME FROM Student
GROUP BY NAME
ORDER BY TOTAL
Hope the above is what you are looking for.
You can try stored procedure to find student with minimum score.