I'm trying to define a table to store student grades for a online report card. I can't decide how to do it, though.
The grades are given by subject, in a trimestral period. Every trimester has a average grade, the total missed classes and a "recovering grade" (I don't know the right term in English, but it's an extra test you take to try to raise your grade if you're below the average), I also gotta store the year average and final "recovering grade". Basically, it's like this:
|1st Trimester |2nd Trimester |3rd Trimester
Subj. |Avg. |Mis. |Rec |Avg. |Mis. |Rec |Avg. |Mis. |Rec |Year Avg. |Final Rec.
Math |5.33 |1 |4 |8.0 |0 |7.0 |2 |6.5 |7.0
Sci. |5.33 |1 |4 |8.0 |0 |7.0 |2 |6.5 |7.0
I could store this information in a single DB row, with each row like this:
1tAverage | 1tMissedClasses | 1tRecoveringGrade | 2tAverage | 2tMissedClasses | 2tRecoveringGrade
And so on, but I figured this would be a pain to mantain, if the scholl ever decides to grade by bimester or some other period (like it used to be up until 3 years ago).
I could also generalize the table fields, and use a tinyint for flagging for which trimester those grades are, or if they're the year finals.
But this one would ask for a lot of subqueries to write the report card, also a pain to mantain.
Which of the two is better, or is there some other way?
Thanks
You could try structuring it like this with your tables. I didn't have all the information so I made some guesses at what you might need or do with it all.
TimePeriods:
ID(INT)
PeriodTimeStart(DateTime)
PeriodTimeEnd(DateTime)
Name(VARCHAR(50)
Students:
ID(INT)
FirstName(VARCHAR(60))
LastName(VARCHAR(60))
Birthday(DateTime)
[any other relevant student field
information added...like contact
info, etc]
Grading:
ID(INT)
StudentID(INT)
GradeValue(float)
TimePeriodID(INT)
IsRecoveringGrade(boolean)
MissedClasses:
ID(INT)
StudentID(INT)
ClassID(INT)
TimePeriodID(INT)
DateMissed(DateTime)
Classes:
ID(INT)
ClassName (VARCHAR(50))
ClassDescription (TEXT)
I think the best solution is to store one row per period. So you'd have a table like:
grades
------
studentID
periodNumber
averageGrade
missedClasses
recoveringGrade
So if it's 2 semesters, you'd have periods 1 and 2. I'd suggest using period 0 to mean "overall for the year".
It's better to have a second table representing trimester, and have a foreign key reference to the trimester from the grades table (and store individual grades in the grades table). Then do the averages, missed classes, etc using SQL functions SUM and AVG.
This comes to mind.
(But seriously, err on the side of too many tables, not too few. Handruin has the best solution I see so far).
Related
Due to partial duplicates in some of my database, after some LEFT JOINs I wind up with several (but not all) rows where I have partial data, along with NULLs. For a unique user, one row may have a ZIP code, and another row may have the STATE of that same user.
Let me show you an example:
|email |state |zip |
|-----------------|------|------|
|unique#email.com |NULL |40502 |
|unique#email.com |KY |NULL |
|other#email.com |FL |34744 |
|other#email.com |FL |34744 |
|third#email.com |OH |NULL |
Rows with full duplicates (such as other#email.com in my example) are easy enough to cleanup with a GROUP BY clause, and some people like third#email.com in my example have NULLs and that's ok, but for unique#email.com I have the state in one row and zip in another, what is the best way to combine those two into one row?
A desired result would be:
|email |state |zip |
|-----------------|------|------|
|unique#email.com |KY |40502 |
|other#email.com |FL |34744 |
|third#email.com |OH |NULL |
For the data you have provided, you can use aggregation:
select email, max(state) as state, max(zip) as zip
from t
group by email;
That said, you can probably fix this in the query used to generate the data. Also, if you want multiple rows for a given email in the result set, then you should ask a new question with a clearer example of data.
I came upon a problem with source data that I need to solve.
I have a table with All Employees column that has several names that I would like to extract and fill other columns with. Please see example below where I have the All Employees from raw data and I have to fill all other ones on the right.
Task|All Employees |Lead Employee1|Lead Employee2|Lead Employee|Reg Employee1|Reg Employee2|Reg Employee
1 |Mark Emily Robert|Mark |Emily |Multiple |Robert |NULL |Robert
2 |Mark Robert |Mark |NULL |Mark |Robert |NULL |Robert
3 |Robert |NULL |NULL |NULL |Robert |NULL |Robert
There's around 50 employees and a small rotation (people come and go).
The easiest solution would be to use several nested IIFs for every group (more or less 20 employees per group). That would mean changing the IIF every time there is a change in the team. I was thinking of streamlining it a bit and use additional table where I could keep track of current and previous employees below.
Team members table
Employee|Position
Mark |Lead Employee
Emily |Lead Employee
Robert |Reg Employee
There should be one employee per group assigned to a task so I have to keep track of all situations where there is a multiple of them (handing over a task to a colleague for vacation for example).
I don't have a problem with getting data for a group (simple WHERE clause) but I don't know if there is a way to use some LIKE expression that would check if there is any occurence of (for example) a Lead Employee and fill a table with it. I know that filling the second column would be easier because I would use similar query and just exclude already found employee (replace it with an empty string).
Can you tell me of this is doable (if yes please give me some hint or direction) or should I stick with nested IIFs?
Quite a long-winded question:
As a hypothetical situation i am trying to split a table of data between two companies: OPM, MON.
|NAME |ACCOUNT |BALANCE |COMPANY
_______________________________________________
|SMITH |11111 |100 |
|SMITH |11111 |150 |
|HUNTER |11121 |200 |
|HUNTER |11131 |250 |
|LITTLE |11141 |300 |
|RIDLEY |11151 |300 |
|RIDLEY |11151 |100 |
|ARMSTRONG |11161 |150 |
|ARMSTRONG |11171 |150 |
|HENRY |11181 |100 |
There are several scenario's with the customer data. 1. Customer has two accounts, both have the same account number, but with different balances. 2. Customer has two accounts, different account numbers and different balances. 3. Customer has one account, one account number, one balance.
I need to write out logic in SQL / PL-SQL that enables the data to fulfill an allocation to either of the two different companies and that also follows rules.
A customer, regardless of how many accounts, must be allocated to the same company.
The value of accounts must be roughly equal.
The volume of accounts must be roughly equal.
I accept the limitation in the data i have provided, but the logic must be extrapolated to larger datasets. What is the best method to achieve this?
What you are trying to do is a bin-packing problem, and this is generally hard. However, you simply state that the two groups need to be approximately equal. So, I would suggest adding up the balances for each customer and taking a stratified sample:
select name, balance,
(case when mod(seqnum, 2) = 0 then 'Company1' else 'Company2' end)
from (select name, sum(balance) as balance,
row_number() over (order by balance) as seqnum
from table t
group by name
) n
Note: this is an approximate approach. It puts half the accounts in each group, and they should have similar total balances. There are many cases where this will not produce an optimal solution (such as when one "name" has very large balances compared to everyone else), but it might be good enough.
I am working through a group by problem and could use some direction at this point. I want to summarize a number of variables by a grouping level which is different (but the same domain of values) for each of the variables to be summed. In pseudo-pseudo code, this is my issue: For each empYEAR variable (there are 20 or so employment-by-year variables in wide format), I want to sum it by the county in which the business was located in that particular year.
The data is a bunch of tables representing business establishments over a 20-year period from Dun & Bradstreet/NETS.
More details on the database, which is a number of flat files, all with the same primary key.
The primary key is DUNSNUMBER, which is present in several tables. There are tables detailing, for each year:
employment
county
sales
credit rating (and others)
all organized as follows (this table shows employment, but the other variables are similarly structured, with a year postfix).
dunsnumber|emp1990 |emp1991|emp1992|... |emp2011|
a | 12 |32 |31 |... | 35 |
b | |2 |3 |... | 5 |
c | 1 |1 | |... | |
d | 40 |86 |104 |... | 350 |
...
I would ultimately like to have a table that is structured like this:
county |emp1990|emp1991|emp1992|...|emp2011|sales1990|sales1991|sales1992|sales2011|...
A
B
C
...
My main challenge right now is this: How can I sum employment (or sales) by county by year as in the example table above, given that county as a grouping variable changes sometimes by the year and specified in another table?
It seems like something that would be fairly straightforward to do in, say, R with a long data format, but there are millions of records, so I prefer to keep the initial processing in postgres.
As I understand your question this sounds relatively straight forward. While I normally prefer normalized data to work with, I don't see that normalizing things beforehand will buy you anything specific here.
It seems to me you want something relatively simple like:
SELECT sum(emp1990), sum(emp1991), ....
FROM county c
JOIN emp e ON c.dunsnumber = e.dunsnumber
JOIN sales s ON c.dunsnumber = s.dunsnumber
JOIN ....
GROUP BY c.name, c.state;
I don't see a simpler way of doing this. Very likely you could query the system catalogs or information schema to generate a list of columns to sum up. the rest is a straight group by and join process as far as I can tell.
if the variable changes by name, the best thing to do in my experience is to put together a location view based on that union and join against it. This lets you hide the complexity from your main queries and as long as you don't also join the underlying tables should perform quite well.
I want a table structure which can store the details of the student like the below format.
If the student is in
10 th standard -> I need his aggregate % from 1st standard to 9th standard.
5 th standard -> I need his aggregate % from 1st standard to 4th standard.
1 st standard -> No aggregate % has to be displayed.
And the most important thing is ' we need to use only one table'. Please form a table structure with no redundant values.
Any ideas will be greatly appreciated......
No friends this is not a home work. This is asked in Oracle interview, conducted in Hyderabad day before yesterday '24th July, 2010',. He asked me the table structure.
He even did not asked me the query. He asked me how I will design the table. Please advice me.
id | name | grade | aggregate
This would do the trick, id is your primary key, name is students first last name, grade is what grade he is in and aggregate is aggregate % based on the grade.
Fro example some rows might be:
10 | Bill Cosby | 10 | 90
11 | Jerry Seinfeld | 4 | 60
Bill Cosby would have aggregate percent of 90 in grades 1-9, and jerry would have 60 in grades 1-3. In this case it is one table and boils down to you managing the rule of aggregation for this table, since it has to be one table.
If this is an interview question, it looks like they would like to check your knowledge on Nested Tables. Essentially you would have one column as roll number, and other column which is a nested table as Class and Percentage.