How to stop redundancy - sql

Here is my little SQL table which contains sequels of movies:
CREATE TABLE "films"
(
"title" TEXT NOT NULL,
"year" INTEGER NOT NULL,
"predecessor_title" TEXT NOT NULL,
"predecessor_year" INTEGER NOT NULL,
"increase" NUMERIC NOT NULL
)
Here are some of my values:
Toy Story 2 1999 Toy Story 1995 28%
Toy Story 3 2010 Toy Story 2 1999 69%
As you can see, it is somewhat redundant. Toy story 2 shows up both as a title and a predecessor. How can I create a table that doesn't have this?

You can:
CREATE TABLE FILMS (
ID INTEGER NOT NULL PRIMARY KEY
TITLE TEXT NOT NULL,
YEAR INTEGER NOT NULL,
PREDECESSOR_ID INTEGER,
INCREASE NUMERIC
)
ID is the primary key.
PREDECESSOR_ID is the ID of the predecessor movie. Can be null because the first movie of the series doesn't have a predecessor.
INCREASE is the increase. Can be null because the first movie doesn't have an increase.

When you design it using ORM (Hibernate) tool like this:
#Entity
public class Films{
#ID
#GeneratedValue
int id;
#NotNull
String titleText;
#NotNull
Integer year;
#JoinColumn
Film film;
Integer increase;
}
this looks cool. Here you can keep a reference of predecessor in Film film; No, matter if there is no predecessor. It would be null then.

Related

Multiple inner joins in a query

Players{
Pid int primary key,
tid int not null references Teams,
name text not null,
age int not null
}
Teams{
tid int primary key,
name text not null,
location not null
}
Possessions{
id int primary key,
pid int not null references Players,
time int not null, //the time the possession started for a player
held int not null //for how much time he had the ball
}
I would like to create a view called Teampasses where I can select (passer,passee) as follows:Passer and passe must be from the same team and passes possession starting time equals to passes possession starting time +held (time he has the ball). What I have done so far is this:
SELECT x.name AS passer,y.name as Pasee
FROM player x
INNER JOIN player y ON x.tid=y.tid
INNER JOIN possesions p ON p.pid=x.pid AND p.pid=y.pid AND ...
in the ... section right of AND i would like to do something like x.time+x.held=y.time.How could i refer to there two?
I see an issue with your data:
Your Possessions table only has a single foreign key to the Players table for the passer. It needs to also include the Pid of the Passee. Otherwise, there's no way to filter out which player on the passer team is the Passee for a given Possession.
I would suggest changing the Possessions table as follows:
Possessions(
id int primary key,
pid_passer int not null references Players,
pid_passee int not null references Players,
time int not null, //the time the possession started for a player
held int not null //for how much time he had the ball
)
With this change, your data will work and the query becomes trivial.

SQL: Modelling template inheritance

I wanted to ask whether it is possible to model a templated datastructure which can be overriden if necessary.
Suppose we have a list with the following items:
Template List
Item 1 Position 0
Item 2 Position 1
Item 3 Position 2
Item 4 Position 3
Now I want to create a list which uses Template List as a base, but modifies some parts of it:
Concrete List, based on Template List
Item 1 Position 0 // Inherited from Template List
Item 5 Position 1 // New and only available in Concrete List
Item 4 Position 2 // Inherited from Template List, but with a different position
Item 3 Position 3 // Inherited from Template List, but with a different position
In this list Item 2 from Template List is missing should not be a part of the resulting list.
Is it possible to model these relations in SQL? (We are using PostgreSQL)
It's possible to do something like what you want, but it's not necessarily a good solution or what you need. What you're asking for looks like a metamodel, but relational databases were designed for first-order logical models, and while SQL can go beyond that somewhat, it's usually better not to go too abstract.
That said, here's an example. I assumed the identity of list items were position or slot-based.
CREATE TABLE template_list (
template_list_id SERIAL NOT NULL,
PRIMARY KEY (template_list_id)
);
CREATE TABLE template_list_items (
template_list_id INTEGER NOT NULL,
slot_number INTEGER NOT NULL,
item_number INTEGER NOT NULL,
PRIMARY KEY (template_list_id, slot_number),
FOREIGN KEY (template_list_id) REFERENCES template_list (template_list_id)
);
CREATE TABLE concrete_list (
concrete_list_id SERIAL NOT NULL,
template_list_id INTEGER NOT NULL,
FOREIGN KEY (template_list_id) REFERENCES template_list (template_list_id),
UNIQUE (concrete_list_id, template_list_id)
);
CREATE TABLE concrete_list_items (
concrete_list_id INTEGER NOT NULL,
template_list_id INTEGER NOT NULL,
slot_number INTEGER NOT NULL,
item_number INTEGER NULL,
PRIMARY KEY (concrete_list_id, slot_number),
FOREIGN KEY (concrete_list_id, template_list_id) REFERENCES concrete_list (concrete_list_id, template_list_id),
FOREIGN KEY (template_list_id, slot_number) REFERENCES template_list_items (template_list_id, slot_number)
);
Now, to get the items in a concrete list, you would use a query like:
SELECT c.concrete_list_id, x.slot_number, x.item_number
FROM concrete_list c
LEFT JOIN (
SELECT ci.concrete_list_id,
COALESCE(ci.template_list_id, ti.template_list_id) AS template_list_id,
COALESCE(ci.slot_number, ti.slot_number) AS slot_number,
COALESCE(ci.item_number, ti.item_number) AS item_number
FROM concrete_list_items AS ci
FULL JOIN template_list_items AS ti ON ci.template_list_id = ti.template_list_id
AND ci.slot_number = ti.slot_number
) x ON c.concrete_list_id = x.concrete_list_id OR c.template_list_id = x.template_list_id;
Here's a SQL fiddle for demonstration. Note that I replaced the serial types with integers and hardcoded values for simplicity of demonstration.

GroupBy query for billion records - Vertica

I am working on an application where records are in billions and I need to make a query where GroupBy clause is needed.
Table Schema:
CREATE TABLE event (
eventId INTEGER PRIMARY KEY,
eventTime INTEGER NOT NULL,
sourceId INTEGER NOT NULL,
plateNumber VARCHAR(10) NOT NULL,
plateCodeId INTEGER NOT NULL,
plateCountryId INTEGER NOT NULL,
plateStateId INTEGER NOT NULL
);
CREATE TABLE source (
sourceId INTEGER PRIMARY KEY,
sourceName VARCHAR(32) NOT NULL
);
Scenario:
User will select sources, suppose source ID (1,2,3)
We need to get all events which occurred more than once for those source for event time range
Same event criteria (same platenumber, platecodeId, platestateId, plateCountryId)
I have prepared a query to perform above mentioned operation but its taking long time to execute.
select plateNumber, plateCodeId, plateStateId,
plateCountryId, sourceId,count(1) from event
where sourceId in (1,2,3)
group by sourceId, plateCodeId, plateStateId,
plateCountryId, plateNumber
having count(1) > 1 limit 10 offset 0
Can you recommend optimized query for it?
Since you didn't supply the projection DDL, I'll assume the projection is default and created by the CREATE TABLE statement
Your goal is to achieve the use of the GROUPBY PIPELINED algorithm instead of GROUPBY HASH which is usually slower and consumes more memory.
To do so, you need the table('s projection) to be sorted by the columns in the group by clause.
More info here: GROUP BY Implementation Options
CREATE TABLE event (
eventId INTEGER PRIMARY KEY,
eventTime INTEGER NOT NULL,
sourceId INTEGER NOT NULL,
plateNumber VARCHAR(10) NOT NULL,
plateCodeId INTEGER NOT NULL,
plateCountryId INTEGER NOT NULL,
plateStateId INTEGER NOT NULL
)
ORDER BY sourceId,
plateCodeId,
plateStateId,
plateCountryId,
plateNumber;
You can see which algorithm is being used by adding EXPLAIN before your query.

Oracle - Trigger [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 8 years ago.
Improve this question
I am new to PL SQL. I have the following two tables: UserGame and Game.
CREATE TABLE Game (
GameID INT NOT NULL,
Name CHAR(100) NOT NULL,
Description CHAR(100) NOT NULL,
Publisher CHAR(100) NOT NULL,
AgeRating INT NOT NULL,
ImageLink CHAR(100),
WebsiteUrl CHAR(100),
AverageRating FLOAT,
OverallRanking INT,
CONSTRAINT pkGameId
PRIMARY KEY (GameID),
CONSTRAINT AgeRating CHECK (AgeRating >= 0),
CONSTRAINT OverallRankingMin CHECK (OverallRanking >= 0)
);
CREATE TABLE UserGame (
PlayerID INT NOT NULL,
GameID INT NOT NULL,
Rating INT,
RatingComment CHAR(100),
LastPlayed DATE,
HighestScore INT,
InProgress CHAR(1),
CONSTRAINT pkUserGame
PRIMARY KEY (PlayerID, GameID),
CONSTRAINT fkPlayerID
FOREIGN KEY (PlayerID)
REFERENCES Player (PlayerID),
CONSTRAINT fkGameIdTer
FOREIGN KEY (GameID)
REFERENCES Game (GameID),
CONSTRAINT RatingMin CHECK (Rating >= 0),
CONSTRAINT RatingMax CHECK (Rating <= 5),
CONSTRAINT HighestScore CHECK (HighestScore >= 0),
CONSTRAINT InProgress CHECK (InProgress IN (0,1))
);
I would like to update the average rating of a game, every time a player updates a rating in UserGame.
This is what I came up with.
CREATE OR REPLACE TRIGGER averageUpdate
AFTER UPDATE OF Rating ON UserGame
BEGIN
FOR r1 in (SELECT DISTINCT GameID FROM UserGame)
LOOP
UPDATE Game
SET Game.AverageRating = (SELECT AVG(Rating) FROM UserGame WHERE GameID = r1.GameID GROUP BY GameID)
WHERE Game.GameID = r1.GameID;
END LOOP;
END averageUpdate;
But it does not work and I get this error:
Error: ORA-00900: invalid SQL statement
SQLState: 42000
ErrorCode: 900
Error occured in:
END LOOP
Could anyone explain to me what I am doing wrong?
It seems the code I had posted was correct, as confirmed by Justin Cave. There must be something wrong with my set-up then.
To make sure of this, I ran the queries using SQL Fiddle, with success.
Are you sure that what you posted is actually what you're running? It works fine for me (once I remove the foreign keys to tables that you haven't provided).
SQL> CREATE TABLE Game (
2 GameID INT NOT NULL,
3 Name CHAR(100) NOT NULL,
4 Description CHAR(100) NOT NULL,
5 Publisher CHAR(100) NOT NULL,
6 AgeRating INT NOT NULL,
7 ImageLink CHAR(100),
8 WebsiteUrl CHAR(100),
9 AverageRating FLOAT,
10 OverallRanking INT,
11 CONSTRAINT pkGameId
12 PRIMARY KEY (GameID),
13 CONSTRAINT AgeRating CHECK (AgeRating >= 0),
14 CONSTRAINT OverallRankingMin CHECK (OverallRanking >= 0)
15 );
Table created.
SQL> ed
Wrote file afiedt.buf
1 CREATE TABLE UserGame (
2 PlayerID INT NOT NULL,
3 GameID INT NOT NULL,
4 Rating INT,
5 RatingComment CHAR(100),
6 LastPlayed DATE,
7 HighestScore INT,
8 InProgress CHAR(1),
9 CONSTRAINT fkGameIdTer
10 FOREIGN KEY (GameID)
11 REFERENCES Game (GameID),
12 CONSTRAINT RatingMin CHECK (Rating >= 0),
13 CONSTRAINT RatingMax CHECK (Rating <= 5),
14 CONSTRAINT HighestScore CHECK (HighestScore >= 0),
15 CONSTRAINT InProgress CHECK (InProgress IN (0,1))
16* )
SQL> /
Table created.
SQL> CREATE OR REPLACE TRIGGER averageUpdate
2 AFTER UPDATE OF Rating ON UserGame
3 BEGIN
4 FOR r1 in (SELECT DISTINCT GameID FROM UserGame)
5 LOOP
6 UPDATE Game
7 SET Game.AverageRating = (SELECT AVG(Rating) FROM UserGame WHERE GameID = r1.GameID GROUP BY GameID)
8 WHERE Game.GameID = r1.GameID;
9 END LOOP;
10 END averageUpdate;
11 /
Trigger created.
Can you cut and paste from a SQL*Plus session just as I did here showing exactly what you are doing?
This doesn't have any impact on your current question. But I would strongly suggest that you not use char(100) or float data types in this data model. All these strings are variable length so you should be using varchar2. char is a fixed-width data type. A char(100) will always store exactly 100 bytes of data. If your actual data is less than that, Oracle will add spaces at the end. If you try to search for a particular value in the table and you end up with char comparison semantics, you'll need to ensure that the search string is space-padded to 100 bytes. A varchar2 is a variable-width data type. It uses only as much space as is required for the actual data. It doesn't do pointless and wasteful space-padding of data. And you never need to worry about space-padding search strings.
I can also all but guarantee that you want your ratings to be number data types of some length and precision, not float. Floating point numbers are inherently imprecise so an game that might average a score of 4.4 might be represented in a float as 4.3999999999865 or 4.4000000000107 (making the numbers up). It's very unlikely that is the sort of score that your users want to see. If you use a number(4,3), you'll get 3 decimal digits of precision and you won't have to deal with errors (or imprecision if you prefer) in the least significant bits of the data. A game that averages a score of 4.4 will have a value of 4.4 not something very very close to 4.4.
From a performance standpoint, I would strongly suggest that you not use a trigger to meet this requirement. Particularly not a trigger that recomputes the score for every game every time any game gets rated. That will not scale well and you will be spending gobs of time constantly recalculating scores. Assuming you need to store the computed score, you probably want it to be refreshed periodically not immediately when a rating is entered. If you do want to recompute the score every time a game is rated, only recompute the score for the game that was rated not for every game in the system.

SQL - Selecting random rows and combining into a new table

Here's the creation of my tables...
CREATE TABLE questions(
id INTEGER PRIMARY KEY AUTOINCREMENT,
question VARCHAR(256) UNIQUE NOT NULL,
rangeMin INTEGER,
rangeMax INTEGER,
level INTEGER NOT NULL,
totalRatings INTEGER DEFAULT 0,
totalStars INTEGER DEFAULT 0
);
CREATE TABLE games(
id INTEGER PRIMARY KEY AUTOINCREMENT,
level INTEGER NOT NULL,
inclusive BOOL NOT NULL DEFAULT 0,
active BOOL NOT NULL DEFAULT 0,
questionCount INTEGER NOT NULL,
completedCount INTEGER DEFAULT 0,
startTime DATETIME DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE gameQuestions(
gameId INTEGER,
questionId INTEGER,
asked BOOL DEFAULT 0,
FOREIGN KEY(gameId) REFERENCES games(id),
FOREIGN KEY(questionId) REFERENCES questions(id)
);
I'll explain the full steps that I'm doing, and then I'll ask for input.
I need to...
Using a games.id value, lookup the games.questionCount and games.level for that game.
Now since I have games.questionCount and games.level, I need to look at all of the rows in questions table with questions.level = games.level and select games.questionCount of them at random.
Now with the rows (aka questions) I got from step 2, I need to put them into gameQuestions table using the games.id value and the questions.id value.
Whats the best way to accomplish this? I could do it with several different sql queries, but I feel like someone really skilled with sql could make it happen a bit more efficient. I am using sqlite3.
This does it in one statement. Let's assume :game_id to be the game id you want to process.
insert into gameQuestions (gameId, questionId)
select :game_id, id
from questions
where level = (select level from games where id = :game_id)
order by random()
limit (select questionCount from games where id = :game_id);
#Tony: sqlite doc says LIMIT takes an expression. The above statement works fine using sqlite 3.8.0.2 and produces the desired results. I have not tested an older version.