Can I use SQL to model my data? - sql

I am trying to develop a bidding system, where an item is listed, and bidders can place a bid, which includes a bid amount and a message. An item may have an arbitrary number of bids on it. Bidders should also be able to see all the bids they have made across different items.
I am unfamiliar with SQL, so am a little unsure how to model this scenario. I was thinking the following:
A User table, which stores information about bidders, such as name, ID number, etc.
A Bid table, which contains all the bids in the system, which stores the bidder's user ID, the bid amount, the bid description.
A Job table, which contains the User ID of the poster, an item description, and then references to the various bids.
The problem I am seeing is how can I store these references to the Bid table entries in the Job table entries?
Is this the right way to go about approaching this problem? Should I be considering a document-oriented database, such as Mongo, instead?

You're describing a many-to-many relationship. In very simplified form, your tables would look something like this:
user:
id int primary key
job:
id int primary key
bids:
user_id int
job_id int
primary key(userid, job_id)
foreign key (user_id) references user (id)
foreign key (job_id) references job (id)
basically, the bids table would contain fields to represent both the user and the job, along with whatever other fields you'd need, such as bid amount, date/time stamp, etc...
Now, I've made the user_id/job_id fields a primary key in the bids table, which would limit each user to 1 bid per job. Simply remove the primary key and put in two regular indexes on each field to remove the limit.

SQL will work fine like you have it set up... I would do:
create table usertable (
userID integer unsigned not null auto_increment primary key,
userName varchar(64) );
create table jobtable (
jobID integer unsigned not null auto_increment primary key,
jobDesc text,
posterUserRef integer not null );
create table bidtable (
bidID integer unsigned not null auto_increment primary key,
bidAmount integer,
bidDesc text,
bidTime datetime,
bidderUserRef integer not null references usertable(userID),
biddingOnJobRef integer not null reference jobtable(jobID) );
Now you can figure out whatever you want with various joins (maximum bid per user, all bids for job, all bids by user, highest bidder for job, etc).

Related

Two postgresql tables referencing each other

Question may be basic, I don't have any experience with databases.
I have a postgres db with some tables. Two of them are dates and accounts.
The date table has an account_id field referencing an id table in an account table and a balance field that represents the balance that account had at that date. So, many date entities may reference one account entity, many-to-one, okay.
But an account table also has an actual_date field, that must reference the date entity, with actual balance this account has. One account entity may reference one actual date entuty, but date entity can have one or zero account entities referncing it. And if it does have an account referencing it with it's actual_date, it will always be the same account, date itself referencing with account_id.
What kind of relathinship is this? Is it even possible to implement? And if it is, how do I do it?
I came up with this piece of code, but I have no clue if it does what I think it does.
CREATE TABLE accounts (
id SERIAL PRIMARY KEY,
user_id INT REFERENCES users,
actual_date_id DATE UNIQUE REFERENCES dates
);
CREATE TABLE dates (
id SERIAL PRIMARY KEY,
account_id INT REFERENCES accounts,
date DATE,
balance INT,
unconfirmed_balance INT
);
P.S. I create tables with init.sql but work with them with sqlalchemy and it would be greate if someone could also show how to define such model with it.
As written the SQL script would never work for two reasons:
a foreign key can only reference the primary key of a table, not any arbitrary column in it. So actual_date_id should be an integer in order to be able to reference the primary key of the dates table.
you can't reference a table that hasn't been created yet, so the foreign key between accounts and dates must be created after both tables are created.
With circular foreign keys it's usually easier to define at least one of them as deferrable, so that you can insert them without the need of e.g. an intermediate NULL value.
So something along the lines (assuming that users already exists)
CREATE TABLE accounts (
id SERIAL PRIMARY KEY,
user_id INT REFERENCES users,
actual_date_id integer UNIQUE -- note the data type
);
CREATE TABLE dates (
id SERIAL PRIMARY KEY,
account_id INT REFERENCES accounts,
date DATE,
balance INT,
unconfirmed_balance INT
);
-- now we can add the foreign key from accounts to dates
alter table accounts
add foreign key (actual_date_id)
references dates (id)
deferrable initially deferred;
It might be better to avoid the circular reference to begin with. As you want to make sure that only one "current balance" exists for each account, this could be achieved by adding a flag in the dates table and getting rid of the actual_date_id in the accounts table.
CREATE TABLE accounts (
id SERIAL PRIMARY KEY,
user_id INT REFERENCES users
);
CREATE TABLE dates (
id SERIAL PRIMARY KEY,
account_id INT REFERENCES accounts,
is_current_balance boolean not null default false,
date DATE,
balance INT,
unconfirmed_balance INT
);
-- this ensures that there is exactly one row with "is_current_balance = true"
-- for each account
create unique index only_one_current_balance
on dates (account_id)
where is_current_balance;
Before you change a row in dates to be the "current one", you need to reset the existing one to false.
Unrelated, but:
With modern Postgres versions it's recommended to use identity columns instead of serial

Database - Am i doing this right?

I'm trying to make a simple database for a personal project, and I'm not sure whether i'm using Primary Keys properly.
Basically, the database contains users who have votes yes/no on many different items.
Example :
User "JOHN" voted YES on item_1 and item_2, but voted FALSE on item_3.
User "BOB" voted YES on item_1 and item_6.
User "PAUL" votes NO on item_55 and item_76 and item_45.
I want to use the following 3 tables (PK means Primary Key) :
1) table_users, which contains the columns "PK_userID" and "name"
2) table_items, which contains the columns "PK_itemID" and "item_name"
3) table_votes, which contains the columns "PK_userID", "PK_itemID", and "vote"
and the columns with the same name will be linked
Does it look like a proper way to use primary keys ? (so the table_votes will have two Primary Keys, being linked to the two other tables)
Thanks :)
Since user can vote for multiple items and multiple users can vote for a single item, you should not create following two primary keys in third table table_votes. Just create them as fields otherwise it will restrict you add only a userId or itemId only once. Yep, you should make them NOT NULL
"PK_userID", "PK_itemID",
No, that's not correct.
There can only be one primary key per table. You can have other columns with unique indexes that could have been candidate keys, but that's not the primary.
I think you'd have three tables:
create table PERSON (
PERSON_ID int not null identity,
primary key(PERSON_ID)
);
create table ITEM (
ITEM_ID int not null identity,
primary key(ITEM_ID)
);
create table VOTE (
PERSON_ID int,
ITEM_ID int,
primary key(PERSON_ID, ITEM_ID),
foreign key(PERSON_ID) references PERSON(PERSON_ID),
foreign key(ITEM_ID) references ITEM(ITEM_ID)
);
It's a matter of cardinality. A person can vote on many items; an item can be voted on by many persons.
select p.PERSON_ID, i.ITEM_ID, COUNT(*) as vote_count
from PERSON as p
join VOTE as v
on p.PERSON_ID = v.PERSON_ID
join ITEM as i
on i.ITEM_ID = v.PERSON_ID
group by p.PERSON_ID, i.ITEM_ID
This looks reasonable. However, I would not advise you to name your primary keys with a "PK_" prefix. This can be confusing, especially because I advise giving foreign keys and primary keys the same name (the relationship is then obvious). Instead, just name it after the table with Id as a suffix. I would recommend a table structure such as this:
create table Users (
UserId int not null auto_increment primary key,
Name varchar(255) -- Note: you probably want this to be unique
);
create table Items (
ItemId int not null auto_increment primary key,
ItemName varchar(255) -- Note: you probably want this to be unique
);
create table Votes (
UserId int not null references Users(UserId),
ItemId int not null references Items(ItemId),
Votes int,
constraint pk_UserId_ItemId primary key (UserId, ItemId)
);
Actually, I would be inclined to have an auto-incremented primary key in Votes, with UserId, ItemId declared as unique. However, there are good arguments for doing this either way, so that is more a matter of preference.

Enforcing existence of many-to-many and one-to-one information at the same time (PostgreSQL)

The relation "astronomers discover stars" is m:n, as an astronomer can discover many stars, and a star can as well be discovered from many astronomers.
In any case however, there will be a single date of discovery for every star. If there are many astronomers working on it, they should do it simultaneously, otherwise, only the first one gets the credits of the discovery.
So, in PostgreSQL, we have:
CREATE TABLE astronomer (
astronomer_id serial PRIMARY KEY,
astronomer_name text NOT NULL UNIQUE
);
CREATE TABLE star (
star_id serial PRIMARY KEY,
star_name text NOT NULL UNIQUE,
discovery_date date
);
CREATE TABLE discovered (
star_id integer NOT NULL REFERENCES star,
astronomer_id integer NOT NULL REFERENCES astronomer,
CONSTRAINT pk_discovered PRIMARY KEY (star_id, astronomer_id)
);
Now the trick question: I want to enforce that whenever there is a discovery date for a star, there will be at least one entry in the discovered table, and vice versa.
I could create a unique key (star_id, discovery_date) in table star, and then use that combination as a foreign key in table discovered instead of the star_id. That would solve half of the problem, but still leave it possible to have a discovered_date without astronomers for it.
I can't see any simple solution other than using a trigger to check, or only allowing data enter via a stored procedure that will at the same time insert a discovery_date and entries into the table discovered.
Any ideas?
Thank you!
I would just move the discovery_date column to the discovered table
CREATE TABLE astronomer (
astronomer_id serial PRIMARY KEY,
astronomer_name text NOT NULL UNIQUE
);
CREATE TABLE star (
star_id serial PRIMARY KEY,
star_name text NOT NULL UNIQUE
);
CREATE TABLE discovered (
star_id integer NOT NULL REFERENCES star,
astronomer_id integer NOT NULL REFERENCES astronomer,
discovery_date date not null,
CONSTRAINT pk_discovered PRIMARY KEY (star_id, astronomer_id)
);
Now you have the problem of multiple dates for the same star but as you say in the question only the first one(s) will get the credit.

MySQL Lookup table and id/keys

Hoping someone can shed some light on this: Do lookup tables need their own ID?
For example, say I have:
Table users: user_id, username
Table categories: category_id, category_name
Table users_categories: user_id, category_id
Would each row in "users_categories" need an additional ID field? What would the primary key of said table be? Thanks.
You have a choice. The primary key can be either:
A new, otherwise meaningless INTEGER column.
A key made up of both user_id and category_id.
I prefer the first solution but I think you'll find a majority of programmers here prefer the second.
You could create a composite key that uses the both keys
Normally if there is no suitable key to be found in a table you want to create a either a composite key, made up of 2 or more fields,
ex:
Code below found here
CREATE TABLE topic_replies (
topic_id int unsigned not null,
id int unsigned not null auto_increment,
user_id int unsigned not null,
message text not null,
PRIMARY KEY(topic_id, id));
therefor in your case you could add code that does the following:
ALTER TABLE users_categories ADD PRIMARY KEY (user_id, category_id);
therefor once you want to reference a certain field all you would need is to pass the two PKs from your other table, however to link them they need to each be coded as a foreign key.
ALTER TABLE users_categories ADD CONSTRAINT fk_1 FOREIGN KEY (category_id) REFERENCES categories (category_id);
but if you want to create a new primary key in your users_categories table that is an option. Just know that its not always neccessary.
If your users_categories table has a unique primary key over (user_id, category_id), then - no, not necessarily.
Only if you
want to refer to single rows of that table from someplace else easily
have more than one equal user_id, category_id combination
you could benefit from a separate ID field.
Every table needs a primary key and unique ID in SQL no matter what. Just make it users_categories_id, you technically never have to use it but it has to be there.

How do you store business activities in a SQL database?

The goal is to store activities such as inserting, updating, and deleting business records.
One solution I'm considering is to use one table per record to be tracked. Here is a simplified example:
CREATE TABLE ActivityTypes
(
TypeId int IDENTITY(1,1) NOT NULL,
TypeName nvarchar(50) NOT NULL,
CONSTRAINT PK_ActivityTypes PRIMARY KEY (TypeId),
CONSTRAINT UK_ActivityTypes UNIQUE (TypeName)
)
INSERT INTO ActivityTypes (TypeName) VALUES ('WidgetRotated');
INSERT INTO ActivityTypes (TypeName) VALUES ('WidgetFlipped');
INSERT INTO ActivityTypes (TypeName) VALUES ('DingBatPushed');
INSERT INTO ActivityTypes (TypeName) VALUES ('ButtonAddedToDingBat');
CREATE TABLE Activities
(
ActivityId int IDENTITY(1,1) NOT NULL,
TypeId int NOT NULL,
AccountId int NOT NULL,
TimeStamp datetime NOT NULL,
CONSTRAINT PK_Activities PRIMARY KEY (ActivityId),
CONSTRAINT FK_Activities_ActivityTypes FOREIGN KEY (TypeId)
REFERENCES ActivityTypes (TypeId),
CONSTRAINT FK_Activities_Accounts FOREIGN KEY (AccountId)
REFERENCES Accounts (AccountId)
)
CREATE TABLE WidgetActivities
(
ActivityId int NOT NULL,
WidgetId int NOT NULL,
CONSTRAINT PK_WidgetActivities PRIMARY KEY (ActivityId),
CONSTRAINT FK_WidgetActivities_Activities FOREIGN KEY (ActivityId)
REFERENCES Activities (ActivityId),
CONSTRAINT FK_WidgetActivities_Widgets FOREIGN KEY (WidgetId)
REFERENCES Widgets (WidgetId)
)
CREATE TABLE DingBatActivities
(
ActivityId int NOT NULL,
DingBatId int NOT NULL,
ButtonId int,
CONSTRAINT PK_DingBatActivities PRIMARY KEY (ActivityId),
CONSTRAINT FK_DingBatActivities_Activities FOREIGN KEY (ActivityId)
REFERENCES Activities (ActivityId),
CONSTRAINT FK_DingBatActivities_DingBats FOREIGN KEY (DingBatId)
REFERENCES DingBats (DingBatId)
CONSTRAINT FK_DingBatActivities_Buttons FOREIGN KEY (ButtonId)
REFERENCES Buttons (ButtonId)
)
This solution seems good for fetching all activities given a widget or dingbat record id, however it doesn't seem so good for fetching all activities and then trying to determine to which record they refer.
That is, in this example, all the account names and timestamps are stored in a separate table, so it's easy to create reports focused on users and focused on time intervals without the need to know what the activity is in particular.
However, if you did want to report on the activities by type in particular, this solution would require determining to which type of activity the general activity table refers.
I could put all my activity types in one table, however the ID's would not be able to be constrained by a foreign key, instead the table name might be used as an id, which would lead me to use dynamic queries.
Note in the example that a DingBatActivity has an optional button Id. If the button name were to have been edited after being added to the dingbat, the activity would be able to refer to the button and know its name, so if a report listed all activities by dingbat and by button by name, the button name change would automatically be reflected in the activity description.
Looking for some other ideas and how those ideas compromise between programming effort, data integrity, performance, and reporting flexibility.
The way that I usually architect a solution to this problem is similar to inheritance in objects. If you have "activities" that are taking place on certain entities and you want to track those activities then the entities involved almost certainly have something in common. There's your base table. From there you can create subtables off of the base table to track things specific to that subtype. For example, you might have:
CREATE TABLE Objects -- Bad table name, should be more specific
(
object_id INT NOT NULL,
name VARCHAR(20) NOT NULL,
CONSTRAINT PK_Application_Objects PRIMARY KEY CLUSTERED (application_id)
)
CREATE TABLE Widgets
(
object_id INT NOT NULL,
height DECIMAL(5, 2) NOT NULL,
width DECIMAL(5, 2) NOT NULL,
CONSTRAINT PK_Widgets PRIMARY KEY CLUSTERED (object_id),
CONSTRAINT FK_Widgets_Objects
FOREIGN KEY (object_id) REFERENCES Objects (object_id)
)
CREATE TABLE Dingbats
(
object_id INT NOT NULL,
label VARCHAR(50) NOT NULL,
CONSTRAINT PK_Dingbats PRIMARY KEY CLUSTERED (object_id),
CONSTRAINT FK_Dingbats_Objects
FOREIGN KEY (object_id) REFERENCES Objects (object_id)
)
Now for your activities:
CREATE TABLE Object_Activities
(
activity_id INT NOT NULL,
object_id INT NOT NULL,
activity_type INT NOT NULL,
activity_time DATETIME NOT NULL,
account_id INT NOT NULL,
CONSTRAINT PK_Object_Activities PRIMARY KEY CLUSTERED (activity_id),
CONSTRAINT FK_Object_Activities_Objects
FOREIGN KEY (object_id) REFERENCES Objects (object_id),
CONSTRAINT FK_Object_Activities_Activity_Types
FOREIGN KEY (activity_type) REFERENCES Activity_Types (activity_type),
)
CREATE TABLE Dingbat_Activities
(
activity_id INT NOT NULL,
button_id INT NOT NULL,
CONSTRAINT PK_Dingbat_Activities PRIMARY KEY CLUSTERED (activity_id),
CONSTRAINT FK_Dingbat_Activities_Object_Activities
FOREIGN KEY (activity_id) REFERENCES Object_Activities (activity_id),
CONSTRAINT FK_Dingbat_Activities_Buttons
FOREIGN KEY (button_id) REFERENCES Object_Activities (button_id),
)
You can add a type code to the base activity if you want to for the type of object which it is affecting or you can just determine that by looking for existence in a subtable.
Here's the big caveat though: Make sure that the objects/activities really do have something in common which relates them and requires you to go down this path. You don't want to store disjointed, unrelated data in the same table. For example, you could use this method to create a table that holds both bank account transactions and celestial events, but that wouldn't be a good idea. At the base level they need to have something in common.
Also, I assumed that all of your activities were related to an account, which is why it's in the base table. Anything in common to ALL activities goes in the base table. Things relevant to only a subtype go in those tables. You could even go multiple levels deep, but don't get carried away. The same goes for the objects (again, bad name here, but I'm not sure what you're actually dealing with). If all of your objects have a color then you can put it in the Objects table. If not, then it would go into sub tables.
I'm going to go out on a limb and take a few wild guesses about what you're really trying to accomplish.
You say you're trying to track 'store activities' I'm going to assume you have the following activities:
Purchase new item
Sell item
Write off item
Hire employee
Pay employee
Fire employee
Update employee record
Ok, for these activities, you need a few different tables: one for inventory, one for departments, and one for employees
The inventory table could have the following information:
inventory:
item_id (pk)
description (varchar)
number_in_stock (number)
cost_wholesale (number)
retail_price (number)
dept_id (fk)
department:
dept_id (pk)
description (varchar)
employee
emp_id (pk)
first_name (varchar)
last_name (varchar)
salary (number)
hire_date (date)
fire_date (date)
So, when you buy new items, you will either update the number_in_stock in inventory table, or create a new row if it is an item you've never had before. When you sell an item, you decriment the number_in_stock for that item (also for when you write off an item).
When you hire a new employee, you add a record from them to the employees table. When you pay them, you grab their salary from the salary column. When you fire them, you fill in that column for their record (and stop paying them).
In all of this, the doing is not done by the database. SQL should be used for keeping track of information. It's fine to write procedures for doing these updates (a new invoice procedure that updates all the items from an invoice record). But you don't need a table to do stuff. In fact, a table can't do anything.
When designing a database, the question you need to ask is not "what do I need to do?" it is "What information do I need to keep track of?"
New answer, based on an different interpretation of the question.
Are you just trying to keep a list of what has happened? If you just need a ordered list of past events, you just need 1 table for it:
action_list
action_list_id (pk)
action_desc (varchar)
event_log:
event_log_id (pk)
event_time (timestamp)
action_list_id (fk)
new_action_added (fk)
action_details_or_description (varchar)
In this, the action_list would be something like:
1 'WidgetRotated'
2 'WidgetFlipped'
3 'DingBatPushed'
4 'AddNewAction'
5 'DeleteExistingAction'
The event_log would be a list of what activities happened, and when. One of your actions would be "add new action" and would require the 'new_action_added' column to be filled in on the event table anytime the action taken is "add new action".
You can create actions for update, remove, add, etc.
EDIT:
I added the action_details_or_description column to event. In this way, you can give further information about an action. For example, if you have a "product changes color" action, the description could be "Red" for the new color.
More broadly, you'll want to think through and map out all the different types of actions you'll be taking ahead of time, so you can set up your table(s) in a way that can accurately contain the data you want to put into them.
How about the SQL logs?
The last time I needed a database transaction logger I used an Instead Of trigger in the database so that it would instead of just updating the record, the database would insert a new record into the log table. This technique meant that I needed an additional table to hold the log for each table in my database and the log table had an additional column with a time stamp. Using this technique you can even store the pre and post update state of the record if you want to.