Consider the following postgres (version 9.4) database:
testbase=# select * from employee;
id | name
----+----------------------------------
1 | johnson, jack
2 | jackson, john
(2 rows)
testbase=# select * from worklog;
id | activity | employee | time
----+----------------------------------+----------+----------------------------
1 | department alpha | 1 | 2018-01-27 20:32:16.512677
2 | department beta | 1 | 2018-01-27 20:32:18.112356
5 | break | 1 | 2018-01-27 20:32:22.255563
3 | department gamma | 2 | 2018-01-27 20:32:20.073173
4 | department gamma | 2 | 2018-01-27 20:32:21.05962
(5 rows)
The column 'name' in table 'employee' is of type character(32) and unique, the column 'employee' in 'worklog' references 'id' from the table 'employee'. The column 'id' is the primary key in either table.
I can see all activities from a certain employee by issuing:
testbase=# select * from worklog where employee=(select id from employee where name='johnson, jack');
id | activity | employee | time
----+----------------------------------+----------+----------------------------
1 | department alpha | 1 | 2018-01-27 20:32:16.512677
2 | department beta | 1 | 2018-01-27 20:32:18.112356
5 | break | 1 | 2018-01-27 20:32:22.255563
(3 rows)
I would rather like to simplify the query to
testbase=# select * from worklog where employee='johnson, jack';
For this I would change 'employee' to type character(32) in 'worklog' and declare 'name' as primary key in table 'employee'. Column 'employee' in 'worklog' would, of course, reference 'name' from table 'employee'.
My question:
Will every new row in 'worklog' require additional 32 bytes for name of the 'employee' or will postgres internally just keep a pointer to the foreign field without duplicating the name for every new row?
I suppose that the answer for my question is somewhere in the documentation but I could not find it. It would be very helpful if someone could provide an according link.
PS: I did find this thread, however, there was no link to some official documentation. The behaviour might also have changed, since the thread is now over seven years old.
Postgres will store the data that you tell it to store. There are some new databases that will do compression under the hood -- and Postgres might have features to enable that (I do not know all Postgres features).
But, you shouldn't do this. Integer primary keys are more efficient than strings for three reasons:
They are fixed length in bytes.
They are shorter.
Collations are not an issue.
Stick with your original query, but write it using a join:
select wl.*
from worklog wl join
employee e
on wl.employee = e.id
where e.name = 'johnson, jack';
I suggest this because this is more consistent with how SQL works and makes it easier to choose multiple employees.
If you want to see the name and not the id, create a view (say v_worklog) and add in the employee name.
Related
Basically each user has a team, and each team has 11 players, so whenever a player scores they earn some points. Now is there a automated way to do this -
As in when there is a update/entry in the USER_TEAM_PLAYERS table, summate the points of all players to the USER_TEAM table for the corresponding user in some column (in this case TEAM_TOTAL column).
I have two tables:
USER_TEAM with columns USER_ID, TEAM_TOTAL
USER_TEAM_PLAYERS with columns PLAYER_NAME, PLAYER_POINTS, USER_ID
Example:
TABLE - USER_TEAM
USER_ID | TEAM_TOTAL
---------------------
1 | 40
2 | 50
TABLE - USER_TEAM_PLAYERS
PLAYER_NAME | PLAYER_POINTS | USER_ID
-------------------------------------
Adam | 10 | 1
Alex | 30 | 1
Botas | 40 | 2
Pepe | 5 | 2
Diogo | 5 | 2
The first table should be only a view of the second one
CREATE VIEW USER_TEAM2 AS
SELECT USER_ID, SUM(PLAYER_POINTS) AS TEAM_TOTAL
FROM USER_TEAM_PLAYERS
GROUP BY USER_ID
ORDER BY USER_ID;
Doing this, you have no duplicate data and a view can be in SELECT, ... like a table.
Nota 1 : I used the name USER_TEAM2 because your first table still exists but you can delete it.
Nota 2 : If you want to have some specific data to the TEAM_TABLE, keep the 2 names, and modifify your view as needed by adding some fields with a JOIN of this first table.
How can we auto generate column/fields in microsoft access table ?
Scenario......
I have a table with personal details of my employee (EmployDetails)
I wants to put their everyday attendance in an another table.
Rather using separate records for everyday, I want to use a single record for an employ..
Eg : I wants to create a table with fields like below
EmployID, 01Jan2020, 02Jan2020, 03Jan2020,.........25May2020 and so on.......
It means everyday I have to generate a column automatically...
Can anybody help me ?
Generally you would define columns manually (whether that is through a UI or SQL).
With the information given I think the proper solution is to have two tables.
You have your "EmployDetails" which you would put their general info (name, contact information etc), and the key which would be the employee ID (unique, can be autogenerated or manual, just needs to be unique)
You would have a second table with a foreign key to the empployee ID in "EmployDetails" with a column called Date, and another called details (or whatever you are trying to capture in your date column idea).
Then you simply add rows for each day. Then you do a join query between the tables to look up all the "days" for an employee. This is called normalisation and how relational databases (such as Access) are designed to be used.
Employee Table:
EmpID | NAME | CONTACT
----------------------
1 | Jim | 222-2222
2 | Jan | 555-5555
Detail table:
DetailID | EmpID (foreign key) | Date | Hours_worked | Notes
-------------------------------------------------------------
10231 | 1 | 01Jan2020| 5 | Lazy Jim took off early
10233 | 2 | 02Jan2020| 8 | Jan is a hard worker
10240 | 1 | 02Jan2020| 7.5 | Finally he stays a full day
To find what Jim worked you do a join:
SELECT Employee.EmpID, Employee.Name, Details.Date, Details.Hours_worked, Details.Notes
FROM Employee
JOIN Details ON Employee.EmpID=Details.EmpID;
Of course this will give you a normalised result (which is generally what's wanted so you can iterate over it):
EmpID | NAME | Date | Hours_worked | Notes
-----------------------------------------------
1 | Jim | 01Jan2020 | 5 | ......
1 | Jim | 02Jan2020 | 7 | .......
If you want the results denormalised you'll have to look into pivot tables.
See more on creating foreign keys
Aware there is an almost identical question here, but that covers the SQL query required, rather than the mechanism of event triggering.
Lets say I have two tables. One table contains performance data for each staff member each week. The other table is a table that holds the staff members information. What I want is to update a value in the table to a Y or N based on whether that staff member left at the week date.
staffTable
+----------+----------------+------------+
| staff_id | staff_name | leave_date |
+----------+----------------+------------+
| 1 | Joseph Blogges | 2020-01-24 |
| 2 | Joe Bloggs | 9999-12-31 |
| 3 | Joey Blogz | 9999-12-31 |
+----------+----------------+------------+
targetTable
+------------+----------+--------+-----------+
| week_start | staff_id | target | left_flag |
+------------+----------+--------+-----------+
| 2020-01-13 | 1 | 10 | N |
| 2020-01-20 | 1 | 10 | N |
| 2020-01-27 | 1 | 8 | Y |
+------------+----------+--------+-----------+
What I am trying to do is have the left_flag automatically change from 'N' to 'Y' when the week_start value is greater than leave_date of the staff member (in the other table).
I have tried successfully putting this into a view, which works, but the problem is that existing applications, views and queries will need to all reference a new view instead of a table and I want to be able to query the data table as my front-end has issues interacting in live with a view instead of a table.
I have also successfully used a UDF to return the leave_date and then create computed column that will check if this UDF variable is greater than the start_date column and this worked fine until I realised that the UDF is the most resource consuming query on the entire server and is completely disproportionate.
Is there a way that I can trigger an update to the staffTable when a criteria is met in another table, or is there a totally better and different way of doing this? If it can't be done easily, I'll try to switch to a view and work around it in the front-end.
I'm going to describe the process rather than writing the code.
What you are describing can be accomplished using triggers on staffTable. When a new row is inserted or updated the trigger would change any rows in targetTable. This would be an after insert/update trigger.
The heart of the trigger would be:
update tt
set left_flag = 'Y'
from targettable tt join
inserted i
on tt.staff_id = i.staff_id
where i.leave_date < tt.week_start and
tt.left_flag <> 'Y';
I have this random table with random contents.
id | name| mission
1 | aaaa | kitr
2 | bbbb | etre
3 | ccccc| qwqw
4 | dddd | qwert
5 | eeee | potentials
6 | ffffffff | toto
What I want is to add in the above table a column with id=3 with different name and different mission BUT the OLD id =3 I want to have an id = 4 with the name and the mission that it had before when it was id=3, and the OLD id =4 become id=5 with the name and mission of id 5 and so on.
its like i want to enter a column inside of the columns and the below column i want to increase there id +1 but the columns rest the same. example below:
id | name| mission
1 | aaaa | kitr
2 | bbbb | etre
3 | zzzzzz| zzzzz
4 | ccccc| qwqw
5 | dddd | qwert
6 | eeee | potentials
7 | ffffffff | toto
why I want to do this ? I have a table that has 2 CLOB. Inside of those CLOBS there are different queries ex: id =1 has clob of creation of a table id=2 inserts for the columns id=3 has creation of another table id=4 has functions
if you add all of this id in one text(or clob) they will have to create then inserts then create then functions. that table it is like a huge script .
Why I am doing this ? The developers are building their application and they want the sql to work in specific order and I have 6 developers and am organizing the data modeling and the performance and how the scripts are running .So the above table is to organize the calling of the scripts that they wany
Simply put, don't do it.
This case highlights why you should never use any business value, i.e. any 'real world values' for a Primary Key.
In your case I would recommend primary keys not be used for any other purposes.
I recommend you add an extra column 'order' and then change THAT column in order to re-order the rows. That way your primary key and all the other records will not need to be touched.
This avoid the issue that your approach would need to change ALL the database records below the current record which seems like a really bad approach. Just imagine trying to undo that update ;)
Some more info here: https://stackoverflow.com/a/8777574/631619
UPDATE random_table r1
SET id =
(SELECT CASE WHEN id > 2 THEN id+1 ELSE id END id FROM random_table r2
WHERE r1.mission=r2.mission
)
Then insert the new value.
I'm trying to convert a product table that contains all the detail of the product into separate tables in SQL. I've got everything done except for duplicated descriptor details.
The problem I am having all the products have size/color/style/other that many other products contain. I want to only have one size or color descriptor for all the items and reuse the "ID" for all the product which I believe is a Parent key to the Product ID which is a ...Foreign Key. The only problem is that every descriptor would have multiple Foreign Keys assigned to it. So I was thinking on the fly just have it skip figuring out a Foreign Parent key for each descriptor and just check to see if that descriptor exist and if it does use its Key for the descriptor.
Data Table
PI Colo Sz OTHER
1 | Blue | 5 | Vintage
2 | Blue | 6 | Vintage
3 | Blac | 5 | Simple
4 | Blac | 6 | Simple
===================================
Its destination table is this
===================================
DI Description
1 | Blue
2 | Blac
3 | 5
4 | 6
6 | Vintage
7 | Simple
=============================
Select Data.Table
Unique.Data.Table.Colo
Unique.Data.Table.Sz
Unique.Data.Table.Other
=======================================
Then the dual part of the questions after we create all the descriptors how to do a new query and assign the product ID to the descriptors.
PI| DI
1 | 1
1 | 3
1 | 4
2 | 1
2 | 3
2 | 4
By figuring out how to do this I should be able to duplicate this pattern for all 300 + columns in the product. Some of these fields are 60+ characters large so its going to save a ton of space.
Do I use a Array?
Okay, if I understand you correctly, you want all unique attributes converted from columns into rows in a single table (detailstable) that has an id and a description field:
Assuming the schema:
datatable
------------------
PI [PK]
Colo
Sz
OTHER
detailstable
------------------
DI [PK]
Description
You can first get all of the unique attributes into its own table with:
INSERT INTO detailstable (Description)
SELECT
a.description
FROM
(
SELECT DISTINCT Colo AS description
FROM datatable
UNION
SELECT DISTINCT Sz AS description
FROM datatable
UNION
SELECT DISTINCT OTHER AS description
FROM datatable
) a
Then to link up the datatable to the detailstable, I'm assuming you have a cross-reference table defined like:
datadetails
------------------
PI [PK]
DI [PK]
You can then do:
INSERT INTO datadetails (PI, DI)
SELECT
a.PI
b.DI
FROM
datatable a
INNER JOIN
detailstable b ON b.Description IN (a.Colo, a.Sz, a.OTHER)
I reckon you want to split description table for different categories, like - colorDescription, sizeDescription etc.
If that is not practical then I would recommend having an extra column showing an category attribute:
DI Description Category
1 | Blue | Color
2 | Blac | Color
3 | 5 | Size
4 | 6 | Size
6 | Vintage | Other
7 | Simple | Other
And then have primary key in this table as combination of ID and Category column.
This will have less chances for injecting any data errors. It will be also easy to track that down.