Is it possible to use the SqlReader plugin with multiple parameters for the primary key? Unfortunately, I'm using a third party database of images which has a primary key of two Int columns. I can't amend the table to add my own key field because the database is updated every night so the change would just be overwritten.
I thought of using a stored procedure which took a string and split and cast it into the two integer IDs, but the SP would also disappear every time the database was updated, as the update routine just deletes and recreates the database objects.
I would suggest forking SqlReader (ImageResizer v4+!), renaming it, and directly implementing the parsing changes you need.
There's no configuration to support your scenario, and the entire plugin is < 400loc. Take the direct approach.
Related
I have a quote status which is either DRAFT, SENT, WON or LOST. Normally I would create a lookup table with Id, Name and then link across via quote_status_id but in this case as the statuses are system managed e.g. users can't create statuses, is it better to use the string instead?
FYI in my particular case I am developing a rest api and multiple front ends e.g. SPA and mobile app so I need to keep the statuses in sync between backend and front end but there are some user workflows depending on status. This probably doesn't affect my initial question but I'm wondering if the string might simplify my development even slightly.
You absolutely want to favour lookup tables instead of fixed strings. There are a whole raft of maintainability reasons for using lookup tables, but for me it comes down to one core value which is data security.
In any full-stack application, you have trust your database data the most, as it's the most persistent. In line with this ideology, you want to build your database in such a way that makes it as difficult as possible for data to end up in an invalid state. Foreign keys are incredibly powerful at supporting this, because you can do a number of things including:
Ensuring that quotes always have a valid status (if that's your desire)
Preventing any user/developer spelling errors from persisting the database (e.g. creating a "droft" quote status)
Easily updating the display name in a single place (so "Won" could be changed to "Acquired" if you needed it to and you wouldn't have to check for every reference in your codebase)
Free-text fields in databases should be primarily reserved for exactly that, free-text. If the values are pre-defined and should never change, then lookup tables are the way to go.
The string is fine. An id and reference table has the nice advantage that you can verify the spelling and values that go in to the field. In addition, if the status is stored in multiple tables, then a reference table ensures that the same values are used in all referencing tables.
However, for values in a single table, a check constraint works almost as well:
alter table t add constraint chk_t_status check (status in ('DRAFT', 'SENT', 'WON', 'LOST'));
Given a project I'm working on, we have an old database structure we're migrating data from into a new database structure, and we need to preserve the old keys for a few tables for backwards compatibility with some existing application functionality.
Currently, there are two approaches we are considering for addressing this need:
Create an extra nullable field for each table and insert the old key into that new field
Create companion table(s) that contain the old and new key mappings
Note: new data will not generate old ID keys, so in approach #1, eventually the nullable field will contain nulls over time for new records.
Which approach is better for a cleaner database design, and data management long-term?
Do you see any issues with either approach, and if so, what issues?
Is there a #3 approach that I haven't thought of yet?
You mention sql, but is it SQL-Server?
if SQL-Server, look into SET INSERT_IDENTITY. This allows you to explicitly insert values for the auto-increment columns vs being in a protected mode for that column.
However, I believe that if you explicitly include the PK in the insert statement with its value, it will respect that and save the original key in the original column you are hoping to retain without having to force yet another column for backward compatibility purposes.
I'm starting a new application and was wondering what the best method of logging is. Some tables in the database will need to have every change recorded, and the user that made the change. Other tables may just need to have the last modified time recorded.
In previous applications I've used different methods to do this but want to hear what others have done.
I've tried the following:
Add a "modified" date-time field to the table to record the last time it was edited.
Add a secondary table just for recording changes in a primary table. Each row in the secondary table represents a changed field in the primary table. So one record update in the primary could create several records in the secondary table.
Add a table similar to no.2 but it records edits across three or fours tables, reference the table it relates to in an additional field.
what methods do you use and would recommend?
Also what is the best way to record deleted data? I never like the idea that a user can permanently delete a record from the DB, so usually I have a boolean field 'deleted' which is changed to true when its deleted, and then it'll be filtered out of all queries at model level. Any other suggestions on this?
Last one.. What is the best method for recording user activity? At the moment I have a table which records logins/logouts/password changes etc, and depending what the action is, gives it a code either 1,2, 3 etc.
Hope I haven't crammed too much into this question. thanks.
I know it's a very old question, but I'd wanted to add more detailed answer as this is the first link I got googling about db logging.
There are basically two ways to log data changes:
on application server layer
on database layer.
If you can, just use logging on server side. It is much more clear and flexible.
If you need to log on database layer you can use triggers, as #StanislavL said. But triggers can slow down your database performance and limit you to store change log in the same database.
Also, you can look at the transaction log monitoring.
For example, in PostgreSQL you can use mechanism of logical replication to stream changes in json format from your database to anywhere.
In the separate service you can receive, handle and log changes in any form and in any database (for example just put json you got to Mongo)
You can add triggers to any tracked table to olisten insert/update/delete. In the triggers just check NEW and OLD values and write them in a special table with columns
table_name
entity_id
modification_time
previous_value
new_value
user
It's hard to figure out user who makes changes but possible if you add changed_by column in the table you listen.
In our DB (on SQL Server 2005) we have a "Customers" table, whose primary key is Client Code, a surrogate, bigint IDENTITY(1,1) key; the table is referenced by a number of other tables in our DB thru a foreign key.
A new CR implementation we are estimating would require us to change ID column type to varchar, Client Code generation algorithm being shifted from a simple numeric progression to a strict 2-char representation, with codes ranging from 01 to 99, then progressing like this:
1A -> 2A -> ... -> 9A -> 1B -> ... 9Z
I'm fairly new to database design, but I smell some serious problems here. First of all, what about this client code generation algorithm? What if I need a Client Code to go beyond 9Z code limit?
The I have some question: would this change be feasible, the table being already filled with a fair amount of data, and referenced by multiple entities? If so, how would you approach this problem, and how would you implement Client Code generation?
I would leave the primary key as it is and would create another key (unique) on the client code generated.
I would do that anyway. It's always better to have a short number primary key instead of long char keys.
In some situation you might prefer a GUID (for replication purposes) but a number int/bigint is alway preferable.
You can read more here and here.
My biggest concern with what you are proposing is that you will be limited to 360 primary records. That seems like a small number.
Performing the change is a multi-step operation. You need to create the new field in the core table and all its related tables.
To do an in-place update, you need to generate the code in the core table. Then you need to update all the related tables to have the code based on the old id. Then you need to add the foreign key constraint to all the related tables. Then you need to remove the old key field from all the related tables.
We only did that in our development server. When we upgraded the live databases, we created a new database for each and copied the data over using a python script that queried the old database and inserted into the new database. I now update that script for every software upgrade so the core engine stays the same, but I can specify different tables or data modifications. I get the bonus of having a complete backup of the original database if something unexpected happens when upgrading production.
One strong argument in favor of a non-identity/guid code is that you want a human readable/memorable code and you need to be able to move records between two systems.
Performance is not necessarily a concern in SQL Server 2005 and 2008. We recently went through a change where we moved from int ids everywhere to 7 or 8 character "friendly" record codes. We expected to see some kind of performance hit, but we in fact saw a performance improvement.
We also found that we needed a way to quickly generate a code. Our codes have two parts, a 3 character alpha prefix and a 4 or 5 digit suffix. Once we had a large number of codes (15000-20000) we were finding it to slow to parse the code into prefix and suffix and find the lowest unused code (it took several seconds). Because of this, we also store the prefix and the suffix separately (in the primary key table) so that we can quickly find the next available lowest code with a particular prefix. The cached prefix and suffix made the search almost fee.
We allow changing of the codes and they changed values propagate by cascade update rules on the foreign key relationship. We keep an identity key on the core code table to simplify the update of the code.
We don't use an ORM, so I don't know what specific things to be aware of with that. We also have on the order of 60,000 primary keys in our biggest instance, but have hundreds of tables related and tables with millions of related values to the code table.
One big advantage that we got was, in many cases, we did not need to do a join to perform operations. Everywhere in the software the user references things by friendly code. We don't have to do a lookup of the int ID (or a join) to perform certain operations.
The new code generation algorithm isn't worth thinking about. You can write a program to generate all possible codes in just a few lines of code. Put them in a table, and you're practically done. You just need to write a function to return the smallest one not yet used. Here's a Ruby program that will give you all the possible codes.
# test.rb -- generate a peculiar sequence of two-character codes.
i = 1
('A'..'Z').each do |c|
(1..9).each do |n|
printf("'%d%s', %d\n", n, c, i)
i += 1
end
end
The program will create a CSV file that you should be able to import easily into a table. You need two columns to control the sort order. The new values don't naturally sort the way your requirements specify.
I'd be more concerned about the range than the algorithm. If you're right about the requirement, you're limited to 234 client codes. If you're wrong, and the range extends from "1A" to "ZZ", you're limited to less than a thousand.
To implement this requirement in an existing table, you need to follow a careful procedure. I'd try it several times in a test environment before trying it on a production table. (This is just a sketch. There are a lot of details.)
Create and populate a two-column table to map
existing bigints to the new CHAR(2).
Create new CHAR(2) columns in all the
tables that need them.
Update all the new CHAR(2) columns.
Create new NOT NULL UNIQUE or PRIMARY KEY constraints and new FOREIGN KEY constraints on the new CHAR(2) columns.
Rewrite user interface code (?) to target the new columns. (Might not be necessary if you rename the new CHAR(2) and old BIGINT columns.)
Set a target date to drop the old BIGINT columns and constraints.
And so on.
Not really addressing whether this is a good idea or not, but you can change your foreign keys to cascade the updates. What will happen once you're done doing that is that when you update the primary key in the parent table, the corresponding key in the child table will be updated accordingly.
Right now I have all my mappings as hbm.xml. I want to switch dynamically the type of Id generator for certain entities from 'identity' to 'assigned' at runtime (application start).
This is because I need to support importing data from previous system and keep existing ids.
Is this possible? How?
The generator is part of the mappings, so you need to change the mappings before creating the session factory.
This is easy to do with Fluent or ConfORM. It's possible to change XML mappings before feeding them to the configuration, but it's cumbersome.
Just check for a configuration flag (that you'll change when starting the app), and call the appropriate generator.
It's not clear why you would need to keep existing id's. I think you should not be needing to keep existing id's. Maybe you need to keep alternate id's instead?
If the previous system has it's own database, then you:
1) Need another mapping for the other table in the other database
2) Copy the data to your existing database (with key identity)
Which means you will need new id's anyway.
Example: Suppose you want to copy a table of 'airlines' and the previous system uses the 'airline-code' as the primary key. You could use an integer as primary key in your new database and the airlinecode as your alternate key.