Get number of attached constraints on a variable in MiniZinc - optimization

I have two sets of variables in my Minizinc program. Each variable from the first set necessarily has several constraints placed on it, but the variables in the second set are only implicitly constrained via their interactions with variables in the first set. This means that each of the variables in the second set may have anywhere from 0 to ~8 constraints placed on it, depending on the values taken by the variables in the first set.
I see that there is a way to reference the number of constraints placed on a variable at search time via the dom_w_deg search annotation, but I was wondering if there was anyway to access this information at runtime? I want to do this because I would like to specify additional constraints related to the number of constraints already placed on the variables.
I realize this is a weird question, and I may be approaching this whole thing the wrong way, but I've been banging my head against this problem for a while now, so figured I'd ask.

As a general rule, I think that you are approaching your problem erroneously. There are several mis-conceptions in the approach that I can identify leading to this:
Different solver back-ends might do very different things with the model and how it is solved
"A constraint" is not a meaningful concept for the solver. A single constraint might be multiple propagators in the back-end solver, a single propagator, or even just part of a propagator covering several constraints (assuming that it is a propagator based back-end).
Constraint models have monotonic behavior, so you can not in a well-defined and meaningful way change the model based on the number of constraints connected to a variable.
Given that a constraint maps to a single propagator, it may still have very different propagation strength, meaning that it might be done early or very late in the solving process.
Without knowing what you are actually trying to achieve, as a general technique you might be interested in using reification, where the truth of a constraint is reflected onto a binary Boolean variable. In general, it is good practice to have as little reification as possible, since it does not propagate much, but sometimes it is needed.
As a very simple example of using reification, this is a (probably not very good) model that tries to maximize the number of constraints satisfied.
set of int: Domain = 1..10;
var Domain: x;
var Domain: y;
var Domain: z;
array[1..3] of var bool: holds;
constraint holds[1] <-> x < y;
constraint holds[2] <-> y < z;
constraint holds[3] <-> z < x;
var int: goal;
constraint goal = sum(holds);
solve maximize goal;

Related

Optaplanner - Multiple planning entities with same blockId- How to "move all" or "chain" or "shadow" to the same planning variable?

I am trying to assign Timeslots to planningEntities
(containing room, groups, persons handled by constraint streams).
Some of these planning entities has a blockId.
When entity has a blockId the goal is to share timeslot with other entities with the same blockId.
I defined a constraint for this, but I can see that the solver does extremely many unnecessary moves.
public Constraint groupBlockConstraint(ConstraintFactory constraintFactory){
return constraintFactory.forEachUniquePair(Lesson.class,
Joiners.equal(Lesson::getSequenceGroup),
Joiners.filtering((a, b) ->
!Lesson.withoutBlock(a, b)
&& !Lesson.sameTimeslot(a,b)))
.penalize("BlockSequence not in same timeslot", HardSoftScore.ofHard(15));
}
Is there a way to handle this more efficiently?
Constraints do not determine which moves the solver will be trying. Constraints are only used to score solutions which are achieved once moves are already performed.
Therefore if you're seeing moves which in your opinion should not be performed, you need to configure your selectors. Using tabu search could, perhaps, also help here.
That said, without a more detailed question I can not provide a less generic answer.

Best practices for referencing natural and/or surrogate key values in code

I'm modifying some stored procedures that manage status changes when records are updated.
For example, if I have these two tables
Request(RequestID, StatusID)
Status(StatusID, StatusName)
I'm trying to determine the best to handle calling out the statuses in code.
Do I use StatusID or StatusName?
It's not guaranteed that StatusID will match between environments (DEV, PRE, PROD, etc).
Also, StatusName could be changed and I wouldn't want to have to alter code because I needed to change a StatusName.
I could create a 2nd unique column, which would sort of closely resemble StatusID.
I'd make sure this column was matched between regions, but that doesn't seem that clean either and sort of repetitive.
Can anyone suggest a cleaner, simpler way?
The difficulty of matching code to data can only partially be handled with a second column. When someone adds an item, what does this mean? If they re-use a known constant, what does it mean if you don't require this column to be unique?
Often times we will have user modifiable lookup tables, but they will have to be associated with a number of other flags indicating how to interpret the status - "IsTreatedAsExpired", "IsTreatedAsActive" or perhaps other tables which hold the statuses which are treated as certain things.
I think you really need to figure out the scope of what you want to allow with this table first. Because if you have a LOT of code references, you would be better off using a natural key which is in sync with your code on all installations. A possibility to handle this is to use negative numbers for unmovable codes (identity insert to add new unmovable codes) and then have your sequence only add positive ones. But again, this doesn't address the semantics of how your program would handle or use the user-entered extensions.
Again, it's hard to say without getting the full scope sorted out here.
From the information you've given, StatusID may have different values in different databases, presumably because your keys are generated automatically and are not specified by you. If so then obviously it's impossible to use StatusID consistently in your code anyway (without standardizing the values). Therefore the question becomes "is it acceptable/practical/desirable to hard-code StatusName values in my code?"
The obvious answer is yes, what's the alternative? If you have a certain status that represents 'ready' and you want to reference that in code then you must put something in your code that identifies the status unambiguously.
If you add a second key of some kind (as Carlos suggested) you still have the same basic problem that changing a natural key value is changing the identity of the status and therefore changes the meaning of your code. If you change the 'real' natural key (READY) without changing the second key (RDY) then your code will become more confusing and difficult to maintain.
If you do something more complex like extracting 'constants' or 'configuration parameters' into a configuration file or table or even writing a custom preprocessor to insert key values into your scripts at deployment time, you add lots of complexity for very little gain (unless you have other good reasons for doing it). I've seen this approach used, and it was a huge, unmaintainable mess.
In practice, StatusName is most likely to change because a) someone thinks another name would be 'more accurate' or 'look better', or b) you discover that it doesn't correctly represent your requirements. If you're forced to spend time on a) then just change the display name in your front end or reports and leave the database and code alone. If b) comes up then by definition your current data model and code are inaccurate and must be revised and possibly modified anyway. And when b) does happen, it often results in adding a new code, not changing the existing one (e.g. because someone defined a new process step that there is no existing code for).
And if you are open to changing your development and deployment practices there are other ways to look at this issue too, as others have suggested. Can you make your StatusID values the same everywhere? Technically it's possible, so what are the organizational reasons not to? Can you reduce the probability and impact of StatusName changes through change management and code reviews? Can you improve your requirements process to capture certain information more effectively?
Write a user defined function that accepts status name and gives out the status if wherever you are referring the status id
select * from resources where statusid = dbo.getStatusId("COMPLETED");
This would make sure that resolving the status id always happens within the function that you have defined
As a rule of thumb when you have id,value tables (Status, Result, Area, etc..) I usually add a third field that its the record's mnemonic value and always use that, neither the name or the id.
Now the mnemonic value is like a business key (well, it is a business key) in the sense that its a business value and does not depend on the database (for the id) or the way it displayed (the description) so for example for your status table you may have
StatusID,StatusName,StatusMnemo
1 ,COMPLETED ,COM
2 ,REJETED ,REJ
and so forth.
And in your queries you always join by statusId but you add a clause to join against the status table by StatusMnemo. This is a value that's independent across environments and remains constant.
Also in inserts, you always use statusid.
If you have statusID values that need special treatment then they should be the same across environments.
Why would you introduce a statusID that needs special treatment in Prod that has not gone thru Pre and Dev?
What I often do is start iden at 100 and use that for generic status that don't need special treatment.
Then DEV owns the space under 100 for special treatment using IDENTITY INSERT ON.
If deploy from DEV to PRE insert any records under 100.

Type to use for "Status" columns in a sql table

I have a (dummy) table structure as follows:
ticket
id: int(11) PK
name: varchar(255)
status: ?????????
The question is, what data type should I use for status? Here are my options, as I see them:
varchar representing the status - BAD because there's no integrity
enum representing the status - BAD because to change the value, I'd have to alter the table, and then any code with dropdowns for the values, etc etc etc
int FK to a status table - GOOD because it's dynamic, BAD because it's harder to inspect by sight (which may be useful)
varchar FK to a status table - GOOD because it's dynamic, and visible on inspection. BAD because the keys are meaningful, which is generally frowned upon. Interestingly, in this case it's entirely possible for the status table to have just 1 column, making it a glorified enum
Have I got an accurate read of the situation? Is having a meaningful key really that bad? Because while it does give me goosebumps, I don't have any reason for it doing so...
Update:
For option 4, the proposed structure would be status: char(4) FK, to a status table. So,
OPEN => "Open"
CLOS => "Closed"
"PEND" => "Pending Authorization"
"PROG" => "In Progress
What's the disadvantage in this case ? The only benefit I can see of using int over char in this case is slight performance.
I would go with number 4, but I'd use a char(x) column. If you're worried about performance, a char(4) takes up as much space (and, or so one would think, disk i/o, bandwidth, and processing time) as an int, which also takes 4 bytes to store. If you're really worried about performance, make it a char(2) or even char(1).
Don't think of it as "meaningful data", think of it as an abbreviation of the natural key. Yes, the data has meaning, but as you've noticed that can be a good thing when working with the data--it means you don't always have to join (even if to a trivially small table) to extract meaning from the database. And of course the foreign key constraint ensures that the data is valid, since it must be in the lookup table. (This can be done with CHECK constraints as well, but Lookup tables are generally easier to manage and maintain over time.)
The downside is that you can get caught up with trying to find meaning. char(1) has a strong appeal, but if you get to ten or more values, it can get hard to come up with good meaningful values. Less of a problem with char(4), but still a possible issue. Another downside: if the data is likely to change, then yes, your meaningful data ("PEND" = "Pending Authorization") can lose its meaning ("PEND" = "Forward to home office for initial approval"). That's a poor example; if codes like that do change, you're probably much better off refactoring your system to reflect the change in business rules. I guess my point should be, if it's a user-entered lookup value, surrogate keys (integers) will be your friend, but if they're internally defined and maintained you should definitely consider more human-friendly values. That, or you'll need post-em notes on your monitor to remind you what the heck Status = 31 is supposed to mean. (I've got three on mine, and the stickum wears out every few months. Talk about cost to maintain...)
Go with number 3. Create a view that join's in the status value if you want something inspectable.
I would use an INT, and create a foreign key relationship to the status table. An INT should definitely be safe for an enumerated status column.
May I recommend you go with a statusID field instead, and have a separate table mapping the ID to a varchar?
EDIT: I guess that's exactly what you outlined in point 3. I think that is the best option.
I'm assuming that your database has a front end of some description, and that regular users are not exposed to the status code.
So, your convenience is only for programmers and DBAs - important people, but I wouldn't optimize my design for them.
Stronger - I would be very careful of using "meaningful" abbreviations - the most egregious data foul-up I've ever seen happened when a developer was cleansing some data, and interpreted the "meaningful" key incorrectly; turns out that "PROG" does not mean "programmed", but "in progress".
Go with option 3.
I've been working with a lot of databases recently that require a lot of statuses AND I've got a few notes that might be worth adding to the conversation.
INT: One thing I found is that if an application has a lot of tracking going on, the number of reference tables can quickly get unwieldy and, as you've mentioned, make inspecting the database at a glance impractical. (Which, for some of my clients, has mattered much more than the scant milliseconds it's saved in processing time.)
VARCHAR: Terrible idea for programming, but it's important to consider if a given status is actually going to be used by the code, or just human eyes. For the latter, you get unlimited range and don't have to maintain any relationships.
CHAR(4): Using a descriptive char column can actually be a very good approach. I'd typically only consider it if the value range were going to be low and obvious, but only because I consider this a nonstandard approach (risking confusion to new devs). Realistically, you could use a CHAR value as a foreign key just the same as an INT, gain legibility and maintain performance parity.
The one thing you couldn't do that I'd miss is mathematical operations (like "<" and ">").
INT Range: A hybrid strategy I've tried out is to use INT, but adding a degree of semantics to the numbers. So, for instance,
1-10 being for initial stages,
11-20 being in progress, and
21-30 being the final stages.
60-69 for errors, rejections
The problem here is that if you discover you need more numbers, you're SOL, since the next range is already taken. So, what I ended up doing was (sort of) mimicking HTTP responses:
100-199 being for initial stages,
200-299 being in progress, and
300-399 being the final stages.
500-599 for errors, rejections
I prefer this to simple INT, and while it can be less descriptive than CHAR, it can also be less ambiguous. Whereas "PROG" could mean a number of things, good, bad or benign, if I can see something is in the 500 range, I may not known what the problem is, I will be able to tell you there is a problem.
Creating a separate table with status is a good idea when you want to show the list of the status in the HTML form. You can show the verbose description from the lookup table and it will help the user to choose status if the requirements are like that.
From the development perspective, I would like to go integer as a primary key. You can optimize it by using small/tiny integer if you know it will not exceed the limit.
If you use abbreviation as a foreign key then you have to think every time to make it unique all the time as #Philip Kelley had mentioned it as a downside of it.
Lastly, you can declare the table type MYISAM if you like.
Update:
Reflecting #Philip Kelley opinion, if there are too many status, then it's better to use integer as foreign key. If there are only couple of status, then may be use abbr as a foreign key.

How important are lookup tables?

A lot of the applications I write make use of lookup tables, since that was just the way I was taught (normalization and such). The problem is that the queries I make are often more complicated because of this. They often look like this
get all posts that are still open
"SELECT * FROM posts WHERE status_id = (SELECT id FROM statuses WHERE name = 'open')"
Often times, the lookup tables themselves are very short. For instance, there may only be 3 or so different statuses. In this case, would it be okay to search for a certain type by using a constant or so in the application? Something like
get all posts that are still open
"SELECT * FROM posts WHERE status_id = ".Status::OPEN
Or, what if instead of using a foreign id, I set it as an enum and queried off of that?
Thanks.
The answer depends a little if you are limited to freeware such as PostGreSQL (not fully SQL compliant), or if you are thinking about SQL (ie. SQL compliant) and large databases.
In SQL compliant, Open Architecture databases, where there are many apps using one database, and many users using different report tools (not just the apps) to access the data, standards, normalisation, and open architecture requirements are important.
Despite the people who attempt to change the definition of "normalisation", etc. to suit their ever-changing purpose, Normalisation (the science) has not changed.
if you have data values such as {Open; Closed; etc} repeated in data tables, that is data duplication, a simple Normalisation error: if you those values change, you may have to update millions of rows, which is very limited design.
Such values should be Normalised into a Reference or Lookup table, with a short CHAR(2) PK:
O Open
C Closed
U [NotKnown]
The data values {Open;Closed;etc} are no longer duplicated in the millions of rows. It also saves space.
the second point is ease of change, if Closed were changed to Expired, again, one row needs to be changed, and that is reflected in the entire database; whereas in the un-normalised files, millions of rows need to be changed.
Adding new data values, eg. (H,HalfOpen) is then simply a matter of inserting one row.
in Open Architecture terms, the Lookup table is an ordinary table. It exists in the [SQL compliant] catalogue; as long as the FOREIGN KEY relation has been defined, the report tool can find that as well.
ENUM is a Non-SQL, do not use it. In SQL the "enum" is a Lookup table.
The next point relates to the meaningfulness of the key.
If the Key is meaningless to the user, fine, use an {INT;BIGINT;GUID;etc} or whatever is suitable; do not number them incrementally; allow "gaps".
But if the Key is meaningful to the user, do not use a meaningless number, use a meaningful Relational Key.
Now some people will get in to tangents regarding the permanence of PKs. That is a separate point. Yes, of course, always use a stable value for a PK (not "immutable", because no such thing exists, and a system-generated key does not provide row uniqueness).
{M,F} are unlikely to change
if you have used {0,1,2,4,6}, well don't change it, why would you want to. Those values were supposed to be meaningless, remember, only a meaningful Key need to be changed.
if you do use meaningful keys, use short alphabetic codes, that developers can readily understand (and infer the long description from). You will appreciate this only when you code SELECT and realise you do not have to JOIN every Lookup table. Power users too, appreciate it.
Since PKs are stable, particularly in Lookup tables, you can safely code:
WHERE status_code = 'O' -- Open
You do not have to JOIN the Lookup table and obtain the data value Open, as a developer, you are supposed to know what the Lookup PKs mean.
Last, if the database were large, and supported BI or DSS or OLAP functions in addition to OLTP (as properly Normalised databases can), then the Lookup table is actually a Dimension or Vector, in Dimension-Fact analyses. If it was not there, then it would have to be added in, to satisfy the requirements of that software, before such analyses can be mounted.
If you do that to your database from the outset, you will not have to upgrade it (and the code) later.
Your Example
SQL is a low-level language, thus it is cumbersome, especially when it comes to JOINs. That is what we have, so we need to just accept the encumbrance and deal with it. Your example code is fine. But simpler forms can do the same thing.
A report tool would generate:
SELECT p.*,
s.name
FROM posts p,
status s
WHERE p.status_id = s.status_id
AND p.status_id = 'O'
Another Exaple
For banking systems, where we use short codes which are meaningful (since they are meaningful, we do not change them with the seasons, we just add to them), given a Lookup table such as (carefully chosen, similar to ISO Country Codes):
Eq Equity
EqCS Equity/Common Share
OTC OverTheCounter
OF OTC/Future
Code such as this is common:
WHERE InstrumentTypeCode LIKE "Eq%"
And the users of the GUI would choose the value from a drop-down that displays
{Equity/Common Share;Over The Counter},
not {Eq;OTC;OF}, not {M;F;U}.
Without a lookup table, you can't do that, either in the apps, or in the report tool.
For look-up tables I use a sensible primary key -- usually just a CHAR(1) that makes sense in the domain with an additional Title (VARCHAR) field. This can maintain relationship enforcement while "keeping the SQL simple". The key to remember here is the look-up table does not "contain data". It contains identities. Some other identities might be time-zone names or assigned IOC country codes.
For instance gender:
ID Label
M Male
F Female
N Neutral
select * from people where gender = 'M'
Alternatively, an ORM could be used and manual SQL generation might never have to be done -- in this case the standard "int" surrogate key approach is fine because something else deals with it :-)
Happy coding.
Create a function for each lookup.
There is no easy way. You want performance and query simplicity. Ensure the following is maintained. You could create a SP_TestAppEnums to compare existing lookup values against the function and look for out of sync/zero returned.
CREATE FUNCTION [Enum_Post](#postname varchar(10))
RETURNS int
AS
BEGIN
DECLARE #postId int
SET #postId =
CASE #postname
WHEN 'Open' THEN 1
WHEN 'Closed' THEN 2
END
RETURN #postId
END
GO
/* Calling the function */
SELECT dbo.Enum_Post('Open')
SELECT dbo.Enum_Post('Closed')
Question is: do you need to include the lookup tables (domain tables 'round my neck of the woods) in your queries? Presumably, these sorts of tables are usually
pretty static in nature — the domain might get extended, but it probably won't get shortened.
their primary key values are pretty unlikely to change as well (e.g., the status_id for a status of 'open' is unlikely to suddenly get changed to something other than what it was created as).
If the above assumptions are correct, there's no real need to add all those extra tables to your joins just so your where clause can use a friend name instead of an id value. Just filter on status_id directly where you need to. I'd suspect the non-key attribute in the where clause ('name' in your example above) is more likely to get changes than the key attribute ('name' in your example above): you're more protected by referencing the desire key value(s) of the domain table in your join.
Domain tables serve
to limit the domain of the variable via a foreign key relationship,
to allow the domain to be expanded by adding data to the domain table,
to populate UI controls and the like with user-friendly information,
Naturally, you'd need to suck domain tables into your queries where you you actually required the non-key attributes from the domain table (e.g., descriptive name of the value).
YMMV: a lot depends on context and the nature of the problem space.
The answer is "whatever makes sense".
lookup tables involve joins or subqueries which are not always efficient. I make use of enums a lot to do this job. its efficient and fast
Where possible (and It is not always . . .), I use this rule of thumb: If I need to hard-code a value into my application (vs. let it remain a record in the database), and also store that vlue in my database, then something is amiss with my design. It's not ALWAYS true, but basically, whatever the value in question is, it either represents a piece of DATA, or a peice of PROGRAM LOGIC. It is a rare case that it is both.
NOT that you won't find yourself discovering which one it is halfway into the project. But as the others said above, there can be trade-offs either way. Just as we don't always acheive "perfect" normalization in a database design (for reason of performance, or simply because you CAN take thngs too far in pursuit of acedemic perfection . . .), we may make some concious choices about where we locate our "look-up" values.
Personally, though, I try to stand on my rule above. It is either DATA, or PROGRAM LOGIC, and rarely both. If it ends up as (or IN) a record in the databse, I try to keep it out of the Application code (except, of course, to retrieve it from the database . . .). If it is hardcoded in my application, I try to keep it out of my database.
In cases where I can't observe this rule, I DOCUMENT THE CODE with my reasoning, so three years later, some poor soul will be able to ficure out how it broke, if that happens.
The commenters have convinced me of the error of my ways. This answer and the discussion that went along with it, however, remain here for reference.
I think a constant is appropriate here, and a database table is not. As you design your application, you expect that table of statuses to never, ever change, since your application has hard-coded into it what those statuses mean, anyway. The point of a database is that the data within it will change. There are cases where the lines are fuzzy (e.g. "this data might change every few months or so…"), but this is not one of the fuzzy cases.
Statuses are a part of your application's logic; use constants to define them within the application. It's not only more strictly organized that way, but it will also allow your database interactions to be significantly speedier.

Is this schema design good?

I inherited a system that stores default values for some fields in some tables in the database. These default values are used in the application to prepopulate control values. So, essentially, every field in every table in the database can potentially have a default value. The previous developer decided to store these values in a single table that had a key/value pair combo. The key represented by the source table + field name (as a varchar) and the default value as a varchar field as well. The Business layer would then cast the varchar field to the appropriate data type.
Somehow, I feel this is brittle. Though the application works as expected, there appears to be a flaw in the design.
Any suggestions on how this requirement could have been handled earlier? Is there anything that can be done now to make it more robust?
EDIT: I should have defined what the term "default" meant. This is NOT related to the default value of a field in the table. Instead, it's a default value that will be used by the application in the front end.
That schema design is fine. I've seen it used in commercial apps and I've also used it in a few apps of my own where the users needed to be able to change the defaults or other parameters around fields in the application (limits, allowable characters etc.) or the application allowed the users to add new fields for use in the app.
Having it in a single table (not separate default tables for each table) protects it from schema changes in the tables it supports. Those schema changes become simple configuration changes in this model.
The single table makes it easy to encapsulate in a Class to serve as the "defaults" configuration object.
Some general advice:
When you inherit a working system and don't understand why something was designed the way it is - the problem is most likely your understanding, not the system. If it isn't broken, do not fix it.
Specific advice on the only improvements I would recommend (if they become necessary):
You can use the new SQLVARIANT field for the value rather than a varchar - it can hold any of the regular data types - you will need to add support for casting them to the correct data type when using the value though.
Refactoring the schema now would be risky and disruptive so I would not recommend it (unless you absolutely need to do that to fix some pressing issue, but from what you say it doesn't look like you do).
Were you doing the design from scratch, I'd recommend one defaults-table per real-table, with a single row recording the defaults with their real column names and types. Having several tiny tables scares some DBAs, but it's not really any substantial performance hit in my experience, and it sure does make the system sounder and more robust as you desire.
If you want to use SQL's own DEFAULT clauses as other answers recommend, be sure to name those explicitly, otherwise altering them when a default changes can be a doozy. Personally, I like to keep the default values separate from the schema's metadata, especially in a system where updating or tweaking a default value is a much more common and should-be-innocuous operation than the momentous undertaking of metadata/schema changes!
A better way to go would be using SQL Server's built-in DEFAULT constraint.
e.g.
CREATE TABLE Orders
(
OrderID int IDENTITY NOT NULL,
OrderDate datetime NULL CONSTRAINT DF_Orders_OrderDate DEFAULT(GETDATE()),
Freight money NULL CONSTRAINT DF_Orders_Freight DEFAULT (0) CHECK(Freight >= 0),
ShipAddress nvarchar (60) NULL DF_Orders_ShipAddress DEFAULT('NO SHIPPING ADDRESS'),
EnteredBy nvarchar (60) NOT NULL DF_Orders_EnteredBy DEFAULT(SUSER_SNAME())
)
If the requirement was that the default selection of a given control be configurable and the "application works as expected" then I don't see a problem. You didn't elaborate on the "flaw" in the design.
If you want (and should!) use default values on the database, I would strongly urge to use the built-in DEFAULT constraint that's available on any field. Only that is really guaranteed to work properly - anything else is a hack solution at best.....
CREATE TABLE
MyTable(ID INT IDENTITY(1,1),
NumericField INT CONSTRAINT DF_MyTable_Numeric DEFAULT(42),
StringID VARCHAR(20) CONSTRAINT DF_MyTable_StringID DEFAULT 'rubbish',
.......)
and so on - you get the idea.
Just learn this mantra: DRY - DON'T REPEAT YOURSELF - don't go out re-inventing stuff that's already there and has been heavily tested and used - just use it.
Marc
I think the real answer here depends heavily on how often these default values change. If default values are set once when the database is designed, then DEFAULT constraints make sense. If some non-technical person needs to change them every couple of months, I really like the design presented.
Where it becomes brittle is when you have a mismatch between the column names or data types and the default values in the Defaults table. If you code a careful interface to manage the Defaults table values, this shouldn't be a problem.
If its a case of UI defaults - the following questions come up.
How 'dynamic' or generic is your schema.? Does the same schema support multiple front-ends - i.e. the same column in the Db-table supports 2 front-ends - each with multiple-defaults?
Do multiple apps use your DB? In that case having the default defined in the DB could still help
Its possible to query the Data-dictionary to get default info for each column.
If a UI field does not have a corresponding db-column, then your current implementation will be justified in such cases
One downside is more code is needed to handle and use this table.
If it was a one-off application and this default 'intelligence' was not leveraged across multiple-apps - thats a consideration
Its more like a 'frameworky' kind of thing to do - though I'd say its quite non-standard, and would be done on the web-layer.
If the table of default values is what irks you, here's some food for thought:
Rather than stick to dogma about varchar(max) or casting strings or key/value tables - a good approach is to ask what is a better solution?
From your description, it seems like this table contains few rows, and has only two columns: key and value.
I should ask - is the data in this table controlled from an administrative UI? Perhaps this is the reason behind the original design decision to make it a table.
If type-safety is an issue, you could consider the existence of a "type" column and analyze how the code would need to be changed.
I wouldn't jump to conclusions about "good" or "bad" until you really analyze WHY the system is implemented this way.
The idea (not necessarily the implementation) makes sense if you want to keep the application defaults separate from the data, allowing different apps to have different defaults.
This is generally a good thing, because many databases inevitably spawn secondary applications (import jobs, if not anything else), where you do NOT want the same defaults (or any defaults at all); and in principle, a defaults table can support this.
What I think makes this implementation less-than-ideal is that while the defaults are MOSTLY data-driven, the calling application either needs its own set of defaults IF the defaults are not specified in the table or terminate.
If the former is employed, this could introduce a number of headaches when you're trying to track down bugs, especially if you don't have good audit tables keeping track of which user/application inserted/updated which rows on which tables.
Disclaimer: I'm generally of the thought that columns ought to be NULLable and w/o defaults, except where it absolutely makes sense from a data point of view (id/primary key, custom timestamp, etc.). If a column should never be NULL introduce a constraint forbidding NULLs, not a concrete default.