'-999' used for all condition - sql

I have a sample of a stored procedure like this (from my previous working experience):
Select * from table where (id=#id or id='-999')
Based on my understanding on this query, the '-999' is used to avoid exception when no value is transferred from users. So far in my research, I have not found its usage on the internet and other company implementations.
#id is transferred from user.
Any help will be appreciated in providing some links related to it.

I'd like to add my two guesses on this, although please note that to my disadvantage, I'm one of the very youngest in the field, so this is not coming from that much of history or experience.
Also, please note that for any reason anybody provides you, you might not be able to confirm it 100%. Your oven might just not have any leftover evidence in and of itself.
Now, per another question I read before, extreme integers were used in some systems to denote missing values, since text and NULL weren't options at those systems. Say I'm looking for ID#84, and I cannot find it in the table:
Not Found Is Unlikely:
Perhaps in some systems it's far more likely that a record exists with a missing/incorrect ID, than to not be existing at all? Hence, when no match is found, designers preferred all records without valid IDs to be returned?
This however has a few problems. First, depending on the design, user might not recognize the results are a set of records with missing IDs, especially if only one was returned. Second, current query poses a problem as it will always return the missing ID records in addition to the normal matches. Perhaps they relied on ORDERing to ease readability?
Exception Above SQL:
AFAIK, SQL is fine with a zero-row result, but maybe whatever thing that calls/used to call it wasn't as robust, and something goes wrong (hard exception, soft UI bug, etc.) when zero rows are returned? Perhaps then, this ID represented a dummy row (e.g. blanks and zeroes) to keep things running.
Then again, this also suffers from the same arguments above regarding "record is always outputted" and ORDER, with the added possibility that the SQL-caller might have dedicated logic to when the -999 record is the only record returned, which I doubt was the most practical approach even in whatever era this was done at.
... the more I type, the more I think this is the oven, and only the great grandmother can explain this to us.

If you want to avoid exception when no value transferred from user, in your stored procedure declare parameter as null. Like #id int = null
for instance :
CREATE PROCEDURE [dbo].[TableCheck]
#id int = null
AS
BEGIN
Select * from table where (id=#id)
END
Now you can execute it in either ways :
exec [dbo].[TableCheck] 2 or exec [dbo].[TableCheck]
Remember, it's a separate thing if you want to return whole table when your input parameter is null.
To answer your id = -999 condition, I tried it your way. It doesn't prevent any exception

Related

Autoincrement varchar column optionally

I have a table with guid identifier and one field that is a 5 characters string that can be specified by user, but it is optional, and it should be unique per user. I'm looking for a way to have this field always there, even if user doesn't specify it. The easiest approach is to have it like "00001", "00002"... etc. in case that user doesn't specify it, it is stored like this. I'm using SQL and entity framework core. What is the best way to achieve this?
EDIT: maybe trigger that will check after insert if that field is not specified and then just take current row number and convert it to string? does this make sense?
Cheers
Setting a default value to '00001' can be done define the field with:
NOT NULL DEFAULT right('0000' || to_char(SomeSequence.nextval),5) (pseudo-code to be adapted to the DBMS you are connected to).
Compared to the solution in your EDIT, this will at least guarantee that 2 inserts at the same time from 2 different users get assigned different values.
The real problem comes with the unique constraint on the column. This does not work nicely when mixing manual input with calculated values.
If as a user, I input (manually) 00005, then the insertion will fail when SomeSequence reaches 5.
I think this problem will exist regardless of how you implement the generation of values (sequence, trigger, external code, ...)
Even if you are fine with coding some additional (and probably complicated) logic to manage that, it will probably decrease concurrency.

Behavior of Selecting Non-Existent Columns [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Given a table with this structure:
create table person (
id int identity primary key,
name varchar(256),
birthday datetime
)
For this query:
select id,name,birthday,haircolor from person
What is the rational behind SQL throwing an error in this situation? I was considering how flexible it would be if queries would simply return null for non existent columns. What would be concrete reasons for or against such a SQL language design?
No, because then you'd assume there is a haircolour column which could then have the following implications, just as an example:
You'd think you could insert into the field since you would assume that the column exists,
The returned result would not be an accurate representation of the database schema.
Errors are given so that you can have a clear understanding and indication of what you can and
cannot do. It's also there so you can catch exceptions, bugs, spiders, and of course creepy crawlies as soon as possible. :)
We dont wan't to accomdate lazy developers.
Do the right thing, be a man - Russel Peters.
This would be inconsistent for many reasons. One simple example is use of * to select the column: when you do this
select id, name, birthday, hair_color from person
and get back a table with all four columns present (hair_color is set to null on all rows) you would reasonably expect that the query below
select * from person
returned at least four columns, and that hair_color is among the columns returned. However, this wouldn't be the case if SQL allowed non-existent columns to return nulls.
This would also create hard-to-find errors when a column gets renamed in the schema, but some of the queries or stored procedures do not get re-worked to match the new names.
Generally, though, SQL engine developers make tradeoffs between usability and "hard" error checking. Sometimes, they would relax a constraint or two, to accommodate some of the most common "simple" mistakes (such as forgetting to use an aggregate function in a grouped query; it is an error, but MySql lets you get away with it). When it comes to schema checks, though, SQL engines do not allow any complacency, treating all missing schema elements as errors.
MySQL has one horrible bug in it...
select field1,field2,filed3
from table
group by field1
Any database engine would return 'error, wtf do I do with the field2 and field3 in the select line when they are not an aggregate nor in the group by statement'.
MySQL on the other hand will return two random values for field 2 and 3, and not return an error (which I refer to as 'doing the wrong thing and not returning an error'). THis is a horrid bug, the number of scripts of troubleshooted to discover theat MySQL is not handling group by's correctly is absurd...give an error before giving unintentional results and this huge bug won't be such an issue
Doesn't it seem to you that you are requesting more of this stupid behaviour to be propagated...just in the select clause instead of the group by clause?
edit:
typo propagation as well.
select age,gender, haricolour from...
I'd prefer to get an error back saying 'great typo silly' instead of a misnamed field full of nulls.
This would result in a silent errors problem. The whole reason you have compile time errors and runtime exceptions is to catch bugs as soon as possible.
If you had a typo in a column name and it returned null instead of an error, your application will continue to work, but various things would misbehave as a result. You might have a column that saves a user setting indicating they don't want to receive emails. However your query has a typo and so it always returns null. Gradually a few people begin to report that they are setting the "Do not send emails" setting but are still getting emails. You have to hunt through all your code to figure out the cause. First you look at the form that edits this setting, then at the code that calls the database to save the setting, then you verify the data exists and the setting is getting persisted, then you look at the system that sends emails, and work your way up to the DB layer there that retrieves settings and painstakingly look through the SQL for that typo.
How much easier would that process be if it had just thrown an error in the first place? No users frustrated. No wasting time with support requests. No wasting time troubleshooting.
Returning null would be too generic. I like it this way much better. In your flow you can probably catch errors and return null if you want. But giving more info about why the query failed is better than receiving a null (and probably have you scratch your head on your JOIN and WHERE clauses

Standard use of 'Z' instead of NULL to represent missing data?

Outside of the argument of whether or not NULLs should ever be used: I am responsible for an existing database that uses NULL to mean "missing or never entered" data. It is different from empty string, which means "a user set this value, and they selected 'empty'."
Another contractor on the project is firmly on the "NULLs do not exist for me; I never use NULL and nobody else should, either" side of the argument. However, what confuses me is that since the contractor's team DOES acknowledge the difference between "missing/never entered" and "intentionally empty or indicated by the user as unknown," they use a single character 'Z' throughout their code and stored procedures to represent "missing/never entered" with the same meaning as NULL throughout the rest of the database.
Although our shared customer has asked for this to be changed, and I have supported this request, the team cites this as "standard practice" among DBAs far more advanced than I; they are reluctant to change to use NULLs based on my ignorant request alone. So, can anyone help me overcome my ignorance? Is there any standard, or small group of individuals, or even a single loud voice among SQL experts which advocates the use of 'Z' in place of NULL?
Update
I have a response from the contractor to add. Here's what he said when the customer asked for the special values to be removed to allow NULL in columns with no data:
Basically, I designed the database to avoid NULLs whenever possible. Here is the rationale:
• A NULL in a string [VARCHAR] field is never necessary because an empty (zero-length) string furnishes exactly the same information.
• A NULL in an integer field (e.g., an ID value) can be handled by using a value that would never occur in the data (e.g, -1 for an integer IDENTITY field).
• A NULL in a date field can easily cause complications in date calculations. For example, in logic that computes date differences, such as the difference in days between a [RecoveryDate] and an [OnsetDate], the logic will blow up if one or both dates are NULL -- unless an explicit allowance is made for both dates being NULL. That's extra work and extra handling. If "default" or "placeholder" dates are used for [RecoveryDate] and [OnsetDate] (e.g., "1/1/1900") , mathematical calculations might show "unusual" values -- but date logic will not blow up.
NULL handling has traditionally been an area where developers make mistakes in stored procedures.
In my 15 years as a DBA, I've found it best to avoid NULLs wherever possible.
This seems to validate the mostly negative reaction to this question. Instead of applying an accepted 6NF approach to designing out NULLs, special values are used to "avoid NULLs wherever possible." I posted this question with an open mind, and I am glad I learned more about the "NULLs are useful / NULLs are evil" debate, but I am now quite comfortable labeling the 'special values' approach to be complete nonsense.
an empty (zero-length) string furnishes exactly the same information.
No, it doesn't; in the existing database we are modifying, NULL means "never entered" and empty string means "entered as empty".
NULL handling has traditionally been an area where developers make mistakes in stored procedures.
Yes, but those mistakes have been made thousands of times by thousands of developers, and the lessons and caveats for avoiding those mistakes are known and documented. As has been mentioned here: whether you accept or reject NULLs, representation of missing values is a solved problem. There is no need to invent a new solution just because developers continue make easy-to-overcome (and easy-to-identify) mistakes.
As a footnote: I have been a DBE and developer for more than 20 years (which is certainly enough time for me to know the difference beetween a database engineer and a database administrator). Throughout my career I have always been in the "NULLs are useful" camp, though I was aware that several very smart people disagreed. I was extremely skeptical about the "special values" approach, but not well-versed enough in the academics of "How To Avoid NULL the Right Way" to make a firm stand. I always love learning new things—and I still have lots to learn after 20 years. Thanks to all who contributed to make this a useful discussion.
Sack your contractor.
Okay, seriously, this isn't standard practice. This can be seen simply because all RDBMS that I have ever worked with implement NULL, logic for NULL, take account of NULL in foreign keys, have different behaviour for NULL in COUNT, etc, etc.
I would actually contend that using 'Z' or any other place holder is worse. You still require code to check for 'Z'. But you also need to document that 'Z' doesn't mean 'Z', it means something else. And you have to ensure that such documentation is read. And then what happens if 'Z' ever becomes a valid piece of data? (Such as a field for an initial?)
At a basic level, even without debating the validity of NULL vs 'Z', I would insist that the contractor conforms to standard practices that exist within your company, not his. Instituting his standard practice in an environment with an alternative standard practice will cause confusion, maintenance overheads, mis-understanding, and in the end increased costs and mistakes.
EDIT
There are cases where using an alternative to NULL is valid in my opinion. But only where doing so reduces code, rather than creating special cases which require accounting for.
I've used that for date bound data, for example. If data is valid between a start-date and an end-date, code can be simplified by not having NULL values. Instead a NULL start-date could be replaced with '01 Jan 1900' and a NULL end-date could be replaced with '31 Dec 2079'.
This still can change behaviour from what may be expected, and so should be used with care:
WHERE end-date IS NULL no longer give data that is still valid
You just created your own millennium bug
etc.
This is equivalent to reforming abstractions such that all properties can always have valid values. It is markedly different from implicitly encoding specific meaning into arbitrarily chosen values.
Still, sack the contractor.
This is easily one of the weirdest opinions I've ever heard. Using a magic value to represent "no data" rather than NULL means that every piece of code that you have will have to post-process the results to account/discard the "no-data"/"Z" values.
NULL is special because of the way that the database handles it in queries. For instance, take these two simple queries:
select * from mytable where name = 'bob';
select * from mytable where name != 'bob';
If name is ever NULL, it obviously won't show up in the first query's results. More importantly, neither will it show up in the second queries results. NULL doesn't match anything other than an explicit search for NULL, as in:
select * from mytable where name is NULL;
And what happens when the data could have Z as a valid value? Let's say you're storing someone's middle initial? Would Zachary Z Zonkas be lumped in with those people with no middle initial? Or would your contractor come up with yet another magic value to handle this?
Avoid magic values that require you to implement database features in code that the database is already fully capable of handling. This is a solved and well understood problem, and it may just be that your contractor never really grokked the notion of NULL and therefore avoids using it.
If the domain allows missing values, then using NULL to represent 'undefined' is perfectly OK (that's what it is there for). The only downside is that code that consumes the data has to be written to check for NULLs. This is the way I've always done it.
I have never heard of (or seen in practice) the use of 'Z' to represent missing data. As to "the contractor cites this as 'standard practice' among DBAs", can he provide some evidence of that assertion? As #Dems mentioned, you also need to document that 'Z' doesn't mean 'Z': what about a MiddleInitial column?
Like Aaron Alton and many others, I believe that NULL values are an integral part of database design, and should be used where appropriate.
Even if you somehow manage to explain to all your current and future developers and DBAs about "Z" instead of NULL, and even if they code everything perfectly, you will still confuse the optimizer because it will not know that you've cooked this up.
Using a special value to represent NULL (which is already a special value to represent NULL) will result in skews in the data. e.g. So many things happened on 1-Jan-1900 that it will throw out the optimizer's ability to understand that actual range of dates that really are relevant to your application.
This is like a manager deciding: "Wearing a tie is bad for productivity, so we're all going to wear masking tape around our necks. Problem solved."
I've never heard about the wide-spread use of 'Z' as a substitute for NULL.
(Incidentally, I'd not particularly like to work with a contractor who tells you in the face that they and other "advanced" DBAs are so much more knowledgeable and better than you.)
+=================================+
| FavoriteLetters |
+=================================+
| Person | FavoriteLetter |
+--------------+------------------+
| 'Anna' | 'A' |
| 'Bob' | 'B' |
| 'Claire' | 'C' |
| 'Zaphod' | 'Z' |
+---------------------------------+
How would your contractor interpret the data from the last row?
Probably he would choose a different "magic value" in this table to avoid collision with the real data 'Z'? Meaning you'd have to remember several magic values and also which one is used where... how is this better than having just one magic token NULL, and having to remember the three-valued logic rules (and pitfalls) that go with it? NULL at least is standardized, unlike your contractor's 'Z'.
I don't particularly like NULL either, but mindlessly substituting it with an actual value (or worse, with several actual values) everywhere is almost definitely worse than NULL.
Let me repeat my above comment here for better visibility: If you want to read something serious and well-grounded by people who are against NULL, I would recommend the short article "How to handle missing information without using NULLs" (links to a PDF from The Third Manifesto homepage).
Nothing in principle requires nulls for correct database design. In fact there are plenty of databases designed without using null and there are plenty of very good database designers and whole development teams who design databases without using nulls. In general it's a good thing to be cautious about adding nulls to a database because they inevitably lead to incorrect or ambiguous results later on.
I've not heard of using Z being called "standard practice" as a placeholder value instead of nulls but I expect your contractor is referring to the concept of sentinel values in general, which are sometimes used in database design. However, a much more common and flexible way to avoid nulls without using "dummy" data is simply to design them out. Decompose the table such that each type of fact is recorded in a table that doesn't have "extra", unspecified attributes.
In reply to contractors comments
Empty string <> NULL
Empty string requires 2 bytes storage + an offset read
NULL uses null bitmap = quicker
IDENTITY doesn't always start at 1 (why waste half your range?)
The whole concept is flawed as per most other answers here
While I have never seen 'Z' as a magic value to represent null, I have seen 'X' used to represent a field that has not been filled in. That said, I have only ever seen this in one place, and my interface to it was not a database, but rather an XML file… so I would not be prepared to use this an argument for being common practice.
Note that we do have to handle the 'X' specially, and, as Dems mentioned, we do have to document it, and people have been confused by it. In our defence, this is forced on us by an external supplier, not something that we cooked up ourselves!

Move SELECT to SQL Server side

I have an SQLCLR trigger. It contains a large and messy SELECT inside, with parts like:
(CASE WHEN EXISTS(SELECT * FROM INSERTED I WHERE I.ID = R.ID)
THEN '1' ELSE '0' END) AS IsUpdated -- Is selected row just added?
as well as JOINs etc. I like to have the result as a single table with all included.
Question 1. Can I move this SELECT to SQL Server side? If yes, how to do this?
Saying "move", I mean to create a stored procedure or something else that can be executed before reading dataset in while cycle.
The 2 following questions make sense only if answer is "yes".
Why do I want to move SELECT? First off, I don't like mixing SQL with C# code. At second, I suppose that server-side queries run faster, since the server have more chances to cache them.
Question 2. Am I right? Is it some sort of optimizing?
Also, the SELECT contains constant strings, but they are localizable. For instance,
WHERE R.Status = "Enabled"
"Enabled" should be changed for French, German etc. So, I want to write 2 static methods -- OnCreate and OnDestroy -- then mark them as stored procedures. When registering/unregistering my assembly on server side, just call them respectively. In OnCreate format the SELECT string, replacing {0}, {1}... with required values from the assembly resources. Then I can localize resources only, not every script.
Question 3. Is it good idea? Is there an existing attribute to mark methods to be executed by SQL Server automatically after (un)registartion an assembly?
Regards,
Well, the SQL-CLR trigger will also execute on the server, inside the server process - so that's server-side as well, no benefit there.
But I agree - triggers ought to be written in T-SQL whenever possible - no real big benefit in having triggers in C#.... can you show the the whole trigger code?? Unless it contains really odd balls stuff, it should be pretty easy to convert to T-SQL.
I don't see how you could "move" the SELECT to the SQL side and keep the rest of the code in C# - either your trigger is in T-SQL (my preference), or then it is in C#/SQL-CLR - I don't think there's any way to "mix and match".
To start with, you probably do not need to do that type of subquery inside of whatever query you are doing. The INSERTED table only has rows that have been updated (or inserted but we can assume this is an UPDATE Trigger based on the comment in your code). So you can either INNER JOIN and you will only match rows in the Table with the alias of "R" or you can LEFT JOIN and you can tell which rows in R have been updated as the ones showing NULL for all columns were not updated.
Question 1) As marc_s said below, the Trigger executes in the context of the database. But it goes beyond that. ALL database related code, including SQLCLR executes in the database. There is no client-side here. This is the issue that most people have with SQLCLR: it runs inside of the SQL Server context. And regarding wanting to call a Stored Proc from the Trigger: it can be done BUT the INSERTED and DELETED tables only exist within the context of the Trigger itself.
Question 2) It appears that this question should have started with the words "Also, the SELECT". There are two things to consider here. First, when testing for "Status" values (or any Lookup values) since this is not displayed to the user you should be using numeric values. A "status" of "Enabled" should be something like "1" so that the language is not relevant. A side benefit is that not only will storing Status values as numbers take up a lot less space, but they also compare much faster. Second is that any text that is to be displayed to the user that needs to be sensitive to language differences should be in a table so that you can pass in a LanguageId or LocaleId to get the appropriate French, German, etc. strings to display. You can set the LocaleId of the user or system in general in another table.
Question 3) If by "registration" you mean that the Assembly is either CREATED or DROPPED, then you can trap those events via DDL Triggers. You can look here for some basics:
http://msdn.microsoft.com/en-us/library/ms175941(v=SQL.90).aspx
But CREATE ASSEMBLY and DROP ASSEMBLY are events that are trappable.
If you are speaking of when Assemblies are loaded and unloaded from memory, then I do not know of a way to trap that.
Question 1.
http://www.sqlteam.com/article/stored-procedures-returning-data
Question 3.
It looks like there are no appropriate attributes, at least in Microsoft.SqlServer.Server Namespace.

Oracle9i: Filter Expression Fails to Exclude Data at Runtime

I have a relatively simple select statement in a VB6 program that I have to maintain. (Suppress your natural tendency to shudder; I inherited the thing, I didn't write it.)
The statement is straightforward (reformatted for clarity):
select distinct
b.ip_address
from
code_table a,
location b
where
a.code_item = b.which_id and
a.location_type_code = '15' and
a.code_status = 'R'
The table in question returns a list of IP addresses from the database. The key column in question is code_status. Some time ago, we realized that one of the IP addresses was no longer valid, so we changed its status to I (invalid) to exclude it from appearing in the query's results.
When you execute the query above in SQL Plus, or in SQL Developer, everything is fine. But when you execute it from VB6, the check against code_status is ignored, and the invalid IP address appears in the result set.
My first guess was that the results were cached somewhere. But, not being an Oracle expert, I have no idea where to look.
This is ancient VB6 code. The SQL is embedded in the application. At the moment, I don't have time to rewrite it as a stored procedure. (I will some day, given the chance.) But, I need to know what would cause this disparity in behavior and how to eliminate it. If it's happening here, it's likely happening somewhere else.
If anyone can suggest a good place to look, I'd be very appreciative.
Some random ideas:
Are you sure you committed the changes that invalidate the ip-address? Can someone else (using another db connection / user) see the changed code_status?
Are you sure that the results are not modified after they are returned from the database?
Are you sure that you are using the "same" database connection in SQLPlus as in the code (database, user etc.)?
Are you sure that that is indeed the SQL sent to the database? (You may check by tracing on the Oracle server or by debugging the VB code). Reformatting may have changed "something".
Off the top of my head I can't think of any "caching" that might "re-insert" the unwanted ip. Hope something from the above gives you some ideas on where to look at.
In addition to the suggestions that IronGoofy has made, have you tried swapping round the last two clauses?
where
a.code_item = b.wich_id and
a.code_status = 'R' and
a.location_type_code = '15'
If you get a different set of results then this might point to some sort of wrangling going on that results in dodgy SQL actually be sent to the database.
There are Oracle bugs that result in incorrect answers. This surely isn't one of those times. Usually they involve some bizarre combination of views and functions and dblinks and lunar phases...
It's not cached anywhere. Oracle doesn't cache results until 11 and even then it knows to change the cache when the answer may change.
I would guess this is a data issue. You have a DISTINCT on the IP address in the query, why? If there's no unique constraint, there may be more than one copy of your IP address and you only fixed one of them.
And your Code_status is in a completely different table from your IP addresses. You set the status to "I" in the code table and you get the list of IPs from the Location table.
Stop thinking zebras and start thinking horses. This is almost certainly just data you do not fully understand.
Run this
select
a.location_type_code,
a.code_status
from
code_table a,
location b
where
a.code_item = b.which_id and
b.ip_address = <the one you think you fixed>
I bet you get one row with an 'I' and another row with an 'R'
I'd suggest you have a look at the V$SQL system view to confirm that the query you believe the VB6 code is running is actually the query it is running.
Something along the lines of
select sql_text, fetches
where sql_text like '%ip_address%'
Verify that the SQL_TEXT is the one you expect and that the FETCHES count goes up as you execute the code.