SQL- limit number of words in a varchar field - sql

I'm creating a database, and I would like to limit one of the table's fields to contain no more than 50 words but I'm not sure what is the way to create this constraint....?

You can add a CHECK constraint to your column. The question is, how do you define 'a word'?
In a rather simplistic approach we could assume that words are 'split' by spaces. In MSSQL you'd then have to add a check like this:
ALTER TABLE [myTable] ADD CONSTRAINT [chk_max_words] CHECK (Len(Replace([myField], N' ', N'')) > (Len([myField]) - 3))
When you try to insert or update a record and put the [myField] to 'test' it would pass, but if you set it to 'test test test test' it will fail because the number of spaces is 3 and our check will not let that pass.
Off course this approach is far from perfect. It does not consider double spaces, trailing spaces, etc...
From a practical point of view you probably want to write a function that counts the number of words according to your (elaborate) rules and then use that in the check.
ALTER TABLE [myTable] ADD CONSTRAINT [chk_max_words] CHECK (dbo.fn_number_of_words([myField] <= 3)

Related

Include wildcards in sql server in the values themselves - not when searching with LIKE

Is there a way to include wildcards in sql server in the values themselves - not when searching with LIKE?
I have a database that users search on. They search for model numbers that contain different wildcard characters but do not know that these wildcard characters exist.
For example, a model number may be 123*abc in the database, but the user will search for 1234abc because that's what they see for their model number on their unit at home.
I'm looking for a way to allow users to search without knowledge of wildcards but have a systematic way to include model numbers with wildcard characters in the database.
What you could do is add a PERSISTED computed column to your table with valid pattern expression for SQL Server. You stated that * should be any letter or numerical character, and comma delimited values in parenthesis can be any one of those characters. Provided that commas don't appear in your main data, nor parenthesis, then this should work:
USE Sandbox;
GO
CREATE TABLE SomeTable (SomeString varchar(15));
GO
INSERT INTO SomeTable
VALUES('123abc'),
('abc*987'),
('def(q,p,r,1)555');
GO
ALTER TABLE SomeTable ADD SomeString_Exp AS REPLACE(REPLACE(REPLACE(REPLACE(SomeString,'*','[0-9A-z]'),'(','['),')',']'),',','') PERSISTED; --What you're interested in
SELECT *
FROM SomeTable;
GO
DECLARE #String varchar(15) = 'defp555';
SELECT *
FROM SomeTable
WHERE #String LIKE SomeString_Exp; --And how to search
GO
DROP TABLE SomeTable;
If * is any character, and noy any alphanumeric then you could shorten the whole thing to (and provided your on SQL Server 2017):
ALTER TABLE SomeTable ADD SomeString_Exp AS REPLACE(TRANSLATE(SomeString,'*()','_[]'),',','') PERSISTED;
I'm thinking either:
where #model_number like replace(model_number, '*', '%')
or
where #model_number like replace(model_number, '*', '_')
Depending on whether '*' means any string (first example) or exactly one character (second example).

SQL Server stored procedure to search list of values without special characters

What is the most efficient way to search a column and return all matching values while ignoring special characters?
For example if a table has a part_number column with the following values '10-01' '14-02-65' '345-23423' and the user searches for '10_01' and 140265 it should return '10-01' and '14-02-65'.
Processing the input to with a regex to remove those characters is possible, so the stored procedure could could be passed a parameter '1001 140265' then it could split that input to form a SQL statement like
SELECT *
FROM MyTable
WHERE part_number IN ('1001', '140265')
The problem here is that this will not match anything. In this case the following would work
SELECT *
FROM MyTable
WHERE REPLACE(part_number,'-','') IN ('1001', '140265')
But I need to remove all special characters. Or at the very least all of these characters ~!##$%^&*()_+?/\{}[]; with a replace for each of those characters the query takes several minutes when the number of parts in the IN clause is less than 200.
Performance is improved by creating a function that does the replaces, so the query takes less than a minute. But without removals the query takes around 1 second, is there any way to create some kind of functional index that will work on multiple SQL Server engines?
You could use a computed column and index it:
CREATE TABLE MyTable (
part_number VARCHAR(10) NOT NULL,
part_number_int AS CAST(replace(part_number, '-', '') AS int)
);
ALTER TABLE dbo.MyTable ADD PRIMARY KEY (part_number);
ALTER TABLE dbo.MyTable ADD UNIQUE (part_number_int);
INSERT INTO dbo.MyTable (part_number)
VALUES ('100-1'), ('140265');
SELECT *
FROM dbo.MyTable AS MT
WHERE MT.part_number_int IN ('1001', '140265');
Of course your replace statement will be more complex and you'll have to sanitize user input the same way you sanitize column values. But this is going to be the most efficient way to do it.
This query can now seek your column efficiently:
But to be honest, I'd just create a separate column to store cleansed values for querying purpose and keep the actual values for display. You'll have to take care of extra update/insert clauses, but that's a minimum damage.

Alter column from varchar to decimal when nulls exist

How do I alter a sql varchar column to a decimal column when there are nulls in the data?
I thought:
ALTER TABLE table1
ALTER COLUMN data decimal(19,6)
But I just get an error, I assume because of the nulls:
Error converting data type varchar to numeric. The statement has been terminated.
So I thought to remove the nulls I could just set them to zero:
ALTER TABLE table1
ALTER COLUMN data decimal(19,6) NOT NULL DEFAULT 0
but I dont seem to have the correct syntax.
Whats the best way to convert this column?
edit
People have suggested it's not the nulls that are causing me the problem, but non-numeric data. Is there an easy way to find the non-numeric data and either disregard it, or highlight it so I can correct it.
If it were just the presence of NULLs, I would just opt for doing this before the alter column:
update table1 set data = '0' where data is null
That would ensure all nulls are gone and you could successfully convert.
However, I wouldn't be too certain of your assumption. It seems to me that your new column is perfectly capable of handling NULL values since you haven't specified not null for it.
What I'd be looking for is values that aren't NULL but also aren't something you could turn in to a real numeric value, such as what you get if you do:
insert into table1 (data) values ('paxdiablo is good-looking')
though some may argue that should be treated a 0, a false-y value :-)
The presence of non-NULL, non-numeric data seems far more likely to be causing your specific issue here.
As to how to solve that, you're going to need a where clause that can recognise whether a varchar column is a valid numeric value and, if not, change it to '0' or NULL, depending on your needs.
I'm not sure if SQL Server has regex support but, if so, that'd be the first avenue I'd investigate.
Alternatively, provided you understand the limitations (a), you could use isnumeric() with something like:
update table1 set data = NULL where isnumeric(data) = 0
This will force all non-numeric values to NULL before you try to convert the column type.
And, please, for the love of whatever deities you believe in, back up your data before attempting any of these operations.
If none of those above solutions work, it may be worth adding a brand new column and populating bit by bit. In other words set it to NULL to start with, and then find a series of updates that will copy data to this new column.
Once you're happy that all data has been copied, you should then have a series of updates you can run in a single transaction if you want to do the conversion in one fell swoop. Drop the new column and then do the whole lot in a single operation:
create new column;
perform all updates to copy data;
drop old column;
rename new column to old name.
(a) From the linked page:
ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($).
Possible solution:
CREATE TABLE test
(
data VARCHAR(100)
)
GO
INSERT INTO test VALUES ('19.01');
INSERT INTO test VALUES ('23.41');
ALTER TABLE test ADD data_new decimal(19,6)
GO
UPDATE test SET data_new = CAST(data AS decimal(19,6));
ALTER TABLE test DROP COLUMN data
GO
EXEC sp_RENAME 'test.data_new' , 'data', 'COLUMN'
As people have said, that error doesn't come from nulls, it comes from varchar values that can't be converted to decimal. Most typical reason for this I've found (after checking that the column doesn't contain any logically false values, like non-digit characters or double comma values) is when your varchar values use comma for decimal pointer, as opposed to period.
For instance, if you run the following:
DECLARE #T VARCHAR(256)
SET #T = '5,6'
SELECT #T, CAST(#T AS DEC(32,2))
You will get an error.
Instead:
DECLARE #T VARCHAR(256)
SET #T = '5,6'
-- Let's change the comma to a period
SELECT #T = REPLACE(#T,',','.')
SELECT #T, CAST(#T AS DEC(32,2)) -- Now it works!
Should be easy enough to look if your column has these cases, and run the appropriate update before your ALTER COLUMN, if this is the cause.
You could also just use a similar idea and make a regex search on the column for all values that don't match digit / digit+'.'+digit criteria, but i suck with regex so someone else can help with that. :)
Also, the american system uses weird separators like the number '123100.5', which would appear as '123,100.5', so in those cases you might want to just replace the commas with empty strings and try then?

Duplicate value in a postgresql table

I'm trying to modify a table inside my PostgreSQL database, but it says there is duplicate! what is the best way to find a duplicate value inside a table? kinda a select query?
Try Like This
SELECT count(column_name), column_name
from table_name
group by column_name having count(column_name) > 1;
If you try to change a value in a column that is part of the PRIMARY KEY or has a UNIQUE constraint and get this error there, then you should be able to find the conflicting row by
SELECT *
FROM your_table
WHERE conflicting_column = conflicting_value;
If conflicting_value is a character type, put it in single quotes (').
EDIT: To find out which columns are affected by the constraint, check this post.
First of all, determine which fields in your table have to be unique. This may be something marked as a Primary Key, a unique index based on one or more fields or a check constraint, again based on one or more fields.
Once you've done that, look at what you're trying to insert and work out whether it busts any of the unique rules.
And yes, SELECT statements will help you determine what's wrong here. Use those to determine whether you are able to commit the row.

CHECK CONSTRAINT of string to contain only digits. (Oracle SQL)

I have a column, say PROD_NUM that contains a 'number' that is left padded with zeros. For example 001004569. They are all nine characters long.
I do not use a numeric type because the normal operation on numbers do not make sense on these "numbers" (For example PROD_NUM * 2 does not make any sense.) And since they are all the same length, the column is defined as a CHAR(9)
CREATE TABLE PRODUCT (
PROD_NUM CHAR(9) NOT NULL
-- ETC.
)
I would like to constrain PROD_NUM so it can only contain nine digits. No spaces, no other characters besides '0' through '9'
REGEXP_LIKE(PROD_NUM, '^[[:digit:]]{9}$')
You already received some nice answers on how to continue on your current path. Please allow me to suggest a different path: use a number(9,0) datatype instead.
Reasons:
You don't need an additional check constraint to confirm it contains a real number.
You are not fooling the optimizer. For example, how many prod_num's are "BETWEEN '000000009' and '000000010'"? Lots of character strings fit in there. Whereas "prod_num between 9 and 10" obviously selects only two numbers. Cardinalities will be better, leading to better execution plans.
You are not fooling future colleagues who have to maintain your code. Naming it "prod_num" will have them automatically assume it contains a number.
Your application can use lpad(to_char(prod_num),9,'0'), preferably exposed in a view.
Regards,
Rob.
(update by MH) The comment thread has a discussion which nicely illustrates the various things to consider about this approach. If this topic is interesting you should read them.
Works in all versions:
TRANSLATE(PROD_NUM,'123456789','000000000') = '000000000'
I think Codebender's regexp will work fine but I suspect it is a bit slow.
You can do (untested)
replace(translate(prod_num,'0123456789','NNNNNNNNNN'),'N',null) is null
Cast it to integer, cast it back to varchar, and check that it equals the original string?
In MSSQL, I might use something like this as the constraint test:
PROD_NUM NOT LIKE '%[^0-9]%'
I'm not an Oracle person, but I don't think they support bracketed character lists.
in MS SQL server I use this command:
alter table add constraint [cc_mytable_myfield] check (cast(myfield as bigint) > 0)
Not sure about performance but if you know the range, the following will work.
Uses a CHECK constraint at the time of creating the DDL.
alter table test add jz2 varchar2(4)
check ( jz2 between 1 and 2000000 );
as will
alter table test add jz2 varchar2(4)
check ( jz2 in (1,2,3) );
this will also work
alter table test add jz2 varchar2(4)
check ( jz2 > 0 );