Finding Non-Numeric Characters in a string

Finding Non-Numeric Characters in a string - sql

I found a couple threads on this topic but the answers seemed a little too complex for what I'm trying to do.
I am using SQL Developer and have a table, lets call it HR_Employees. This table I have to use to populate a fact table, we will call it Fact_Key. This fact table has 3 columns, including a description as well as the Employee_Key column (which is the primary key).
The task I have is to figure out a way to create the employee_key off of a Department_ID field. For half of the company this is automatically generated based on some payroll information and pushed into the Fact_Key table however for the other half (which is merging into the first half) they do not use the same payroll system so this wouldn't work for us. I have to take the Department_ID and use that to create an employee_key by concatenating it with other fields (employee_number for instance). See below:
INSERT INTO FACT_KEY (Employee_Key, Description)
SELECT RPAD(Employee_Number||Deptartment_ID, 15,'0'), EmployeeJob_Code
FROM HR_EMPLOYEES
I am padding this to keep all keys at a consistent 15 character length. The problem I am experiencing is the Employee_Key field is a numeric data type but the Department_ID has non-numeric characters in it at random places. I need a way to remove those characters or preferably change them into their ASCII counterparts. Is there a simple way to do this?
Everything I've seen appears too complex and I was looking for a simpler way to do this (involving just one or two functions) as this is going into a stored procedure and I was directed to keep it simple. Any help would be appreciated, Thanks!

Just use regexp_replace, example:
select REGEXP_REPLACE('123A45B67CCC89', '[^[:digit:]]','') from dual
returns: 123456789

Related

DB2 creating a random but unique character identifier upon row insert

I currently have a column in a DB2 table which is being passed through web calls and procedure by a character-encrypted value. It is type CHARACTER(13) with a CSSID for encryption.
This has become a huge pain to accommodate through multiple APIs but was initially intended to allow us a unique ID to use in calls that wasn't the primary key.
In DB2-400, what would be the next best thing as far as a 13 or more character string that is unique and randomly created upon insert, but doesn't require decryption (just a plain string)?
Is there a commonly-gravitated-to method for this? We aren't passing secure data, so there's no need for encryption, but we just want a randomly created and unique character

Try hex(generate_unique()). It's unique CHAR(26) string.
Or to_char(timestamp(generate_unique()), 'YYYYMMDDHH24MISSFF6'). You may play with format of the to_char function as well. May be useful to use, let's say, reverse format like FF6SSMIHH24DDMMYYYY to avoid unique index page contention upon heavy insert activity.

This is a comment that doesn't fit in the comments section.
I don't have access to a DB2-400 (anymore), but I tested the code below in DB2 10.5 for Linux.
create sequence seq1;
select concat('A', varchar_format(next value for seq1, '000000000000')) as my_id
from sysibm.sysdummy1;
Result, if you run it 4 times in a row:
A0000000000001
A0000000000002
A0000000000003
A0000000000004
Maybe there's something equivalent in DB2-400.

Sounds like you might be using GENERATE_UNIQUE()
GENERATE_UNIQUE function returns a bit data character string 13 bytes
long (CHAR(13) FOR BIT DATA)
Doesn't really have anything to do with encryption...
And pretty much the ideal solution in my opinion generating a unique value other than a simple numeric identity. So what the problem you are having?

How to add brackets in heading of a table column in database?

I want to create a database containing multiple tables using postgres 11 and i'm currently creating a table which contain brackets in the heading of the column (as shown as follows).
Table - supp_details
supp_id|supp_name | supp_weight(Kg)|
Can i add units to the heading and what is the proper way to do so with sql?
I'm a fresher to query writing, so please help me with this.

You could place the column name in quotes, e.g. use "supp_weight(Kg)", but it is best to avoid placing special characters or keywords as object names. Instead, I suggest using the following name:
supp_weight_kg
It is just a single string requiring no escaping, and makes it clear what the units are. A better option might be to just use supp_weight, and maybe just keep a note somewhere that the column uses kilograms as the unit by default.

You will need to use quoted identifiers but I strongly recommend not to do that:
create table supp_details
(
supp_id integer,
supp_name text,
"supp_weight(kg)" integer
);

Adding bracket symbols into your column names is possible but probably a bad idea. If you want to micromanage the name just for display purposes of the end result, you should probably do it in an alias, using the AS keyword.
SELECT supp_id, supp_name, supp_weight AS "supp_weight(Kg)" FROM ...
Or add the decorations on the client side before it displays the results.

Replace single quote to regular abbreviation

I have a table with a state column. Inside the state column I have a value like TX` I want to replace that ending character to make the State read TX. How would I do that please give examples

You already have answers for replacing the quote, but I wanted to provide methods for avoiding this problem in the first place.
As noted in #SeanLange's answer, you can use define your State field as a CHAR(2) , so you know that you'll never have a dummy character following a valid state code. You could also handle this in your client code, sanitizing the input before even sending to the database.
One could argue that it would even be a good idea to define a lookup table with a foreign key constraint, so users could only input valid values. You could also use this lookup table client-side (e.g. to provide a list of states).
Of course, you also have to consider internationalization: What about when/if you need to store locations outside of the United States, that may have > 2 characters?

You can escape a single quote by doubling it and including it in quotes. So:
select replace(state, '''', ''')
Of course, if the problem is just a bad third character, then LEFT(state, 2) might do the trick as well.

As a Sean Lang's comment said, you can do this in many different ways. For simplicity, you can just use LEFT(string, #) function for the whole typecasting as long as your raw values are all in the TX` format (**two letter abbrev and one ` , so three characters total for every value in that field.
If that is the case then just do:
SELECT CAST(LEFT(t.state_column, 2) As Varchar(2)) As state,
t.column_2,
t.column_3
/* and so on for all the columns you want */
FROM table t;
--
Further Reference:
- https://w3resource.com/PostgreSQL/left-function.php

Storing multiple values in one column vs. Multiple columns for multiple values

With regards to Performance & 'The proper way of doing things' and trying to figure out which way is better to store configuration data in a SQL database. Assume you have website configuration data for setting a minimum and maximum age that a person must be to access the site.
CREATE TABLE SiteConfig
(
featureName varchar(100),
value varchar(100),
)
Which is better:
To store it all in a single row and process it in PHP with explode();.
featureName: "ageRequirement"
value: "13|60"
To store it in separate rows for each feature and just SELECT the feature you need when needed.
featureName: "minAge"
value: "13"
featureName: "maxAge"
value: "60"
Due to the amount of features of the website, this is a difference of having 60 ROWS of data vs. about 25.

Store it has two separate features.
You may want to access the feature in the database, rather than in the application code.
There is no reason to parse a comma delimited string to get two features.
Admittedly, in the case of just two numbers separated by a comma, not much can go wrong. This is, however, a slippery slope and you might soon find yourself trying to stuff multiple fields into a single string, and then devoting a lot of resources to parsing the string.

If those are values that you never will use by themselves in the database, either would work.
If you suspect that you would ever want to use the value in the database, you would not want them in the same value. That would make querying the data complicated and inefficient.

Theory of storing a number and text in same SQL field

I have a three tables
Results:
TestID
TestCode
Value
Tests:
TestID
TestType
SysCodeID
SystemCodes
SysCodeID
ParentSysCodeID
Description
The question I have is for when the user is entering data into the results table.
The formatting code when the row gets the focus changes the value field to a dropdown combobox if the testCode is of type SystemList. The drop down has a list of all the system codes that have a parentsyscodeID of the test.SysCodeID. When the user chooses a value in the list it translates into a number which goes into the value field.
The datatype of the Results.Value field is integer. I made it an integer instead of a string because when reporting it is easier to do calculations and sorting if it is a number. There are issues if you are putting integer/decimal value into a string field. As well, when the system was being designed they only wanted numbers in there.
The users now want to put strings into the value field as well as numbers/values from a list and I'm wondering what the best way of doing that would be.
Would it be bad practice to convert the field over to a string and then store both strings and integers in the same field? There are different issues related to this one but i'm not sure if any are a really big deal.
Should I add another column into the table of string datatype and if the test is a string type then put the data the user enters into the different field.
Another option would be to create a 1-1 relationship to another table and if the user types in a string into the value field it adds it into the new table with a key of a number.
Anyone have any interesting ideas?

What about treating Results.Value as if it were a numeric ValueCode that becomes an foreign key referencing another table that contains a ValueCode and a string that matches it.
CREATE TABLE ValueCodes
(
Value INTEGER NOT NULL PRIMARY KEY,
Meaning VARCHAR(32) NOT NULL UNIQUE
);
CREATE TABLE Results
(
TestID ...,
TestCode ...,
Value INTEGER NOT NULL FOREIGN KEY REFERENCES ValueCodes
);
You continue storing integers as now, but they are references to a limited set of values in the ValueCodes table. Most of the existing values appear as an integer such as 100 with a string representing the same value "100". New codes can be added as needed.

Are you saying that they want to do free-form text entry? If that's the case, they will ruin the ability to do meaningful reporting on the field, because I can guarantee that they will not consistently enter the strings.
If they are going to be entering one of several preset strings (for example, grades of A, B, C, etc.) then make a lookup table for those strings which maps to numeric values for sorting, evaluating, averaging, etc.
If they really want to be able to start entering in free-form text and you can't dissuade them from it, add another column along the lines of other_entry. Have a predefined value that means "other" to put in your value column. That way, when you're doing reporting you can either roll up all of those random "other" values or you can simply ignore them. Make sure that you add the "other" into your SystemCodes table so that you can keep a foreign key between that and the Results table. If you don't already have one, then you should definitely consider adding one.
Good luck!

The users now want to put strings into
the value field as well as
numbers/values from a list and I'm
wondering what the best way of doing
that would be.
It sounds like the users want to add new 'testCodes'. If that is the case why not just add them to your existing testcode table and keep your existing format.
Would it be bad practice to convert
the field over to a string and then
store both strings and integers in
the same field? There are different
issues related to this one but i'm not
sure if any are a really big deal.
No it's not a big deal. Often PO numbers or Invoice numbers have numbers or a combination of letters and numbers. You are right however about the performance of the database on a number field as opposed to a string, but if you index the string field you end up with the database doing it's scans on numeric indexes anyway.
The problems you may have had with your decimals as strings probably have to do with the floating point data types in which the server essentially estimates the value of the field and only retains accuracy to a certain number of digits. This can lead to a whole host of rounding errors if you are concerned about the digits. You can avoid that issue by using currency fields or the like that have static accuracy of the decimals. lol I learned this the hard way.
Tom H. did a great job addressing everything else.

I think the easiest way to do it would be to convert Results.Value to a "string" (char, varchar, whatever). Yes, this ruins the ability to do numeric sorting (and you won't be able to do a cast or convert on the column any longer since text will be intermingled with integer values), but I think any other method would be too complex to maintain properly. (For example, in the 1-1 case you mentioned, is that integer value the actual value or a foreign key to the string table? Now we need another column to determine that.)

I would create the extra column for string values. It's not true normalization but it's the easiest to implement and to work with.
Using the same field for both numbers and strings would work to as long as you don't plan on doing anything with the numbers like summing or sorting.
The extra table approach while good from a normalization standpoint is probably overly complex.

I'd convert the value field to string and add a column indicating what the datatype should be treated as for post processing and reporting.

Sql Server at least has an IsNumeric function you can use:
ORDER BY IsNumeric(Results.Value) DESC,
CASE WHEN IsNumeric(Results.Value) = 1 THEN Len(Results.Value) ELSE 99 END,
Results.Value

One of two solutions comes to mind. It kind of depends on what you're doing with the numbers. If they just represent a choice of some kind, then pick one. If you need to do math on it (sorting, conversion, etc..) then pick another.
Change the column to be a varchar, and then either put numbers or text in it. Sorting numerically will suck, but hey, it's one column.
Have both a varchar column for the text, and an int column for the number. Use a view to hide the differences, and to control the sorting if necessary. You can coalesce the two columns together if you don't care about whether you're looking at numbers or text.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas