SQL Parse NVARCHAR Field - sql

I am loading data from Excels into database on SQL Server 2008. There is one column which is in nvarchar data type. This field contains the data as
Text text text text text text text text text text.
(ABC-2010-4091, ABC-2011-0586, ABC-2011-0587, ABC-2011-0604)
Text text text text text text text text text text.
(ABC-2011-0562, ABC-2011-0570, ABC-2011-0575, ABC-2011-0588)
so its text with many sentences of this kind.
For each row I need to get the data ABC-####-####, respectivelly I only need the last part. So e.g. for ABC-2010-4091 I need to obtain 4091. This number I will need to join to other table. I guess it would be enough to get the last parts of the format ABC-####-####, then I should be able to handle the request.
So the example of given above, the result should be 4091, 0586, 0587, 0604, 0562, 0570, 0575, 0588 in the row instead of the whole nvarchar value field.
Is this possible somehow? The text in the nvarchar field differ, but the text format (ABC-####-####) I want to work with is still the same. Only the count of characters for the last part may vary so its not only 4 numbers, but could be 5 or more.
What is the best approach to get these data? Should I parse it in SSIS or on the SQL server side with SQL Query? And how?
I am aware this is though task. I appreciate every help or advice how to deal with this. I have not tried anything yet as I do not know where to start. I read articles about SQL parsing, but I want to ask for best approach to deal with this task.

Stackoverflow is about programming.
Sit down and start programming.
Ok, seriously. That is string parsing and the last part in brackets with multiple fields means no bulk import, it is not a standard CSV file.
Either you use SSIS in SQL Server and program the parsing there or.... you write a program for that.
String maniupation in SQL is the worst part of the language and I would avoid it.
So, yes, sit down and program a routine. Probable the fastest way.

If I understand correctly, "ABS-####-####" will be the value coming through in the column and the numeric part is variable in length.
If that is the case, maybe this will work.
Use a "Derived Column" transformation.
Lets say we call "ABC-####-####" = Column1
SUBSTRING("Column1",(FINDSTRING("Column1","-",2)+1),LEN(Column1)-(FINDSTRING("Column1","-",2)))
If I am not mistaken, that should give you the last # values in a new column no matter how long that value is.
HTH

I have worked this problem out with the following guides:
Split Multi Value Column into Multiple Records &
Remove Multiple Spaces with Only One Space

Related

How to find Bad characters in the column

I am trying to pull 'COURSE_TITLE' column value from 'PS_TRAINING' table in PeopleSoft and writing into UTF-8 text file to get loaded into Workday system. The file is erroring out while loading because of bad characters(Ã â and many more) present in the column. I have used a procedure which will convert non-ascii value into space. But because of this procedure, the 'Course_Title' which are written in non-english language like Chinese, Korean, Spanish also replacing with spaces.
I even tried using regular expressions (``regexp_like(course_title, 'Ã) only to find bad characters but since the table has hundreds of thousands of rows, it would be difficult to find all bad characters. Please suggest a way to solve this.
If you change your approach, this may work.
Define what you want, and retrieve it.
select *
from PS_TRAINING
where not regexp_like(course_title, '[0-9A-Za-z]')```
If you take out too much data, just add it to the regex

SQL query that prevents Excel from converting long integer to scientific notation

So it's been a long time since I've done anything fancy with SQL, so I'm going to do my best to explain. Please be nice, I'm trying my best here.
Basically, I'm pulling information from a database in Snowflake and putting it into a new XML file, and that data is input exactly as-written into a form email.
One of the values is an ID number that's 14 characters long (example: 12345678912345), which is stored in the database as an integer (or so I'm told), but Excel keeps automatically converting it into scientific notation. Since it's an ID number, it needs to look like an ID number, not scientific notation.
Right now, my query just selects & inputs the regular ol' value, and then we manually change it in the Excel sheet. Like literally just SELECT ID_Number from TheThing
One thing I thought might work is:
SELECT CAST(ID_Number as bigint) as ID_Number
... But it doesn't work. Most other solutions I've found don't seem to address my specific scenario of unwanted integer-to-string conversion & I'm distraught.
I'm just an intern and this might have a very obvious answer, but my fellow interns have given up on it and I need to find the answer for my own sanity. It's been a minute since I did anything fancy with SQL so please be nice to me and sorry if this is a dumb question.
In Snowflake, BIGINT and INT(EGER) are the same thing, what you want is VARCHAR. As Ross mentioned in his comment, this is likely just a formatting issue within Excel. In Excel any value can be cast as a string by including a single quote ' at the beginning of the value, or by using the Text-to-Column feature.
If you wanted to try to format it out of Snowflake as a string, casting it might not do the trick unless you include some kind of additional string character.
To get this type of formatting out of Snowflake, you can try:
SELECT '\'' || CAST(ID_Number AS VARCHAR) as ID_Number;

Quickly Convert Text To Numbers or Dates Excel VBA

Is there any way to QUICKLY convert numbers/dates stored as text (without knowing exactly which cells are affected) to their correct type using VBA.
I get data in an ugly text-deliminated format, and I wrote a macro that basically does text-to-columns on it, but is more robust (regular text-to-columns will not work on my data, and I also don't want to waste time going through the wizard every time...). But, since I have to use arrays to process the data efficiently, everything gets stored as a String (and is thus transferred to the worksheet as text).
I don't want to have to cycle through every cell, as this takes a LONG time (these are huge data files - I need to use arrays to process them). Is there a simple command I can apply to the entire range to do this?
Thanks!
This has to do with the data type of the columns modify the column from general to the correct data type and the placement of text data should get automatically converted... here's an example where I pasted the text 012345 into different columns having different data types. Note how the displayed value is different for the different types but the value is retained (except on number and general which truncate a leading 0.
However if you don't know what field is of what type... you're really out of luck.
There is a way is there. Just multiply 1 with the data in the column have text to converted as number, whether it is text or not it will convert to numbers only.
Read the following the link for more.
http://chandoo.org/wp/2014/09/02/convert-numbers-stored-as-text-tip/

Getting long 'dirty' strings from SQL Server database into a 'clean' excel file

I Have a table in which comments are kept about clients. This is an open field and be very long and include line breaks.
When I try and export this to Excel, the data is misaligned. I'd like to return as much of the comment as possible in an excel cell, without anything like a line break.
Is there a way I could do this in Excel? (Find and replace)
Is there a way to structure my SQL query to only return what I can fit?
Or is there a better way?
I found the best way to deal with this is to enclose all suspect String columns with Speech marks "" and then in excel under the text to columns option make sure to select speech marks as a text qualifier.
This always worked for me.
Just be sure to remove speech marks from the string column in question otherwise it will split it again.
Another method i used was to used an obscure delimiter like an Ibar | which was not likely to be found in my data and by again using the Text to columns option i specified the IBar as the column separator which did just what i needed.
T

Full Text Searching for single characters

I have a table with a TEXT column where the contents is just strings of CSV numbers. Example ",1,76,77,115," Each string can have an arbitrary number of numbers.
I am trying to set up Full Text Indexing so that I can search this column rapidly. This works great. Instead of running queries with
where MY_COL LIKE '%,77,%' and MY_COL LIKE '%,115,%'
I can do
where CONTAINS(MY_COL,'77 and 115')
However, when I try to search for a single character it doesn't work.
where CONTAINS(MY_COL,'1')
But I know that there should be records returned! I quickly found that I need to edit the Noise file and rebuild the index. But even after doing that it still doesn't work.
Working with relational databases that way is going to hurt.
Use a proper schema. Either store the values in different rows or use an array datatype for the column.
That will make solving the problem trivial.
I fixed my own problem, although I'm not exactly sure what fixed it.
I dropped my table and populated a new one (my program does batch processing) and created a new Full Text Index. Maybe I wasn't being patient enough to allow the indexing to fully rebuild.
Agreed. How does 12,15,33 not return that record for a search for 1 with fulltext? Use an actual table schema to accomplish this.