filling space in value according to length of field - sql

I have sql query as:
select UJIdentifyer,RecordIdentity,Location,DeptTime,BayNumber,TimingPointIndicatior,
FareStageIndicatior
from QO where UJIdentifyer='139013'
In this query's result, every column's value length is fixed.
Eg. Its mandatory to have Location column with 12 character length.
If it is of 11 then one space should get filled in it, if 10 then 2 spaces.
For this i written :
select UJIdentifyer,case when len(RecordIdentity)=11 then RecordIdentity+''else when len(RecordIdentity)=10 then RecordIdentity+' '.... from QO where UJIdentifyer='139013'
Like this i will have to write 10-12 cases for length matching.
Is there any way through which i can avoid repetition of those similar cases??

Try this :-
case when datalength(Location ) <12
then Location + replicate (' ' ,12-datalength(Location))
END
Example:-
Declare #var varchar(12) = 'SQL'
Select datalength(#var) InitialLength,
case when datalength(#var) <12
then #var + replicate (' ' ,12-datalength(#var))
END as PaddingExtraSpaces,
Datalength(case when datalength(#var) <12
then #var + replicate (' ' ,12-datalength(#var))
END ) FinalLength
Result
IntialLength PaddingExtraSpaces FinalLength
2 SQL 12
Use Datalength to get the number of characters including blanks

Related

In SQL Server, how can I search for column with 1 or 2 whitespace characters?

So I need to filter column which contains either one, two or three whitespace character.
CREATE TABLE a
(
[col] [char](3) NULL,
)
and some inserts like
INSERT INTO a VALUES (' ',' ', ' ')
How do I get only the row with one white space?
Simply writing
SELECT *
FROM a
WHERE column = ' '
returns all rows irrespective of one or more whitespace character.
Is there a way to escape the space? Or search for specific number of whitespaces in column? Regex?
Use like clause - eg where column like '%[ ]%'
the brackets are important, like clauses provide a very limited version of regex. If its not enough, you can add a regex function written in C# to the DB and use that to check each row, but it won't be indexed and thus will be very slow.
The other alternative, if you need speed, is to look into full text search indexes.
Here is one approach you can take:
DECLARE #data table ( txt varchar(50), val varchar(50) );
INSERT INTO #data VALUES ( 'One Space', ' ' ), ( 'Two Spaces', ' ' ), ( 'Three Spaces', ' ' );
;WITH cte AS (
SELECT
txt,
DATALENGTH ( val ) - ( DATALENGTH ( REPLACE ( val, ' ', '' ) ) ) AS CharCount
FROM #data
)
SELECT * FROM cte WHERE CharCount = 1;
RETURNS
+-----------+-----------+
| txt | CharCount |
+-----------+-----------+
| One Space | 1 |
+-----------+-----------+
You need to use DATALENGTH as LEN ignores trailing blank spaces, but this is a method I have used before.
NOTE:
This example assumes the use of a varchar column.
Trailing spaces are often ignored in string comparisons in SQL Server. They are treated as significant on the LHS of the LIKE though.
To search for values that are exactly one space you can use
select *
from a
where ' ' LIKE col AND col = ' '
/*The second predicate is required in case col contains % or _ and for index seek*/
Note with your example table all the values will be padded out to three characters with trailing spaces anyway though. You would need a variable length datatype (varchar/nvarchar) to avoid this.
The advantage this has over checking value + DATALENGTH is that it is agnostic to how many bytes per character the string is using (dependant on datatype and collation)
DB Fiddle
How to get only rows with one space?
SELECT *
FROM a
WHERE col LIKE SPACE(1) AND col NOT LIKE SPACE(2)
;
Though this will only work for variable length datatypes.
Thanks guys for answering.
So I converted the char(3) column to varchar(3).
This seemed to work for me. It seems sql server has ansi padding that puts three while space in char(3) column for any empty or single space input. So any search or len or replace will take the padded value.

Filter rows by whether a text column contains any words in a string in SQL

My SQL Server database table has a column text which is a long string of text.
The search list is a string of words separated by comma. I want to grab those rows where the text column contains any one of words in the string.
DECLARE #words_to_search nvarchar(50)
SET #words_to_search = 'apple, pear, orange'
SELECT *
FROM myTbl
WHERE text ??? --how to specify text contains #words_to_search
Thanks a lot in advance.
If you're running SQL Server 2016 or later, you can use STRING_SPLIT to convert the words to search into a single column table, and then JOIN that to your table using LIKE:
DECLARE #words_to_search nvarchar(50)
SET #words_to_search = 'apple,pear,orange'
SELECT *
FROM myTbl
JOIN STRING_SPLIT(#words_to_search, ',') ON text LIKE '%' + value + '%';
Demo on SQLFiddle
Note that as the query is written it will (for example) match apple within Snapple. You can work around that by making the JOIN condition a bit more complex:
SELECT *
FROM myTbl t
JOIN STRING_SPLIT(#words_to_search, ',') v
ON t.text LIKE '%[^A-Za-z]' + value + '[^A-Za-z]%'
OR t.text LIKE value + '[^A-Za-z]%'
OR t.text LIKE '%[^A-Za-z]' + value;
Demo on SQLFiddle
First, I would use exists, unless you want to return the matching word.
Second, you can do this with a single like comparison If words are separated by spaces:
select t.*
from t
where exists (select 1
from string_split(#words_to_search, ',') s
where ' ' + t.text + ' ' like '% ' + value + ' %'
);
For more generic separators, you can use:
select t.*
from t
where exists (select 1
from string_split(#words_to_search, ',') s
where ' ' + t.text + ' ' like '%[^A-Za-z]' + value + '[^A-Za-z]%'
);
Or whatever describes your separators.
Note that your list of words is separated by a comma-space, not just a comma. However, based on your description (not the sample data), I have only used a ',' for the separator.

SQL Select query to pick the field value which has more than one empty space?

In my LastName Column, I have either one name or two names. In some records, I have more than one empty space between the two names.
I will have to select the records which has more than one empty space in the field name.
declare #nam nvarchar(4000)
declare #nam1 nvarchar(4000)
set #nam = 'sam' + ' ' + 'Dev'
set #nam1 = 'ed' + ' ' + ' ' + 'Dev'
In the sample query, i expect the output value should be #nam1.
You can do this using LEN and REPLACE to replace the spaces from string and then get original length - replaced length and then check that in WHERE clause,
SELECT *
FROM
mytTable
WHERE
LEN(LastName)-LEN(REPLACE(LastName, ' ', '')) > 1

How to count words in specific column against matching words in another table

I want to be able to:
extract specific words from column1 in Table1 - but only the words that are matched from Table2 from a column called word,
perform a(n individual) count of the number of words that have been found, and
put this information into a permanent table with a format, that looks like:
Final
Word | Count
--------+------
Test | 7
Blue | 5
Have | 2
Currently I have tried this:
INSERT INTO final (word, count)
SELECT
extext
, SUM(dbo.WordRepeatedNumTimes(extext, 'test')) AS Count
FROM [dbo].[TestSite_Info], [dbo].[table_words]
WHERE [dbo].[TestSite_Info].ExText = [dbo].[table_words].Words
GROUP BY ExText;
The function dbo.WordRepeatedNumTimes is:
ALTER function [dbo].[WordRepeatedNumTimes]
(#SourceString varchar(8000),#TargetWord varchar(8000))
RETURNS int
AS
BEGIN
DECLARE #NumTimesRepeated int
,#CurrentStringPosition int
,#LengthOfString int
,#PatternStartsAtPosition int
,#LengthOfTargetWord int
,#NewSourceString varchar(8000)
SET #LengthOfTargetWord = len(#TargetWord)
SET #LengthOfString = len(#SourceString)
SET #NumTimesRepeated = 0
SET #CurrentStringPosition = 0
SET #PatternStartsAtPosition = 0
SET #NewSourceString = #SourceString
WHILE len(#NewSourceString) >= #LengthOfTargetWord
BEGIN
SET #PatternStartsAtPosition = CHARINDEX (#TargetWord,#NewSourceString)
IF #PatternStartsAtPosition <> 0
BEGIN
SET #NumTimesRepeated = #NumTimesRepeated + 1
SET #CurrentStringPosition = #CurrentStringPosition + #PatternStartsAtPosition +
#LengthOfTargetWord
SET #NewSourceString = substring(#NewSourceString, #PatternStartsAtPosition +
#LengthOfTargetWord, #LengthOfString)
END
ELSE
BEGIN
SET #NewSourceString = ''
END
END
RETURN #NumTimesRepeated
END
When I run the above INSERT statement, no record is inserted.
In the table TestSite_Info is a column called Extext. Within this column, there is random text - one of the words being 'test'.
In the other table called Table_Words, I have a column called Words and one of the words in there is 'Test'. So in theory, as the word is a match, I would pick it up, put it into the table Final, and then next to the word (in another column) the count of how many times the word has been found within TestSite_Info.Extext.
Table_Words
id|word
--+----
1 |Test
2 |Onsite
3 |Here
4 |As
TestSite_Info
ExText
-------------------------------------------------
This is a test, onsite test , test test i am here
The expected Final table has been given at the top.
-- Update
Now that I have run Abecee block of code this actually works in terms of bringing back a count column and the id relating to the word.
Here are the results :
id|total
--+----
169 |3
170 |0
171 |5
172 |7
173 |1
174 |3
Taken from the following text which it is extracting from :
Test test and I went to and this was a test I'm writing rubbish hello
but I don't care about care and care seems care to be the word that you will see appear
four times as well as word word word word word, but a .!
who knows whats going on here.
So as you can see, the count for ID 172 appears 7 times (as a reference please see below to what ID numbers relate to in terms of words) which is incorrect it should appear appear 6 times (its added +1 for some reason) as well as ID 171 which is the word care, that appears 4 times but is showing up as 5 times on the count. Any ideas why this would be?
Also what I was really after was a way as you have quite kindly done of the table showing the ID and count BUT also showing the word it relates to as well in the final table, so I don't have to link back through the ID table to see what the actual word is.
Word|id
--+----
as |174
here |173
word |172
care |171
hello |170
test |169
You could work along the updated
WITH
Detail AS (
SELECT
W.id
, W.word
, T.extext
, (LEN(REPLACE(T.extext, ' ', ' ')) + 2
- LEN(REPLACE(' '
+ UPPER(REPLACE(REPLACE(REPLACE(REPLACE(T.extext, ' ', ' '), ':', ' '), '.', ' '), ',', ' '))
+ ' ', ' ' + UPPER(W.word) + ' ', '')) - 1
) / (LEN(W.word) + 2) count
FROM Table_Words W
JOIN TestSite_Info T
ON CHARINDEX(UPPER(W.word), UPPER(T.extext)) > 0
)
INSERT INTO Result
SELECT
id
, SUM(count) total
FROM Detail
GROUP BY id
;
(Had forgotten to count in the blanks added to the front and the end, missed a sign change, and got mixed up as for the length of the word(s) surrounded by blanks. Sorry about that. Thanks for testing it more thoroughly than I did originally!)
Tested on SQL Server 2008: Updated SQL Fiddle and 2012: Updated SQL Fiddle.
And with your test case as well.
It:
is pure SQL (no UDF required),
has room for some tuning:
Store words all lower / all upper case, unless case matters (Which would require to adjust the suggested solution.)
Store strings to check with all punctuation marks removed.
Please comment if and as further detail is required.
From what I could understand, this might do the job. However, things will be more clear if you post the schema
create table final(
word varchar(100),
count integer
);
insert into final (word, count)
select column1, count(*)
from table1, table2
where table1.column1 = table2.words
group by column1;
Thanks for all your help.
The best approach for this solution was :
WITH
Detail AS (
SELECT
W.id
, W.word
, T.extext
, (LEN(REPLACE(T.extext, ' ', ' ')) + 2
- LEN(REPLACE(' '
+ UPPER(REPLACE(REPLACE(REPLACE(REPLACE(T.extext, ' ', ' '), ':', ' '), '.', ' '), ',', ' '))
+ ' ', ' ' + UPPER(W.word) + ' ', '')) - 1
) / (LEN(W.word) + 2) count
FROM Table_Words W
JOIN TestSite_Info T
ON CHARINDEX(UPPER(W.word), UPPER(T.extext)) > 0
)
INSERT INTO Result
SELECT
id
, SUM(count) total
FROM Detail
GROUP BY id
Re-reading the problem description, the presented SELECT seems to try to align/join full "exText" strings with individual "words" - but based on equality. Then, it has "exText" in the SELECT list whilst "Result" seems to wait for individual words. (That would not fail the INSERT, though, as long as the field is not guarded by a foreign key constraint. But as the WHERE/JOIN is not likely to let any data through, this is probably never coming up as an issue anyway.)
For an alternative to the pure declarative approach, you might want to try along
INSERT INTO final (word, count)
SELECT
word
, SUM(dbo.WordRepeatedNumTimes(extext, word)) AS Count
FROM [dbo].[TestSite_Info], [dbo].[table_words]
WHERE CHARINDEX(UPPER([dbo].[table_words].Word), UPPER([dbo].[TestSite_Info].ExText)) > 0
GROUP BY word;
You have "Word" in your "Table_Words" description - but use "[dbo].[table_words].Words" in your WHERE condition…

SQL Server : find percentage match of LIKE string

I'm trying to write a query to find the percentage match of a search string in a notes or TEXT column.
This is what I'm starting with:
SELECT *
FROM NOTES
WHERE UPPER(NARRATIVE) LIKE 'PAID CALLED RECEIVED'
Ultimately, what I want to do is:
Split the search string by spaces and search individually for all words in the string
Order the results descending based on percentage match
For example, in the above scenario, each word in the search string would constitute 33.333% of the total. A NARRATIVE with 3 matches (100%) should be at the top of the results, while a match containing 2 of the keywords (66.666%) would be lower, and a match containing 1 of the keywords (33.333%) would be even lower.
I then want to display the resulting percentage match for that row in a column, along with all the other columns from that table (*).
Hopefully, this makes sense and can be done. Any thoughts on how to proceed? This MUST all be done in SQL Server, and I would prefer not to write any CTEs.
Thank you in advance for any guidance.
Here is what I came up with:
DECLARE #VISIT VARCHAR(25) = '999232'
DECLARE #KEYWORD VARCHAR(100) = 'PAID,CALLED,RECEIVED'
DECLARE SPLIT_CURSOR CURSOR FOR
SELECT RTRIM(LTRIM(VALUE)) FROM Rpt_Split(#KEYWORD, ',')
IF OBJECT_ID('tempdb..#NOTES_FF_SEARCH') IS NOT NULL DROP TABLE #NOTES_FF_SEARCH
SELECT N.VISIT_NO
,N.CREATE_DATE
,N.CREATE_BY
,N.NARRATIVE
,0E8 AS PERCENTAGE
INTO #NOTES_FF_SEARCH
FROM NOTES_FF AS N
WHERE N.VISIT_NO = #VISIT
DECLARE #KEYWORD_VALUE AS VARCHAR(255)
OPEN SPLIT_CURSOR
FETCH NEXT FROM SPLIT_CURSOR INTO #KEYWORD_VALUE
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE #NOTES_FF_SEARCH
SET PERCENTAGE = PERCENTAGE + ( 100 / ##CURSOR_ROWS )
WHERE UPPER(NARRATIVE) LIKE '%' + UPPER(#KEYWORD_VALUE) + '%'
FETCH NEXT FROM SPLIT_CURSOR INTO #KEYWORD_VALUE
END
CLOSE SPLIT_CURSOR
DEALLOCATE SPLIT_CURSOR
SELECT * FROM #NOTES_FF_SEARCH
WHERE PERCENTAGE > 0
ORDER BY PERCENTAGE, CREATE_DATE DESC
There may be a more efficient way to do this but every other road I started down ended in a dead-end. Thanks for your help
If you want to do a "percentage" match, you need to do two things: calculate the number of words in the string and calculate the number of words you care about. Before giving some guidance, I will say that full text search probably does everything you want and much more efficiently.
Assuming the search string has space delimited words, you can count the words with the expression:
(len(narrative) - len(replace(narrative, ' ', '') + 1) as NumWords
You can count the matching words with success replaces. So, for keywords, it would be something like removing each key word, fixing the spaces, and counting the words.
The overall code is best represented with subqueries. The resulting query is something like:
select n.*
from (select n.*,
(len(narrative) - len(replace(narrative, ' ', '') + 1.0) as NumWords,
ltrim(rtrim(replace(replace(replace(narrative + ' ', #keyword1 + ' ', ''),
#keyword2 + ' ', ''),
#keyword3 + ' ', ''))) as NoKeywords
from notes n
) n
order by 1 - (len(NoKeywords) - len(replace(NoKeywords, ' ', '') + 1.0) / NumWords desc;
SQL Server -- as with many databases -- is not particularly good at parsing strings. You can do that outside the query and assign the #keyword variables accordingly.