How to select a substring till Nth space based on fixed character in SQL Server and Oracle SQL - sql

I need to select a substring based on fixed character length till Nth space. Let me explain the problem.
Let's assume I have three different strings:
Lorem ipsum dolor sit amet, consectetur adipiscing elit
Lorem ipsumdolor sit amet, consectetur adipiscing elit
Lorem ipsum dolorsitamet, consectetur adipiscing elit
If I select 20 character from the beginning of the string I get the following substring sequentially
Lorem ipsum dolor si
Lorem ipsumdolor sit
Lorem ipsum dolorsit
But I want my substring (which is at most 20 characters long) like this
Lorem ipsum dolor
Lorem ipsumdolor sit
Lorem ipsum
That is, I do not want any partial word between two whitespaces
Please help me to generate the query.

Oracle:
select substr(substr(MyField,1,20), 1, instr(substr(MyField,1,20), ' ',-1,1))
from MyTable
SQL Server
SELECT LEFT(MyField, 20 - CHARINDEX (' ' ,REVERSE(LEFT(MyField,20))))
FROM MyTable

For Oracle (it should be possible to translate this to SQL Server, but I don't know SQL Server:
If the first "token" (before the first space) is more than 20 characters, then this will return NULL
if the 21st character is a space, return the first 20 characters
if the 21st character is not a space, but there is a space among the first 20 characters, then chop off the first 20 characters, and then find the "last" space and delete it and everything after it.
If the whole string is at most 20 characters, it should be returned as is.
In the test data below I added two more examples to test if this is working as needed.
with
inputs ( str ) as (
select 'Lorem ipsum dolor sit amet, consectetur adipiscing elit' from dual union all
select 'Lorem ipsumdolor sit amet, consectetur adipiscing elit' from dual union all
select 'Loremipsumdolorsitametconsedtetur' from dual union all
select 'Lorem ipsumdolorsit amet, consectetur etc.' from dual union all
select 'Lorem ipsum dolorsitamet, consectetur adipiscing elit' from dual union all
select 'abcdef ghijk lmno' from dual
),
prep ( str, flag, fragment ) as (
select str,
case when length(str) <= 20 or substr(str, 21, 1) = ' ' then 1 end,
substr(str, 1, 20)
from inputs
)
select str,
case flag when 1 then fragment
else substr(fragment, 1, instr(fragment, ' ', -1) - 1) end
as new_str
from prep;
STR NEW_STR
------------------------------------------------------- --------------------
Lorem ipsum dolor sit amet, consectetur adipiscing elit Lorem ipsum dolor
Lorem ipsumdolor sit amet, consectetur adipiscing elit Lorem ipsumdolor sit
Loremipsumdolorsitametconsedtetur
Lorem ipsumdolorsit amet, consectetur etc. Lorem ipsumdolorsit
Lorem ipsum dolorsitamet, consectetur adipiscing elit Lorem ipsum
abcdef ghijk lmno abcdef ghijk lmno

If SQL Server and if you don't mind a UDF.
Declare #YourTable table (SomeText varchar(500))
Insert Into #YourTable values
('Lorem ipsum dolor sit amet, consectetur adipiscing elit.'),
('Lorem ipsumdolor sit amet, consectetur adipiscing elit'),
('Lorem ipsum dolorsitamet, consectetur adipiscing elit')
Declare #MaxLen int = 20
Select *,Trimmed = [dbo].[udf-Str-TrimToWord](SomeText,#MaxLen)
From #YourTable
Returns
SomeText Trimmed
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Lorem ipsum dolor
Lorem ipsumdolor sit amet, consectetur adipiscing elit Lorem ipsumdolor
Lorem ipsum dolorsitamet, consectetur adipiscing elit Lorem ipsum
The UDF
CREATE FUNCTION [dbo].[udf-Str-TrimToWord] (#String varchar(max),#MaxLen int)
Returns varchar(max)
AS
Begin
Return LEFT(#String,#MaxLen-CharIndex(' ' ,Reverse(Left(#String,#MaxLen))))
End

Related

Separate and split a string with tags

I have a column with values that are separated by tags. How can I split that into three columns in SQL Server. These columns should be namely 'Application', 'Access Level', and 'Restrictions' containing corresponding text as text values. I am using SQL Server Management System 2012.
Below is what the value looks like that I am trying to split:
<b>Application</b> : Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.<br/><b>Access Level</b> : CLorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod. <br/><b>Restrictions</b> : Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.
My code attempt:
select
substring(ColumnName,1,charindex('br/><b>Access Level</b> : ',ColumnName)-1) as 'Application',
substring(ColumnName,charindex('<br/><b>Access Level</b> : ',ColumnName)+1,len(ColumnName)) as 'Access Level',
substring(ColumnName,charindex('<br/> <b>Restrictions</b> : ',ColumnName)+1,len(ColumnName)) as 'Restrictions'
from TableName
Please try the following solution.
Assuming that you have SQL Server 2016 or later.
Your data closely resembles JSON:
Convert input column into a legit JSON.
Use JSON_VALUE() function to retrieve name/value pairs one by one.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, unstructured NVARCHAR(MAX));
INSERT INTO #tbl (unstructured) VALUES
('<b>Application</b> : Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.<br/>
<b>Access Level</b> : CLorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod. <br/>
<b>Restrictions</b> : Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.');
-- DDL and sample data population, end
SELECT ID, unstructured
, JSON_VALUE(JSON, '$.Application') AS [Application]
, JSON_VALUE(JSON, '$."Access Level"') AS [Access Level]
, JSON_VALUE(JSON, '$.Restrictions') AS [Restrictions]
FROM #tbl
CROSS APPLY (VALUES ('{' + REPLACE(REPLACE(REPLACE(unstructured
,'<b>', '"')
,'</b> : ', '":"')
,'<br/>', '",') + '"}'
)) AS R(JSON);
You were pretty much there, you just needed to debug your logic and add/subtract the right amounts in your substring to match your data. Thats assuming your data always follows that pattern of course.
select ColumnName
, substring(ColumnName,22,charindex('<br/><b>Access Level</b> : ',ColumnName)-22) as 'Application'
, substring(ColumnName,charindex('<br/><b>Access Level</b> : ',ColumnName)+27,charindex('<br/><b>Restrictions</b> : ',ColumnName)-charindex('<br/><b>Access Level</b> : ',ColumnName)-27) as 'Access Level'
, substring(ColumnName,charindex('<br/><b>Restrictions</b> : ',ColumnName)+27,len(ColumnName)) as 'Restrictions'
from (
values ('<b>Application</b> : Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.<br/><b>Access Level</b> : Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.<br/><b>Restrictions</b> : Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.')
) x (ColumnName);
I would tend to split this out using cross apply to avoid repeating the same calculation.
select a.ColumnName
, substring(a.ColumnName, c.StartFirstString, d.LengthFirstString)
, substring(a.ColumnName, c.StartSecondString, d.LengthSecondString)
, substring(a.ColumnName, c.StartThirdString, d.LengthThirdString)
from (
values ('<b>Application</b> : Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.<br/><b>Access Level</b> : Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.<br/><b>Restrictions</b> : Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.')
) a (ColumnName)
cross apply (
values (
'<b>Application</b> : '
, '<br/><b>Access Level</b> : '
, '<br/><b>Restrictions</b> : '
)
) b (FirstString, SecondString, ThirdString)
cross apply (
values (
len(b.FirstString) + 2
, charindex('<br/><b>Access Level</b> : ', a.ColumnName) + len(b.SecondString) + 1
, charindex('<br/><b>Restrictions</b> : ', a.ColumnName) + len(b.ThirdString) + 1
)
) c (StartFirstString, StartSecondString, StartThirdString)
cross apply (
values (
c.StartSecondString-len(b.FirstString)-len(b.SecondString)-3
, c.StartThirdString-c.StartSecondString-len(b.ThirdString)-1
, len(a.ColumnName)-c.StartThirdString+1
)
) d (LengthFirstString, LengthSecondString, LengthThirdString);
Both return (showing columns as rows due to the data length):
Column
Data
Application
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.
Access Level
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.
Restrictions
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod.

Padding around tick labels

I'm doing a bar chart.
sns_plot = sns.barplot(years, yields, ax=axes[0, 0])
sns_plot.set_xticklabels(years, rotation=90)
sns_plot.xaxis.set_tick_params(pad=10)
But the text is too bunched up. I.e. need separation between the labels. How can I do this? The pad=10 seems to push the labels from the axis rather then separate labels.
After the labels are vertical (90 deg), they are still a bit bunched up. I guess finding the right lever to pull...
Test Data:
In [1]:
import numpy as np
import re
x = """Lorem ipsum dolor sit amet, consectetur adipiscing
elit, sed do eiusmod tempor incididunt ut labore et dolore
magna aliqua. Cras nibh turpis, ullamcorper ac lectus vel,
aliquet consectetur odio. Cras vel scelerisque tortor.
Interdum et malesuada fames ac ante ipsum primis in faucibus.
Proin id dignissim ante, a dictum ipsum. Fusce at lacus ac purus
pulvinar dignissim eget a quam. Sed quis mollis ligula, sed
ullamcorper velit. Curabitur vel congue metus. Ut placerat
ipsum non leo posuere, non vestibulum eros posuere.
Donec eu viverra augue, sit amet tempus ex. Vivamus
sit amet tempus ipsum. Fusce consequat, augue a mollis
hendrerit, quam neque dapibus ligula, vitae blandit ipsum
lorem eu mauris. """
x = pd.Series(x.split(' '))
x = x.apply(lambda x: re.sub('\W+', '', x))
y = np.random.randn(x.shape[0])
df = pd.DataFrame({'X': x, 'Y': y})
df.head()
Out [1]:
X Y
0 Lorem -0.562246
1 ipsum 1.085094
2 dolor 1.044887
3 sit -1.424002
4 amet -0.87682
Test plot:
In [2]:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 1, figsize=(9, 6))
ax.plot(df.X, df.Y, 'ks')
ax.tick_params(axis='x', rotation=90)
Out [2]:
So to summarize comments to the original post, there are multiple ways to remove cluttered ticklabels you have multiple options:
1) Smaller label text: ax.tick_params(axis='x', labelsize=6)
2) Less labels:
for label in ax.xaxis.get_ticklabels()[::2]:
label.set_visible(False)
3) Longer Axis: fig.set_size_inches((15, 4))
And so on...

How to end multi-line pipes in Haml?

I would like to write multiple multi-line paragraphs one after another but am not able to seperate them nicely.
The following code results in one paragraph where the other %p is included in the content:
%p Lorem ipsum dolor sit amet, |
consetetur sadipscing elitr, sed diam |
%p Lorem ipsum dolor sit amet, |
consetetur sadipscing elitr, sed diam |
results in:
<p>Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam %p Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam</p>
which is clearly not what i wanted.
The only "solution" I've come up with is:
%p Lorem ipsum dolor sit amet, |
consetetur sadipscing elitr, sed diam |
\
%p Lorem ipsum dolor sit amet, |
consetetur sadipscing elitr, sed diam |
this results in:
<p>Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam</p>
<p>Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam</p>
As you can see the two paragraphs are separated by an empty line, which does not look great, especially when you know that can't be the right way for doing this.
You don't need the pipe for plain text. From the reference:
A substantial portion of any HTML document is its content, which is
plain old text. Any Haml line that’s not interpreted as something else
is taken to be plain text, and passed through unmodified.
If you need it on the same line in HTML you'll need to keep it on the same line in HAML. But browsers normally don't care much about newlines so it would be fine anyway in most cases. Also in this case you need to put the text on a new line after the p like so:
%p
Lorem ipsum dolor sit amet,
consetetur sadipscing elitr, sed diam
%p
Lorem ipsum dolor sit amet,
consetetur sadipscing elitr, sed diam

Indenting plain text in a .text.haml file in a rails app

In my rails 3.1 app I am working on a text email backup and want it to show up in the email client as text separated into new lines like this:
Lorem ipsum dolor sit amet
Consectetur adipisicing elit
Sed do eiusmod tempor incididunt
Ut labore et d Lorem ipsum dolor sit amet Consectetur adipisicing elit Sed do eiusmod tempor incididunt Ut labore et dolore magna aliquaolore magna aliqua
In my .text.haml file I have tried to use this:
:plain
Lorem ipsum dolor sit amet
Consectetur adipisicing elit
Sed do eiusmod tempor incididunt
Ut labore et dolore magna aliqua
However, when I check it in gmail it appears condensed into one paragraph like this:
Lorem ipsum dolor sit amet Consectetur adipisicing elit Sed do eiusmod tempor incididunt Ut labore et dolore magna aliqua
What can I do to get this to work? When I copy and paste this code into a view file, and view source, it appears as text as I want it in the view source, but gets condensed into one paragraph in the browser. Does this perhaps indicate that gmail is taking the text and formatting it that way and I don't actually have a problem?
I don't have a setup to test this, but I believe you want either :escaped or :preserve (probably the latter).
If those don't work, see http://haml.info/docs/yardoc/file.HAML_REFERENCE.html#filters for others (including info on how to create your own).

inline tag in haml

In html, you can do something like this
<p>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent eget
aliquet odio. Fusce id quam eu augue sollicitudin imperdiet eu ac eros.
<em>Etiam nec nisi lorem</em>, ac venenatis ipsum. In sollicitudin,
lectus eget varius tincidunt, felis sapien porta eros, non
pellentesque dui quam vitae tellus.
</p>
It is nice, because the paragraph of text still looks like a paragraph in the markup. In haml, it looks like this
%p
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent eget
aliquet odio. Fusce id quam eu augue sollicitudin imperdiet eu ac eros.
%em Etiam nec nisi lorem
, ac venenatis ipsum. In sollicitudin,
lectus eget varius tincidunt, felis sapien porta eros, non
pellentesque dui quam vitae tellus.
Is there any way to totally inline a tag in haml?
Haml excels for structural markup, but it's not really intended for inline markup. Read: Haml Sucks for Content. Just put your inline tags as HTML:
.content
%p
Lorem ipsum <em>dolor</em> sit amet.
Or else use a filter:
.content
:markdown
Lorem ipsum *dolor* sit amet.
I know this is old. But figured I'd post this in case anyone lands here. You can also do this sort of thing in haml (And maybe more what the OP was looking for?).
%p Here is some text I want to #{content_tag(:em, "emphasize!")}, and here the word #{content_tag(:strong, "BOLD")} is in bold. and #{link_to("click here", "url")} for a link.
Useful for those situations where doing it on multiple lines adds spaces you don't want
I.E. When you have a link at the end of a sentence, and don't want that stupid space between the link and the period. (or like in the OP's example, there would be a space between the and the comma.
Just don't get carried away like i did in the example :)
You can inline HTML in any HAML doing
%p!= "Lorem ipsum <em>dolor</em> sit amet"
The != operator means that whatever the right side returns it will be outputted.
As a hybrid of these nice answers by others, I think you can define a Helper method in your application_helper.rb for some inline markups you'd frequently use. You don't need to mix HTML with HAML, nor do you have to type much.
In your helper;
def em(text)
content_tag(:em, text)
end
#def em(text)
# "<em>#{text}</em>".html_safe
#end
In your haml;
%p
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent eget
aliquet odio. Fusce id quam eu augue sollicitudin imperdiet eu ac eros.
#{em 'Etiam nec nisi lorem'}, ac venenatis ipsum. In sollicitudin,
lectus eget varius tincidunt, felis sapien porta eros, non
pellentesque dui quam vitae tellus.
It's all about indentation:
%p
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent eget aliquet odio. Fusce id quam eu augue sollicitudin imperdiet eu ac eros.
%em
Etiam nec nisi lorem, ac venenatis ipsum. In sollicitudin, lectus eget varius tincidunt, felis sapien porta eros, non pellentesque dui quam vitae tellus.