Parse JSON string in sql - sql

I have a column of JSON strings in my table, I'm using SQL Server 2008.
Any Ideas on how to parse the JSON string and extract a specific value?
Here's my JSON String:
{"id":1234,"name":"Lasagne al frono","description":"Placerat nisi turpis, dictumst nasceture ete etiam mus nec cum platea tempori zillest. Nisi niglue, augue adipiscing ete dignissim sed mauris augue, eros!","image":"images/EmptyProduct.jpg","cartImage":"images/ArabianCoffee.jpg","basePrice":54.99,"priceAttribute":"itemPrice","attributes":[{"type":"Addons","label":"Side","attributeName":"Side","display":"Small","lookupId":8},{"type":"Addons","label":"Drink","attributeName":"drink","display":"Small","lookupId":5},{"label":"add note","type":"Text","attributeName":"notes","display":"Wide","lookupId":null}]}
I need to extract the value of "name", Any Help?

Since sql server has no JSON support built in, you'd need to parse this manually, which would get complicated.
However, you could always use somebody else's JSON parsing library.

For parsing JSON you can write an easy CLR Function in C# or VB.NET.

Now you can use MS SQL Server 2016
https://msdn.microsoft.com/en-us/library/dn921897.aspx

Related

How to Detect Question Mark Invalid Character in SQL

I am working in a database that accepts imported files. When the client enters a registered trademark, copyright, or another invalid symbol, the database imports the symbol as an invalid character in the form of a question mark, like the following:
lorem ipsum dolor sit amet, consectetur � lorem ipsum dolor sit amet, consectetur
When printing this character, it appears as such:
lorem ipsum dolor sit amet, consectetur ? lorem ipsum dolor sit amet, consectetur
Is there a way to detect that symbol, as using a like statement doesn't detect the symbol.
The desired result is to be able to send a warning in a stored procedure that asks the user to check the inserted data to ensure validity.
Note: It is not enough to insert the string into a temp table and then check the temp table for question marks, as a question mark in the string is not uncommon and would create for more false positives than helpful alerts.
Thank you
That special character is NCHAR(65533) but evades normal pattern matching using LIKE, CHARINDEX, PATINDEX, etc. I did find one way to detect it using TRANSLATE, by swapping the Unicode replacement character for a different Unicode character that can't possibly be in the data already. I picked an 8-pointed star (✵, NCHAR(10037)) but there are so many to choose from...
CREATE TABLE dbo.whatever(things nvarchar(32));
INSERT dbo.whatever(things) VALUES
(N'this row is just fine.'),
(N'well, here there is a � rhombus.'),
(N'this row is just fine too.');
SELECT things
FROM dbo.whatever
WHERE TRANSLATE(things, nchar(65533), N'✵') LIKE N'%✵%';
Output:
well, here there is a � rhombus.
Also note the difference between print 'hi � there'; and print N'hi � there'; - don't be lazy, if your string is (or could contain) Unicode, always use the N'prefix'.
As Martin suggests, though, SQL Server can store whatever character is leading to the � - it is most likely because you are treating the file as ASCII, inserting them into a varchar column, or it is getting lost somewhere else along the way.

Which approaches of quick adding of surroundings for text are exist in IntelliJ IDEA family IDEs?

Basic example
Consider below code. I took Pug preprocessor for example, but it could be any other declarative language like HTML, HAML, etc.
p.
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et
dolore magna aliqua.
I need:
p.
Lorem ipsum dolor sit amet, #[+ImprovedUnderline consectetur] adipiscing elit, sed do eiusmod tempor incididunt ut labore et
dolore magna aliqua. #[+ImprovedUnderline ]
The content of last #[+ImprovedUnderline ] has not been inputted yet.
Target
Provide 2 methods of quick adding of #[+ImprovedUnderline TARGET_WORDS] surrounding without direct typing:
Before TARGET_WORDS will be input.
After TARGET_WORDS will be input (select the TARGET_WORDS and surround it by #[+ImprovedUnderline TARGET_WORDS]).
Why Live Templates can not handle it
Consider below Live template:
#[+ImprovedUnderline $SELECTION$]
For the version 2020.2, if we have something in line and try to use above live template with Ctrl + Alt + J, all previous characters in line will be wrapped:
So the Live templates does not satisfy to first target.
Which other methods ItelliJ IDEA suggests?
IntelliJ IDEA makes it very easy to surround a code block with if, while, for and other statements or to make a code block become part of such statments as try/catch or synchronized. Simply select the code block to surround (don’t forget to use Ctrl + W to increase the current selection) and then press Ctrl + Alt + T (or right-click the selection and select Surround with… from the menu). IntelliJ IDEA will show a list of options to choose from

Tag content in pdf

I have a pdf which looks like below. I would want to tag the paragraph as 'paragraph'. I have searched a lot about this, and there are ways to create a tagged pdf from scratch, or convert html content to tagged pdf, but I have not had success in tagging an existing pdf.
Given the coordinates can I tag a content in pdf. In this example, I want to tag the paragraph as paragraph tag. Thanks.
**A sample pdf**
1. Lorem ipsum dolor sit amet, consectetuer adipiscing elit,
sed diam nonum- my nibh euismod ncidunt ut laoreet dolore magna aliquam erat volutpat.
Ut wisi enim ad minim veniam, quis nostrud exerci taon ullamcorper
sus- cipit lobors nisl ut aliquip ex ea commodo consequat.
PDF is not a WYSIWYG format.
It's not because you see a paragraph that a computer program is able to see it.
In fact, an untagged PDF might look like this (pseudo-pdf-code):
go to location 10, 700
set the active font to Times New Roman
set the fontsize to 12
set the color to black
draw the glyph 'H'
go to coordinate 10, 680
draw the glyphs 'Lorem'
As you can tell from the example, instructions don't need to draw the text in reading order.
So the first challenge you're facing is to identify paragraphs.
I worked at iText, I've talked to various people at Adobe.
Being able to recognize structure in an untagged PDF document is not considered an easy problem.
Once you do have this structure (to the level of 'these glyphs make up a line' and 'these lines make up a paragraph' etc), it's a matter of creating a StructureTree
But since this usecase (re-tagging a PDF) was never thought possible, iText (or any other PDF library to my knowledge) isn't really designed to allow you to (easily) do this.
A tag itself is a part of separate datastructure inside the PDF.
Tags can have children (for instance to indicate 'this paragraph contains these lines').
A tag itself will reference the objects (groups of instructions) that are part of it.
So you might have:
these instructions (to render a line of text) make up a word and form an object
these word objects are aggregated (by a tag) into a line object
a few line tags are aggregated into a paragraph object
For a thorough understanding, I recommend reading the PDF spec.

Replacing Linefeeds in SQL database

I'm in a bit over my head here.
I have an SQL database, and I'm trying to replace all linefeeds (LF), which are NOT preceeded by a whitespace, with a whitespace + the linefeed. I'm using SQLiteStudio for this. What I have right now is the following:
UPDATE table
SET column = replace( column, '%' + char(10) + '%', ' ' )
When I run the above query, the following data:
<br><strong><font color="2018283286c3">
Lorem ipsum dolor sit amet, consectetur adipiscing[LF]
elit, sed do eiusmod tempor incididunt ut labore et[LF]
<hr size="1px" noshade style="clear:both;margin-top:10px;height:1px;">
... Becomes:
<br><strong><font color="2% %18283286c3">
Lorem ipsum dolor sit amet, consectetur adipiscing[LF]
elit, sed do eiusmod tempor incididunt ut labore et[LF]
<hr size="1px" noshade style="clear:both;margin-top:1% %px;height:1px;">
I have added the [LF]'s in the above for clarity. As can be seen, my query only replaces the zeroes, for some reason, and doesn't match the linefeeds.
What I need to end up with is this:
<br><strong><font color="2018283286c3">
Lorem ipsum dolor sit amet, consectetur adipiscing[WHITESPACE][LF]
elit, sed do eiusmod tempor incididunt ut labore et[WHITESPACE][LF]
<hr size="1px" noshade style="clear:both;margin-top:1% %px;height:1px;">
... so that only LF's NOT already preeceded by a whitespace are matched and replaced with a whitespace + LF. LF's already preeceded by a whitespace are left alone, ideally.
Any ideas what I'm doing wrong, or if there is a better method for this? I found the above query online and have tried to tweak it. Not used to working with these things. Thanks for reading!
Not sure if your DB setup supports regular expressions, but if so, you can try to do your search/replace with them. Take a look at this link:
replace a part of a string with REGEXP in sqlite3
Once you get your regexp replace function in place, you can use this as your search pattern:
(?P<mystring>.*\S+)\n$
This will match strings that end with a LF, but no whitespace directly preceding it. You can then use the named group "mystring" to construct the string you want.
You can test/revise your regexp here: https://regex101.com/

Convert URLs to Hyperlinks

I have a table in my database with around 3000 records. One of the columns in this table contains data including URLs. I wish to convert these URLs into Hyperlinks so that when the content is rendered onto a web page, it is an anchor element linking to the URL.
For example the content may be like:
Lorem ipsum http://domain.com dolor sit amet, consectetur adipiscing elit. Cras consequat nisl vitae leo pellentesque tempus et id nunc. Vestibulum varius facilisis fringilla
And I want to change it to:
Lorem ipsum <a href='http://domain.com' target='_blank'>http://domain.com</a> dolor sit amet, consectetur adipiscing elit. Cras consequat nisl vitae leo pellentesque tempus et id nunc. Vestibulum varius facilisis fringilla
I've tried doing:
UPDATE TableA
SET Content=REPLACE(Content, "http://domain.com", "<a href='http://domain.com' target='_blank'>http://domain.com</a>")
But this only works for that one exact URL, whereas I need it to work for any URL starting with http://
Is this possible in SQL Server?
You could use a programming language of your choice, SELECT all entries, manipulate them with a regex that replaces the URLs in each row and UPDATE each row.
If you want to use SQL Server directly, you could try implementing a CLR function on your DB server. The following link explains how to do it:
http://weblogs.sqlteam.com/jeffs/archive/2007/04/27/SQL-2005-Regular-Expression-Replace.aspx
Then you'd use a pattern to match the URLs, like
^http://([a-zA-Z0-9_\-]+)([\.][a-zA-Z0-9_\-]+)+([/][a-zA-Z0-9\~\(\)_\-]*)+([\.][a-zA-Z0-9\(\)_\-]+)*$
(that regex works, but is probably not complete)
Yes it's possible. You will have to parse your string and split of the parts you need. Have a look at the t-sql character functions. Mainly "charindex" and "substring"