How to select the immediate characters BEFORE a specific string in SQL - sql

So imagine I have a SQL tempTable with a text field with 2 pages worth of text in it.
select * from tempTable
pkid
text
0
This is an example text with some images names like image1.svg or another one like image2.svg
1
This is another example text image3.svg and several images more like image4 and image5
What I want to know is if it's possible to select the characters before the .svg extension, so that the select result would look like
result
ike image1.svg
ike image2.svg
ext image3.svg
and so on.
I've alread read about CHARINDEX and SUBSTRING, but I've only been able to find selects that return ALL text before my filter (.svg).

So I found a way to do it. This is the query I used using PATINDEX().
select pkid, SUBSTRING (text, PATINDEX('%.svg%',text)-60,65)
from tempTable
where text like '%.svg%'
This way you can either return ALL text before desired word/expression, or get a certain number of characters before, you just need to change the substring ranges.

Here's what I came up with. This uses the string_split and LAG functions in MSSQL, but other database engines have similar features.
--create the temp table
DECLARE #temp AS TABLE (
pkid int,
text nvarchar(max)
)
--populate the temp table
INSERT INTO #temp (pkid, text) VALUES
(0, 'This is an example text with some images names like image1.svg or another one like image2.svg'),
(1, 'This is another example text image3.svg and several images more like image4 and image5')
--run the query to get the desired results
SELECT
CONCAT(RIGHT(split.priorValue, 10), '.svg') AS result
FROM (
SELECT
ss.value,
LAG(ss.value, 1) OVER (ORDER BY pkid) AS priorValue
FROM #temp
CROSS APPLY string_split(text, '.') ss
) AS split
WHERE split.value LIKE 'svg%'

Related

Get maximum value in a column in sql query if the column is alphanumeric

This is the table which I have by name project and it contains 3 columns:
estimateId
name
projectName
I want to fetch data from SQL database based on maximum value of estimateId
but here estimateid is alphanumeric. How can I achieve this.
I need a SQL query to achieve this:
For example estimateId contains values like:
Elooo1
Elooo2
......
Elooo10
and so on. So how can I achieve this?
Setup Testing Data
DECLARE #tmpTable TABLE ( estimateId NVARCHAR(MAX));
INSERT into #tmpTable(estimateId) VALUES ('Elooo1'),('Elooo2'),('Elooo3'),('Elooo4'),('Elooo5'),('Elooo6');
Split data based on the pattern
SELECT T.prefix AS prefix, MAX(T.suffix) AS suffix, MAX(estimateId) AS estimateId FROM (SELECT estimateId,LEFT(estimateId, PATINDEX('%[a-zA-Z][^a-zA-Z]%', estimateId )) AS prefix,LTRIM(RIGHT(estimateId, LEN(estimateId) - PATINDEX('%[a-zA-Z][^a-zA-Z]%', estimateId ))) As suffix FROM #tmpTable) T GROUP BY T.prefix
Result
prefix suffix estimateId
Elooo 6 Elooo6
Reference
split alpha and numeric using sql
I just started SQL like today.. so i'm totally a newbie, but I think I could solve your problem. I would do something like this
SELECT name, projectName FROM table ORDER BY estimateId ASC
or (I think you will need ORDER BY ... DESC)
SELECT name, projectName FROM table ORDER BY estimateId DESC
You seem to be looking to extract the numeric part of the strings. Assuming that the strings have variable length, and that the numbers are always at the end, you can do:
try_cast(
substring(estimateId, patindex('%[0-9]%', estimateId), len(estimateId))
as int
)
This captures everything from the the first number in the string to the end of the string, and attempts to convert it to a number (if the conversion fails, try_cast() returns null rather than raising an error).
It is not very clear what you want to use this information for. For example, if you wanted to sort your data accordingly, you would do:
select *
from mytable
order by try_cast(
substring(estimateId, patindex('%[0-9]%', estimateId), len(estimateId))
as int
)

Redshift - Extract value matching a condition in Array

I have a Redshift table with the following column
How can I extract the value starting by cat_ from this column please (there is only one for each row and at different position in the array)?
I want to get those results:
cat_incident
cat_feature_missing
cat_duplicated_request
Thanks!
There is no easy way to extract multiple values from within one column in SQL (or at least not in the SQL used by Redshift).
You could write a User-Defined Function (UDF) that returns a string containing those values, separated by newlines. Whether this is acceptable depends on what you wish to do with the output (eg JOIN against it).
Another option is to pre-process the data before it is loaded into Redshift, to put this information in a separate one-to-many table, with each value in its own row. It would then be trivial to return this information.
You can do this using tally table (table with numbers). Check this link on information how to create this table: http://www.sqlservercentral.com/articles/T-SQL/62867/
Here is example how you would use it. In real life you should replace temporary #tally table with a permanent one.
--create sample table with data
create table #a (tags varchar(500));
insert into #a
select 'blah,cat_incident,mcr_close_ticket'
union
select 'blah-blah,cat_feature_missing,cat_duplicated_request';
--create tally table
create table #tally(n int);
insert into #tally
select 1
union select 2
union select 3
union select 4
union select 5
;
--get tags
select * from
(
select TRIM(SPLIT_PART(a.tags, ',', t.n)) AS single_tag
from #tally t
inner join #a a ON t.n <= REGEXP_COUNT(a.tags, ',') + 1 and n<1000
)
where single_tag like 'cat%'
;
Thanks!
In the end I managed to do it with the following query:
SELECT SUBSTRING(SUBSTRING(tags, charindex('cat_', tags), len(tags)), 0, charindex(',', SUBSTRING(tags, charindex('cat_', tags), len(tags)))) tags
FROM table

Insert data with a SUBSTRING in the SELECT

How do I INSERT data with a SUBSTRING in the SELECT?
I tried the following, but it didn't work:
INSERT INTO PHONE(
ID_CLIENT,
COD,
PHONE)
SELECT (ID_CLI,
SUBSTRING(PHONE,0,3), '',
SUBSTRING(PHONE,3,9), '')
FROM [dbo].TABLE_1
WHERE ID_PHONE = 2
What am I doing wrong?
INSERT INTO PHONE(
ID_CLIENT,
COD,
PHONE)
SELECT (ID_CLI,
SUBSTRING(PHONE,0,3),
SUBSTRING(PHONE,3,9)
FROM [dbo].TABLE_1
WHERE ID_PHONE = 2
SQL is not like other languages. When counting a string, the count does not begin at zero, like python and other similar languages, it begins at 1. I noticed, in your substring() function, you inserted the starting value as 0. This would be correct in some languages, but in SQL the first character is 1 not zero. Also, after the last substring, be sure to close your () from the original select statement. You opened () after the select, before the ID_CLI column, but did not provide a closing () at the end of the statement. I am not a professional with this language, by any means, but I hope my input is helpful for what you are trying to do. I believe this may work, if not there may be something I am missing, (again, I am a novice with SQL, so forgive me if this is not correct)
INSERT INTO PHONE(
ID_CLIENT,
COD,
PHONE)
SELECT (ID_CLI,
SUBSTRING(PHONE,1,4),
SUBSTRING(PHONE,4,10))
FROM [dbo].TABLE_1
WHERE ID_PHONE = 2
It appears you are selecting five columns and trying to insert into three columns in another table. Make sure the number of rows match. It appears that you may be trying to concatenate the substrings with empty strings, which you would need to use the CONCAT function for.
You can do something like this:
WITH CteData
AS
(
SELECT ID_CLI,SUBSTRING(PHONE,0,3) AS COD,SUBSTRING(PHONE,3,9) AS PHONE
FROM [dbo].TABLE_1
WHERE ID_PHONE = 2
)
INSERT INTO PHONE(ID_CLIENT,COD,PHONE)
SELECT CteData.ID_CLI,CteData.COD,CteData.PHONE FROM CteData
but first you need to remove the empty space like '' which create the impression for the INSERT statement that you have more columns on the SELECT than the INSERT, I did use CTE just to make the code readable

Get each <tag> in String - stackexchange database

Mockup code for my problem:
SELECT Id FROM Tags WHERE TagName IN '<osx><keyboard><security><screen-lock>'
The problem in detail
I am trying to get tags used in 2011 from apple.stackexchange data. (this query)
As you can see, tags in tag changes are stored as plain text in the Text field.
<tag1><tag2><tag3>
<osx><keyboard><security><screen-lock>
How can I create a unique list of the tags, to look them up in the Tags table, instead of this hardcoded version:
SELECT * FROM Tags
WHERE TagName = 'osx'
OR TagName = 'keyboard'
OR TagName = 'security'
Here is a interactive example.
Stackexchange uses T-SQL, my local copy is running under postgresql using Postgres app version 9.4.5.0.
Assuming this table definition:
CREATE TABLE posthistory(post_id int PRIMARY KEY, tags text);
Depending on what you want exactly:
To convert the string to an array, trim leading and trailing '<>', then treat '><' as separator:
SELECT *, string_to_array(trim(tags, '><'), '><') AS tag_arr
FROM posthistory;
To get list of unique tags for whole table (I guess you want this):
SELECT DISTINCT tag
FROM posthistory, unnest(string_to_array(trim(tags, '><'), '><')) tag;
The implicit LATERAL join requires Postgres 9.3 or later.
This should be substantially faster than using regular expressions. If you want to try regexp, use regexp_split_to_table() instead of regexp_split_to_array() followed by unnest() like suggested in another answer:
SELECT DISTINCT tag
FROM posthistory, regexp_split_to_table(trim(tags, '><'), '><') tag;
Also with implicit LATERAL join. Related:
Split column into multiple rows in Postgres
What is the difference between LATERAL and a subquery in PostgreSQL?
To search for particular tags:
SELECT *
FROM posthistory
WHERE tags LIKE '%<security>%'
AND tags LIKE '%<osx>%';
SQL Fiddle.
Applied to your search in T-SQL in our data explorer:
SELECT TOP 100
PostId, UserId, Text AS Tags FROM PostHistory
WHERE year(CreationDate) = 2011
AND PostHistoryTypeId IN (3 -- initial tags
, 6 -- edit tags
, 9) -- rollback tags
AND Text LIKE ('%<' + ##TagName:String?postgresql## + '>%');
(T-SQL syntax uses the non-standard + instead of ||.)
https://data.stackexchange.com/apple/query/edit/417055
I've simplified the data to the relevant column only and called it tags to present the example.
Sample data
create table posthistory(tags text);
insert into posthistory values
('<lion><backup><time-machine>'),
('<spotlight><alfred><photo-booth>'),
('<lion><pdf><preview>'),
('<pdf>'),
('<asd>');
Query to get unique list of tags
SELECT DISTINCT
unnest(
regexp_split_to_array(
trim('><' from tags), '><'
)
)
FROM
posthistory
First we're removing all occurences of leading and trailing > and < signs from each row, then using regexp_split_to_array() function to get values into arrays, and then unnest() to expand an array to a set of rows. Finally DISTINCT eliminates duplicate values.
Presenting SQLFiddle to preview how it works.

How to combine IN operator with LIKE condition (or best way to get comparable results)

I need to select rows where a field begins with one of several different prefixes:
select * from table
where field like 'ab%'
or field like 'cd%'
or field like "ef%"
or...
What is the best way to do this using SQL in Oracle or SQL Server? I'm looking for something like the following statements (which are incorrect):
select * from table where field like in ('ab%', 'cd%', 'ef%', ...)
or
select * from table where field like in (select foo from bar)
EDIT:
I would like to see how this is done with either giving all the prefixes in one SELECT statement, of having all the prefixes stored in a helper table.
Length of the prefixes is not fixed.
Joining your prefix table with your actual table would work in both SQL Server & Oracle.
DECLARE #Table TABLE (field VARCHAR(32))
DECLARE #Prefixes TABLE (prefix VARCHAR(32))
INSERT INTO #Table VALUES ('ABC')
INSERT INTO #Table VALUES ('DEF')
INSERT INTO #Table VALUES ('ABDEF')
INSERT INTO #Table VALUES ('DEFAB')
INSERT INTO #Table VALUES ('EFABD')
INSERT INTO #Prefixes VALUES ('AB%')
INSERT INTO #Prefixes VALUES ('DE%')
SELECT t.*
FROM #Table t
INNER JOIN #Prefixes pf ON t.field LIKE pf.prefix
you can try regular expression
SELECT * from table where REGEXP_LIKE ( field, '^(ab|cd|ef)' );
If your prefix is always two characters, could you not just use the SUBSTRING() function to get the first two characters of "field", and then see if it's in the list of prefixes?
select * from table
where SUBSTRING(field, 1, 2) IN (prefix1, prefix2, prefix3...)
That would be "best" in terms of simplicity, if not performance. Performance-wise, you could create an indexed virtual column that generates your prefix from "field", and then use the virtual column in your predicate.
Depending on the size of the dataset, the REGEXP solution may or may not be the right answer. If you're trying to get a small slice of a big dataset,
select * from table
where field like 'ab%'
or field like 'cd%'
or field like "ef%"
or...
may be rewritten behind the scenes as
select * from table
where field like 'ab%'
union all
select * from table
where field like 'cd%'
union all
select * from table
where field like 'ef%'
Doing three index scans instead of a full scan.
If you know you're only going after the first two characters, creating a function-based index could be a good solution as well. If you really really need to optimize this, use a global temporary table to store the values of interest, and perform a semi-join between them:
select * from data_table
where transform(field) in (select pre_transformed_field
from my_where_clause_table);
You can also try like this, here tmp is temporary table that is populated by the required prefixes. Its a simple way, and does the job.
select * from emp join
(select 'ab%' as Prefix
union
select 'cd%' as Prefix
union
select 'ef%' as Prefix) tmp
on emp.Name like tmp.Prefix