substring query - sql

I want to get the substring out of a cell value wrt following eg-
Input: "J.H.Ambani.School"-----------School
Output: "H.Ambani"-----------------MidName
That is all the text that comes between the first and the last dots. Length of string or number of dots in string can be any. I am trying to form a query for above input column "School" to get the output column "MidName".What can be the sql query for it?

For Oracle Database:
SELECT
REGEXP_REPLACE(yourColumn, '^[^.]*.|.[^.]*$', '') AS yourAlias
FROM yourTable

If is correctly understood your problem by your statement
"That is all the text that comes between the first and the last dots". Then below is solution to your problem is as given below. Below is working solution in SQL SERVER, for other databases i could not check because of lack of time.
#SourceString : this is your input
#DestinationString : this is your output
declare #SourceString varchar(100)='J.H.Ambani.School'
declare #DestinationString varchar(100)
;with result as
(
select ROW_NUMBER()over (order by (select 100))SNO,d from(
select t.c.value('.','varchar(100)')as d from
(select cast('<a>'+replace(#SourceString,'.','</a><a>')+'</a>' as xml)data)as A cross apply data.nodes('/a') as t(c))B
)
select #DestinationString=COALESCE(#DestinationString+'.','')+ISNULL(d,'') from result where SNO>(select top 1 SNO from result order by SNO)
and SNO<(select top 1 SNO from result order by SNO desc)
select #DestinationString

Related

How to generate control number for every input "CN202201" in sql? [duplicate]

I am using SQL Server 2014 and want to select a column in a table with the row number concatenated to the column value in the result set.
For example:
DemoField
---------
Apple
Ball
Cat
Should return this result set:
DemoField
---------
Row1 Apple
Row2 Ball
Row3 Cat
I went through a few similar questions where ROW_NUMBER() is used, but I find that it is selected as a separate column and not concatenated to an existing column being returned.
When I try to concatenate the ROW_NUMBER() to the column, I get an error:
Error converting data type varchar to bigint.
Please let me know.
Thanks
If 2012+ you can use concat()
Example
Declare #YourTable Table ([DemoField] varchar(50))
Insert Into #YourTable Values
('Apple')
,('Ball')
,('Cat')
Select concat('Row',Row_Number() over(Order By DemoField),' ',DemoField)
from #YourTable
Returns
(No column name)
Row1 Apple
Row2 Ball
Row3 Cat
This is just basic ROW_NUMBER with some concatenation. Seems the desired output is pretty strange but the concept is simple.
select DemoField
from
(
select DemoField = 'Row' + convert(varchar(4), ROW_NUMBER() over (order by DemoField)) + ' ' + DemoField
from YourTable
) x
The reason this is happening is because you're trying to add a number to a string. You have to CAST your row_number as a VARCHAR i.e.
'Row' + CAST(ROW_NUMBER() OVER (ORDER BY DemoField) AS VARCHAR)
If you want to get row number concatenated to any column, check the other answers. But IMO, that is really odd.
If you want to get row number in a separate column,
you can use ROW_NUMBER:
SELECT *, ROW_NUMBER() OVER(ORDER BY some_column) as RowNumber FROM YourTable
You can use the following -
SELECT CAST(ID AS NVARCHAR)+' '+CAST(ColName AS NVARCHAR)
FROM tb_Table
You can also use CONCAT() as well as USE CAST() for the row number to convert it to character
SELECT CONCAT('Row',CAST(ROW_NUMBER() OVER (ORDER BY demofield) AS VARCHAR),' ', demofield)

How to select the best item in each group?

I have table reports:
id
file_name
1
jan.xml
2
jan.csv
3
feb.csv
In human language: there are reports for each month. Each report could be in XML or CSV format. There could be 1-2 reports for each month in unique format.
I want to select the reports for all months, picking only 1 file for each month. The XML format is more preferable.
So, expected output is:
id
file_name
1
jan.xml
3
feb.csv
Explanation: the file jan.csv was excluded since there is more preferable report for that month: jan.xml.
As mentioned in the comments your data structure has a number of challenges. It really needs a column for ReportDate or something along those lines that is a date/datetime so you know which month the report belongs to. That would also give you something to sort by when you get your data back. Aside from those much needed improvements you can get the desired results from your sample data with something like this.
create table SomeFileTable
(
id int
, file_name varchar(10)
)
insert SomeFileTable
select 1, 'jan.xml' union all
select 2, 'jan.csv' union all
select 3, 'feb.csv'
select s.id
, s.file_name
from
(
select *
, FileName = parsename(file_name, 2)
, FileExtension = parsename(file_name, 1)
, RowNum = ROW_NUMBER() over(partition by parsename(file_name, 2) order by case parsename(file_name, 1) when 'xml' then 1 else 2 end)
from SomeFileTable
) s
where s.RowNum = 1
--ideally you would want to order the results but you don't have much of anything to work with in your data as a reliable sorting order since the dates are implied by the file name
You may want to use a window function that ranks your rows by partitioning on the month and ordering by the format name, by working on the file_name field.
WITH ranked_reports AS (
SELECT
id,
file_name,
ROW_NUMBER() OVER(
PARTITION BY LEFT(file_name, 3)
ORDER BY RIGHT(file_name, 3) DESC
) AS rank
FROM
reports
)
SELECT
id,
file_name
FROM
ranked_reports
WHERE
rank = 1

Specific string matching

I am working in SQL Server 2012. In my table, there is a column called St_Num and its data is like this:
St_Num status
------------------------------
128 TIMBER RUN DR EXP
128 TIMBER RUN DRIVE EXP
Now we can notice that there are spelling variations in the data above. What I would like to do is that if the number in this case 128 and first 3 letters in St_Num column are same then these both rows should be considered the same like this the output should be:
St_Num status
-----------------------------
128 TIMBER RUN DR EXP
I did some search regarding this and found that left or substring function can be handy here but I have no idea how they will be used here to get what I need and don't know even if they can solve my issue. Any help regarding how to get the desired output would be great.
This will output only the first of the matching rows:
with cte as (
select *,
row_number() over (order by (select null)) rn
from tablename
)
select St_Num, status from cte t
where not exists (
select 1 from cte
where
left(St_Num, 7) = left(t.St_Num, 7)
and
rn < t.rn
)
See the demo
This could possibly be done by using a subquery in the same way that you would eliminate duplicates in a table so:
SELECT Str_Num, status
FROM <your_table> a
WHERE NOT EXISTS (SELECT 1
FROM <your_table> b
WHERE SUBSTRING(b.Str_Num, 1, 7) = SUBSTRING(a.Str_Num, 1, 7));
This would only work however if the number is guaranteed to be 3 characters long, or if you don't mind it taking more characters in the case that the number is fewer characters.
You can use grouping by status and substring(St_Num,1,3)
with t(St_Num, status) as
(
select '128 TIMBER RUN DR' ,'EXP' union all
select '128 TIMBER RUN DRIVE','EXP'
)
select min(St_Num) as St_Num, status
from t
group by status, substring(St_Num,1,3);
St_Num status
----------------- ------
128 TIMBER RUN DR EXP
I don't really approve of your matching logic . . . but that is not your question. The big issue is how long is the number before the string. So, you can get the shortest of the addresses using:
select distinct t.*
from t
where not exists (select 1
from t t2
where left(t2.st_num, patindex('%[a-zA-Z]%') + 2, t.st_num) = left(t.st_num, patindex('%[a-zA-Z]%', t.st_num) + 2) and
len(t.St_Num) < len(t2.St_Num)
);
I still have odd feeling that your criteria is not enough to match same addresses but this might help, since it considers also length of the number:
WITH ParsedAddresses(st_num, exp, number)
AS
(
SELECT st_num,
exp,
number = ROW_NUMBER() OVER(PARTITION BY LEFT(st_num, CHARINDEX(' ', st_num) + 3) ORDER BY LEN(st_num))
FROM <table_name>
)
SELECT st_num, exp FROM ParsedAddresses
WHERE number = 1

Access SQL Query from another Query

I have a little problem with a piece of SQL code. I have a table Paiements_17_18 and I would like to create a single-line query that calculates:
the total of the Amount field,
the first date of Date_Regulation field,
the last date of Date_Regulation field,
the distinct values of N_Facture field.
All this from a sub request of the style SELECT TOP n FROM ....
I tried this:
SELECT Sum(P.Montant) AS TotalMontant,
First(P.Date_Regulation) AS PremièreDate,
Last(P.Date_Regulation) AS DernièreDate,
First(P.N_Facture) AS PremièreFacture,
Last(P.N_Facture) AS DernièreFacture,
(SELECT Count(N_Facture)
FROM (SELECT DISTINCT N_Facture FROM Paiements_17_18)) AS NombreFactures
FROM (SELECT TOP 5 Paiements_17_18.*
FROM Paiements_17_18
ORDER BY Paiements_17_18.ID_Paiement DESC) AS P;
But I get an error of "P"
(The Microsoft Access database engine cannot find the input table or
query" P" . Make sure it exists and that its name is spelled
correctly)
Can you help me please?
The 2 lines on generating the NombreFacture field is causing the error:
(SELECT Count(N_Facture)
FROM (SELECT DISTINCT N_Facture FROM Paiements_17_18)) AS
NombreFactures
Replaced the two lines. See below.
SELECT Sum(P.Montant) AS TotalMontant,
First(P.Date_Regulation) AS PremièreDate,
Last(P.Date_Regulation) AS DernièreDate,
First(P.N_Facture) AS PremièreFacture,
Last(P.N_Facture) AS DernièreFacture,
(SELECT Count(n.N_Facture_distinct)
FROM (SELECT DISTINCT N_Facture as N_facture_distinct FROM Paiements_17_18 ) AS n)
AS NombreFacture
FROM (SELECT TOP 5 Paiements_17_18.*
FROM Paiements_17_18
ORDER BY Paiements_17_18.ID_Paiement DESC) AS P;

How to split and display distinct letters from a word in SQL?

Yesterday in a job interview session I was asked this question and I had no clue about it. Suppose I have a word "Manhattan " I want to display only the letters 'M','A','N','H','T'
in SQL. How to do it?
Any help is appreciated.
Well, here is my solution (sqlfiddle) - it aims to use a "Relational SQL" operations, which may have been what the interviewer was going for conceptually.
Most of the work done is simply to turn the string into a set of (pos, letter) records as the relevant final applied DQL is a mere SELECT with a grouping and ordering applied.
select letter
from (
-- All of this just to get a set of (pos, letter)
select ns.n as pos, substring(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s) as ss
cross join (
-- Or use another form to create a "numbers table"
select n from (values (1),(2),(3),(4),(5),(6),(7),(8),(9)) as X(n)
) as ns
) as pairs
group by letter -- guarantees distinctness
order by min(pos) -- ensure output is ordered MANHT
The above query works in SQL Server 2008, but the "Numbers Table" may have to be altered for other vendors. Otherwise, there is nothing used that is vendor specific - no CTE, or cross application of a function, or procedural language code ..
That being said, the above is to show a conceptual approach - SQL is designed for use with sets and relations and multiplicity across records; the above example is, in some sense, merely a perversion of such.
Examining the intermediate relation,
select ns.n as pos, substring(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s) as ss
cross join (
select n from (values (1),(2),(3),(4),(5),(6),(7),(8),(9)) as X(n)
) as ns
uses a cross join to generate the Cartesian product of the string (1 row) with the numbers (9 rows); the substring function is then applied with the string and each number to obtain each character in accordance with its position. The resulting set contains the records-
POS LETTER
1 M
2 A
3 N
..
9 N
Then the outer select groups each record according to the letter and the resulting records are ordered by the minimum (first) occurrence position of the letter that establishing the grouping. (Without the order by the letters would have been distinct but the final order would not be guaranteed.)
One way (if using SQL Server) is with a recursive CTE (Commom Table Expression).
DECLARE #source nvarchar(100) = 'MANHATTAN'
;
WITH cte AS (
SELECT SUBSTRING(#source, 1, 1) AS c1, 1 as Pos
WHERE LEN(#source) > 0
UNION ALL
SELECT SUBSTRING(#source, Pos + 1, 1) AS c1, Pos + 1 as Pos
FROM cte
WHERE Pos < LEN(#source)
)
SELECT DISTINCT c1 from cte
SqlFiddle for this is here. I had to inline the #source for SqlFiddle, but the code above works fine in Sql Server.
The first SELECT generates the initial row(in this case 'M', 1). The second SELECT is the recursive part that generates the subsequent rows, with the Pos column getting incremented each time until the termination condition WHERE Pos < LEN(#source) is finally met. The final select removes the duplicates. Internally, SELECT DISTINCT sorts the rows in order to facilitate the removal of duplicates, which is why the final output happens to be in alphabetic order. Since you didn't specify order as a requirement, I left it as-is. But you could modify it to use a GROUP instead, that ordered on MIN(Pos) if you needed the output in the characters' original order.
This same technique can be used for things like generating all the Bigrams for a string, with just a small change to the general structure above.
declare #charr varchar(99)
declare #lp int
set #charr='Manhattan'
set #lp=1
DECLARE #T1 TABLE (
FLD VARCHAR(max)
)
while(#lp<=LEN(#charr))
begin
if(not exists(select * from #T1 where FLD=(select SUBSTRING(#charr,#lp,1))))
begin
insert into #T1
select SUBSTRING(#charr,#lp,1)
end
set #lp=#lp+1
end
select * from #T1
check this it may help u
Here's an Oracle version of #user2864740's answer. The only difference is how you construct the "numbers table" (plus slight differences in aliasing)
select letter
from (
select ns.n as pos, substr(ss.s, ns.n, 1) as letter
from (select 'MANHATTAN' as s from dual) ss
cross join (
SELECT LEVEL as n
FROM DUAL
CONNECT BY LEVEL <= 9
ORDER BY LEVEL) ns
) pairs
group by letter
order by min(pos)