I need to return matches that are with a range of serial numbers but the prefix and suffix need to be removed
Ie. I need to search between the below serial numbers but the sequential numbers are only the middle part.
G4A41103801702 - G4A41113171702
G4A [4110380] 1702 - G4A [4111317] 1702
I need to exclude the first 3 and last 4 digits and then search between
4110380-4111317
thanks
baton, try a variation of the query below:
Replace id with the column you want to select.
Replace tablename with your actual table name.
Assumes serialnumber is the name of the column with the serial to be queried against.
Assumes the length of the serial number is constant.
SELECT id FROM tablename WHERE
CAST(SUBSTRING(serialnumber, 4, 7) as int) >= 4110380 AND
CAST(SUBSTRING(serialnumber, 4, 7) as int) <= 4111317
As #ADyson mentioned, this will not utilize an index and you should extract this number into a separate indexed column for a more performant query. Hope this helps!
If length is not constant, you can combine reverse function twice time to exclude the first 3 and last 4 digits:
SELECT id FROM tablename WHERE
substring(reverse(substring(reverse(serialnumber),5)),4) >= 4110380 AND
substring(reverse(substring(reverse(serialnumber),5)),4) <= 4111317
test :
substring(reverse(substring(reverse('G4A41103801702'),5)),4) ==> '4110380'
I'd handle this like so:
-- Sample data
DECLARE #table TABLE (col1 VARCHAR(100));
INSERT #table (col1)
VALUES ('G4A41103801702 - G4A41113171702');
-- solution
SELECT
c1=SUBSTRING(s.s1,PATINDEX(p.P,s.s1),7),
c2=SUBSTRING(s.s2,PATINDEX(p.P,s.s2),7)
FROM #table AS t
CROSS JOIN (VALUES('%'+REPLICATE('[0-9]',7)+'%')) AS p(P)
CROSS APPLY (VALUES(CHARINDEX('-',t.col1))) AS br(b)
CROSS APPLY (VALUES(SUBSTRING(t.col1,1,br.b-1),
SUBSTRING(t.col1,br.b+1,8000))) AS s(s1,s2);
Returns:
c1 c2
------- -------
4110380 4111317
You can then use c1 and c2 elsewhere.
Related
Say I have a table with an incrementing id column and a random positive non zero number.
id
rand
1
12
2
5
3
99
4
87
Write a query to return the rows which add up to a given number.
A couple rules:
Rows must be "consumed" in order, even if a later row makes it a a perfect match. For example, querying for 104 would be a perfect match for rows 1, 2, and 4 but rows 1-3 would still be returned.
You can use a row partially if there is more available than is necessary to add up to whatever is leftover on the number E.g. rows 1, 2, and 3 would be returned if your max number is 50 because 12 + 5 + 33 equals 50 and 90 is a partial result.
If there are not enough rows to satisfy the amount, then return ALL the rows. E.g. in the above example a query for 1,000 would return rows 1-4. In other words, the sum of the rows should be less than or equal to the queried number.
It's possible for the answer to be "no this is not possible with SQL alone" and that's fine but I was just curious. This would be a trivial problem with a programming language but I was wondering what SQL provides out of the box to do something as a thought experiment and learning exercise.
You didn't mention which RDBMS, but assuming SQL Server:
DROP TABLE #t;
CREATE TABLE #t (id int, rand int);
INSERT INTO #t (id,rand)
VALUES (1,12),(2,5),(3,99),(4,87);
DECLARE #target int = 104;
WITH dat
AS
(
SELECT id, rand, SUM(rand) OVER (ORDER BY id) as runsum
FROM #t
),
dat2
as
(
SELECT id, rand
, runsum
, COALESCE(LAG(runsum,1) OVER (ORDER BY id),0) as prev_runsum
from dat
)
SELECT id, rand
FROM dat2
WHERE #target >= runsum
OR #target BETWEEN prev_runsum AND runsum;
I have a table that has a column with 1 to 3 digits.
For example:
(342) 342-9324
(1) 234-3424
(04) 234-7744 etc
But I am not sure how to write the query. I use
SUBSTRING(x, 2, 3)
where x is name of column, but I only get 3 digits, anyone have any ideas to extract digits in brackets that could be 1, 2 or 3 digits? This is done using sql server. Also this table has more than 5 million rows of phone numbers
If area code is within (), some simple string functions should do.
Example
Declare #YourTable Table ([Phone] varchar(50)) Insert Into #YourTable Values
('(342) 342-9324')
,('(1) 234-3424')
,('(04) 234-7744')
Select *
,AreaCode = replace(left(Phone,charindex(')',Phone+')')-1),'(','')
From #YourTable
Returns
Phone AreaCode
(342) 342-9324 342
(1) 234-3424 1
(04) 234-7744 04
I have the data like that.
AB
ABC
ABCD
ABCDE
EF
EFG
IJ
IJK
IJKL
and I just want to get ABCDE,EFG,IJKL. how can i do that oracle sql?
the size of the char are min 2 but doesn't have a fixed length, can be from 2 to 100.
In the event that you mean "longest string for each sequence of strings", the answer is a little different -- you are not guaranteed that all have a length of 4. Instead, you want to find the strings where adding a letter isn't another string.
select t.str
from table t
where not exists (select 1
from table t2
where substr(t2.str, 1, length(t.str)) = t.str and
length(t2.str) = length(t.str) + 1
);
Do note that performance of this query will not be great if you have even a moderate number of rows.
Select all rows where the string is not a substring of any other row. It's not clear if this is what you want though.
select t.str
from table t
where not exists (
select 1
from table t2
where instr(t1.str, t2.str) > 0
);
This question already exists:
Closed 10 years ago.
Possible Duplicate:
SQL Select DISTINCT using CAST
Let me try this one more time... I'm not a sql guy so please bear with me as I try to explain this... I have a table called t_recordkeepingleg with three columns of data. Column1 is named LEGTRIPNUMBER that happens to be a string that starts with the letter Q followed by 4 numbers. I need to strip off the Q and convert the remaining 4 characters (numbers) to an integer. Everyone with me so far? Column2 of this table is named LEGDATE. Column3 is named LEGGROUP.
Here's the input scenario
LEGTRIPNUMBER LEGDATE LEGGROUP
Q1001 08/12/12 0001
Q1001 09/15/12 0002
Q1002 09/01/12 0001
Q1002 09/08/12 0003
Q1002 09/09/12 0002
As you can see the input table has rows where LEGTRIPNUMBER occurs more than once. I only want the first occurrence.
This is my current select statement - it works but returns all rows.
SELECT *,
CAST(
substring("t_RecordkeepingLeg"."LEGTRIPNUMBER",2,4) as INT
) as Num_Trip_Num
FROM "1669"."dbo"."t_RecordkeepingLeg" "t_RecordkeepingLeg"
Where left "t_RecordkeepingLeg"."LEGTRIPNUMBER",1) = 'Q'
I want to modify this so that it only selects ONE occurance of the Qnnnn. When the row gets selected I want to have LEGDATE and LEGGROUP available to me. How do I do this?
Thank you,
Can it be as simple as below? I've just added condiotion on leggroup being 0001
SELECT *,
CAST(substring("t_RecordkeepingLeg"."LEGTRIPNUMBER",2,4) as INT) as Num_Trip_Num
FROM "1669"."dbo"."t_RecordkeepingLeg" "t_RecordkeepingLeg"
Where left ("t_RecordkeepingLeg"."LEGTRIPNUMBER",1) = 'Q'
and "t_RecordkeepingLeg"."LEGGROUP"='0001'
If you have a unique primay key in your table you can do something like the below;
SELECT CAST(
substring("t_RecordkeepingLeg"."LEGTRIPNUMBER",2,4) as INT
) as Num_Trip_Num
FROM "1669"."dbo"."t_RecordkeepingLeg" "t_RecordkeepingLeg"
Where "t_RecordkeepingLeg"."ID" In(
Select Min("t_RecordkeepingLeg"."ID")
From "1669"."dbo"."t_RecordkeepingLeg" "t_RecordkeepingLeg"
Where left ("t_RecordkeepingLeg"."LEGTRIPNUMBER",1) = 'Q'
Group By "t_RecordkeepingLeg"."LEGTRIPNUMBER"
)
Which values of LEGDATE & LEGGROUP do you want for the distinct LEGTRIPNUMBER? there are multiple non-distinct possibilities and the concept of "first occurrence" is only valid with an explicit order.
To get the values where LEGDATE is the earliest for example;
select Num_Trip_Num, LEGDATE, LEGGROUP from (
select
cast(substring(t_RecordkeepingLeg.LEGTRIPNUMBER, 2, 4) as INT) as Num_Trip_Num,
row_number() over (partition by substring(t_RecordkeepingLeg.LEGTRIPNUMBER, 2, 4) order by t_RecordkeepingLeg.LEGDATE asc) as row,
t_RecordkeepingLeg.LEGDATE,
t_RecordkeepingLeg.LEGGROUP
from t_RecordkeepingLeg
where left (t_RecordkeepingLeg.LEGTRIPNUMBER, 1) = 'Q'
) T
where row = 1
I have a table with 1,000,000+ records and I would like to find the most common sub string that is at least 5 characters long.
If I have the following entries:
KDHFOUDHGOENWFIJ 1114H4363SDFHDHGFDG
GSDLGJSLJSKJDFSG 1114H20SDGDSSFHGSLD
SLSJDHLJKSSDJFKD 1114HJSDHFJKSDKFSGG
I would like to write in SQL a statement that selects 1114H as the most commmon sub string. How can I do this?
Notes:
The substring does not have to be in the same location.
The subtrings must be length 5
The maximum length of each record is 50 characters
There are no requirement to find the longest substring so every substring with length greater than 5 will always have a substring of 5 characters that is a tie for count. So we only have to check substrings of length 5.
In the sample data there are three strings that occur three times. _1114H, _1114 and 1114H (_ is to show the location of a space ).
In this solution master..spt_values is used in place of a numbers table.
declare #T table
(
ID int identity,
Data varchar(50)
)
insert into #T values
('KDHFOUDHGOENWFIJ 1114H4363SDFHDHGFDG'),
('GSDLGJSLJSKJDFSG 1114H20SDGDSSFHGSLD'),
('SLSJDHLJKSSDJFKD 1114HJSDHFJKSDKFSGG')
select top 1 substring(T.Data, N.Number, 5) as Word
from #T as T
cross apply (select N.Number
from master..spt_values as N
where N.type = 'P' and
N.number between 1 and len(T.Data)-4) as N
group by substring(T.Data, N.Number, 5)
order by count(distinct id) desc
Result:
Word
------
1114
This doesn't answer your question in full, but here is an article from a book about advanced search techniques where it mentions a user-defined function "LCS" (longest common substring) that might be helpful:
http://books.google.com/books?id=wGwVkAt79bEC&pg=PA248&lpg=PA248&dq=sql+full+text+common+substring&source=bl&ots=fveHa8an08&sig=VTWHQDTA6gqSNylY9oR0mPhcP6Y&hl=en&ei=iALcTd_AB-j00gG3iZ3lDw&sa=X&oi=book_result&ct=result&resnum=1&ved=0CBoQ6AEwAA#v=onepage&q&f=false