how can function SUBSTR can help me remove vlues from my column - sql

25779101724|GTG1105-Kibimba .These are the telefone numbers, there is more like this.After the numbers the following characters are a location of the telefone number. so i want to remove everything after these numbers(the bar,the location, and the space) so that i can query in the db the numbers which are active among these. and then i dont want to lose these location after numbers because i will need them to report the active numbers AND their location. how can i remove these location and query the active numbers and then replace the location.
I am hoping a response.

From my point of view, your by far the best option is to normalize that table and store each piece of information into its own column. Because, as long as you can easily split that string into several parts, joining it to another table will suffer as number of rows gets higher.
Anyway, here you are.
Sample data:
SQL> with test (msisdn) as
2 (select '25779101724|GTG1105-Kibimba' from dual union all
3 select '25776030896|BRR1351-Kaberenge2' from dual
4 )
Query begins here:
5 select
6 substr(msisdn, 1, instr(msisdn, '|') - 1) phone_number,
7 substr(msisdn, instr(msisdn, '|') + 1) the_rest
8 from test;
PHONE_NUMBER THE_REST
------------------------------ ------------------------------
25779101724 GTG1105-Kibimba
25776030896 BRR1351-Kaberenge2
SQL>

Related

Extracting number of specific length from a string in Postgres

I am trying to extract a set of numbers from comments like
"on april-17 transactions numbers are 12345 / 56789"
"on april-18 transactions numbers are 56789"
"on may-19 no transactions"
Which are stored in a column called "com" in table comments
My requirement is to get the numbers of specific length. In this case length of 5, so 12345 and 56789 from the above string separately, It is possible to to have 0 five digit number or more more than 2 five digit number.
I tried using regexp_replace with the following result, I am trying the find a efficient regex or other method to achieve it
select regexp_replace(com, '[^0-9]',' ', 'g') from comments;
regexp_replace
----------------------------------------------------
17 12345 56789
I expect the result to get only
column1 | column2
12345 56789
There is no easy way to create query which gets an arbitrary number of columns: It cannot create one column for one number and at the next try the query would give two.
For fixed two columns:
demo:db<>fiddle
SELECT
matches[1] AS col1,
matches[2] AS col2
FROM (
SELECT
array_agg(regexp_matches[1]) AS matches
FROM
regexp_matches(
'on april-17 transactions numbers are 12345 / 56789',
'\d{5}',
'g'
)
) s
regexp_matches() gives out all finds in one row per find
array_agg() puts all elements into one array
The array elements can be give out as separate columns.

How can I select the Nth row of a group of fields?

I have a very very small database that I am needing to return a field from a specific row.
My table looks like this (simplified)
Material_Reading Table
pointID Material_Name
123 WoodFloor
456 Carpet
789 Drywall
111 Drywall
222 Carpet
I need to be able to group these together and see the different kinds (WoodFloor, Carpet, and Drywall) and need to be able to select which one I want and have that returned. So my select statement would put the various different types in a list and then I could have a variable which would select one of the rows - 1, 2, 3 for example.
I hope that makes sense, this is somewhat a non-standard implementation because its a filemaker database unfortunately, so itstead of one big SQL statement doing all I need I will have several that will each select an individual row that I indicate.
What I have tried so far:
SELECT DISTINCT Material_Name FROM MATERIAL_READING WHERE Room_KF = $roomVariable
This works and returns a list of all my material names which are in the room indicated by the room variable. But I cant get a specific one by supplying a row number.
I have tried using LIMIT 1 OFFSET 1. Possibly not supported by Filemaker or I am doing it wrong, I tried it like this - it gives an error:
SELECT DISTINCT Material_Name FROM MATERIAL_READING WHERE _Room_KF = $roomVariable ORDER BY Material_Name LIMIT 1 OFFSET 1
I am able to use ORDER BY like this:
SELECT DISTINCT Material_Name FROM MATERIAL_READING WHERE Room_KF = $roomVariable ORDER BY Material_Name
In MSSQL
SELECT DISTINCT Material_Name
FROM MATERIAL_READING
WHERE _Room_KF = 'roomVariable'
ORDER BY Material_Name
OFFSET N ROWS
FETCH NEXT 5 ROWS ONLY
where N->from which row does to start
X->no.of rows to retrieve which were started from (N+1 row)

Comparing values in oracle when one value is partially masked

Here is what I am trying to do in a Oracle SQL query:
I have an account number that is X characters long (Example: 6001055555). I have a table that has part of the same account number but most of the number is masked (Examples: 600##########, 6001######, 600244####).
I am trying to match the number passed in 6001055555 to one of the following values 600##########, 6001######, 600244####.
In this example, account number 6001055555 should return 6001###### (from the above list). I can get to the point where the lengths are the same but am not sure how to address the match - I am looking at using REGEX expressions but am not sure if that' the correct path.
You can use the regular LIKE comparison in this case:
SQL> WITH DATA AS (
2 SELECT '600##########' acct FROM dual UNION ALL
3 SELECT '6001######' acct FROM dual UNION ALL
4 SELECT '600244####' acct FROM dual
5 )
6 SELECT *
7 FROM DATA
8 WHERE '6001055555' LIKE REPLACE (acct, '#', '_');
ACCT
-------------
6001######
We're used to seeing COLUMN LIKE :var but switching terms is also valid (:var LIKE column).
If my understanding is rite, this is what u may be expecting...
select regexp_substr('6001055555',replace('600##########','#'),1) from dual;
If you got any value from this query you may conclude that the account number is matched with the masking values

Finding what words a set of letters can create?

I am trying to write some SQL that will accept a set of letters and return all of the possible words it can make. My first thought was to create a basic three table database like so:
Words -- contains 200k words in real life
------
1 | act
2 | cat
Letters -- contains the whole alphabet in real life
--------
1 | a
3 | c
20 | t
WordLetters --First column is the WordId and the second column is the LetterId
------------
1 | 1
1 | 3
1 | 20
2 | 3
2 | 1
2 | 20
But I'm a bit stuck on how I would write a query that returns words that have an entry in WordLetters for every letter passed in. It also needs to account for words that have two of the same letter. I started with this query, but it obviously does not work:
SELECT DISTINCT w.Word
FROM Words w
INNER JOIN WordLetters wl
ON wl.LetterId = 20 AND wl.LetterId = 3 AND wl.LetterId = 1
How would I write a query to return only words that contain all of the letters passed in and accounting for duplicate letters?
Other info:
My Word table contains close to 200,000 words which is why I am trying to do this on the database side rather than in code. I am using the enable1 word list if anyone cares.
Ignoring, for the moment, the SQL part of the problem, the algorithm I'd use is fairly simple: start by taking each word in your dictionary, and producing a version of it with the letters in sorted order, along with a pointer back to the original version of that word.
This would give a table with entries like:
sorted_text word_id
act 123 /* we'll assume `act` was word number 123 in the original list */
act 321 /* we'll assume 'cat' was word number 321 in the original list */
Then when we receive an input (say, "tac") we sort it's letters, look it up in our table of sorted letters joined to the table of the original words, and that gives us a list of the words that can be created from that input.
If I were doing this, I'd have the tables for that in a SQL database, but probably use something else to pre-process the word list into the sorted form. Likewise, I'd probably leave sorting the letters of the user's input to whatever I was using to create the front-end, so SQL would be left to do what it's good at: relational database management.
If you use the solution you provide, you'll need to add an order column to the WordLetters table. Without that, there's no guarantee that you'll retrieve the rows that you retrieve are in the same order you inserted them.
However, I think I have a better solution. Based on your question, it appears that you want to find all words with the same component letters, independent of order or number of occurrences. This means that you have a limited number of possibilities. If you translate each letter of the alphabet into a different power of two, you can create a unique value for each combination of letters (aka a bitmask). You can then simply add together the values for each letter found in a word. This will make matching the words trivial, as all words with the same letters will map to the same value. Here's an example:
WITH letters
AS (SELECT Cast('a' AS VARCHAR) AS Letter,
1 AS LetterValue,
1 AS LetterNumber
UNION ALL
SELECT Cast(Char(97 + LetterNumber) AS VARCHAR),
Power(2, LetterNumber),
LetterNumber + 1
FROM letters
WHERE LetterNumber < 26),
words
AS (SELECT 1 AS wordid, 'act' AS word
UNION ALL SELECT 2, 'cat'
UNION ALL SELECT 3, 'tom'
UNION ALL SELECT 4, 'moot'
UNION ALL SELECT 5, 'mote')
SELECT wordid,
word,
Sum(distinct LetterValue) as WordValue
FROM letters
JOIN words
ON word LIKE '%' + letter + '%'
GROUP BY wordid, word
As you'll see if you run this query, "act" and "cat" have the same WordValue, as do "tom" and "moot", despite the difference in number of characters.
What makes this better than your solution? You don't have to build a lot of non-words to weed them out. This will constitute a massive savings of both storage and processing needed to perform the task.
There is a solution to this in SQL. It involves using a trick to count the number of times that each letter appears in a word. The following expression counts the number of times that 'a' appears:
select len(word) - len(replace(word, 'a', ''))
The idea is to count the total of all the letters in the word and see if that matches the overall length:
select w.word, (LEN(w.word) - SUM(LettersInWord))
from
(
select w.word, (LEN(w.word) - LEN(replace(w.word, wl.letter))) as LettersInWord
from word w
cross join wordletters wl
) wls
having (LEN(w.word) = SUM(LettersInWord))
This particular solution allows multiple occurrences of a letter. I'm not sure if this was desired in the original question or not. If we want up to a certain number of occurrences, then we might do the following:
select w.word, (LEN(w.word) - SUM(LettersInWord))
from
(
select w.word,
(case when (LEN(w.word) - LEN(replace(w.word, wl.letter))) <= maxcount
then (LEN(w.word) - LEN(replace(w.word, wl.letter)))
else maxcount end) as LettersInWord
from word w
cross join
(
select letter, count(*) as maxcount
from wordletters wl
group by letter
) wl
) wls
having (LEN(w.word) = SUM(LettersInWord))
If you want an exact match to the letters, then the case statement should use " = maxcount" instead of " <= maxcount".
In my experience, I have actually seen decent performance with small cross joins. This might actually work server-side. There are two big advantages to doing this work on the server. First, it takes advantage of the parallelism on the box. Second, a much smaller set of data needs to be transfered across the network.

How to increase non-numeric column value?

I have a column that is used for a sequence number.
The values in the column look like 'WT0000004568'
I need to find the maximum value and increment the counter part by 1 to create a new sequence number.
For this example, the resultant value would be 'WT0000004569'.
How do I do that?
As Thilio pointed out, this is generally a bad design both from a performance and from a concurrency issue. If we assume that you have a single user system and aren't particularly concerned about performance, you could do something like
SQL> ed
Wrote file afiedt.buf
1 select 'WT' ||
2 to_char(
3 to_number(substr('WT0000004568',3)) + 1,
4 'fm0000000000')
5* from dual
SQL> /
'WT'||TO_CHAR
-------------
WT0000004569