select a number of strings inside a clob - sql

sorry I know you expect code examples but I have absolutly no idear how to start with that issue.
I have a database with about 100000 entries of that structure:
ID | LONGARG
0 ECLONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS
ECLONG_TEXT_INSIDE_THIS2|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS
1 ECLONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS
2 ECLONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS
ECLONG_TEXT_INSIDE_THIS2|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS
ECLONG_TEXT_INSIDE_THIS3|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS
3 ECLONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS
ECLONG_TEXT_INSIDE_THIS2|LONG_TEXT_INSIDE_THIS|LONG_TEXT_INSIDE_THIS
Longarg is of type CLOB.
My question is, is there a possibility to select all the text that is between EC and the first | to get a result like that without using a StoredProcedure and for all datarows?
Result:
LONG_TEXT_INSIDE_THIS
LONG_TEXT_INSIDE_THIS2
LONG_TEXT_INSIDE_THIS
LONG_TEXT_INSIDE_THIS
LONG_TEXT_INSIDE_THIS2
LONG_TEXT_INSIDE_THIS3
LONG_TEXT_INSIDE_THIS
LONG_TEXT_INSIDE_THIS2
Thanks in advance for your help
Stefan

Related

SQL different null values in different rows

I have a quick question regarding writing a SQL query to obtain a complete entry from two or more entries where the data is missing in different columns.
This is the example, suppose I have this table:
Client Id | Name | Email
1234 | John | (null)
1244 | (null) | john#example.com
Would it be possible to write a query that would return the following?
Client Id | Name | Email
1234 | John | john#example.com
I am finding this particularly hard because these are 2 entires in the same table.
I apologize if this is trivial, I am still studying SQL and learning, but I wasn't able to come up with a solution for this and I although I've tried looking online I couldn't phrase the question in the proper way, I suppose and I couldn't really find the answer I was after.
Many thanks in advance for the help!
Yes, but actually no.
It is possible to write a query that works with your example data.
But just under the assumption that the first part of the mail is always equal to the name.
SELECT clients.id,clients.name,bclients.email FROM clients
JOIN clients bclients ON upper(clients.name) = upper(substring(bclients.email from 0 for position('#' in bclients.email)));
db<>fiddle
Explanation:
We join the table onto itself, to get the information into one row.
For this we first search for the position of the '#' in the email, get the substring from the start (0) of the string for the amount of characters until we hit the # (result of positon).
To avoid case-problems the name and substring are cast to uppercase for comparsion.
(lowercase would work the same)
The design is flawed
How can a client have multiple ids and different kind of information about the same user at the same time?
I think you want to split the table between clients and users, so that a user can have multiple clients.
I recommend that you read information about database normalization as this provides you with necessary knowledge for successfull database design.

Why I cant use the column name in the alias when i opered with dates

Currently I am migrating a database from SQL_SERVER to SPARK using HIVE_SQL.
I had an issue when im trying to pass a number to a date format.I found the answer is:
from_unixtime(unix_timestamp(cast(DATE as string) , 'dd-MM-yyyy'))
When I execute this query it bring me the data, notice that iI put an alias different to the name of column FECHA :
SELECT FROM_UNIXTIME(UNIX_TIMESTAMP(CAST(FECHA AS STRING ) ,'yyyyMMdd'), 'yyyy-MM-dd') AS FECHA_1
FROM reportes_hechos_avisos_diarios
LIMIT 1
| FECHA_1 |
| -------- |
| 2019-01-01 |
But when I put the same alias as the column name it bring me an incosistent information:
SELECT FROM_UNIXTIME(UNIX_TIMESTAMP(CAST(FECHA AS STRING ) ,'yyyyMMdd'), 'yyyy-MM-dd') AS FECHA
FROM reportes_hechos_avisos_diarios
LIMIT 1
| FECHA |
| -------- |
| 2.019 |
I know the trivial answer is , put an alias that doesnt be the same as the column name, but i have an implementation in Tableau that feeds from this query and Its complicated to change this columns because basically i must change all implementation so I need to preserve the column name.This query works for me in SQL SERVER, but i dont know why doesnt works in hive.
Issue
ExpectedResult
PSDT:Thanks for your attention, this is the first question I ask in stack and my native language is not English, sorry if I had grammatical errors.
limit 1 without order by can produce non-deterministic results from run to run because the order of rows is random due to parallel execution, some factors may affect it somehow but getting the same row is not guaranteed.
What is happening - I guess you receiving different row and the date is corrupted in that row, this is why some weird result is returned.
Also, you can another method of conversion:
select date(regexp_replace(cast(20200101 as string),'(\\d{4})(\\d{2})(\\d{2})','$1-$2-$3')) --put your column instead of constant.
Result:
2020-01-01

Concatenate /,: in a database row value after every 2 characters

I created a table in SQL Server and inserted values in that table columns in time column I stored a long string value which I retrieved from a log.
That log returns a time string like this '1103873704755', now I want to separate every 2 characters with /, (empty space) and K like this
'11/03/87 37:04:755'
Current query:
select top 1 Time
from tbl_ModBus
order by id desc
Output:
Time
-------------
1103873704755
Expected:
Time
-------------
11/03/87/ 37:04:755
So how can I get this string like I want using a SQL query?
I think that there is no a built in function to do this job in SQL.
It is always advised to use 'DATETIME ' when you are storing dates .
Yet, if you are storing data and time you better to use 'TIMESTAMP'.
Regarding all changes in the obtained value, you can do it in you programm code using whatever language you want !
I would not use a string save as DateTime.becasue it will be Unstable
If you only want to do concrete /,: in a database row value after every 2 characters.
The easiest way, you can try to use FORMAT function.
SELECT FORMAT(CAST('1103873704755' AS BIGINT),'##/##/## ##:##:###')
sqlfiddle
| |
|--------------------|
| 11/03/87 37:04:755 |

Search Blob and find multiple matches along with identity field

I have an identity column and a string value stored as a blob in 2 columns in my db.
What I’m trying to do is search for multiple values within the string and return the results in different rows for each match.
For example:
ID | String
1000 | ChrisBobTomSteve
I want to search the string for both Bob and Tom and return the results like this:
1000 | Bob
1000 | Tom
This is a simplified example but I have a very large db and I need to match on 39 different values to parse out the results so a union isn’t exactly efficient for this.
This is being done in oracle 11g. Any thoughts would be greatly appreciated. Thank you

user defined psuedocolumn oracle

I have a large dataset in an oracle database that is currently accessed from Java one item at a time. For example if a user is trying to do a bulk get of 50 items it will process them sequentially, calling a stored procedure for each one. I am now trying to implement a bulk get, but am having some difficulty due to the way the user can pass in a range query:
An example table:
prim_key | identifier | start | end
----------+--------------+---------+-------
1 | aaa | 1 | 3
2 | aaa | 3 | 7
3 | bbb | 1 | 5
The way it works is that if you have a query like (id='aaa' and pos=1) it will find prim_key = 1, but if you query (id='aaa' and pos=2) it won't find anything. If you do (id='aaa' and pos=-2) then it will again find prim_key=1 because the stored proc converts the -2 into a range scan equivalent to start<=2 and end>2.
(Extra context: the start/end are actually dates and this querying mechanism allows efficient "latest as of date" queries as opposed to doing something like select prim_key,
start from myTable
where start = (select max(start) from myTable where start <= 2))
This is all fine and works correctly for single gets, but now I'm trying to do bulk gets so that we can speed up the batch considerably. The first attempt was to multithread the individual calls, but it put too much stress on the database to be doing so many parallel queries on the same table. To solve this I've been trying to create a query like
select prim_key
from myTable
where (identifier='aaa' and start=3)
or (identifier='aaa' and start<=2 and end>2)
building this up from the list of input parameters ('aaa',3 ; 'bbb',-2), which works well and produces an explain plan using all of the indexes I would expect.
My Problem: I need to know what the input parameters were that retrieved that row in order to do further processing and return the relevant prim_key. I need to use something like a psuedocolumn that I can define myself:
select prim_key, PSUEDO
from myTable
where (identifier='aaa' and start=3 and PSUEDO='a3')
or (identifier='aaa' and start<=2 and end>2 and PSUEDO='a-2')
but I can't find any way to return a value from the where clause, and I think subqueries would lose the indexing efficiencies gained by doing it all in one select.
Try something like:
select
prim_key,
case when start = 3 then 'a3' else 'a-2' end pseudo
from
you_table
where
...