Data field - search and write value in new data field (Oracle) - sql

Sorry, I don't know how to describe that as a title.
With a query (example: Select SELECT PKEY, TRUNC (CREATEDFORMAT), STATISTICS FROM BUSINESS_DATA WHERE STATISTICS LIKE '% business_%'), I can display all data that contains the value "business_xxxxxx".
For example, the data field can have the following content: c01_ad; concierge_beendet; business_start; or also skill_my; pre_initial_markt; business_request; topIntMaster; concierge_start; c01_start;
Is it now possible in a temp-only output the corresponding value in another column?
So the output looks like this, for example?
PKEY | TRUNC(CREATEDFORMAT) | NEW_STATISTICS
1 | 13.06.2020 | business_start
2 | 14.06.2020 | business_request
That means removing everything that does not start with business_xxx? Is this possible in an SQL query? RegEx would not be the right one, I think.

I think you want:
select
pkey,
trunc(createdformat) createddate,
regexp_substr(statistics, 'business_\S*') new_statistics
from business_data
where statistics like '% business_%'

You can also use the following regexp_substr:
SQL> select regexp_substr(str,'business_[^;]+') as result
2 from
3 --sample data
4 (select 'skill_my; pre_initial_markt; business_request; topIntMaster; concierge_start; c01_start;' as str from dual
5 union all
6 select 'c01_ad; concierge_beendet; business_start;' from dual);
RESULT
--------------------------------------------------------------------------------
business_request
business_start
SQL>

Related

Split not-atomar value into multiple rows with PostgreSQL

I have some not atomar data in a database like this:
ID
Component ID List
1
123, 456
2
123, 345
I need to transform those table into a view that provides the "Component ID List" in a way, that I can use joins. Expected result:
ID
Component ID List
1
123
1
456
2
123
2
345
Because I have this case in quite a few tables I look for the possibility to create a reusable way to perform this action, e.g. with a SQL-function. The tables have different column-names so the function would need a parameter, like this:
SELECT *, split_values("Component ID List") FROM xyz
I know the best way would be to fix the problem in the raw-data but that's not possible in this case.
Any suggestions how to solve this the best way possible?
You can use unnest(string_to_array(Component_ID_List, ', ')):
SELECT ID,
unnest(string_to_array(Component_ID_List, ', ')) as Component_ID_List
FROM table_name;
Fiddle

Return only ALL CAPS strings in BigQuery

Pretty simple question, specific to BigQuery. I'm sure there's a command I'm missing. I'm used to using "collate" in another query which doesn't work here.
email
| -------- |
| eric#email.com |
| JOHN#EMAIL.COM |
| STACY#EMAIL.COM |
| tanya#email.com |
Desired return:
JOHN#EMAIL.COM,STACY#EMAIL.COM
Consider below
select *
from your_table
where upper(email) = email
If applied to sample data in your question - output is
In case you want the output as a comma separated list - use below
select string_agg(email) emails
from your_table
where upper(email) = email
with output
You can use below cte (which is exact data sample from your question) for testing purposes
with your_table as (
select 'eric#email.com' email union all
select 'JOHN#EMAIL.COM' union all
select 'STACY#EMAIL.COM' union all
select 'tanya#email.com'
)

How do I select a SQL dataset where values in the first row are the column names?

I have data that looks like this:
ID RowType Col_1 Col_2 Col_3 ... Col_n
1 HDR FirstName LastName Birthdate
2 DTL Steve Bramblet 1989-01-01
3 DTL Bob Marley 1967-03-12
4 DTL Mickey Mouse 1921-04-25
And I want to return a table or dataset that looks like this:
ID FirstName LastName Birthdate
2 Steve Bramblet 1989-01-01
3 Bob Marley 1967-03-12
4 Mickey Mouse 1921-04-25
where n = 255 (so there's a limit of 255 Col_ fields)
***EDIT: The data in the HDR row is arbitrary so I'm just using FirstName, LastName, Birthdate as examples. This is why I thought it will need to be dynamic SQL since the column names I want to end up with will change based on the values in the HDR row. THX! ***
If there's a purely SQL solution that is what I'm after. It's going into an ETL process (SSIS) so I could use a Script task if all else fails.
Even if I could return a single row that would be a solution. I was thinking there might be a dynamic sql solution for something like this:
select Col_1 as FirstName, Col_2 as LastName, Col_3 as Birthdate
Not sure if your first data snippet is already in a oracle table or not but it is in a CSV file then you have option during loading to skip headers.
If data is already in table then you can use UNION to get desired result
Select * from table name where rowtype=‘HRD’
union
select * from table name where rowtype=‘DTL’
If you need First Name etc as Column header then you need not to do anything. Design destination table columns as per your requirement.
Sorry, posted an answer but I completely misread that you had your desired column headers as data in the source table.
One trivial solution (though it requires more IO) would be to dump the table data to a flat file without headers, then read it back in, but this time tell SSIS that the first row has headers, and ignore the RowType column. Make sure you sort the data correctly before writing it out to the intermediate file!
To dump to a file without headers, you have to set ColumnNamesInFirstDataRow to false. Set this in the properties window, not by editing the connection. More info in this thread
If you have a lot of data, this is obviously very inefficient.
Try the following using row_number. Here is the demo.
with cte as
(
select
*,
row_number() over (order by id) as rn
from myTable
)
select
ID,
Col_1 as FirstName,
Col_2 as LastName,
Col_3 as Birthdate
from cte
where rn > 1
output:
| id | firstname | lastname | birthdate |
| --- | --------- | -------- | ---------- |
| 2 | Steve | Bramblet | 1989-01-01 |
| 3 | Bob | Marley | 1967-03-12 |
| 4 | Mickey | Mouse | 1921-04-25 |
Oh, well. There is a pure SSIS approach, assumed the source is a SQL table. Here it is, rather sketchy.
Create a Variable oColSet with type Object, and 255 variables of type String and names sColName_1, sColName_2 ... sColName_255.
Create a SQL Task with query like select top(1) Col_1, Col_2, ... Col_255 from Src where RowType = 'HDR', set task properties ResultSet = Full Result Set, on result set tab - set Result Name to 0 and Variable Name to oColSet.
Add ForEach Loop enumerator, set it as ForEach ADO Enumerator, ADO object source variable - set to oColSet, Enumeration mode = Rows in the first table. Then, on the Variable Mappings tab - define as such example (Variable - Index) - sColName_1 - 0, sColName_2 - 1, ... sColName_255 - 254.
Create a variable sSQLQuery with type String and Variable Expression like
"SELECT Col_1 AS ["+#[User::sColName_1]+"],
Col_2 AS ["+#[User::sColName_2]+"],
...
Col_255 AS ["+#[User::sColName_255]+"]
FROM Src WHERE RowType='DTL'"
In the ForEach Loop - add your dataflow, in the OLEDB Source - set Data access mode to SQL command from variable and provide variable name User::sSQLQuery. On the Data Flow itself - set DelayValidation=true.
The main idea of this design - retrieve all column names and store it in temp variable (step 2). Then step 3 does parsing and places all results into corresponding variables, 1 column (0th) - into sColName_1 etc. Step 4 defines a SQL command as an expression, which is evaluated every time when the variable is read. Finally, in the ForEach Loop (where parsing is done) - you perform your dataflow.
Limitations of SSIS - data types and column names should be the same at runtime as at design time. If you need to further store your dataset into SQL - let me know, so I could adjust the proposed solution.

SQL LIKE using the same row value

I'm wondering how can I use a row value as a variable for my like statement? For example
ID | PID | DESCRIPTION
1 | 4124 | Hi4124
2 | 2451 | Test
3 | 1467 | Hello
4 | 9642 | Me9642
I have a table above, I want to return IDs 1 and 4 since DESCRIPTION contains PID.
I'm thinking it would be SELECT * from TABLE WHERE DESCRIPTION LIKE '%PID%' but I can't get it.
You can use CONCAT() to assemble the matching pattern, as in:
select *
from t
where description like concat('%', PID, '%')
We could also try using CHARINDEX here:
SELECT ID, PID, DESCRIPTION
FROM yourTable
WHERE CHARINDEX(PID, DESCRIPTION) > 0;
Demo
Note that I assume in the demo that the PID column is actually text, and not a numeric column. If PID be numeric, we might have to first use a cast in order to use CHARINDEX (or any of the methods given in the other answers).
Use the CONCAT SQL function
SELECT *
FROM TABLE
WHERE DESCRIPTION LIKE CONCAT('%', PID, '%')

What are the technical differences between "select * from table_name" and "select a.* from table_name a"?

This might be a basic question, but I can't find explanations after googling around.
Anyway, a short background story. I have this table that I don't have the permission to alter on DB2:
other_field | date_field | time_field
---------------------------------------
1 | 180101 | 101010
2 | 180102 | 202020
3 | 180103 | 303030
4 | 180104 | 404040
I tried to use:
select *, concat(date_field, time_field) as TIME
from Table_Name
My expected result is displaying something like this:
other_field | date_field | time_field | TIME
--------------------------------------------------------
1 | 180101 | 101010 | 180101101010
2 | 180102 | 102020 | 180102102020
3 | 180103 | 103030 | 180103103030
4 | 180104 | 104040 | 180104104040
But I can't use that query for some reason. It gave me an error ...Token , was not valid. Valid tokens: FROM INTO that basically said a comma (,) after * is invalid.
Then I tried tweaking it a little into:
select a.*, concat(a.date_field, a.time_field) as TIME
from Table_Name a
And it works!
I understand that Table_Name a are often used for joining tables, but I'm curious about the underlying mechanism.
What are the technical differences between using Table_Name and Table_Name a? And what is this a called?
Technically there will be no difference between the op of
SELECT * FROM TAB_NAME and SELECT a,* FROM TAB_NAME a.
Here you are just specifying alias name.
But you can understand the difference when you will try to fetch another column with * from TAB_NAME.
That means if you want to gate data as bellow
SELECT *,COL_1,COL2...
FROM TAB_NAME
or
SELECT *,CONCAT(...)
FROM TAB_NAME
or anything with * you must have to specify the alias name.
But the question is why? Let me try to explain,
As you know here SELECT * means you are trying to select all columns. So, * means "all" and if you are putting * after SELECT clause that means you already have given a command to your system to select all by passing a special character and after that your system can only expect FROM clause instead of any other thing. Because you already told your system/database to select all then there would be nothing left to select and hence your system will always wait for FROM clause. So it will throw an error each and every time.
BUT now the question is, how the bellow query will work internally
SELECT a.*,COL_1,COL2...
FROM TAB_NAME a
or
SELECT a.*,a.COL_1,a.COL2...
FROM TAB_NAME a
or
SELECT a.*,CONCAT(c1,c2)
FROM TAB_NAME a
or
SELECT a.*,CONCAT(a.c1,a.c2)
FROM TAB_NAME a
or anything else like that.
Here your system will understand that you are trying to select all from table a that means you may select any other col/function etc from either table a or any other table. That's the reason why your system/database will allow you to insert other col/func also after a, if required or you can use from clause as well after a.*
Db2 (LUW) 11.1 support this syntax
create table Table_Name (
other_field int not null primary key
, date_field date not null
, time_field time not null
)
;
insert into Table_Name values
(1,'2018-01-01', '10.10.10')
, (2,'2018-01-01', '20.20.20')
, (3,'2018-01-01', '13.13.13')
, (4,'2018-01-01', '14.14.14')
;
select *, timestamp(date_field, time_field) as TIME from Table_Name
;
which will return
OTHER_FIELD DATE_FIELD TIME_FIELD TIME
----------- ---------- ---------- ---------------------
1 2018-01-01 10:10:10 2018-01-01 10:10:10.0
2 2018-01-01 20:20:20 2018-01-01 20:20:20.0
3 2018-01-01 13:13:13 2018-01-01 13:13:13.0
4 2018-01-01 14:14:14 2018-01-01 14:14:14.0
BTW I took the liberty of using sensible data types for your example. Use DATE1, TIME and TIMESAMP (or TIMESTAMP(0)) for working with date and time values...