A follow on from a previous question
I'm working with an oracle 11g DB and need to manipulate a string column within it. The column contains multiple email addresses in this format:
jgoozooll#gmail.com;dzhookep#gmail.com;admzmoore#outlook.com
What I want to do is take out anything that does not have '#gmail.com' at the end (in this example admzmoore#outlook.com.com would be removed) however admzmoore#outlook.com may be the first email in the next row of the column so in this way there is no real fixed format, the only format being that each address is seperated by a semi-colon.
Is there anyway of implementing this through one command to run through every row in the column and remove anything thats not #gmail.com? I'm not really sure if this kind of processing is possible in SQL. Just looking for your thoughts!!
Getting the above 'FROM' error in the following code and I cant for my life figure out why. Someone will probably make me look stupid, but its a chance i have to take. There may also be other errors:) Heres my code:
SELECT REMIT_TO.ID
, LISTAGG(EMAIL, ';') WITHIN GROUP(ORDER BY REMIT_TO.ID) REMIT_TO.EFT_EMAIL_ADDR
FROM (SELECT REMIT_TO.ID
, regexp_substr(REMIT_TO.EFT_EMAIL_ADDR, '[^;]+', 1, RN) email
FROM IQMS.REMIT_TO
CROSS JOIN (SELECT ROWNUM RN
FROM(SELECT MAX (REGEXP_COUNT(REMIT_TO.EFT_EMAIL_ADDR, '[^;]+')) ML
FROM IQMS.REMIT_TO
)
CONNECT BY LEVEL <= ML
)
)
WHERE EMAIL LIKE '%#gmail.com%'
GROUP BY REMIT_TO.ID
Anything stick out for anyone?
Thanks you guys.
It appears you are missing a few aliases on your subqueries:
SELECT REMIT_TO.ID
, LISTAGG(EMAIL, ';') WITHIN GROUP(ORDER BY REMIT_TO.ID) REMIT_TO.EFT_EMAIL_ADDR
FROM
(
SELECT REMIT_TO.ID
, regexp_substr(REMIT_TO.EFT_EMAIL_ADDR, '[^;]+', 1, RN) email
FROM IQMS.REMIT_TO REMIT_TO
CROSS JOIN
(
SELECT ROWNUM RN
FROM
(
SELECT MAX (REGEXP_COUNT(REMIT_TO.EFT_EMAIL_ADDR, '[^;]+')) ML
FROM IQMS.REMIT_TO
) x2 -- alias needed
CONNECT BY LEVEL <= ML
) x1 -- alias needed
) REMIT_TO -- alias needed
WHERE EMAIL LIKE '%#gmail.com%'
GROUP BY REMIT_TO.ID
My Oracle is a but rusty, but on first look you're missing a comma after the LISTAGG function.
SELECT REMIT_TO.ID
, LISTAGG(EMAIL, ';') WITHIN GROUP(ORDER BY REMIT_TO.ID)
, REMIT_TO.EFT_EMAIL_ADDR...
Related
I have a table with a column that contains the path to SSIS packages located in a drive. The entire folder path is populated in the column. I need a SQL query to get a section of the string within the folder path.
An example of record in the column_1.
/FILE "\"G:\Enterprise_Data\Packages\SSIS_Packages_Source_to_Target_Data_Snowflake.dtsx\""/CHECKPOINTING OFF /REPORTING E
All I am interested in extracting is the "SSIS_Packages_Source_to_Target_Data_Snowflake". Everything I have tried so far throws errors. The latest code I tried is:
SELECT SUBSTRING(Column_1, LEFT(CHARINDEX('dtsx', Column_1)), LEN(Column_1) - CHARINDEX('dtsx', Column_1)).
I would really appreciate some help with this.
Thanks!
Given you know the extension and its unlikely to appear elsewhere in the string, find it, and truncate to it. Do that in a CROSS APPLY so we can use the value multiple times.
Then find the nearest slash (using REVERSE) and use SUBSTRING from there to the end.
SELECT
SUBSTRING(Y.[VALUE], LEN(Y.[VALUE]) - PATINDEX('%\%', REVERSE(Y.[VALUE])) + 2, LEN(Y.[VALUE]))
FROM (
VALUES ('/FILE "\"G:\Enterprise_Data\Packages\SSIS_Packages_Source_to_Target_Data_Snowflake.dtsx\""/CHECKPOINTING OFF /REPORTING E')
) AS X ([Value])
CROSS APPLY (
VALUES (SUBSTRING(X.[Value], 1, PATINDEX('%.dtsx%', X.[Value])-1))
) AS Y ([Value]);
Returns:
SSIS_Packages_Source_to_Target_Data_Snowflake
Another possible way is this. not sure on the performance of it though
SELECT vt.[value]
FROM (
VALUES ('/FILE "\"G:\Enterprise_Data\Packages\SSIS_Packages_Source_to_Target_Data_Snowflake.dtsx\""/CHECKPOINTING OFF /REPORTING E')
) AS X ([Value])
OUTER APPLY (
SELECT * FROM STRING_SPLIT(x.Value,'\')
) vt
WHERE vt.[value] LIKE '%.dtsx'
Thank you #Dale K for your response and solutions provided. I was able to replicate the same for my query to obtain the result. Below is how I modified the query in my environment to fetch only the new string column1 after applying the string manipulations based on your solution:
SELECT SUBSTRING(Y.column1, LEN(Y.column1) - PATINDEX('%%', REVERSE(Y.column1)) +2, LEN(Y.column1))
FROM (SELECT column1 FROM CTE1) AS X ([column2])
CROSS APPLY (SELECT SUBSTRING(X.column2, 1, PATINDEX('%.dtsx%', X.column2)-1) FROM CTE1) AS Y ([column1])
I am actually query a CTE table (CTE1) to get my desired result. The issue is that, I have other columns in the CTE1 that I need to include in the final select query results, which of course should include the string manipulated results from Column1. Currently, I get errors when I try to include other columns in my final result from the CTE1 along with the resultset from the query above.
Example of final query:
Select Jobname,job_step,job_date,job_duration,Column1 (this will be the resultset from the string manipulation)FROM CTE1;
So, what I'm currently doing that is not working is as follows:
SELECT C1.Jobname,C1.job_step,C1.job_date,C1.job_duration,Column1 =(SELECT SUBSTRING(Y.column1, LEN(Y.column1) - PATINDEX('%\%', REVERSE(Y.column1)) +2, LEN(Y.column1)) FROM (SELECT column1 FROM CTE1) AS X ([column2]) CROSS APPLY (SELECT SUBSTRING(X.column2, 1, PATINDEX('%.dtsx%', X.column2)-1) FROM CTE1) AS Y ([column1])) FROM CTE1 C1
Please, how can I obtain the final results with all the above columns present in the resultsets?
Thank you.
I have the following data in a table:
GROUP1|FIELD
Z_12TXT|111
Z_2TXT|222
Z_31TBT|333
Z_4TXT|444
Z_52TNT|555
Z_6TNT|666
And I engineer in a field that removes the leading numbers after the '_'
GROUP1|GROUP_ALIAS|FIELD
Z_12TXT|Z_TXT|111
Z_2TXT|Z_TXT|222
Z_31TBT|Z_TBT|333 <- to be removed
Z_4TXT|Z_TXT|444
Z_52TNT|Z_TNT|555
Z_6TNT|Z_TNT|666
How can I easily query the original table for only GROUP's that correspond to GROUP_ALIASES with only one Distinct FIELD in it?
Desired result:
GROUP1|GROUP_ALIAS|FIELD
Z_12TXT|Z_TXT|111
Z_2TXT|Z_TXT|222
Z_4TXT|Z_TXT|444
Z_52TNT|Z_TNT|555
Z_6TNT|Z_TNT|666
This is how I get all the GROUP_ALIAS's I don't want:
SELECT GROUP_ALIAS
FROM
(SELECT
GROUP1,FIELD,
case when instr(GROUP1, '_') = 2
then
substr(GROUP1, 1, 2) ||
ltrim(substr(GROUP1, 3), '0123456789')
else
substr(GROUP1 , 1, 1) ||
ltrim(substr(GROUP1, 2), '0123456789')
end GROUP_ALIAS
FROM MY_TABLE
GROUP BY GROUP_ALIAS
HAVING COUNT(FIELD)=1
Probably I could make the engineered field a second time simply on the original table and check that it isn't in the result from the latter, but want to avoid so much nesting. I don't know how to partition or do anything more sophisticated on my case statement making this engineered field, though.
UPDATE
Thanks for all the great replies below. Something about the SQL used must differ from what I thought because I'm getting info like:
GROUP1|GROUP_ALIAS|FIELD
111,222|,111|111
111,222|,222|222
etc.
Not sure why since the solutions work on my unabstracted data in db-fiddle. If anyone can spot what db it's actually using that would help but I'll also check on my end.
Here is one way, using analytic count. If you are not familiar with the with clause, read up on it - it's a very neat way to make your code readable. The way I declare column names in the with clause works since Oracle 11.2; if your version is older than that, the code needs to be re-written just slightly.
I also computed the "engineered field" in a more compact way. Use whatever you need to.
I used sample_data for the table name; adapt as needed.
with
add_alias (group1, group_alias, field) as (
select group1,
substr(group1, 1, instr(group1, '_')) ||
ltrim(substr(group1, instr(group1, '_') + 1), '0123456789'),
field
from sample_data
)
, add_counts (group1, group_alias, field, ct) as (
select group1, group_alias, field, count(*) over (partition by group_alias)
from add_alias
)
select group1, group_alias, field
from add_counts
where ct > 1
;
With Oracle you can use REGEXP_REPLACE and analytic functions:
select Group1, group_alias, field
from (select group1, REGEXP_REPLACE(group1,'_\d+','_') group_alias, field,
count(*) over (PARTITION BY REGEXP_REPLACE(group1,'_\d+','_')) as count from test) a
where count > 1
db-fiddle
I have a table of record tried to concatenate multiple rows on group wise and i use XMLAGG function but when i try to run the query for particular group which has 2000 records, getting error message:
Select failed 9134 : Intermediate aggregate storage limit for aggregation has been exceeded during computation
SELECT
H.GROUP_id,
H.Group_name,
TRIM(
TRAILING ',' FROM (
XMLAGG(TRIM(COALESCE(H.Group_desc, -1) || '') ORDER BY H.LINE_NBR) (VARCHAR(7000))
)
) AS Group_detail
even increased the varchar value but still having same issue
XMLAGG() adds overhead. However, you can get a sense for how large the result set is by using:
SELECT H.GROUP_id, H.Group_name,
SUM(LENGTH(COALESCE(H.Group_Desc, '-1'))) as total_string_length,
COUNT(*) as cnt
FROM . . .
GROUP BY H.GROUP_id, H.Group_name
ORDER BY total_string_length DESC
You will probably find that some of the groups have total string close to or more than 7000 characters.
I'm not sure if you want to fix the data or do something else. But this should at least identify the problem.
The problem is that the concatenation would be repeated for every row in the dataset, you need to get the distinct Group_desc first, try this:
WITH BASE AS(
SEL
H.GROUP_id,
H.Group_name,
H.Group_desc,
MAX(H.LINE_NBR) AS LINE_NBR
FROM TABLE_NAME
GROUP BY 1,2,3
)
SELECT
BASE.GROUP_id,
BASE.Group_name,
TRIM(
TRAILING ',' FROM (
XMLAGG(TRIM(COALESCE(BASE.Group_desc, -1) || '') ORDER BY BASE.LINE_NBR) (VARCHAR(7000)) -- You probably won't need the varchar to be that large.
)
) AS Group_detail
FROM BASE
I have used LISTAGG to concatenate data from two different tables to form the following output:
How do I display the above output neatly like this:
I am using ORACLE PL/SQL. I am thinking if this can be done by implementing cursor, but I am not sure how to do it. Or maybe is there any other way to achieve this? Thanks.
Looks like NATION.N_NAME column's datatype is CHAR as those names are blank-padded. I'd switch to VARCAHR2 (if possible) or try with TRIM, e.g.
select ...
listagg(trim(n.n_name), ', ') within group ...
----
this
WITH CTE AS
(SELECT r.REGION_KEY
,r.R_NAME
,LIST_AGG(trim(n.N_NAME),',') WITHIN GROUP (ORDER BY R_NAME) AS REGION_NATION
FROM REGION r
INNER JOIN NATION n
ON r.R_REGION_KEY = n.N_REGIONKEY
GROUP BY r.R_REGION_KEY
,r.R_NAME
)
SELECT REGION_KEY
,R_NAME || ':' || REGION_NATION as REGION_TEXT
FROM CTE
I am looking for some help in separating scientific names in my data. I want to take only the genus names and group them, but they are both connected in the same column. I saw the SQL Sever had a CHARINDEX command, but PostgreSQL does not. Does there need to be a function created for this? If so, how would it look?
I want to change 'Mallotus philippensis' to just 'Mallotus' or to just 'philippensis'
I am currently using Postgres 11, 12.
Use SPLIT_PART:
WITH yourTable AS (
SELECT 'Mallotus philippensis'::text AS genus
)
SELECT
SPLIT_PART(genus, ' ', 1) AS genus,
SPLIT_PART(genus, ' ', 2) AS species
FROM yourTable;
Demo
Probably string_to_array will be slightly more efficient than split_part here because string splitting will be done only once for each row.
SELECT
val_arr[1] AS genus,
val_arr[2] AS species
FROM (
SELECT string_to_array(val, ' ') as val_arr
FROM (
VALUES
('aaa bbb'),
('cc dddd'),
('e fffff')
) t (val)
) tt;