SQL How to make nested query with substring/case statement/ trim - sql

New to SQL and trying to understand nested queries and how to use them. I have a substring, case statement, and trim statement that I'm trying to put together but am unsure of how. The substring has to be done first, then the case statement, then the trim. This is what I have at the moment but unsure of how to get it working. The code is random names/tables as an example
SELECT dtXYZ.*
FROM
(
SELECT dt,
SUBSTRING_INDEX(SUBSTRING_INDEX(dt, ..................... ) as lioness,
SUBSTRING_INDEX(SUBSTRING_INDEX(dt, .....................) as tiger,
SUBSTRING_INDEX(dt, .................) as bear
FROM Animaltab
) dtXYZ
SELECT
CASE WHEN length(bear) = 4 THEN bear
ELSE concat('0', bear)
END AS bear_corr,
CASE WHEN length(lion) = 7 THEN lioness
ELSE concat('0', lioness)
END AS lion_corr
trim(lion_corr) || '_' || trim(tiger) || '_' || trim(bear_corr) as new_imp_animal

Spark supports CTE https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-cte.html
even with databrics this will work see Common Table Expressions (CTEs) in Databricks and Spark
ANd you can nest them like this
WITH dtXYZ(dt,lioness,tiger,bear() AS ( SELECT dt,
SUBSTRING_INDEX(SUBSTRING_INDEX(dt, ..................... ) as lioness,
SUBSTRING_INDEX(SUBSTRING_INDEX(dt, .....................) as tiger,
SUBSTRING_INDEX(dt, .................) as bear
FROM Animaltab),
dtcorrected (dt,bear_corr,lion_corr,tiger) as (
SELECT
dt,
CASE WHEN length(bear) = 4 THEN bear
ELSE concat('0', bear)
END AS bear_corr,
CASE WHEN length(lion) = 7 THEN lioness
ELSE concat('0', lioness)
END AS lion_corr
,tiger
FROM dtXYZ)
SELECT
dt,
trim(lion_corr) || '_' || trim(tiger) || '_' || trim(bear_corr) as new_imp_animal FROM dtcorrected

Order of operations can be tricky with SQL if you're used to ordering things in a procedure. Like nbk commented, CTEs or Common Table Expressions are your best bet. CTEs are defined by the 'with' keyword and are very similar to nested subqueries (you could write the same query nested if you wanted) but are better suited to this operation where the nesting structure of the code doesn't mimic the nesting of the data. I always use CTEs if I'm joining tables that each need independent grouping or filtering. The SQL in the parenthesis essentially creates a view, and the outside SQL is a second higher-order select statement to create a result set. If I'm working with hierarchical data (parent, child, grandchild), I'll go with the nesting in the query to follow that path, but usually, the CTE is easier to organize your ideas. Here's how that would work:
with dtXYZ as
(
SELECT dt,
SUBSTRING_INDEX(SUBSTRING_INDEX(dt, ..................... ) as lioness,
SUBSTRING_INDEX(SUBSTRING_INDEX(dt, .....................) as tiger,
SUBSTRING_INDEX(dt, .................) as bear
FROM Animaltab
)
SELECT
CASE WHEN length(bear) = 4 THEN bear
ELSE concat('0', bear)
END AS bear_corr,
CASE WHEN length(lion) = 7 THEN lioness
ELSE concat('0', lioness)
END AS lion_corr,
trim(lion_corr) || '_' || trim(tiger) || '_' || trim(bear_corr) as new_imp_animal
from
dtXYZ
And in terms of 'order of operations,' case statements and functions in a select can be referenced by other parts of the select statement as inputs. Things can get hairy when you use 'if' ideas that resolve to illogical or error-causing conditions. Still, otherwise, I've had no issues with having many parts of a select refer to each other. It's an excellent way to test out nesting functions.

Related

ListAgg Over ListAgg - Oracle

I wanted to get some data from multiple rows under same column (SRAV.XYZ) and concat it with other col hence used the listagg query.
SELECT LISTAGG (
REGEXP_SUBSTR (SRAV.XYZ,
'[^:]+$'),
';')
WITHIN GROUP (ORDER BY
REGEXP_SUBSTR (
SRAV.XYZ,
'[^:]+$')) ||';'||SRA.ABC
/*(CASE
WHEN SRA.ABC like 'PROF.TMP' THEN SRA.ABC = 'TMP'
WHEN SRA.ABC like 'PROF' THEN SRA.ABC ='PROF'
ELSE SRA.ABC='EMPLOYEES' END) */
FROM TEST1 SPAEM,
TEST2 SRAV,
TEST3 srm,
TEST4 SRA
WHERE SRAV.RID = srm.RGID
AND SRAV.PID IN
('123RTU23',
'456U43',
'AB4577Y')
AND SRAV.XYZ IS NOT NULL
AND SPAEM.EMPID = srm.SEC_UUID
AND SRAV.PID = SRA.PRID
AND SPAEM.EMPID = 139806
group by ABC
I am able to get the output in the below format:
physics;PROF.TMP
bio;EMPLOYEES
Now, I am having 2 issues that I am unable to handle.
I want the output in the below format:
physics;PROF.TMP,bio;EMPLOYEES
My case when is not working ( hence commented ) when I am trying to concat.
The ideal output would be:
physics;TMP,bio;EMPLOYEES
Any help in this regard.
Regards,
CASE probably doesn't work because of LIKE; the way you put it, it acts as if it was =, actually. Wildcards are missing. Also, syntax you used seems to be wrong (from my point of view). Perhaps you meant to say something like this:
CASE
WHEN SRA.ABC like '%PROF.TMP%' THEN 'TMP'
WHEN SRA.ABC like '%PROF%' THEN 'PROF'
ELSE 'EMPLOYEES'
END
As of listagg over listagg: use your current query as a subquery or as a CTE, and then apply yet another listagg:
with your_current_query as
(select listagg(...) within group over (...) as result_1
from ...
where ...
)
-- apply listagg to result_1
select listagg(result_1, ', ') over (...) as final_result
from your_current_query
That's theory. If you want something more, provide a simple test case.

How to easily remove count=1 on aliased field in SQL?

I have the following data in a table:
GROUP1|FIELD
Z_12TXT|111
Z_2TXT|222
Z_31TBT|333
Z_4TXT|444
Z_52TNT|555
Z_6TNT|666
And I engineer in a field that removes the leading numbers after the '_'
GROUP1|GROUP_ALIAS|FIELD
Z_12TXT|Z_TXT|111
Z_2TXT|Z_TXT|222
Z_31TBT|Z_TBT|333 <- to be removed
Z_4TXT|Z_TXT|444
Z_52TNT|Z_TNT|555
Z_6TNT|Z_TNT|666
How can I easily query the original table for only GROUP's that correspond to GROUP_ALIASES with only one Distinct FIELD in it?
Desired result:
GROUP1|GROUP_ALIAS|FIELD
Z_12TXT|Z_TXT|111
Z_2TXT|Z_TXT|222
Z_4TXT|Z_TXT|444
Z_52TNT|Z_TNT|555
Z_6TNT|Z_TNT|666
This is how I get all the GROUP_ALIAS's I don't want:
SELECT GROUP_ALIAS
FROM
(SELECT
GROUP1,FIELD,
case when instr(GROUP1, '_') = 2
then
substr(GROUP1, 1, 2) ||
ltrim(substr(GROUP1, 3), '0123456789')
else
substr(GROUP1 , 1, 1) ||
ltrim(substr(GROUP1, 2), '0123456789')
end GROUP_ALIAS
FROM MY_TABLE
GROUP BY GROUP_ALIAS
HAVING COUNT(FIELD)=1
Probably I could make the engineered field a second time simply on the original table and check that it isn't in the result from the latter, but want to avoid so much nesting. I don't know how to partition or do anything more sophisticated on my case statement making this engineered field, though.
UPDATE
Thanks for all the great replies below. Something about the SQL used must differ from what I thought because I'm getting info like:
GROUP1|GROUP_ALIAS|FIELD
111,222|,111|111
111,222|,222|222
etc.
Not sure why since the solutions work on my unabstracted data in db-fiddle. If anyone can spot what db it's actually using that would help but I'll also check on my end.
Here is one way, using analytic count. If you are not familiar with the with clause, read up on it - it's a very neat way to make your code readable. The way I declare column names in the with clause works since Oracle 11.2; if your version is older than that, the code needs to be re-written just slightly.
I also computed the "engineered field" in a more compact way. Use whatever you need to.
I used sample_data for the table name; adapt as needed.
with
add_alias (group1, group_alias, field) as (
select group1,
substr(group1, 1, instr(group1, '_')) ||
ltrim(substr(group1, instr(group1, '_') + 1), '0123456789'),
field
from sample_data
)
, add_counts (group1, group_alias, field, ct) as (
select group1, group_alias, field, count(*) over (partition by group_alias)
from add_alias
)
select group1, group_alias, field
from add_counts
where ct > 1
;
With Oracle you can use REGEXP_REPLACE and analytic functions:
select Group1, group_alias, field
from (select group1, REGEXP_REPLACE(group1,'_\d+','_') group_alias, field,
count(*) over (PARTITION BY REGEXP_REPLACE(group1,'_\d+','_')) as count from test) a
where count > 1
db-fiddle

How to put together data from different tables without duplicating

Im feeling a bit stupid now but I cant seem to make it happen. I have som tables with data and the problem I have with one SELECT is that the data is sometimes duplicated. (Sorry, English is not my first language, ask if unclear.
SELECT (IM_FAKTUROR.FAKT_NUMMER || ' ' || IM_FAKTURA_GRUPPER.FAKT_TYP) AS 'ProjektNrNamn',
But sometimes those two tables/columns have exactly the same data and in those cases I only want the data from one of them, not both. How to?
If there is different data in the two I want all info.
Use a case expression. If the two columns have the same value, just return one of them. Else return both of them:
SELECT case when IM_FAKTUROR.FAKT_NUMMER = IM_FAKTURA_GRUPPER.FAKT_TYP
then IM_FAKTUROR.FAKT_NUMMER
else (IM_FAKTUROR.FAKT_NUMMER || ' ' || IM_FAKTURA_GRUPPER.FAKT_TYP)
end AS 'ProjektNrNamn',
Try this:
SELECT DISTINCT ProjektNrNamn FROM
(
SELECT (IM_FAKTUROR.FAKT_NUMMER || ' ' || IM_FAKTURA_GRUPPER.FAKT_TYP) AS 'ProjektNrNamn'
) as t
Try SELECT DISTINCT:
SELECT DISTINCT
IM_FAKTUROR.FAKT_NUMMER || ' ' || IM_FAKTURA_GRUPPER.FAKT_TYP AS 'ProjektNrNamn'
FROM ...
But this answer assumes that the project name is the only thing in your select list. If you have other columns, it gets more complicated.

SQL: conditioning in aggregation function

I have following sql:
SELECT LISTAGG((TO_CHAR(ch.count), '|') WITHIN GROUP (ORDER BY ch.Count)
FROM ChG cg
JOIN Ch ch on ch.GroupID = cg.GroupID
WHERE cg.PartyID = cp.PartyID
I would like to add condition, pseudocode:
if(ch.TYPECODE = 1) then ch.count = 'A' + ch.count. How it's better to achieve in stored procedure?
listagg(case when ch.typecode = 1 then 'A' end || to_char(ch.count), '|') .....
Before aggregation, each row is inspected for the condition ch.typecode = 1. If it is true, an 'A' is pre-pended (concatenated in front of) to_char(ch.count). I am just guessing that's what you need.
If you need this 'A' to be pre-pended to ch.count for the ORDER BY condition also, then you can do the same thing there. You will need to wrap within to_char as well. (If you don't, Big Brother Oracle will do it for you anyway, but try to avoid implicit conversions whenever possible.)

How to avoid multiple function executions in case expression without nesting

So let's assume I have the following query:
SELECT
CASE
WHEN SOME_PACKAGE.SOME_EXPENSIVE_FUNCTION IS NOT NULL
THEN SOME_PACKAGE.SOME_EXPENSIVE_FUNCTION || ', random string'
ELSE
'something_else'
END
FROM
SOME_TABLE;
Is there a way to prevent this package function from executing more than once without nested queries?
This should work:
SELECT NVL(NULLIF(SOME_PACKAGE.SOME_EXPENSIVE_FUNCTION
||', random string',', random string'),'something_else')
FROM SOME_TABLE;