How to easily remove count=1 on aliased field in SQL? - sql

I have the following data in a table:
GROUP1|FIELD
Z_12TXT|111
Z_2TXT|222
Z_31TBT|333
Z_4TXT|444
Z_52TNT|555
Z_6TNT|666
And I engineer in a field that removes the leading numbers after the '_'
GROUP1|GROUP_ALIAS|FIELD
Z_12TXT|Z_TXT|111
Z_2TXT|Z_TXT|222
Z_31TBT|Z_TBT|333 <- to be removed
Z_4TXT|Z_TXT|444
Z_52TNT|Z_TNT|555
Z_6TNT|Z_TNT|666
How can I easily query the original table for only GROUP's that correspond to GROUP_ALIASES with only one Distinct FIELD in it?
Desired result:
GROUP1|GROUP_ALIAS|FIELD
Z_12TXT|Z_TXT|111
Z_2TXT|Z_TXT|222
Z_4TXT|Z_TXT|444
Z_52TNT|Z_TNT|555
Z_6TNT|Z_TNT|666
This is how I get all the GROUP_ALIAS's I don't want:
SELECT GROUP_ALIAS
FROM
(SELECT
GROUP1,FIELD,
case when instr(GROUP1, '_') = 2
then
substr(GROUP1, 1, 2) ||
ltrim(substr(GROUP1, 3), '0123456789')
else
substr(GROUP1 , 1, 1) ||
ltrim(substr(GROUP1, 2), '0123456789')
end GROUP_ALIAS
FROM MY_TABLE
GROUP BY GROUP_ALIAS
HAVING COUNT(FIELD)=1
Probably I could make the engineered field a second time simply on the original table and check that it isn't in the result from the latter, but want to avoid so much nesting. I don't know how to partition or do anything more sophisticated on my case statement making this engineered field, though.
UPDATE
Thanks for all the great replies below. Something about the SQL used must differ from what I thought because I'm getting info like:
GROUP1|GROUP_ALIAS|FIELD
111,222|,111|111
111,222|,222|222
etc.
Not sure why since the solutions work on my unabstracted data in db-fiddle. If anyone can spot what db it's actually using that would help but I'll also check on my end.

Here is one way, using analytic count. If you are not familiar with the with clause, read up on it - it's a very neat way to make your code readable. The way I declare column names in the with clause works since Oracle 11.2; if your version is older than that, the code needs to be re-written just slightly.
I also computed the "engineered field" in a more compact way. Use whatever you need to.
I used sample_data for the table name; adapt as needed.
with
add_alias (group1, group_alias, field) as (
select group1,
substr(group1, 1, instr(group1, '_')) ||
ltrim(substr(group1, instr(group1, '_') + 1), '0123456789'),
field
from sample_data
)
, add_counts (group1, group_alias, field, ct) as (
select group1, group_alias, field, count(*) over (partition by group_alias)
from add_alias
)
select group1, group_alias, field
from add_counts
where ct > 1
;

With Oracle you can use REGEXP_REPLACE and analytic functions:
select Group1, group_alias, field
from (select group1, REGEXP_REPLACE(group1,'_\d+','_') group_alias, field,
count(*) over (PARTITION BY REGEXP_REPLACE(group1,'_\d+','_')) as count from test) a
where count > 1
db-fiddle

Related

ORACLE TO_CHAR SPECIFY OUTPUT DATA TYPE

I have column with data such as '123456789012'
I want to divide each of each 3 chars from the data with a '/' in between so that the output will be like: "123/456/789/012"
I tried "SELECT TO_CHAR(DATA, '999/999/999/999') FROM TABLE 1" but it does not print out the output as what I wanted. Previously I did "SELECT TO_CHAR(DATA, '$999,999,999,999.99') FROM TABLE 1 and it printed out as "$123,456,789,012.00" so I thought I could do the same for other case as well, but I guess that's not the case.
There is also a case where I also want to put '#' in front of the data so the output will be something like this: #12345678901234. Can I use TO_CHAR for this problem too?
Is these possible? Because when I go through the documentation of oracle about TO_CHAR, it stated a few format that can be use for TO_CHAR function and the format that I want is not listed there.
Thank you in advance. :D
Here is one option with varchar2 datatype:
with test as (
select '123456789012' a from dual
)
select listagg(substr(a,(level-1)*3+1,3),'/') within group (order by rownum) num
from test
connect by level <=length(a)
or
with test as (
select '123456789012.23' a from dual
)
select '$'||listagg(substr((regexp_substr(a,'[0-9]{1,}')),(level-1)*3+1,3),',') within group (order by rownum)||regexp_substr(a,'[.][0-9]{1,}') num
from test
connect by level <=length(a)
output:
1st query
123/456/789/012
2nd query
$123,456,789,012.23
If you wants groups of three then you can use the group separator G, and specify the character to use:
SELECT TO_CHAR(DATA, 'FM999G999G999G999', 'NLS_NUMERIC_CHARACTERS=./') FROM TABLE_1
123/456/789/012
If you want a leading # then you can use the currency indicator L, and again specify the character to use:
SELECT TO_CHAR(DATA, 'FML999999999999', 'NLS_CURRENCY=#') FROM TABLE_1
#123456789012
Or combine both:
SELECT TO_CHAR(DATA, 'FML999G999G999G999', 'NLS_CURRENCY=# NLS_NUMERIC_CHARACTERS=./') FROM TABLE_1
#123/456/789/012
db<>fiddle
The data type is always a string; only the format changes.

SQL Server 2017 - Concatenate values based on order number

I am doing concatenation of values where there is grouping value greater than 1. This works fine, but now I am trying to figure out how to utilize an sequence/order number in which to concatenate the values in that order. Is there a way I can do this?
So for example, I have a table which has the following:
I need the ability to concatenate Field1 and Field4 since the StagingCol is the same name and I also need to be able to concatenate in the order provided in the ConcatenateOrder column. I can't have it out of sequence i.e. Field4 + Field1
This is a snippet of the code I have so far which is working fine, it concatenates the two LandingZoneCol values...
--DECLARATION OF LANDING ZONE FIELD NAMES CONCATENATED TOGETHER AND DELMITED BY A COMMA WHERE VALUE NEEDS TO BE CONCATENATED (I.E. SUBSCRIBER + SEQ# = MEMBER_ID)
SELECT #ConcatLandingZoneFieldNames = ISNULL(#ConcatLandingZoneFieldNames,'') + ISNULL(LandZoneMapping.LandingZoneFieldName,'') + ', '
FROM #LandingZoneMapping AS LandZoneMapping
WHERE LandZoneMapping.StagingColumnName <> 'BreadCrumb'
AND LandZoneMapping.StagingColumnName IN (SELECT StagingColumnName
FROM #TEST
WHERE Total > 1)
--DECLARATION OF VARIABLES
SET #ConcatLandingZoneFieldNames = CONCAT('CONCAT(',SUBSTRING(#ConcatLandingZoneFieldNames,1,LEN(#ConcatLandingZoneFieldNames)-1),')')
Current Results
CONCAT(Field1, Field4)
Expected Results
CONCAT(Field1, Field4)
Although both Current and Expected are the same right now, I want to ensure that the concatenation of the values are concatenated in the correct order. If I was to flip the ConcatenateOrder numbers in the above table then the outcome would be different. The "Current" Results would end up being CONCAT(Field1, Field4) but the "Expected Results" should be CONCAT(Field4, Field1)
Any help would be greatly appreciated.
Your code looks like SQL Server. You an use string_agg():
select string_agg(lzm.landingzonecol, ',') within group (order by lzm.concatenateorder)
from #LandingZoneMapping lzm
where lzm.stagingcol = 'ID';
You can control the ordering with the order by clause.
As mention by Gordon you can use string_agg (Doc) function if you are using SQL Server version 2017 or above.
In case if you need same functionality for SQL Server version below 2017, use this:
SELECT STUFF((
SELECT CONCAT (
','
,LandingZoneCol
)
FROM #LandingZoneMapping LZM
WHERE StagingCol = 'ID'
ORDER BY ConcatenateOrder
FOR XML PATH('')
), 1, 1, '') AS Result

Sort varchar datatype with numeric characters

SQL SERVER 2005
SQL Sorting :
Datatype varchar
Should sort by
1.aaaa
5.xx
11.bbbbbb
12
15.
how can i get this sorting order
Wrong
1.aaaa
11.bbbbbb
12
15.
5.xx
On Oracle, this would work.
SELECT
*
FROM
table
ORDER BY
to_number(regexp_substr(COLUMN,'^[0-9]+')),
regexp_substr(column,'\..*');
You could do this by calculating a column based on what's on the left hand side of the period('.').
However this method will be very difficult to make robust enough to use in a production system, unless you can make a lot of assertions about the content of the strings.
Also handling strings without periods could cause some grief
with r as (
select '1.aaaa' as string
union select '5.xx'
union select '11.bbbbbb'
union select '12'
union select '15.' )
select *
from r
order by
CONVERT(int, left(r.string, case when ( CHARINDEX('.', r.string)-1 < 1)
then LEN(r.string)
else CHARINDEX('.', r.string)-1 end )),
r.string
If all the entries have this form, you could split them into two parts and sort be these, for example like this:
ORDER BY
CONVERT(INT, SUBSTRING(fieldname, 1, CHARINDEX('.', fieldname))),
SUBSTRING(fieldname, CHARINDEX('.', fieldname) + 1, LEN(fieldname))
This should do a numeric sort on the part before the . and an alphanumeric sort for the part after the ., but may need some tuning, as I haven't actually tried it.
Another way (and faster) might be to create computed columns that contain the part before the . and after the . and sort by them.
A third way (if you can't create computed columns) could be to create a view over the table that has two additional columns with the respective parts of the field and then do the select on that view.

SQL 'FROM' not where expected

A follow on from a previous question
I'm working with an oracle 11g DB and need to manipulate a string column within it. The column contains multiple email addresses in this format:
jgoozooll#gmail.com;dzhookep#gmail.com;admzmoore#outlook.com
What I want to do is take out anything that does not have '#gmail.com' at the end (in this example admzmoore#outlook.com.com would be removed) however admzmoore#outlook.com may be the first email in the next row of the column so in this way there is no real fixed format, the only format being that each address is seperated by a semi-colon.
Is there anyway of implementing this through one command to run through every row in the column and remove anything thats not #gmail.com? I'm not really sure if this kind of processing is possible in SQL. Just looking for your thoughts!!
Getting the above 'FROM' error in the following code and I cant for my life figure out why. Someone will probably make me look stupid, but its a chance i have to take. There may also be other errors:) Heres my code:
SELECT REMIT_TO.ID
, LISTAGG(EMAIL, ';') WITHIN GROUP(ORDER BY REMIT_TO.ID) REMIT_TO.EFT_EMAIL_ADDR
FROM (SELECT REMIT_TO.ID
, regexp_substr(REMIT_TO.EFT_EMAIL_ADDR, '[^;]+', 1, RN) email
FROM IQMS.REMIT_TO
CROSS JOIN (SELECT ROWNUM RN
FROM(SELECT MAX (REGEXP_COUNT(REMIT_TO.EFT_EMAIL_ADDR, '[^;]+')) ML
FROM IQMS.REMIT_TO
)
CONNECT BY LEVEL <= ML
)
)
WHERE EMAIL LIKE '%#gmail.com%'
GROUP BY REMIT_TO.ID
Anything stick out for anyone?
Thanks you guys.
It appears you are missing a few aliases on your subqueries:
SELECT REMIT_TO.ID
, LISTAGG(EMAIL, ';') WITHIN GROUP(ORDER BY REMIT_TO.ID) REMIT_TO.EFT_EMAIL_ADDR
FROM
(
SELECT REMIT_TO.ID
, regexp_substr(REMIT_TO.EFT_EMAIL_ADDR, '[^;]+', 1, RN) email
FROM IQMS.REMIT_TO REMIT_TO
CROSS JOIN
(
SELECT ROWNUM RN
FROM
(
SELECT MAX (REGEXP_COUNT(REMIT_TO.EFT_EMAIL_ADDR, '[^;]+')) ML
FROM IQMS.REMIT_TO
) x2 -- alias needed
CONNECT BY LEVEL <= ML
) x1 -- alias needed
) REMIT_TO -- alias needed
WHERE EMAIL LIKE '%#gmail.com%'
GROUP BY REMIT_TO.ID
My Oracle is a but rusty, but on first look you're missing a comma after the LISTAGG function.
SELECT REMIT_TO.ID
, LISTAGG(EMAIL, ';') WITHIN GROUP(ORDER BY REMIT_TO.ID)
, REMIT_TO.EFT_EMAIL_ADDR...

SQL Using ORDER BY with UNION doesn't sort numbers correctly (e.g. 10 before 8)

I've tried looking for the answer, and read many threads on this site, but still can't find the answer I'm looking for.
I am trying to sort a series of numbers that are real look-ups and also one * which isn't, I can sort fine when I don't need to add the fake * but not after.
I have tried
SELECT DISTINCT MasterTable.ClassName, MasterTable.ClassYear
FROM MasterTable
UNION ALL
SELECT DISTINCT "*" As [ClassName], "1" As [ClassYear]
FROM MasterTable
ORDER BY MasterTable.ClassYear;
And
SELECT DISTINCT MasterTable.ClassName, MasterTable.ClassYear
FROM (
SELECT DISTINCT MasterTable.ClassName, MasterTable.ClassYear FROM MasterTable
UNION
SELECT DISTINCT "*" As [ClassName], "1" As [ClassYear] FROM MasterTable
)
ORDER BY MasterTable.ClassYear;
But both return the ClassYear as 1, 10, 12, 8... rather than 1, 8, 10, 12....
Any help would be much appreciated,
Thanks :)
MasterTable.ClassYear is varchar so it will sort as a string.
You'll have to convert it in the query or fix the column type.
For the 2nd clause, you also need only:
SELECT "*" As [ClassName], "1" As [ClassYear] --No FROM MasterTable
However, you can "cheat" and do this. Here 1 will be int and will force a conversion to int from the 1st clause because
SELECT "*" As [ClassName], 1 As [ClassYear] --force int. And fixed on edit
UNION ALL
SELECT DISTINCT MasterTable.ClassName, MasterTable.ClassYear
FROM MasterTable
ORDER BY ClassYear; --no table ref needed
It's property sorting those values as strings. If you want them in numerical order, try something like Cast(MasterTable.ClassYear AS int), either in the select or in the order by, or both, depending on how you end up structuring your query.
And instead of SELECT ..., "1" As [ClassYear], write SELECT ..., 1 As [ClassYear].
You are returning the year as a string, not a number. That means that it's sorted as text, not numerically.
Either return the year as a number, or convert the value into a number when sorting it.