SQL Group by with string concat and CASE statements - sql

I am trying to understand SQL Group By concept along with the Select statement having:
String concat operation
CASE statement
I understand the Select clause is evaluated after the Group by clause and that any non aggregated columns in the Select list also need to be present in the Group By list. I will try to explain my questsion with an example ORDERS table that has one record for each order:
SELECT FIRSTNAME +' '+LASTNAME
,EMP_TITLE
,EMAILADDR
,INACTIVE
,CASE WHEN ISNULL(EMAILADDR,'')='' THEN 'ADMIN#MYCOMPANY.COM' ELSE EMAILADDR END AS ALTEMAIL
,CASE WHEN AGE>60 THEN 'SENIOR' ELSE 'EMP' END AS EMPTYPE
,COUNT(*) AS TOTAL_ORDERS
FROM ORDERS
GROUP BY FIRSTNAME + ' '+ LASTNAME
,EMP_TITLE
,EMAILADDR
,INACTIVE
,CASE WHEN ISNULL(EMAILADDR,'')='' THEN 'ADMIN#MYCOMPANY.COM' ELSE EMAILADDREND
,CASE WHEN AGE>60 THEN 'SENIOR' ELSE 'EMP' END
I understand the Select clause is evaluated after the Group by clause. What I am confused/trying to get my head around is - in what situations do we use the column name vs the contents from the select statement (concat/case in this example) in the group by clause?
Example: In group by above, we could get rid of FIRSTNAME + ' '+ LASTNAME and replace it with FIRSTNAME,LASTNAME. Also we could replace the CASE statements with the EMAILADDR, AGE.
Question:
I am trying to understand in which situations this is OK to do and in which situations is it absolutely necessary that we must place the CASE statement (or the concat string operation) from select clause into the Group By clause?
In situations where it is OK to use either, what is the best practice to follow? Use the case statement as is from select in the group by? Or only use column names involved in the case statement?

The GROUP BY clause defines expressions that can be referenced without aggregation in the SELECT. So, this is allowed:
SELECT FIRSTNAME + ' ' + LASTNAME
. . .
GROUP BY FIRSTNAME, LASTNAME
because FIRSTNAME and LASTNAME can be referenced without aggregation functions. So, any expressions are allowed as well.
On the other hand, this is not allowed:
SELECT FIRSTNAME, LASTNAME
. . .
GROUP BY
In this case, FIRSTNAME and LASTNAME are not in the GROUP BY, so they are not allowed. This is also allowed:
SELECT HONORIFIC + ' ' + FIRSTNAME + ' ' + LASTNAME
. . .
GROUP BY FIRSTNAME + ' ' + LASTNAME, HONORIFIC
However, this is not allowed:
SELECT HONORIFIC + ' ' + FIRSTNAME + ' ' + LASTNAME
. . .
GROUP BY FIRSTNAME + ' ' + LASTNAME + HONORIFIC

Related

How to do not null check on LEFT function on the select query

I have a query that returns some demographics like firstName, lastName, MiddleName and i need to use LEFT function on each to filter the First Letter of each column like LEFt(firstName, 1).This is working fine when each column is not a null value. when it is null value
select otherColumns, LEFT(sub.LastName, 1) + ',' + LEFT(sub.FirstName, 1) + ' ' + LEFT(sub.MiddleName, 1) as patientInitials from <table> <inner joins> <some where conditions>;
But when one of demographics like middleName is null and other firstName, lastName are not null , patientInitials are evaulating to NULL, not sure why?
I resolved my issue by adding COALESCE
LEFT(sub.LastName, 1) + ',' + LEFT(sub.FirstName, 1) + ' ' + COALESCE((LEFT(sub.MiddleName, 1)),'') as patientInitials
But is there any other good way to check for notNull on the LEFT function ??
Help Appreciated!
But is there any other good way to check for notNull on the LEFT function ??
CONCAT function ignores NULLs:
SELECT CONCAT(LEFT(sub.LastName, 1), ',' ,
LEFT(sub.FirstName, 1),
' ' + LEFT(sub.MiddleName, 1)) patientInitials
FROM tab;
' ' + LEFT(sub.MiddleName, 1)) using ' ' will remove leading space in case if Middle Name is NULL.
The CONCAT_WS function also has a similar function:CONCAT_WS (Transact-SQL)

Manipulating duplicate values?

I have a table, with an ID, FirstName & Lastname.
I'm selecting that using the following query:
SELECT USER_ID as [ID], First_P + ' ' + Last_P as FullName FROM Persons
It's working fine. I'm basically having a list of ID's and full names.
Full names could be the same. How is it possible for me to find them and add the ID on the Full name cell as well? only when the names are the same.
Example:
1 John Wick (1)
50 John Wick (50)
I haven't found any similar questions to be honest, at least not for MSSQL. So If there are any, feel free to link me.
please take a look my answer. I used nested query to identify number of duplicated names
SELECT
ID,
IIF(NUMBEROFDUPS =1, NAME, CONCAT(NAME, ' (', ID, ')')) AS NAME
FROM
(
SELECT
ID,
CONCAT(First_P, ' ', Last_P) AS NAME,
COUNT(*) OVER (PARTITION BY First_P,Last_P) AS NUMBEROFDUPS
FROM
Table1
) tmp;
You can use outer apply to group the items via First_P + ' ' + Last_P
and then add case for multiple items.
The select stuff should look like:
SELECT USER_ID as [ID], p1.First_P + ' ' + p1.Last_P + case when cnt.FullName is not null
then '(' + p2.[sum] + ')' else '' end as FullName FROM Persons p1
outer apply (select First_P + ' ' + Last_P as FullName,count(1) as [sum]
from Persons p2
where p2.First_P + ' ' + p2.Last_P = p1.First_P + ' ' + p1.Last_P
group by First_P + ' ' + Last_P
having count(1) > 1) cnt

Choose the non null column per row

I'm working with a table which has "Name", "FirstName", and "Surname" fields.
"Name" was used years ago but is still supported.
Today though the "FirstName" and "Surname" fields are used instead.
Here's what I'd like to do for my query on this table. If FirstName and/or Surname are not Null then I want them returned for the row, otherwise I want Name.
FirstName and Surname can be concatenated with a space between, giving just a single field in the returned result. Or maybe you have a better solution.
Thanks you!
Barry
This can be done by either using COALESCE
SELECT COALESCE(Firstname + ' ' + Surname, Name)
FROM ATable
or using ISNULL
SELECT ISNULL(Firstname + ' ' + Surname, Name)
FROM ATable
If Firstname or Surname can be NULL and you want to return one of these, you would use COALESCE
SELECT COALESCE(Firstname + ' ' + Surname, Firstname, Surname, Name)
FROM ATable
COALESCE
Returns the first nonnull expression among its arguments.
SELECT CASE
WHEN [FirstName] IS NULL OR [Surname] IS NULL THEN [Name]
ELSE [FirstName] + ' ' + [Surname]
END AS [Name]
FROM MyTable

How to convert two nvarchar rows into one when one of the row is null

sorry if this is a duplicate to one of existing questions (it's so simple but I can't figure it out, I'm new).
I need to migrate some data from one table to another (different structures).
Table A have Firstname and LastName columns.
Table B have Name column
I want to do
SELECT Firstname + ' ' + LastName As Name FROM TableA
But the problem is that in table B, some rows have null value for firstname or lastname but not both (Lazy user).
When I import them into table B, the query fails because Name column is non-nullable in my new design and when I test the statement above, if firstname or lastname is null, the concated value is null.
From the reading that I've done, this is expected behavior but what can I do to get around this?
I want to save firstname or lastname if the other is null.
SELECT RTRIM(LTRIM(ISNULL(Firstname ,'') + ' ' + ISNULL(LastName,''))) AS Name
FROM TableA
This can be done with a single colaesce (return first non-null) & no need to mess about with spaces.
select
coalesce(firstname + ' ' + lastname, firstname, lastname)
from TableA
use coalesce or isnull
select COALESCE(FirstNAme, '') + ' ' + COALESCE(LastName, '') as name from TableA
SELECT isnull (Firstname, '') + ' ' isnull (LastName, '') as Name
FROM TableA

How do I perform a GROUP BY on an aliased column in SQL Server?

I'm trying to perform a group by action on an aliased column (example below) but can't determine the proper syntax.
SELECT LastName + ', ' + FirstName AS 'FullName'
FROM customers
GROUP BY 'FullName'
What is the correct syntax?
Extending the question further (I had not expected the answers I had received) would the solution still apply for a CASEed aliased column?
SELECT
CASE
WHEN LastName IS NULL THEN FirstName
WHEN LastName IS NOT NULL THEN LastName + ', ' + FirstName
END AS 'FullName'
FROM customers
GROUP BY
LastName, FirstName
And the answer is yes it does still apply.
You pass the expression you want to group by rather than the alias
SELECT LastName + ', ' + FirstName AS 'FullName'
FROM customers
GROUP BY LastName + ', ' + FirstName
This is what I do.
SELECT FullName
FROM
(
SELECT LastName + ', ' + FirstName AS FullName
FROM customers
) as sub
GROUP BY FullName
This technique applies in a straightforward way to your "edit" scenario:
SELECT FullName
FROM
(
SELECT
CASE
WHEN LastName IS NULL THEN FirstName
WHEN LastName IS NOT NULL THEN LastName + ', ' + FirstName
END AS FullName
FROM customers
) as sub
GROUP BY FullName
Unfortunately you can't reference your alias in the GROUP BY statement, you'll have to write the logic again, amazing as that seems.
SELECT LastName + ', ' + FirstName AS 'FullName'
FROM customers
GROUP BY LastName + ', ' + FirstName
Alternately you could put the select into a subselect or common table expression, after which you could group on the column name (no longer an alias.)
Sorry, this is not possible with MS SQL Server (possible though with PostgreSQL):
select lastname + ', ' + firstname as fullname
from person
group by fullname
Otherwise just use this:
select x.fullname
from
(
select lastname + ', ' + firstname as fullname
from person
) as x
group by x.fullname
Or this:
select lastname + ', ' + firstname as fullname
from person
group by lastname, firstname -- no need to put the ', '
The above query is faster, groups the fields first, then compute those fields.
The following query is slower (it tries to compute first the select expression, then it groups the records based on that computation).
select lastname + ', ' + firstname as fullname
from person
group by lastname + ', ' + firstname
Given your edited problem description, I'd suggest using COALESCE() instead of that unwieldy CASE expression:
SELECT FullName
FROM (
SELECT COALESCE(LastName+', '+FirstName, FirstName) AS FullName
FROM customers
) c
GROUP BY FullName;
My guess is:
SELECT LastName + ', ' + FirstName AS 'FullName'
FROM customers
GROUP BY LastName + ', ' + FirstName
Oracle has a similar limitation, which is annoying. I'm curious if there exists a better solution.
To answer the second half of the question, this limitation applies to more complex expressions such as your case statement as well. The best suggestion I've seen it to use a sub-select to name the complex expression.
You can use CROSS APPLY to create an alias and use it in the GROUP BY clause, like so:
SELECT FullName
FROM Customers
CROSS APPLY (SELECT LastName + ', ' + FirstName AS FullName) Alias
GROUP BY FullName
SELECT
CASE
WHEN LastName IS NULL THEN FirstName
WHEN LastName IS NOT NULL THEN LastName + ', ' + FirstName
END AS 'FullName'
FROM
customers
GROUP BY
LastName,
FirstName
This works because the formula you use (the CASE statement) can never give the same answer for two different inputs.
This is not the case if you used something like:
LEFT(FirstName, 1) + ' ' + LastName
In such a case "James Taylor" and "John Taylor" would both result in "J Taylor".
If you wanted your output to have "J Taylor" twice (one for each person):
GROUP BY LastName, FirstName
If, however, you wanted just one row of "J Taylor" you'd want:
GROUP BY LastName, LEFT(FirstName, 1)
If you want to avoid the mess of the case statement being in your query twice, you may want to place it in a User-Defined-Function.
Sorry, but SQL Server would not render the dataset before the Group By clause so the column alias is not available. You could use it in the Order By.
In the old FoxPro (I haven't used it since version 2.5), you could write something like this:
SELECT LastName + ', ' + FirstName AS 'FullName', Birthday, Title
FROM customers
GROUP BY 1,3,2
I really liked that syntax. Why isn't it implemented anywhere else? It's a nice shortcut, but I assume it causes other problems?
SELECT
CASE WHEN LastName IS NULL THEN FirstName
WHEN LastName IS NOT NULL THEN LastName + ', ' + FirstName
END AS 'FullName'
FROM customers GROUP BY 1`
For anyone who finds themselves with the following problem (grouping by ensuring zero and null values are treated as equals)...
SELECT AccountNumber, Amount AS MyAlias
FROM Transactions
GROUP BY AccountNumber, ISNULL(Amount, 0)
(I.e. SQL Server complains that you haven't included the field Amount in your Group By or aggregate function)
...remember to place the exact same function in your SELECT...
SELECT AccountNumber, ISNULL(Amount, 0) AS MyAlias
FROM Transactions
GROUP BY AccountNumber, ISNULL(Amount, 0)