SQL Grouping by Like using select statement - sql

I have a table with unique names and a combination of those same names separated by commas in the same field as below:
Bill
Mark
Steve
Bill, Mark
Mark, Steve
Bill,Mark, Steve
I would like to Group the names not separated by a comma for a count of those where the name exists such as:
Bill 3
Mark 4
Steve 3
In the future someone may add another name to the table so I can't use a Case statement with static names. I would like something like this:
SELECT
Name
FROM
My_Table
Group By
Name Like (SELECT Name FROM My_Table Where Name Not Like '%,%')
Is that possible?

Select N.Name, COUNT(*)
FROM (
SELECT Name
FROM My_Table
WHERE Name NOT LIKE '%,%'
) Names N
JOIN My_Table MT
ON (MT.Name LIKE '%' + N.Name + ',%' OR MT.Name LIKE '%,' + N.Name + '%' OR MT.Name = N.Name)
GROUP BY N.Name

I don't have an instance of SQLServer to test this out, but an approach that may work is selecting only the simple-name records and then a nested SELECT expression that will count all records with that name. Something like this:
SELECT
Name,
(SELECT COUNT(*) FROM YourTable Y2 WHERE Name LIKE ('%' + Y1.Name + '%'))
FROM YourTable Y1 WHERE NAME NOT LIKE '%,%'
It will fail, of course, on nested names. (Bob and Bobby, if they're different people, for instance). A more robust approach would require removing all the spaces from around commas and building out the LIKE expression into three LIKEs ORed together. If you can't create a LIKE value in-line the way I did, you can substitute whatever the SQLServer function for location-within-string is.
But, honestly, I'd project a temporary normalized table and base your report off that.

Related

I need a query for children that start with my name but doesn't start with any in that set

Suppose I have a table with text primary key called "name". Given a name (that may contain any arbitrary characters including %), I need all of the rows from that table that start with that name, are longer than that name, and that don't start with anything else in the table that is longer than the given name.
For example, suppose my table contains names ad, add, adder, and adage. If I query for "children of ad", I want to get back add, adage. (adder is a child of add). Can this be done efficiently, as I have several million rows? Recursive queries are certainly available.
I have a different approach at present where I maintain a "parent" column. The code to maintain this column is quite painful, and it would be unnecessary if this other approach were reasonable.
I can't tell about its efficiency but I think it works:
with cte as (
select name
from tablename
where name like 'ad' || '_%'
)
select c.name
from cte c
where not exists (
select 1 from cte
where c.name like name || '_%'
);
See the demo.
Equivalent to the above query with a self LEFT JOIN:
with cte as (
select name
from tablename
where name like 'ad' || '_%'
)
select c.name
from cte c left join cte cc
on c.name like cc.name || '_%'
where cc.name is null
See the demo.
Results:
| name |
| ----- |
| add |
| adage |
You could use left on the column like this:
select *
from sometable
where lower(left(name,2)) = 'ad' and length(name) > 2

SQL Server : returning multiple rows by ROWID by using only one where / LIKE OPERATOR

Here are rows in my database which I want to get:
I want to get get all 3 rows by executing one query, this ids are received as procedure parameters, sometimes I can receive 1 id and sometimes I can receive 10 of them, depends what users sends to database.
I wrote something like this:
SELECT *
FROM Products
WHERE CONVERT(NVARCHAR(MAX), ProductId) LIKE '%' + 'A9472294-CFDD-40AC-BC2D-00E39AF4A300, 9A817E40-E4B1-4487-A376-010DD6377E38, 078A3C75-C442-4D88-A1E0-0118B8706667' + '%'
SELECT *
FROM Products
WHERE 'A9472294-CFDD-40AC-BC2D-00E39AF4A300, 9A817E40-E4B1-4487-A376-010DD6377E38, 078A3C75-C442-4D88-A1E0-0118B8706667' LIKE '%' + CONVERT(NVARCHAR(MAX), ProductId) + '%'
How come the second example returns rows as expected and the first example does not return any rows?
The definition says: %or% finds any values that have "or" in any position
How does this actually work? Could anyone explain?
EDIT:
I thought this LIKE operator should be used on my column like
Select *
From Products
Where ProductId Like '%' + 'SomeStringId' + '%';
because LIKE operator is used in a WHERE clause to search for a specified pattern in a column and my column is ProductId so I can't understand how come example like this works:
Select *
From Products
Where 'SomeStringId' Like '%' + ProductId + '%';
How come this example above works if Like is not used on my column, it's used on some string acctually...
You should split the string and then use an inner join.
For SQL Server below 2016 with the cumbersome string split via XML:
SELECT p.*
FROM products p
INNER JOIN (SELECT arg_xml_node.xml_node.value('(.)[1]', 'uniqueidentifier') uniqueidenfier
FROM (SELECT convert(xml,
concat('<x>',
replace('A9472294-CFDD-40AC-BC2D-00E39AF4A300, 9A817E40-E4B1-4487-A376-010DD6377E38, 078A3C75-C442-4D88-A1E0-0118B8706667',
', ',
'</x><x>'),
'</x>')) xml) arg_xml
CROSS APPLY arg_xml.xml.nodes('x') arg_xml_node (xml_node)) arg_uniqueidenfier
ON arg_uniqueidenfier.uniqueidenfier = p.productid;
For SQL Server 2016 and above with the elegant way using string_split():
SELECT p.*
FROM products p
INNER JOIN (SELECT convert(uniqueidentifier, ltrim(value)) uniqueidenfier
FROM string_split('A9472294-CFDD-40AC-BC2D-00E39AF4A300, 9A817E40-E4B1-4487-A376-010DD6377E38, 078A3C75-C442-4D88-A1E0-0118B8706667',
',')) arg_uniqueidenfier
ON arg_uniqueidenfier.uniqueidenfier = p.productid;
db<>fiddle

Count values in a table taken from a list in SQL

I Have one list that is instantiated by the following:
SELECT
chklRefTo
FROM CSART.DBO.tblMaintenance
and returns the following:
chklRefTo
----------
SRH
STI
GP/Walk-in
ED/UCC
Other
and another column of values
Ref to
-------
STI
STI,GP/Walk-in,ED/UCC
GP/Walk-in,ED/UCC
SRH,STI,ED/UCC
STI,Other
That is instantiated by this:
SELECT
ReferredTo AS "Reason Not Admitted"
FROM CSART.DBO.tblPhoneConsult
WHERE ReferredTo != '' AND ReferredTo IS NOT NULL
For each value in the first list, I need a count of the number of times each value appears in the second list. Ideally, the result of the query would look something like the below:
Ref Num
-----------
STI | 3
SRH | 1
Other| 1
I haven't been having much luck trying to work through this problem so any help or advice would be greatly appreciated. Thank you.
On SQL Server:
SELECT chklRefTo Ref,
COUNT(ReferredTo) Num
FROM tblMaintenance
LEFT JOIN tblPhoneConsult
ON ',' + ReferredTo + ',' LIKE '%,' + chklRefTo + ',%'
GROUP BY chklRefTo
ORDER BY 2 DESC
By adding the commas in the JOIN expression you ensure that a sub-string is not considered a match. For example, it prevents a value like "P/W" to be considered a match with the list "GP/Walk-in,ED/UCC".
The LEFT JOIN in combination with COUNT(ReferredTo) will ensure that every value of chklRefTo is in the result, even when the count is 0.
You could try something like this:
select chklRefTo as Ref, count(*) as Num
from CSART.DBO.tblMaintenance m
inner join CSART.DBO.tblPhoneConsult p on p.ReferredTo like '%' + chklRefTo + '%'
group by chklRefTo

How to split a single column into multiple columns in SQL Server select statement?

I am new to SQL Server . I have a single long column with names starting from a, b, c and d. I want to show them in separate columns of NameA, NameB, NameC and NameD. I tried union but it shows in one column. I tried case but I dont know how to use it. Please help.
Existing column
names
A1
B1
A2
C1
A3
A4
A_names| B_names | C_names
A1 | B1 | C1
A2
A3
A4
just for fun and curious why you want that:
select *
from
( select idx = left(names,1)
, names
, rn = row_number() over (partition by left(names,1) order by names)
from
( values ('A1'),('B1'),('A2'),('C1'),('A3'),('A4'),('B2'))
v(names)
) dat
pivot
( max(names)
for idx in ([A],[B],[C],[D])
) p
http://sqlfiddle.com/#!6/9eecb/4013/0
I don't think this is something that should be solved in SQL. It's a representational thing that should probably be done in the application.
However, if you insist to use SQL for this, this is how you could do it. The main problem with this query is that the ROW_NUMBER function will be quite bad for performance.
WITH nameA
(
SELECT name, ROW_NUMBER() OVER(ORDER BY name) AS rn
FROM t1
WHERE name LIKE 'a%'
), nameB AS
(
SELECT name, ROW_NUMBER() OVER(ORDER BY name) AS rn
FROM t1
WHERE name LIKE 'b%'
)
SELECT name FROM nameA
FULL OUTER JOIN nameB
ON nameA.rn = nameB.rn
ORDER BY nameA.rn,nameB.rn;
You can use CASE (https://msdn.microsoft.com/en-us/library/ms181765.aspx)
This Query should work:
SELECT
CASE WHEN users.name like 'a%' THEN users.name ELSE NULL END AS NameA,
CASE WHEN users.name like 'b%' THEN users.name ELSE NULL END AS NameB,
CASE WHEN users.name like 'c%' THEN users.name ELSE NULL END AS NameC,
CASE WHEN users.name like 'd%' THEN users.name ELSE NULL END AS NameD
FROM users
Look at this post, it is very close to your problem.
Itzik Ben-Gan | SQL Server Pro
This isn't really the way relational databases work. When you have data in the same row, it's supposed to be related in some way - a common ID, at the least. What is it that would connect the person whose name happens to begin with A to the person whose name happens to begin with B? Why would you ever want the RDBMS to make such an arbitrary connection?
If you have a requirement to display users in such a way, you'd just want to write several queries and have your presentation layer deal with laying them out properly, e.g.
SELECT name FROM users WHERE name LIKE 'a%'
SELECT name FROM users WHERE name LIKE 'b%'
SELECT name FROM users WHERE name LIKE 'c%'
etc...
The presentation layer could run each query and then populate a table appropriately. Even better would be to have the presentation layer just run this query:
SELECT name FROM users
And then appropriately sort and display the data, which is probably going to be less expensive than multiple scans on your users table by SQL Server.

ORDER BY items must appear in the select list if SELECT DISTINCT is specified

I added the columns in the select list to the order by list, but it is still giving me the error:
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.
Here is the stored proc:
CREATE PROCEDURE [dbo].[GetRadioServiceCodesINGroup]
#RadioServiceGroup nvarchar(1000) = NULL
AS
BEGIN
SET NOCOUNT ON;
SELECT DISTINCT rsc.RadioServiceCodeId,
rsc.RadioServiceCode + ' - ' + rsc.RadioService as RadioService
FROM sbi_l_radioservicecodes rsc
INNER JOIN sbi_l_radioservicecodegroups rscg
ON rsc.radioservicecodeid = rscg.radioservicecodeid
WHERE rscg.radioservicegroupid IN
(select val from dbo.fnParseArray(#RadioServiceGroup,','))
OR #RadioServiceGroup IS NULL
ORDER BY rsc.RadioServiceCode,rsc.RadioServiceCodeId,rsc.RadioService
END
Try this:
ORDER BY 1, 2
OR
ORDER BY rsc.RadioServiceCodeId, rsc.RadioServiceCode + ' - ' + rsc.RadioService
While they are not the same thing, in one sense DISTINCT implies a GROUP BY, because every DISTINCT could be re-written using GROUP BY instead. With that in mind, it doesn't make sense to order by something that's not in the aggregate group.
For example, if you have a table like this:
col1 col2
---- ----
1 1
1 2
2 1
2 2
2 3
3 1
and then try to query it like this:
SELECT DISTINCT col1 FROM [table] WHERE col2 > 2 ORDER BY col1, col2
That would make no sense, because there could end up being multiple col2 values per row. Which one should it use for the order? Of course, in this query you know the results wouldn't be that way, but the database server can't know that in advance.
Now, your case is a little different. You included all the columns from the order by clause in the select clause, and therefore it would seem at first glance that they were all grouped. However, some of those columns were included in a calculated field. When you do that in combination with distinct, the distinct directive can only be applied to the final results of the calculation: it doesn't know anything about the source of the calculation any more.
This means the server doesn't really know it can count on those columns any more. It knows that they were used, but it doesn't know if the calculation operation might cause an effect similar to my first simple example above.
So now you need to do something else to tell the server that the columns are okay to use for ordering. There are several ways to do that, but this approach should work okay:
SELECT rsc.RadioServiceCodeId,
rsc.RadioServiceCode + ' - ' + rsc.RadioService as RadioService
FROM sbi_l_radioservicecodes rsc
INNER JOIN sbi_l_radioservicecodegroups rscg
ON rsc.radioservicecodeid = rscg.radioservicecodeid
WHERE rscg.radioservicegroupid IN
(SELECT val FROM dbo.fnParseArray(#RadioServiceGroup,','))
OR #RadioServiceGroup IS NULL
GROUP BY rsc.RadioServiceCode,rsc.RadioServiceCodeId,rsc.RadioService
ORDER BY rsc.RadioServiceCode,rsc.RadioServiceCodeId,rsc.RadioService
Try one of these:
Use column alias:
ORDER BY RadioServiceCodeId,RadioService
Use column position:
ORDER BY 1,2
You can only order by columns that actually appear in the result of the DISTINCT query - the underlying data isn't available for ordering on.
Distinct and Group By generally do the same kind of thing, for different purposes... They both create a 'working" table in memory based on the columns being Grouped on, (or selected in the Select Distinct clause) - and then populate that working table as the query reads data, adding a new "row" only when the values indicate the need to do so...
The only difference is that in the Group By there are additional "columns" in the working table for any calculated aggregate fields, like Sum(), Count(), Avg(), etc. that need to updated for each original row read. Distinct doesn't have to do this... In the special case where you Group By only to get distinct values, (And there are no aggregate columns in output), then it is probably exactly the same query plan.... It would be interesting to review the query execution plan for the two options and see what it did...
Certainly Distinct is the way to go for readability if that is what you are doing (When your purpose is to eliminate duplicate rows, and you are not calculating any aggregate columns)
When you define concatenation you need to use an ALIAS for the new column if you want to order on it combined with DISTINCT
Some Ex with sql 2008
--this works
SELECT DISTINCT (c.FirstName + ' ' + c.LastName) as FullName
from SalesLT.Customer c
order by FullName
--this works too
SELECT DISTINCT (c.FirstName + ' ' + c.LastName)
from SalesLT.Customer c
order by 1
-- this doesn't
SELECT DISTINCT (c.FirstName + ' ' + c.LastName) as FullName
from SalesLT.Customer c
order by c.FirstName, c.LastName
-- the problem the DISTINCT needs an order on the new concatenated column, here I order on the singular column
-- this works
SELECT DISTINCT (c.FirstName + ' ' + c.LastName)
as FullName, CustomerID
from SalesLT.Customer c
order by 1, CustomerID
-- this doesn't
SELECT DISTINCT (c.FirstName + ' ' + c.LastName) as FullName
from SalesLT.Customer c
order by 1, CustomerID
You could try a subquery:
SELECT DISTINCT TEST.* FROM (
SELECT rsc.RadioServiceCodeId,
rsc.RadioServiceCode + ' - ' + rsc.RadioService as RadioService
FROM sbi_l_radioservicecodes rsc
INNER JOIN sbi_l_radioservicecodegroups rscg ON rsc.radioservicecodeid = rscg.radioservicecodeid
WHERE rscg.radioservicegroupid IN
(select val from dbo.fnParseArray(#RadioServiceGroup,','))
OR #RadioServiceGroup IS NULL
ORDER BY rsc.RadioServiceCode,rsc.RadioServiceCodeId,rsc.RadioService
) as TEST