Search from comma separated values in column - sql

I have a table with 2 columns - id and pets.
Pets column contain abbreviated pet names separated by , [comma] as shown below
+----+-------------+
| id | pets |
+----+-------------+
| 1 | CAT,DOG |
+----+-------------+
| 2 | CAT,DOG,TIG |
+----+-------------+
| 3 | ZEB,MOU |
+----+-------------+
Now I want to list all id's where pets = CAT, similarly all id's where pets = DOG etc
My initial try was to RUN the following SQL for each and every PET (CAT, DOG, etc)
select id
from "favpets"
where pets like '%CAT%'
The limitation of this simple solution is that the actual table and no. of pets are not as simple as mentioned above.
No. of such pets are more than 200. Therefore, 200 sql's have to be executed in order to list all id's corresponding to the pets
Is there any good alternative solution ? I'm using doctrine, so does doctrine provide any good implementation ?

With this query you will obtain all id, all pets, ordered by pet:
SELECT id, unnest(string_to_array(pets, ',')) AS mypet
FROM favpets
ORDER BY mypet;
Using it as subquery it will became easy to group and count:
SELECT mypet, COUNT(*) FROM
(
SELECT id, unnest(string_to_array(pets, ',')) AS mypet
FROM favpets
ORDER BY mypet
) AS a
GROUP BY mypet;

Related

Novice seeking help, Max Aggregate not returning expected results

I'm still very new to MS-SQL. I have a simple table and query that that is getting the best of me. I know it will something fundamental I'm overlooking.
I've changed the field names but the idea is the same.
So the idea is that every time someone signs up they get a RegID, Name, and Team. The names are unique, so for below yes John changed teams. And that's my trouble.
Football Table
+------------+----------+---------+
| Max_RegID | Name | Team |
+------------+----------+---------+
| 100 | John | Red |
| 101 | Bill | Blue |
| 102 | Tom | Green |
| 103 | John | Green |
+------------+----------+---------+
With the query at the bottom using the Max_RegID, I was expecting to get back only one record.
+------------+----------+---------+
| Max_RegID | Name | Team |
+------------+----------+---------+
| 103 | John | Green |
+------------+----------+---------+
Instead I get back below, Which seems to include Max_RegID but also for each team. What am I doing wrong?
+------------+----------+---------+
| Max_RegID | Name | Team |
+------------+----------+---------+
| 100 | John | Red |
| 103 | John | Green |
+------------+----------+---------+
My Query
SELECT
Max(Football.RegID) AS Max_RegID,
Football.Name,
Football.Team
FROM
Football
GROUP BY
Football.RegID,
Football.Name,
Football.Team
EDIT* Removed the WHERE statement
The reason you're getting the results that you are is because of the way you have your GROUP BY clause structured.
When you're using any aggregate function, MAX(X), SUM(X), COUNT(X), or what have you, you're telling the SQL engine that you want the aggregate value of column X for each unique combination of the columns listed in the GROUP BY clause.
In your query as written, you're grouping by all three of the columns in the table, telling the SQL engine that each tuple is unique. Therefore the query is returning ALL of the values, and you aren't actually getting the MAX of anything at all.
What you actually want in your results is the maximum RegID for each distinct value in the Name column and also the Team that goes along with that (RegID,Name) combination.
To accomplish that you need to find the MAX(ID) for each Name in an initial data set, and then use that list of RegIDs to add the values for Name and Team in a secondary data set.
Caveat (per comments from #HABO): This is premised on the assumption that RegID is a unique number (an IDENTITY column, value from a SEQUENCE, or something of that sort). If there are duplicate values, this will fail.
The most straight forward way to accomplish that is with a sub-query. The sub-query below gets your unique RegIDs, then joins to the original table to add the other values.
SELECT
f.RegID
,f.Name
,f.Team
FROM
Football AS f
JOIN
(--The sub-query, sq, gets the list of IDs
SELECT
MAX(f2.RegID) AS Max_RegID
FROM
Football AS f2
GROUP BY
f2.Name
) AS sq
ON
sq.Max_RegID = f.RegID;
EDIT: Sorry. I just re-read the question. To get just the single record for the MAX(RegID), just take the GROUP BY out of the sub-query, and you'll just get the current maximum value, which you can use to find the values in the rest of the columns.
SELECT
f.RegID
,f.Name
,f.Team
FROM
Football AS f
JOIN
(--The sub-query, sq, now gets the MAX ID
SELECT
MAX(f2.RegID) AS Max_RegID
FROM
Football AS f2
) AS sq
ON
sq.Max_RegID = f.RegID;
Use row_number()
select * from
(SELECT
Football.RegID AS Max_RegID,
Football.Name,
Football.Team, row_number() over(partition by name order by Football.RegID desc) as rn
FROM
Football
WHERE
Football.Name = 'John')a
where rn=1
simply you can edit your query below way
SELECT *
FROM
Football f
WHERE
f.Name = 'John' and
Max_RegID = (SELECT Max(Football.Max_RegID) where Football.Name = 'John'
)
or
if sql server simply use this
select top 1 * from Football f
where f.Name = 'John'
order by Max_RegID desc
or
if mysql then
select * from Football f
where f.Name = 'John'
order by Max_RegID desc
Limit 1
You need self join :
select f1.*
from Football f inner join
Football f1
on f1.name = f.name
where f.Max_RegID = 103;
After re-visit question, the sample data suggests me subquery :
select f.*
from Football f
where name = (select top (1) f1.name
from Football f1
order by f1.Max_RegID desc
);

Is there a way to "unflatten" a table in SQL into multiple normalized tables? (SQL Server)

I have some data generated by an SQL XML query that uses one cross apply and one outer apply create a table of the following form:
COL 1 | COL 2 | COL 3
abe | dog | ball
abe | dog | stick
abe | cat | yarn
ben | cow | NULL
ben | dog | water
ben | dog | stick
In this example, col 1 is people, col 2 is their pets, and col 3 is a list of things their pets like (the pets may not like anything). In reality, columns 1,2 and 3 are each represented by multiple columns.
I want to "unflatten" this data into 3 tables, person, pet and pet_interests. In doing this I would also like to create a many to one relationship from pet to person and a many to one relationship from pet_interests to pet.
I am unable to find a way to do this without iterating through the data manually with C#, but I feel like there must be a easier way. I was hoping someone would be able to help me with the best way to do this.
Thanks in advance.
I will show the approach. First, create the reference tables:
select identity(1, 1) as personId, col1 as name
into persons
from t
group by name;
Repeat for the rest of the tables.
Then use the reference tables for your 1-1 tables:
select identity(1, 1) as personPetId, p.personId, pe.petId
into personPets
from t join
persons p
on t.col1 = p.name join
pets pe
on t.col2 = pe.species
group by col1, col2;
Repeat as necessary.

Select and count attribute for non-unique column

I have a table of persons and activities - neither column is unique.
I need to rank every user by the count of distinct activities, e.g.:
_________________
|PERSON|ACTIVITY|
-----------------
|Lars | Sleep |
|James | Eat |
|Lars | Sleep |
|Lars | Sleep |
|Kirk | Shred |
|James | Shred |
-----------------
Lars appears thrice, but performs the same activity repeatedly.
Kirk appears once, so he is identical to Lars in number of activities.
James performs two distinct activities, so he should be ranked the highest.
The expected output:
James - 2
Kirk - 1
Lars - 1
(ordering of identical counts is irrelevant)
The solution I have come up with involves applying DISTINCT to the person column and iterating over the names, selecting the activities for each and applying DISTINCT followed by COUNT. It feels like there must be a better way.
I think you just want count(distinct):
select person, count(distinct activity) as num_activities
from t
group by person
order by num_activities desc;
SELECT Person, COUNT(DISTINCT Activity) AS ActivityCount
FROM MyTable
GROUP BY Person
ORDER BY 2 DESC
You could use GROUP BY function.
SELECT PERSON, COUNT(DISTINCT ACTIVITY) AS count FROM YOUR TABLE GROUP BY PERSON ORDER BY count;

SQL - Group by Elements of Comma Delineation

How can I group by a comma delineated list within a row?
Situation:
I have a view that shows me information on support tickets. Each ticket is assigned to an indefinite number of resources. It might have one name in the resource list, it might have 5.
I would like to aggregate by individual names, so:
| Ticket ID | resource list
+-----------+----------
| 1 | Smith, Fred, Joe
| 2 | Fred
| 3 | Smith, Joe
| 4 | Joe, Fred
Would become:
| Name | # of Tickets
+-----------+----------
| Fred | 3
| Smith | 2
| Joe | 3
I did not design the database, so I am stuck with this awkward resource list column.
I've tried something like this:
SELECT DISTINCT resource_list
, Count(*) AS '# of Tickets'
FROM IEG.vServiceIEG
GROUP BY resource_list
ORDER BY '# of Tickets' DESC
...which gives me ticket counts based on particular combinations, but I'm having trouble getting this one step further to separate that out.
I also have access to a list of these individual names that I could do a join from, but I'm not sure how I would make that work. Previously in reports, I've used WHERE resource_list LIKE '%' + #tech + '%', but I'm not sure how I would iterate through this for all names.
EDIT:
This is my final query that gave me the information I was looking for:
select b.Item, Count(*) AS 'Ticket Count'
from IEG.vServiceIEG a
cross apply (Select * from dbo.Split(REPLACE(a.resource_list, ' ', ''),',')) b
Group by b.Item
order by 2 desc
Check this Post (Function Definition by Romil) for splitting strings into a table:
How to split string and insert values into table in SQL Server
Use it this way :
select b.Item, Count(*) from IEG.vServiceIEG a
cross apply (
Select * from dbo.Split (a.resource_list,',')
) b
Group by b.Item
order by 2 desc

Sybase SQL Select Distinct Based on Multiple Columns with an ID

I'm trying to query a sybase server to get examples of different types of data we hold for testing purposes.
I have a table that looks like the below (abstracted)
Animals table:
id | type | breed | name
------------------------------------
1 | dog | german shepard | Bernie
2 | dog | german shepard | James
3 | dog | husky | Laura
4 | cat | british blue | Mr Fluffles
5 | cat | other | Laserchild
6 | cat | british blue | Sleepy head
7 | fish | goldfish | Goldie
As I mentioned I want an example of each type so for the above table would like a results set like (in reality I just want the ID's):
id | type | breed
---------------------------
1 | dog | german shepard
3 | dog | husky
4 | cat | british blue
5 | cat | other
7 | fish | goldfish
I've tried multiple combinations of queries such as the below but they are either invalid SQL (for sybase) or return invalid results
SELECT id, DISTINCT ON type, breed FROM animals
SELECT id, DISTINCT(type, breed) FROM animals
SELECT id FROM animals GROUP BY type, breed
I've found other questions such as SELECT DISTINCT on one column but this only deal with one column
Do you have any idea how to implement this query?
Maybe you have to use aggregate function max or min for column ID. It will return only one ID for grouped columns.
select max(Id), type, breed
from animals
group by type, breed
EDIT:
Other different ways to do it:
With having and aggregate function
select id, type, breed
from animals
group by type, breed
having id = max(Id)
With having and aggregate subquery
select id, type, breed
from animals a1
group by type, breed
having id = (
select max(id)
from animals a2
where a2.type = a1.type
and a2.breed = a1.breed
)
Try this and let me know if it works:
select distinct breed, max(id) as id , max(type) as type
from animals
You may have to play around with max()
The arbitrary choice here is max(), but you could arbitrarily use min() instead.
max() returns the largest value for that columns, min() the smallest