SQL: pull distinct values form 1 column with all values from 2nd column - sql

Its easier to explain what I need to do with an example;
table looks like this
Col 1, Col 2
1, a
1, b
2, a
2, b
2, c
I need a query to return something like
1,a,b
2,a,b,c

You would want a line such as:
UPDATE t
SET t.dupcustodians = dt.custadmin
FROM tbldoc t
INNER JOIN (SELECT t1._dupid,
(SELECT DISTINCT custadmin + ', '
FROM tbldoc t2
WHERE t2._dupid = t1._dupid
ORDER BY custadmin + ', '
FOR XML PATH('')) AS custadmin
FROM tbldoc t1
GROUP BY _dupid) AS dt
ON t._dupid = dt._dupid
;
I had a similar problem where everything had a name in the "CustAdmin" field and then they all had potentially duplicate _DupID values. I wanted it to list out in a new field "DupCustodians" all the names that were there when the _DupID values were alike from one record to the next. So swap those names with the field names you need (and don't forget to change the table names, of course) and you should be good.

Well, if you are using MySQL, then you can do this:
SELECT Col1, GROUP_CONCAT(Col2)
FROM MyTable
GROUP BY Col1
Other databases that don't have the MySQL specific GROUP_CONCAT function might require a more complex query.

Related

Query not reading the quoted string values stored in the table

I have stored some quoted values in a separate table and based on the value in this table. I am trying to filter the rows in another table
by using the values in this table in a subquery. But it is not reading the values for the subquery and returns a blank table in output.
The value is in column override and resolves to 'HCC11','HCC12'.
When I just copy the value from the column and paste it in place of the subquery it is fetching the data correctly. I am not able to understand the issue here. I have tried using the trim() function here but still its not working
Note-: I have attached the pic for your reference:
select *
from table1
where column1 in (select override from table 2 )
Storing comma separated values in a single column is a really poor database to begin with enclosing them in quotes makes things even wors. The proper solution to your problem is a better design.
However, if you are forced to work with that bad design, you can convert them to a proper list of values using
select *
from table1
where column1 in (select trim(both '''' from w.word)
from table2 t2
cross join unnest(string_to_array(t2.override, ',')) as w(word)
This assumes that table1.column1 only contains a single value without any quotes and that the override values never contain a comma in the real value (e.g. the above would break on a value like 'A,B', 'C')
You have the override column value as 'HCC11','HCC12' which can not match with single value 'HCC11'. You should better use the LIKE operator as follows:
select * from table1 t1
where exists
(select 1 from table2 t2
where t2.override like concat('%''', t1.column1, '''%'));
According to your image, the value of table1.column1 has to be 'HCC11','HCC12' (one string) to get the match from subquery.
If the table1 has 2 rows with values HCC11 and HCC12 then you might use the exists keyword in your subquery.
Something like
select *
from table1 t1
where exists
(select 1
from table2 t2
where instr( t2.override, concat("'",t1.column1,"'") ) >=1
);
You can do this like -
1.
select * from table1
where column1 in
(select regexp_replace(unnest(string_to_array(override, ',')),'''', '', 'g') from table2)
Or
2.
select * from table1
where '''' || column1 || '''' in
(select unnest(string_to_array(override, ',')) from table2)
Although, I would just recommend not storing your data like this, since you want to query using it.

SQL Select query rows using explode on a string

I need to query a MariaDB-database based on what ID is contained inside one column of a row. The ID's in the column 'children' is a string with concatenated numbers like this:
123;32523;436;241;345;234;
... or:
23;45;324;56;2141;5464;2342;
I need a query, something like this:
Select * from testTbl WHERE ID in (Explode(";", Select children from testTbl WHERE ID = 1))
I need the query to return the rows inside the children column on row with ID = 1. What I am looking for is the equivalent to my hypothetical Explode command.
You should not store lists of things as delimited lists. Here are some reasons:
Numbers should be stored as numbers, not strings.
Ids should have foreign key relationships to the tables they refer to.
A column should contain a single item of information, not a list.
SQL has this great data structure for storing lists. It is called a "table".
That said, sometimes you are stuck with other people's really bad designs. If so, you can do what you want with replace() and find_in_set():
select t.*
from testtbl t2
where exists (select 1
from testtbl t2
where t2.id = 1 and
find_in_set(t.id, replace(t2.children, ';', ',')) > 0
);
Try this:
Select *
from testTbl
WHERE ';' + (Select children from testTbl WHERE ID = 1) + ';' LIKE '%;' + CAST(ID AS varchar(20)) + ';%'

Concatenating multiple values into a single string as a subquery without using a variable

I have two row values from table C:
Select Name FROM Table C Where AccountID = 123
COL1
Row 1 |Ricky|
Row 2 |Roxy |
I want to be able to select both of these two values in a SubQuery that will be used in a larger query. So that it displays "Ricky, Roxy"
How can this be done without declaring a variable?
SELECT COL1 = STUFF ((SELECT ',' + COL1 FROM tableC WHERE AccountID=123
FOR XML PATH(''), Type).value('.[1]','nvarchar(max)'),
1,1,'')
This will return all account 123 COL1 values as one column, with commas separating values.
Here is a SQL Fiddle

SQL: Dedupe table data and manipulate merged data

I have an SQL table with:
Id INT, Name NVARCHAR(MAX), OldName NVARCHAR(MAX)
There are multiple duplicates in the name column.
I would like to remove these duplicates keeping only one master copy of 'Name'. When the the dedupe happens I want to concatenate the old names into the OldName field.
E.G:
Dave | Steve
Dave | Will
Would become
Dave | Steve, Will
After merging.
I know how to de-dupe data using something like:
with x as (select *,rn = row_number()
over(PARTITION BY OrderNo,item order by OrderNo)
from #temp1)
select * from x
where rn > 1
But not sure how to update the new 'master' record whilst I am at it.
This is really too complicated to do in a single update, because you need to update and delete rows.
select n.name,
stuff((select ',' + t2.oldname
from sqltable t2
where t2.name = n.name
for xml path (''), type
).value('/', 'nvarchar(max)'
), 1, 1, '') as oldnames
into _temp
from (select distinct name from sqltable) n;
truncate table sqltable;
insert into sqltable(name, oldnames)
select name, oldnames
from _temp;
Of course, test, test, test before deleting the old table (copy it for safe keeping). This doesn't use a temporary table. That way, if something happens -- like a server reboot -- before the insert is finished, you still have all the data.
Your question doesn't specify what to do with the id column. You can add min(id) or max(id) to the _temp if you want to use one of those values.

Using the distinct function in SQL

I have a SQL query I am running. What I was wanting to know is that is there a way of selecting the rows in a table where the value in on one of those columns is distinct? When I use the distinct function, It returns all of the distinct rows so...
select distinct teacher from class etc.
This works fine, but I am selecting multiple columns, so...
select distinct teacher, student etc.
but I don't want to retrieve the distinct rows, I want the distinct rows where the teacher is distinct. So this query would probably return the same teacher's name multiple times because the student value is different but what I would like is to return rows where the teachers are distinct, even if it means returning the teacher and one student name (because I don't need all the students).
I hope what I am trying to ask is clear but is it possible to use the distinct function on a single column even when selecting multiple columns or is there any other solution to this problem? Thanks.
The above is just an example I am giving. I don't know if using 'distinct' is the solution to my problem. I am not using teacher etc. that was just an example to get the idea accross. I am selecting multiple columns (about 10) from different tables. I have a query to get the tabled result I want. Now I want to query that table to find the unique values in one particular column. So using the teacher example again, say I have wrote a query and I have all the teachers and all the pupils they teach. Now I want to go through each row in this table and email the teacher a message. But I don't want to email the teacher numerous times, just the once, so I want to return all the columns from the table I have, where only the teacher value is distinct.
Col A Col B Col C Col D
a b c d
a c d b
b a a c
b c c c
A query I have produces the above table. Now I want only those rows where Col A values are unique. How would I go about it?
You have misunderstood the DISTINCT keyword. It is not a function and it does not modify a column. You cannot SELECT a, DISTINCT(b), c, DISTINCT(d) FROM SomeTable. DISTINCT is a modifier for the query itself, i.e. you don't select a distinct column, you make a SELECT DISTINCT query.
In other words: DISTINCT tells the server to go through the whole result set and remove all duplicate rows after the query has been performed.
If you need a column to contain every value once, you need to GROUP BY that column. Once you do that, the server now needs to do which student to select with each teacher, if there are multiple, so you need to provide a so-called aggregate function like COUNT(). Example:
SELECT teacher, COUNT(student) AS amountStudents
FROM ...
GROUP BY teacher;
One option is to use a GROUP BY on Col A. Example:
SELECT * FROM table_name
GROUP BY Col A
That should return you:
abcd
baac
Based on the limited details you provided in your question (you should explain how/why your data is in different tables, what DB server you are using, etc) you can approach this from 2 different directions.
Reduce the number of columns in your query to only return the "teacher" and "email" columns but using the existing WHERE criteria. The problem you have with your current attempt is both DISTINCT and GROUP BY don't understand that you one want 1 row for each value of the column that you are trying to be distinct about. From what I understand, MySQL has support for what you are doing using GROUP BY but MSSQL does not support result columns not included in the GROUP BY statement. If you don't need the "student" columns, don't put them in your result set.
Convert your existing query to use column based sub-queries so that you only return a single result for non-grouped data.
Example:
SELECT t1.a
, (SELECT TOP 1 b FROM Table1 t2 WHERE t1.a = t2.a) AS b
, (SELECT TOP 1 c FROM Table1 t2 WHERE t1.a = t2.a) AS c
, (SELECT TOP 1 d FROM Table1 t2 WHERE t1.a = t2.a) AS d
FROM dbo.Table1 t1
WHERE (your criteria here)
GROUP BY t1.a
This query will not be fast if you have a lot of data, but it will return a single row per teacher with a somewhat random value for the remaining columns. You can also add an ORDER BY to each sub-query to further tweak the values returned for the additional columns.
I'm not sure if I am understanding this right but couldn't you do
SELECT * FROM class WHERE teacher IN (SELECT DISTINCT teacher FROM class)
This would return all of the data in each row where the teacher is distinct
distinct requires a unique result-set row. This means that whatever values you select from your table will need to be distinct together as a row from any other row in the result-set.
Using distinct can return the same value more than once from a given field as long as the other corresponding fields in the row are distinct as well.
As soulmerge and Shiraz have mentioned you'll need to use a GROUP BY and subselect. This worked for me.
DECLARE #table TABLE (
[Teacher] [NVarchar](256) NOT NULL ,
[Student] [NVarchar](256) NOT NULL
)
INSERT INTO #table VALUES ('Teacher 1', 'Student 1')
INSERT INTO #table VALUES ('Teacher 1', 'Student 2')
INSERT INTO #table VALUES ('Teacher 2', 'Student 3')
INSERT INTO #table VALUES ('Teacher 2', 'Student 4')
SELECT
T.[Teacher],
(
SELECT TOP 1 T2.[Student]
FROM #table AS T2
WHERE T2.[Teacher] = T.[Teacher]
) AS [Student]
FROM #table AS T
GROUP BY T.[Teacher]
Results
Teacher 1, Student 1
Teacher 2, Student 3
You need to do it with a sub select where you take TOP 1 of student where the teacher is the same.
You may try "GROUP BY teacher" to return what you need.
What is the question your query is trying to answer?
Do you need to know which classes have only one teacher?
select class_name, count(teacher)
from class group by class_name having count(teacher)=1
Or are you looking for teachers with only one student?
select teacher, count(student)
from class group by teacher having count(student)=1
Or is it something else? The question you've posed assumes that using DISTINCT is the correct approach to the query you're trying to construct. It seems likely this is not the case. Could you describe the question you're trying to answer with DISTINCT?
You will need to say how your data is stored in-memory for us to say how you can query it.
But you could do a separate query to just get the distinct teachers.
select distinct teacher from class
I am struggling to understand exactly what you wish to do.. but you can do something like this:
SELECT DISTINCT ColA FROM Table WHERE ...
If you only select a singular column, the distinct will only grab those.
If you could clarify a little more, I could try to help a bit more.
You could use GROUP BY to separate the return values based on a single column value.
All you have to do is select just the columns you want the first one and do a select Distinct
Select Distinct column1 -- where your criteria...
The following might help you get to your solution. The other poster did point to this but his syntax for group by was incorrect.
Get all teachers that teach any classes.
Select teacher_id, count(*)
from teacher_table inner join classes_table
on teacher_table.teacher_id = classes_table.teacher_id
group by teacher_id
Noone seems to understand what you want. I will take another guess.
Select * from tbl
Where ColA in (Select ColA from tbl Group by ColA Having Count(ColA) = 1)
This will return all data from rows where ColA is unique -i.e. there isn't another row with the same ColA value. Of course, that means zero rows from the sample data you provided.
select cola,colb,colc
from yourtable
where cola in
(
select cola from yourtable where your criteria group by cola having count(*) = 1
)
declare #temp as table (colA nchar, colB nchar, colC nchar, colD nchar, rownum int)
insert #temp (colA, colB, colC, colD, rownum)
select Test.ColA, Test.ColB, Test.ColC, Test.ColD, ROW_NUMBER() over (order by ColA) as rownum
from Test
select t1.ColA, ColB, ColC, ColD
from #temp as t1
join (
select ColA, MIN(rownum) [min]
from #temp
group by Cola)
as t2 on t1.Cola = t2.Cola and t1.rownum = t2.[min]
This will return a single row for each value of the colA.
CREATE FUNCTION dbo.DistinctList
(
#List VARCHAR(MAX),
#Delim CHAR
)
RETURNS
VARCHAR(MAX)
AS
BEGIN
DECLARE #ParsedList TABLE
(
Item VARCHAR(MAX)
)
DECLARE #list1 VARCHAR(MAX), #Pos INT, #rList VARCHAR(MAX)
SET #list = LTRIM(RTRIM(#list)) + #Delim
SET #pos = CHARINDEX(#delim, #list, 1)
WHILE #pos > 0
BEGIN
SET #list1 = LTRIM(RTRIM(LEFT(#list, #pos - 1)))
IF #list1 <> ''
INSERT INTO #ParsedList VALUES (CAST(#list1 AS VARCHAR(MAX)))
SET #list = SUBSTRING(#list, #pos+1, LEN(#list))
SET #pos = CHARINDEX(#delim, #list, 1)
END
SELECT #rlist = COALESCE(#rlist+',','') + item
FROM (SELECT DISTINCT Item FROM #ParsedList) t
RETURN #rlist
END
GO