Grouping values and changing values which do not allow the rest of the row to group - sql

Not sure how to describe this, but I want to group a row of values, where one field has two or more different values and set the value of that (but concatenating or changing the values) to give just one single row.
For example:
I have a simple table (all fields are Strings) of people next to their departments. But some people belong to more than one department.
select department_ind, name
from jobs
;
department_ind name
1 Michael
2 Michael
2 Sarah
3 Dave
2 Sally
4 Sally
I want to group by name, and concatenate the department_ind. So the results show look like:
department_ind name
1,2 Michael
2 Sarah
3 Dave
2,4 Sally
Thanks

Use string_agg()
select string_agg(department_ind::text, ',') as departments,
name
from jobs
group by name;

Related

Aggregate values in columns ONLY there is a difference

Where the values of one column match I want to:
aggregate the data in the other columns where there is a different between the two values
If the values are the same the take the value
Example data
Name Surname Age
Ryan Smith 28
Ryan Smith 29
Sean Johnson 37
Desired result:
Name Surname Age
Ryan Smith 28, 29
Sean Johnson 37
Name ryan appears twice, so want to aggregate the data for the other fields surname and age ONLY where the data is different for the two rows.
Surname is Smith in both rows so no need to aggregate, just want to populate as Smith in one row.
Age is different so want to aggregate the ages for the two rows into one row
Sean Johnson record is unique for all columns so no need to aggregate or amend anything
I have tried string_agg function but this gives the result:
Name Surname Age
Ryan Smith, Smith 28,29
Sean Johnson 37
It aggregates all fields irrespective of whether the data between the two rows is different or not.
You can use:
select name, string_agg(distinct surname, ',') as surname, string_agg(age, ',')
from t
group by name;
This assumes that all names are unique -- that seems like a strong assumption for most datasets.

Crystal Reports SQL - comparing data in multiple fields for grouping purposes

I have a database table which contains the following fields (amoung others)
Person 1
Person 2
Person 3
I need to group by Person. Unfortunately the same person can appear in any one of the above three fields. I can't simply concatenate the 3 fields as there may be a name in each field ie 3 names in a particular entry. For example
Record 1 Joe Bloggs (Person 1), Person 2 null, Person 3 null
Record 2 Jane Black (Person 1), Joe Bloggs (Person 2), Person 3 null
Record 3 Jane Black (Person 1), James Blue (person 2), Joe Bloggs (person 3)
Yes I know the database table is set up incorrectly - ideally there should be 1 field for Person's name and multiple entries but this is not the case and this is what I have to work with.
How on earth can I group by name when the answer is in one of several fields ?
Please note I am not looking to pull out a particular name (that would be easy as I would search for the name in either of the 3 fields). I need to group by name to show a report containing everyone. The report would look like:
Group 1: Joe Bloggs
Record 1 fields
Record 2 Fields
Record 3 fields
Group 2: Jane Black
Record 2 Fields
Record 3 Fields
Group 3: James Blue
Record 3 fields
Any help would be much appreciated thank you !

Populating column for Oracle Text search from 2 tables

I am investigating the benefits of Oracle Text search, and currently am looking at collecting search text data from multiple (related) tables and storing the data in the smaller table in a 1-to-many relationship.
Consider these 2 simple tables, house and inhabitants, and there are NEVER any uninhabited houses:
HOUSE
ID Address Search_Text
1 44 Some Road
2 31 Letsby Avenue
3 18 Moon Crescent
INHABITANT
ID House Name Nickname
1 1 Jane Doe Janey
2 1 John Doe JD
3 2 Jo Smythe Smithy
4 2 Percy Plum PC
5 3 Apollo Lander Moony
I want to to write SQL that updates the HOUSE.Search_Text column with text from INHABITANT. Now because this is a 1-to-many, the SQL needs to collate the data in INHABITANT for each matching row in house, and then combine the data (comma separated) and update the Search_Text field.
Once done, the Oracle Text search index on HOUSE.Search_Text will return me HOUSEs that match the search criteria, and I can look up INHABITANTs accordingly.
Of course, this is a very simplified example, I want to pick up data from many columns and Full Text Search across fields in both tables.
With the help of a colleague we've got:
select id, ADDRESS||'; '||Names||'; '||Nicknames as Search_Text
from house left join(
SELECT distinct house_id,
LISTAGG(NAME, ', ') WITHIN GROUP (ORDER BY NAME) OVER (PARTITION BY house_id) as Names,
LISTAGG(NICKNAME, ', ') WITHIN GROUP (ORDER BY NICKNAME) OVER (PARTITION BY house_id) as Nicknames
FROM INHABITANT)
i on house.id = i.house_id;
which returns:
1 44 Some Road; Jane Doe, John Doe; JD, Janey
2 31 Letsby Avenue; Jo Smythe, Percy Plum; PC, Smithy
3 18 Moon Crescent; Apollo Lander; Moony
Some questions:
Is this an efficient query to return this data? I'm slightly
concerned about the distinct.
Is this the right way to use Oracle Text search across multiple text fields?
How to update House.Search_Text with the results above? I think I need a correlated subquery, but can't quite work it out.
Would it be more efficient to create a new table containing House_ID and Search_Text only, rather than update House?

Table Join issue

Right now I've got a Main table in which I am uploading data. Because the Main table has many different duplicates, I Append various data out of the Main table into other tables such as, username, phone number, and locations in order to keep things optimized. Once I have everything stripped down from the Main table, I then append what's left into a final optimized Main table. Before this happens though, I run a select query joining all the stripped tables with the original Main table in order to connect the IDs from each table, with the correct data. For example:
Original Main Table
--Name---------Number------Due Date-------Location-------Charges Monthly-----Charges Total--
John Smith 111-1111 4/3 Chicago 234.56 500.23
Todd Jones 222-2222 4/3 New York 174.34 323.56
John Smith 111-1111 4/3 Chicago 274.56 670.23
Bill James 333-3333 4/3 Orlando 100.00 100.00
This gets split into 3 tables (name, number, location) and then there is a date table with all the dates for the year:
Name Table Number Table Location Table Due Date Table
--ID---Name------ -ID--Number--------- ---ID---Location---- --Date---
1 John Smith 1 111-1111 1 Chicago 4/1
2 Todd Jones 2 222-2222 2 New York 4/2
3 Bill James 3 333-3333 3 Orlando 4/3
Before The Original table gets stripped, I run a select query that grabs the ID from the 3 new tables, and joins them based on the connection they have with the original Main table.
Select Output
--Name ID----Number ID---Location ID---Due Date--
1 1 1 4/3
2 2 2 4/3
1 1 1 4/3
3 3 3 4/3
My issue comes when I need to introduce a new table that isn't able to be tied into the Original Main Table. I have an inventory table that, much like the original Main table, has duplicates and needs to be optimized. I do this by creating a secondary table that takes all the duplicated devices out and put them in their own table, and then strips the username and number out and puts them into their tables. I would like to add the IDs from this new device table into the select output that I have above. Resulting in:
Select Output
--Name ID----Number ID---Location ID---Due Date--Device ID---
1 1 1 4/3 1
2 2 2 4/3 1
1 1 1 4/3 2
3 3 3 4/3 1
Unlike the previous tables, the device table has no relationship to the originalMain Table, which is what is causing me so much headache. I can't seem to find a way to make this happen...is there anyway to accomplish this?
Any two tables can be joined. A table represents an application relationship. In some versions (not the original) of Entity-Relationship Modelling (notice that the "R" in E-R stands for "(application) relationship"!) a foreign key is sometimes called a "relationship". You do not need other tables or FKs to join any two tables.
Explain, in terms of its column names and the values for those names, exactly when a row should turn up in the result. Maybe you want:
SELECT *
FROM the stripped-and-ID'd version of the Original AS o
JOIN the stripped-and-ID'd version of the Device AS d
USING NameID, NumberID, LocationID and DueDate
Ie
SELECT *
FROM the stripped-and-ID'd version of the Original AS o
JOIN the stripped-and-ID'd version of the Device AS d
ON o.NameID=d.NameId AND o.NumberID=d.NumberID
AND o.LocationID=d.LocationID AND o.DueDateID=d.DueDate.
Suppose p(a,...) is some statement parameterized by a,... .
If o holds the rows where o(NameID,NumberID,LocationID,DueDate) and d holds the rows where d(NameID,NumberID,LocationID,DueDate,DeviceID) then the above holds the rows where o(NameID, NumberID, LocationID, DueDate) AND d(NameID,NumberID,LocationID,DueDate,DeviceID). But you really have not explained what rows you want.
The only way to "join" tables that have no relation is by unioning them together:
select attribute1, attribute2, ... , attributeN
from table1
where <predicate>
union // or union all
select attribute1, attribute2, ... , attributeN
from table2
where <predicate>
the where clauses are obviously optional
EDIT
optionally you could join the tables together by stating ON true which will act like a cross product

SQL: How to select rows from a table while ignoring the duplicate field values?

How to select rows from a table while ignoring the duplicate field values?
Here is an example:
id user_id message
1 Adam "Adam is here."
2 Peter "Hi there this is Peter."
3 Peter "I am getting sick."
4 Josh "Oh, snap. I'm on a boat!"
5 Tom "This show is great."
6 Laura "Textmate rocks."
What i want to achive is to select the recently active users from my db. Let's say i want to select the 5 recently active users. The problem is, that the following script selects Peter twice.
mysql_query("SELECT * FROM messages ORDER BY id DESC LIMIT 5 ");
What i want is to skip the row when it gets again to Peter, and select the next result, in our case Adam. So i don't want to show my visitors that the recently active users were Laura, Tom, Josh, Peter, and Peter again. That does not make any sense, instead i want to show them this way: Laura, Tom, Josh, Peter, (skipping Peter) and Adam.
Is there an SQL command i can use for this problem?
Yes. "DISTINCT".
SELECT DISTINCT(user_id) FROM messages ORDER BY id DESC LIMIT 5
Maybe you could exclude duplicate user using GROUP BY.
SELECT * FROM messages GROUP BY user_id ORDER BY id DESC LIMIT 5;