Select records distinct in one column in Postgresql database - sql

I got the following records where different people have the same name:
id name category_id birthdate family_id
1 joe 2 2014-05-01 1
2 jack 3 2013-04-01 2
3 joe 2 1964-05-01 1
4 jack 5 1982-05-01 2
5 emma 1 2014-05-01 1
6 joe 3 2003-07-06 3
Now I need a query which results to the following. I want only each name once per family_id. I need all values of each record afterwards including the id. In case the table gets further rows down the road I need them too. So the result should include all values.
id name category_id birthdate family_id
1 joe 2 2014-05-01 1
2 jack 3 2013-04-01 2
5 emma 1 2014-05-01 1
6 joe 3 2003-07-06 3
I tried it with several approaches (GROUP BY, DISTINCT, DISTINCT ON etc.) but none was working out for me.
When I use a GROUP BY clause (GROUP BY name) I get a ERROR: column "deals.id" must appear in the GROUP BY clause or be used in an aggregate function. But when I put id inside the clause I get all records back.
Same with distinct. There I have to choose all fields on which the result set should be distinct. But I need all values of the record. And because of the primary each record is distinct when i include the id.
I tried it with a sub clause which has filtered all distinct names. I used them in a where clause. But I got all values back including (of course) the not distinct name/family_id records.
Has anybody a helping hand for me?

You might not of specified all of the fields in your group by and if you included id, then that would make the rows unique.
Try something like:
SELECT
name, category_id, birthdate, family_id
FROM family
GROUP BY
name, category_id, birthdate, family_id;

It worked with DISTINCT ON.
The following worked quite well:
SELECT DISTINCT ON (table.name, table.family_id) table.* FROM table;
The only thing I have to check is the ordering. But I wanted to share my solution so far.

Related

count different column values after grouping by

Consider this table:
id name department email
1 Alex IT blah#gmail.com
1 Alex IT blah#gmail.com
2 Jay HR jay#gmail.com
2 Jay Marketing zou#gmail.com
If I group byid,name and count I get:
id name count(*)
1 Alex 2
2 Jay 2
With this query:
select id,name,count(*) from tb group by id,name;
However I would like to count only records that diverge from department,email, so as to have:
id name count(*)
1 Alex 0
2 Jay 1
This time the count for the first group 1,Alex is 0 because department,email have the same values (duplicated) , on the other hand 2,Jay is one because department,email has one different value.
If you meant "two different values" for "Jay", you can use distinct:
select id,name,count(*) from (SELECT distinct * FROM tb) group by id,name;
You can use count(*) - 1 to get similar results in your question.

Display DISTINCT value on SQL statement by column condition

i'm introducing you the problem with DISTINCT values by column condition i have dealt with and can't provide
any idea how i can solve it.
So. The problem is i have two Stephen here declared , but i don't want duplicates:
**
The problem:
**
id vehicle_id worker_id user_type user_fullname
9 1 NULL external_users John Dalton
10 1 16 employees Mike
11 1 1 employees Stephen
12 2 173 employee Nicholas
13 2 1 employee Stephen
14 1 NULL external_users Peter
**
The desired output:**
id vehicle_id worker_id user_type user_fullname
9 1 NULL external_users John Dalton
10 1 16 employees Mike
12 2 173 employee Nicholas
13 2 1 employee Stephen
14 1 NULL external_users Peter
I have tried CASE statements but without success. When i group by it by worker_id,
it removes another duplicates, so i figured out it needs to be grouped by some special condition?
If anyone can provide me some hint how i can solve this problem , i will be very grateful.
Thank's!
There are no duplicate rows in this table. Just because Stephen appears twice doesn't make them duplicates because the ID, VEHICLE_ID, and USER_TYPE are different.
What you need to do is decide how you want to identify the Stephen record you wish to see in the output. Is it the one with the highest VEHICLE_ID? The "latest" record, i.e. the one with the highest ID?
You will use that rule in a window function to order the rows within your criteria, and then use that row number to filter down to the results you want. Something like this:
select id, vehicle_id, worker_id, user_type, user_fullname
from (
select id, vehicle_id, worker_id, user_type, user_fullname,
row_number() over (partition by worker_id, user_fullname order by id desc) n
from user_vehicle
) t
where t.n = 1

Finding distinct count of combination of columns values in sql

Currently I have a table this :
Roll no. Names
------------------
1 Sam
1 Sam
2 Sasha
2 Sasha
3 Joe
4 Jack
5 Jack
5 Julie
I want to write a query in which I get count of the combination in another column
Required output
Combination distinct count
-----------------------------
2-Sasha 1
5-Jack 1
5-Julie 1
Basically, you could group by these columns and use a count function:
SELECT rollno, name, COUNT(*)
FROM mytable
GROUP BY rollno, name
You could also concat the two columns:
SELECT CONCAT(rollno, '-', name), COUNT(*)
FROM mytable
GROUP BY CONCAT(rollno, '-', name)

Identifying Records Where a String Appears More Than Once

I have a following dataset that looks like:
ID Medication Dose
1 Aspirin 4
1 Tylenol 7
1 Aspirin 2
1 Ibuprofen 1
2 Aspirin 6
2 Aspirin 2
2 Ibuprofen 6
2 Tylenol 4
3 Tylenol 3
3 Tylenol 7
3 Tylenol 2
I would like to develop a code that would identify patients who have been administered a medication more than once. So for example, ID 1 had Aspirin twice, ID 2 had Aspirin twice and ID 3 had Tylenol three times.
I could be wrong but I think the easiest way to do this would be to concatenate each ID based on Medication using a code similar to the one below; but I'm not quite sure what to do after that - is it possible to count if a string appears twice within a cell?
SELECT DISTINCT ST2.[ID],
SUBSTRING(
(
SELECT ','+ST1.Medication AS [text()]
FROM ED_NOTES_MASTER ST1
WHERE ST1.[ID] = ST2.[ID]
Order BY [ID]
FOR XML PATH ('')
), 1, 200000) [Result]
FROM ED_NOTES_MASTER ST2
I would like the output to look like the following:
ID MEDICATION Aspirin2x Tylenol2x Ibuprofen2x
1 Aspirin, Tylenol , Aspirin YES NO NO
2 Ibuprofen, Aspirin, Aspirin YES NO NO
3 Tylenol, Tylenol ,Tylenol NO YES NO
For the first part of your question (identify patients that have had a particular medication more than once), you can do this using GROUP BY to group by the ID and medication, and then using COUNT to get how many times each medication was given to each patient. For example:
SELECT ID, Medication, COUNT(*) AS amount
FROM ST2
GROUP BY ID, Medication
This will give you a list of all ID - Medication combinations that appear in the table and a count of how many times each combo appears. To limit these results down to just those that are greater than 2, you can add a condition to the COUNTed field using HAVING:
SELECT ID, Medication, COUNT(*) AS amount
FROM ST2
GROUP BY ID, Medication
HAVING amount >= 2
The problem now is formatting the results in the way you want. What you will get from the query above is a list of all patient - medication combinations that came up in the table more than once, like this:
ID | Medication | Count
------+---------------+-------
1 | Aspirin | 2
2 | Aspirin | 2
3 | Tylenol | 3
I'd suggest that you try and work with this format if possible, because as you have found, to get multiple values returned in a comma delimited list as you have in your Medication column you have to resort to some hacks to get it to work (although a recent version of SQL Server does implement some sort of proper group concatenation functionality.). If you really need the Aspirin2x etc. columns, take a look at the PIVOT operation in SQL Server.

SQL field default "count(another_field) +1"

I need to create a field COUNT whose default value is the automatically generated count of times NAME has appeared in that table till now, as shown in example below. Since i am adding the field to an existing table, i also need to populate existing rows. How best to go about this please?
ID NAME COUNT
1 peter 1
2 jane 1
3 peter 2
4 peter 3
5 frank 1
6 jane 2
7 peter 4
You would do this when you are querying the table, using the ANSI-standard row-number function:
select id, name, row_number() over (partition by name order by id) as seqnum
from t;