query SQL Aggregate rows inside join - sql

Table: ID, Person_ID,Name
Each Person ID can have several rows because he can have several names (first name, last name, nick name, etc..)
I have another table that contains one row per person and some other data in it
I want to join both tables into 1 row per person and in the last column to aggregate all of the person names in to one string like this: "Thomas, anderson, neo"
Something like this:
SELECT A.*,
B.PERSON_ID,
B.(aggregated names here)
FROM USERS A, USERS_NAMES B;
How do i do this?

I would do this in the following way:
select u.*, un.names
from users u left outer join
(select un.person_id, listagg(un.name, ',') within group (order by un.id) as names
from users_names un
group by un.person_id
) un
on u.person_id = un.person_id;
Note that the list aggregation is being done in a subquery. That allows the use of u.* in the outer query with no aggregation. Otherwise, you have to group by each column in users explicitly.

Related

Remove duplicates from result in sql

i have following sql in java project:
select distinct * from drivers inner join licenses on drivers.user_id=licenses.issuer_id
inner join users on drivers.user_id=users.id
where (licenses.state='ISSUED' or drivers.status='WAITING')
and users.is_deleted=false
And result i database looks like this:
And i would like to get only one result instead of two duplicated results.
How can i do that?
Solution 1 - That's Because one of data has duplicate value write distinct keyword with only column you want like this
Select distinct id, distinct creation_date, distinct modification_date from
YourTable
Solution 2 - apply distinct only on ID and once you get id you can get all data using in query
select * from yourtable where id in (select distinct id from drivers inner join
licenses
on drivers.user_id=licenses.issuer_id
inner join users on drivers.user_id=users.id
where (licenses.state='ISSUED' or drivers.status='WAITING')
and users.is_deleted=false )
Enum fields name on select, using COALESCE for fields which value is null.
usually you dont query distinct with * (all columns), because it means if one column has the same value but the rest isn't, it will be treated as a different rows. so you have to distinct only the column you want to, then get the data
I suspect that you want left joins like this:
select *
from users u left join
drivers d
on d.user_id = u.id and d.status = 'WAITING' left join
licenses l
on d.user_id = l.issuer_id and l.state = 'ISSUED'
where u.is_deleted = false and
(d.user_id is not null or l.issuer_id is not null);

Subtracting values of columns from two different tables

I would like to take values from one table column and subtract those values from another column from another table.
I was able to achieve this by joining those tables and then subtracting both columns from each other.
Data from first table:
SELECT max_participants FROM courses ORDER BY id;
Data from second table:
SELECT COUNT(id) FROM participations GROUP BY course_id ORDER BY course_id;
Here is some code:
SELECT max_participants - participations AS free_places FROM
(
SELECT max_participants, COUNT(participations.id) AS participations
FROM courses
INNER JOIN participations ON participations.course_id = courses.id
GROUP BY courses.max_participants, participations.course_id
ORDER BY participations.course_id
) AS course_places;
In general, it works, but I was wondering, if there is some way to make it simplier or maybe my approach isn't correct and this code will not work in some conditions? Maybe it needs to be optimized.
I've read some information about not to rely on natural order of result set in databases and that information made my doubts to appear.
If you want the values per course, I would recommend:
SELECT c.id, (c.max_participants - COUNT(p.id)) AS free_places
FROM courses c LEFT JOIN
participations p
ON p.course_id = c.id
GROUP BY c.id, c.max_participants
ORDER BY 1;
Note the LEFT JOIN to be sure all courses are included, even those with no participants.
The overall number is a little tricker. One method is to use the above as a subquery. Alternatively, you can pre-aggregate each table:
select c.max_participants - p.num_participants
from (select sum(max_participants) as max_participants from courses) c cross join
(select count(*) as num_participants from participants from participations) p;

Inner join of table a and table b with selecting only one row of multiple in table b ( row with colum n = max value )

i am a total beginner in SQL. So i have two tables ( table a and table b )
table a holds people with unique IDs, table b holds multiple rows for each person of table a and also the persons ID ( for a possible join ) . the rows in table b are sorted by the columnn row_number.
How can i select all people but only the row of table b with the highest row_number ?
i hope you could somewhat understand me.
Cheers
If i got you right:
SELECT a.persons_ID
,b.rn
FROM A
INNER JOIN
(
SELECT MAX(row_number_column) AS rn
,persons_ID
FROM B
GROUP BY persons_ID
) sub
ON sub.persons_ID = A.persons_ID
The subselect in the inner joins groups your data of table B. So there will be just one row for each persons_ID - the row with the highest row_number_column.
Finally just a simple join on persons_ID.
If you don't need any other information than the person ID and the last row_number per person, then it is quite trivial. Let's call the first table person and the second visit:
select person_id,
max(row_number) max_row_number
from visit
group by person_id
If you need some other information from the first table, like person.name, then perform the join:
select person.person_id,
person.name,
max(visit.row_number) max_row_number
from person
inner join visit on visit.person_id = person.person_id
group by person.person_id,
person.name
If you need some other information from the second table, like visit.present, then modern databases support the row_number() window function (not to be confused with the column that you have):
select name,
base.row_number,
present
from (
select person.name,
row_number() over (partition by visit.person_id
order by visit.row_number desc) rn,
visit.row_number,
visit.present
from person
inner join visit on visit.person_id = person.person_id
) base
where rn = 1
NB: I would strongly advise to rename the column row_number to some other name, as row_number is an analytic function in many databases.

PostgreSQL - Query with aggregate functions

I need some help for a PostgreSQL query.
I have 4 tables involved on it: customer, organization_complete, entity and address. I retrieve some data from everyone and with this query:
SELECT distinct ON (c.customer_number, trim(lower(o.name)), a.street, a.zipcode, a.area, a.country)
c.xid AS customer_xid, o.xid AS entity_xid, c.customer_number, c.deleted, o.name, o.vat, 'organisation' AS customer_type, a.street, a.zipcode, a.city, a.country
FROM customer c
INNER JOIN organisation_complete o ON (c.xid = o.customer_xid AND c.deleted = 'FALSE')
INNER JOIN entity e ON e.customer_xid = o.customer_xid
INNER JOIN address a ON (a.contact_info_xid = e.contact_info_xid and a.address_type = 'delivery')
WHERE c.account_xid = "<value>"
I get a distinct of all the customers splitted by customer_number, name, street, zipcode, area and country (what's specified after the DISTINCT ON statement).
What I need to retrieve now is a distinct of all customers having a doubled row on DB but I also need to retrieve the customer_xid and the entity_xid, that are primary keys of the respective tables and so are unique. For this reason they can't be included into an aggregate function. All I need is to count how many rows with the same customer_number, name, street, zipcode, area and country I have for each distinct tuple and to select only tuples with a count bigger than 1.
For each selected tuple I need also to take a customer_xid and an entity_xid, at random, like MySQL would do with a_key in a query like this:
SELECT COUNT(*), tab.a_key, tab.b, tab.c from tab
WHERE 1
GROUP BY tab.b
I know MySQL is quite an exception regarding this, I just want to know if may be possible to obtain the same result on PostgreSQL.
Thanks,
L.
This query in MySql is using a nonstandard (see note below) "MySql group by extension": http://dev.mysql.com/doc/refman/5.0/en/group-by-extensions.html
SELECT COUNT(*), tab.a_key, tab.b, tab.c
from tab
WHERE 1
GROUP BY tab.b
Note: This is a feature definied in SQL:2003 Standard as T301 Functional dependencies, it is not required by the standard, and many RDBMS don't support it, including PostgreSql (see this link for version 9.3 - unsupported features: http://www.postgresql.org/docs/9.3/static/unsupported-features-sql-standard.html ).
The above query could be expressed in PostgreSQL in this way:
SELECT tab.a_key, tab.b, tab.c,
q.cnt
FROM (
SELECT tab.b,
COUNT(*) As cnt,
MIN(tab.unique_id) As unique_id /* could be also MAX */
from tab
WHERE 1
GROUP BY tab.b
) q
JOIN tab ON tab.unique_id = q.unique_id
where unique_id is a column that uniquely identifies each row in tab (usually a primary key).
Min or Max functions choose one row from the table in a pseudo-random manner.

how to use count with where clause in join query

SELECT
DEPTMST.DEPTID,
DEPTMST.DEPTNAME,
DEPTMST.CREATEDT,
COUNT(USRMST.UID)
FROM DEPTMASTER DEPTMST
INNER JOIN USERMASTER USRMST ON USRMST.DEPTID=DEPTMST.DEPTID
WHERE DEPTMST.CUSTID=1000 AND DEPTMST.STATUS='ACT
I have tried several combination but I keep getting error
Column 'DEPTMASTER.DeptID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
I also add group by but it's not working
WHen using count like that you need to group on the selected columns,
ie.
SELECT
DEPTMST.DEPTID,
DEPTMST.DEPTNAME,
DEPTMST.CREATEDT,
COUNT(USRMST.UID)
FROM DEPTMASTER DEPTMST
INNER JOIN USERMASTER USRMST ON USRMST.DEPTID=DEPTMST.DEPTID
WHERE DEPTMST.CUSTID=1000 AND DEPTMST.STATUS='ACT'
GROUP BY DEPTMST.DEPTID,
DEPTMST.DEPTNAME,
DEPTMST.CREATEDT
you miss group by
SELECT DEPTMST.DEPTID,
DEPTMST.DEPTNAME,
DEPTMST.CREATEDT,
COUNT(USRMST.UID)
FROM DEPTMASTER DEPTMST
INNER JOIN USERMASTER USRMST ON USRMST.DEPTID=DEPTMST.DEPTID
WHERE DEPTMST.CUSTID=1000 AND DEPTMST.STATUS='ACT
group by DEPTMST.DEPTID,
DEPTMST.DEPTNAME,
DEPTMST.CREATEDT
All aggregate functions like averaging, counting,sum needs to be used along with a group by function. If you dont use a group by clause, you are performing the function on all the rows of the table.
Eg.
Select count(*) from table;
This returns the count of all the rows in the table.
Select count(*) from table group by name
This will first group the table data based on name and then return the count of each of these groups.
So in your case, if you want the countof USRMST.UID, group it by all the other columns in the select list.