How do I get a query of clients with many contact numbers? SQL - sql

I have 2 tables, clients and contact numbers. Each client has one or many contact number, its a one to many relationship. And I need to make an excel document that for each row it has one client and its contact numbers. For example:
client name | contact_number_1 | contact_number_2| ...
I want to make it in POSTGRESQL to be fast. Doesn't matter the way that I make the excel file. I just need the query to make the rest.
Thank you!

If you can parse the result and create the Excel file from there, the most flexible solution is to aggregate the numbers into an array:
select c.client_id,
c.client_name,
array_agg(cn.number) as contact_numbers
from client c
join concat_number cn on cn.client_id = c.client_id
group by c.client_id, c.client_name;
Another alternative is to use string_agg(cn.number, ',') to get a comma-separated list (but the array is more robust against embedded commas in the names).
If you really do need to get the numbers in separate columns, you need to decide on a sensible upper limit of columns, then you can use the first query and extract the array elements as columns:
select client_id,
client_name,
contact_numbers[1] as contact_number_1,
contact_numbers[2] as contact_number_2,
contact_numbers[3] as contact_number_3,
...
from (
select c.client_id,
c.client_name,
array_agg(cn.number) as contact_numbers
from client c
join concat_number cn on cn.client_id = c.client_id
group by c.client_id, c.client_name
) t

If you actually want a dynamic number of columns returned, it gets a bit complicated cause you have to know the maximum number of columns for the returned results, or you hard-code a set number for the highest number you think will exist.
If you can live with having one column represent all of the possible contacts, then you can aggregate them all into a single column:
select c.clientName, STRING_AGG(COALESCE(con.contact_number,''),'|') as contact_numbers
from clients c
left join contacts con on c.clientId = con.clientId
group by c.clientName
order by c.clientName

Related

SQL refusing to do a join even when every identifier is valid? (ORA-00904)

Made this account just to ask about this question after being unable to find/expending the local resources I have, so I come to you all.
I'm trying to join two tables - ORDERS and CUSTOMER - as per a question on my assignment
For every order, list the order number and order date along with the customer number, last name, and first name of the customer who placed the order.
So I'm looking for the order number, date, customer number, and the full name of customers.
The code goes as such
SELECT ORDERS.ORDR_ORDER_NUMBER, ORDERS.ORDR_ORDER_DATE, ORDERS.ORDR_CUSTOMER_NUMBER, CUSTOMER.CUST_LAST, CUSTOMER.CUST_FIRST
FROM ORDERS, CUSTOMER
WHERE ORDERS.ORDR_CUSTOMER_NUMBER = CUSTOMER.CUST_CUSTOMER_NUMBER;
I've done this code without the table identifiers, putting quotation marks around ORDERS.ORDR_CUSTOMER_NUMBER, aliases for the two tables, and even putting a space after ORDR_ in both SELECT & WHERE for laughs and nothing's working. All of them keep coming up with the error in the title (ORA-00904), saying [ORDERS.]ORDR_CUSTOMER_NUMBER is the invalid identifier even though it shouldn't be.
Here also are the tables I'm working with, in case that context is needed for help.
Anyway, the query that produces the result you want should take the form:
select
o.ordr_order_number,
o.ordr_order_date,
c.cust_customer_number,
c.cust_last,
c.cust_first
from orders o
join customer c on c.cust_customer_number = o.ordr_customer_number
As you see the query becomes a lot easier to read and write if you use modern join syntax, and if you use table aliases (o and c).
You have to add JOIN or INNER JOIN to your query. Because the data comes from two different tables the WHERE clause will not select both.
FROM Orders INNER JOIN Customers ON Orders.order_Customer_Number = Customer.Cust_Customer_Number

How to select * from multiple tables in SQL when "self-joining"?

This query tries to get information about a company and its parent company:
select c.*, p.*
from companies c, companies p
where c.parent_id = p.id and c.name ilike '%google%'
but this seems to return data from the parent company (the one specified latter) only, and is missing the c.*.
Perhaps the reason is that because this is a self-join, the second one overrides the first ones?
I'm using this via the Sequal gem.
What you observe is not what Postgres does for this query. It returns all columns of the table companies twice, once for each instance, effectively duplicating column names, which can be a problem for some clients that would expect unique column names.
db<>fiddle here

Oracle - Join multiple columns trying different combinations

I'll try to explain my problem:
I need to find the most efficient way to join two table on 4 columns, but data is really crappy so there could be cases where I can join only on 3 or 2 columns because the fourth and/or third were stored badly (with spaces, zeros, dashes,...)
I should try to achieve something like this:
select * from table a
join table b
on a.key1=b.key1
and a.key2=b.key2
or a.key3=b.key3
or a.key4=b.key4```
I already performed some data quality but the number of records is really high (table a is 300k records and table b is about 25M records).
I know that the example I provided is not efficient and it would be better making separate joins and then "union" them, but I'm asking you if there could be some better way to do it.
Thanks in advance
You haven't explained your problem very well, so let's create an example:
There is a table of clients and a table of orders. Both are not related via keys, because both are imported from different systems. Your task is now to find the client per order.
Both tables contain the client's last name, first name, city, and a client number. However, these columns are optional in the order table (but either last name or client number are always given). And sometimes a first name or city may be abbreviated or misspelled (e.g. J./James, NY/New York, Cris/Chris).
So, if the order contains a client number, we have a match and are done. Otherwise the last name must match. In the latter case we look at first name and city, too. Do both match? Only one? Neither?
We use RANK to rank the clients per order and pick the best matches. Some orders will end up with exactly one match, others will have ties and we must examine the data manually then (the worst case being no client number and no last name match because of a misspelled name).
select *
from
(
select
o.*,
c.*,
rank() over
(
partition by o.order_number
order by
case
when c.client_number = o.client_number then 1
when c.last_name = o.last_name and c.first_name = o.first_name and c.city = o.city then 2
when c.last_name = o.last_name and (c.first_name = o.first_name or c.city = o.city) then 3
when c.last_name = o.last_name then 4
else 5
end
) as rnk
from orders o
left join clients c on c.client_number = o.client_number or c.last_name = o.last_name
) ranked
where rnk = 1
order by order_number;
I hope this gets you an idea how to write such a query and you will be able to adapt this concept to your case.

How to glue two dependent tables?

I have Customer and Application tables. I want to create select query which provides info about a customer and also to count a number of applications user has in the system.
select distinct c.id, c.region, c.city, count(a.customer_id_id)
from customers c
join applications a on c.id=a.customer_id_id
group by c.id;
But I get an error that I need to group by region and city but I want to display info about each application not to group by region and city. Because in such a way I will get not a number of applications for each user but for each group of users.
I read that it's possible to do with nested queries and full outer join but I tried and it didn't work. Can you explain to me how to do that?
You are close.
Use a LEFT OUTER JOIN so that Customers with 0 records in Applications will also be included (assuming your intent here)
Don't use DISTINCT and GROUP BY together. Distinct means "If all the fields are the same value across multiple records in the record set produced by this SELECT statement, then only give back distinct records, dropping the duplicates". Instead with GROUP BY, "Group by this list of fields. Any remaining fields not in this list will be aggregated using a formula in your SELECT clause like count(a.customer_id_id)." They are similar, but you can't aggregate a field with merely a DISTINCT.
When using GROUP BY, if you are not going to aggregate a field with an aggregation formula (count, sum, avg, etc..) then you must include it in your group by. This isn't necessary with some RDBMS (older versions of MySQL, for example) but it's poor practice since a field that isn't explicitly aggregated with a formula that is also missing from the GROUP BY is like telling the RDBMS "Just pick which ever value you wish from matching records" which might have some unexpected consequences.
SELECT c.id, c.region, c.city, count(a.customer_id_id)
FROM customers c
LEFT OUTER JOIN applications a on c.id=a.customer_id_id
GROUP BY c.id, c.region, c.city;
Not sure what your problem is. I assume that region and city are functionally dependent of id (that is id is a candidate key). Newer versions of postgresql will therefor accept your query. However, if you're on an older version you can expand your group by clause to:
select c.id, c.region, c.city, count(a.customer_id_id)
from customers c
join applications a
on c.id=a.customer_id_id
group by c.id, c.region, c.city;
You say that you would like to display information about each application, but why are you then counting the number of applications per customer? Do you mean something like:
select c.id, c.region, c.city, a.customer_id_id, a.<other attributes>
from customers c
join applications a
on c.id=a.customer_id_id;

Select based on the number of appearances of an id in another table

I have a table B with cids and cities. I also have a table C that has these cids with extra information. I want to list all the cids in table C that are associated with ALL appearances of a given city in Table B.
My current solution relies on counting the number of times the given city appears in Table B and selecting only the cids that appear that many times. I don't know all the SQL syntax yet, but is there a way to select for this kind of pattern?
My current solution:
SELECT Agents.aid
FROM Agents, Customers, Orders
WHERE (Customers.city='Duluth')
AND (Agents.aid = Orders.aid)
AND (Customers.cid = Orders.cid)
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
It only works because I know right now with the HAVING statement.
Thanks for the help. I wasn't sure how to google this problem, since it's pretty specific.
EDIT: I'm pinpointing my problem a bit. I need to know how to determine if EVERY row in a table has a certain value for a field. Declaring a variable and counting the rows in a sub-selection and filtering out my results by IDs that appear that many times works, but It's really ugly.
There HAS to be a way to do this without explicitly count()ing rows. I hope.
Not an answer to your question, but a general improvement.
I'd recommend using JOIN syntax to join your tables together.
This would change your query to be:
SELECT Agents.aid
FROM Agents
INNER JOIN Orders
ON Agents.aid = Orders.aid
INNER JOIN Customers
ON Customers.cid = Orders.cid
WHERE Customers.city='Duluth'
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
What variant of SQL are you using?
To start with, you can (and should) use JOIN instead of doing it in the WHERE clause, e.g.,
select Agents.aid
from Agents
join Orders on Agents.aid = Orders.aid
join Customers on Customers.cid = Orders.cid
where Customers.city = 'Duluth'
group by Agents.aid
having count(Agents.aid) > 1
After that, I'm afraid I might be a little lost. Using the table names in your example query, what (in English, not pseudocode) are you trying to retrieve? For example, I think your sample query is retrieving the PK for all Agents that have been involved in at least 2 Orders involving Customers in Duluth.
Also, some table definitions for Agents, Orders, and Customers might help (then again, they might be irrelevant).
I'm not sure if I understood you problem, but I think the following query is what you want:
SELECT *
FROM customers b
INNER JOIN orders c USING (cid)
WHERE b.city = 'Duluth'
AND NOT EXISTS (SELECT 1
FROM customers b2
WHERE b2.city = b.city
AND b2.cid <> cid);
Probably you will need some indexes on these columns.