postgres - retain duplicates from query - sql

I just ran into a bug in our code based on the way we're matching data against a Postgres array.
Here's the issue:
In the database, we have a surcharge column that lives in the appointment table where the ids of all surcharges attached to an appointment are stored. In some cases, a surcharge may be added twice and therefore would look like {1,1} in the database.
I was using a simple query like the following to retrieve the surcharge data associated with the surcharges that have been attached to the appointment:
SELECT s.* FROM surcharge s, appointment a WHERE s.id = ANY(a.surcharges);
Unfortunately, I didn't test this case where more than one of the same surcharge exists. The result is that only 1 record is being returned rather than the 2 I really need.
To clarify, an example would be an appointment with appointment.surcharges = {1,1}, I'd actually want to be able to retrieve the record from surcharges TWICE because it has two surcharges associated with it. I then use those records to create billing-related transactions associated with the appointment.
All that said, is there a simple way to allow a query to return the duplicates that I need?
I can handle this situation in code, but it would be great to run a simple query to handle this situation.
I'm running Postgres 9.5.
Thank you in advance!

I think a plain inner join to an unnested table of surcharges should give you what you want:
SELECT s.*
FROM surcharge s
INNER JOIN
(
SELECT UNNEST(surcharges) AS surcharges
FROM appointment
) a
ON s.id = a.surcharges;

Related

Refer to different id's within the same query in SQL

I need to brind a lot of columns from several tables using LEFT JOIN. My starting point is the orders table and I bring the vendor name from the "Address_table". Then I add another table with order details and then the shipping information of each order detail.
My problem is that I need to bring a different record from "Address_table" to refer onether id's detailed in shipment table as of "origin_id" and "destination_id".
In other words, "address_id", "origin_id" and "destination_id" are all records from "Address_table". I brought the first one related to the vendor, how can I retrieve the other two?
Example
Thanks in advance
Your question is not exactly clear in terms of the tables and their relationships. It is, however, clear what the problem is. You need to join against the same table twice using different columns.
In order to do that you need to use table aliases. For example, you can do:
select *
from shipment s
left join address_table a on a.address_id = s.origin_id
left join address_table b on b.address_id = s.destination_id
In this example the table address_table is joined twice against the table shipment; the first time we use a as an alias, the second time b. This way you can differentiate how to pick the right columns and make the joins work as you need them to.

poor report performance when using with clause

What is the best way to join humongous table two or more times?
For example In this fictitious example, we have students table, student record who has TUTOR_ID column is a reference to the same table to another student.
What I am trying to accomplish is a report that will find all tutors that share the same students.
So I have built a query like this:
First time to find parent child relationships
Second time to find all tutors that share the same students
I tried to use "with clause" but query performance is really bad
Is there a better approach to take?
WITH STUDENT_TMP_TABLE AS (
select
TUTORS.STUDENT_IDENTIFIER as TUTOR_STUDENT_IDENTIFIER,
CADETS.STUDENT_IDENTIFIER as CADET_STUDENT_IDENTIFIER
FROM V_STUDENTS_RELATIONS TUTORS
INNER JOIN V_STUDENT_RELATIONS CADETS ON CADETS.TUTOR_ID = TUTORS.STUDENT_IDENTIFIER AND TUTORS.TUTOR_ID IS NULL
WHERE -- ? conditions truncated
)
SELECT
*
FROM STUDENT_TMP_TABLE STUD_TMP1
JOIN STUDENT_TMP_TABLE STUD_TMP2
ON STUD_TMP1.TUTOR_STUDENT_IDENTIFIER <> STUD_TMP2.TUTOR_STUDENT_IDENTIFIER
AND STUD_TMP1.CADET_STUDENT_IDENTIFIER = STUD_TMP2.CADET_STUDENT_IDENTIFIER;

What's the most efficient way to exclude possible results from an SQL query?

I have a subscription database containing Customers, Subscriptions and Publications tables.
The Subscriptions table contains ALL subscription records and each record has three flags to mark the status: isActive, isExpire and isPending. These are Booleans and only one flag can be True - this is handled by the application.
I need to identify all customers who have not renewed any magazines to which they have previously subscribed and I'm not sure that I've written the most efficient SQL query. If I find a lapsed subscription I need to ignore it if they already have an active or pending subscription for that particular magazine.
Here's what I have:
SELECT DISTINCT Customers.id, Subscriptions.publicationName
FROM Subscriptions
LEFT JOIN Customers
ON Subscriptions.id_Customer = Customers.id
LEFT JOIN Publications
ON Subscriptions.id_Publication = Publications.id
WHERE Subscriptions.isExpired = 1
AND NOT EXISTS
( SELECT * FROM Subscriptions s2
WHERE s2.id_Publication = Subscriptions.id_Publication
AND s2.id_Customer = Subscriptions.id_Customer
AND s2.isPending = 1 )
AND NOT EXISTS
( SELECT * FROM Subscriptions s3
WHERE s3.id_Publication = Subscriptions.id_Publication
AND s3.id_Customer = Subscriptions.id_Customer
AND s3.isActive = 1 )
I have just over 50,000 subscription records and this query takes almost an hour to run which tells me that there's a lot of looping or something going on where for each record the SQL engine is having to search again to find any 'isPending' and 'isActive' records.
This is my first post so please be gentle if I've missed out any information in my question :) Thanks.
I don't have your complete database structure, so I can't test the following query but it may contain some optimization. I will leave it to you to test, but will explain why I have changed, what I have changed.
select Distinct Customers.id, Subscriptions.publicationName
from Subscriptions
join Customers on Subscriptions.id_Customer = Customer.id
join Publications
ON Subscriptions.id_Publication = Publications.id
Where Subscriptions.isExpired = 1
And Not Exists
(select * from Subscriptions s2
join Customers on s2.id_Customer = Customer.id
join Publications
ON s2.id_Publication = Publications.id
where s2.id_Customer = s2.id_customer and
(s2.isPending = 1 or s2.isActive = 1))
If you have no resulting data in Customer or Publications DB, then the Subscription information isn't useful, so I eliminated the LEFT join in favor of simply join. Combine the two Exists subqueries. These are pretty intensive if I recall so the fewer the better. Last thing which I did not list above but may be worth looking into is, can you run a subquery with specific data fields returned and use it in an Exists clause? The use of Select * will return all data fields which slows down processing. I'm not sure if you can limit your result unfortunately, because I don't have an equivalent DB available to me that I can test on (the google probably knows).
I suspect there are further optimizations that could be made on this query. Eliminating the Exists clause in favor of an 'IN' clause may help, but I can't think of a way right now, seeing how you've got to match two unique fields (customer id and the relevant subscription). Let me know if this helps at all.
With a table of 50k rows, you should be able to run a query like this in seconds.

SQL Join Statement based on date range

New SQL dev here. I'm writing a call log application for our Asterisk server. In one table (CDRLogs), I have the call logs from the phone system (src, dst, calldate, duration). In another table I have (Employees) I have empName, empExt, extStartDate extEndDate). I want to join the two together on src and empExt based on who was using a particular ext on the date of the call. One user per extension in a given time frame.
For example, we have had 3 different users sitting at x100 during the month of July. In the Employees table, I have recorded the dates each of these people started and ended their use of that ext. How do I get the join to reflect that?
Thanks in advance
Perhaps something like:
SELECT A.*, B.*
FROM CDRLOGS A
INNER JOIN Employees B
ON A.SRC = B.EmpExt
AND A.CallDate between B.extStartDate and coalesce(B.extEndDate,getdate())
Please replace the * with relevant fields needed
and there may be a better way as a join on a between seems like it would possibly cause some overhead, but I can't think of a better way presently.

How can I compare two tables and delete on matching fields (not matching records)

Scenario: A sampling survey needs to be performed on membership of 20,000 individuals. Survey sample size is 3500 of the total 20000 members. All membership individuals are in table tblMember. Same survey was performed the previous year and members whom were surveyed are in tblSurvey08. Membership data can change over the year (e.g. new email address, etc.) but the MemberID data stays the same.
How do I remove the MemberID/records contained tblSurvey08 from tblMember to create a new table of potential members to be surveyed (lets call it tblPotentialSurvey09). Again the record for a individual member may not match from the different tables but the MemberID field will remain constant.
I am fairly new at this stuff but I seem to be having a problem Googling a solution - I could use the EXCEPT function but the records for the individuals members are not necessarily the same from one table to next - just the MemberID may be the same.
Thanks
SELECT
* (replace with column list)
FROM
member m
LEFT JOIN
tblSurvey08 s08
ON m.member_id = s08.member_id
WHERE
s08.member_id IS NULL
will give you only members not in the 08 survey. This join is more efficient than a NOT IN construct.
A new table is not such a great idea, since you are duplicating data. A view with the above query would be a better choice.
I apologize in advance if I didn't understand your question but I think this is what you're asking for. You can use the insert into statement.
insert into tblPotentialSurvey09
select your_criteria from tblMember where tblMember.MemberId not in (
select MemberId from tblSurvey08
)
First of all, I wouldn't create a new table just for selecting potential members. Instead, I would create a new true/false (1/0) field telling if they are eligible.
However, if you'd still want to copy data to the new table, here's how you can do it:
INSERT INTO tblSurvey00 (MemberID)
SELECT MemberID
FROM tblMember m
WHERE NOT EXISTS (SELECT 1 FROM tblSurvey09 s WHERE s.MemberID = m.MemberID)
If you just want to create a new field as I suggested, a similar query would do the job.
An outer join should do:
select m_09.MemberID
from tblMembers m_09 left outer join
tblSurvey08 m_08 on m_09.MemberID = m_08.MemberID
where
m_08.MemberID is null