How to make operations on rows that resulted from a self join query?

How to make operations on rows that resulted from a self join query? - sql

I have a table containing many rows about financial data. Colums are as follows
Unixtime,open,high,low,close,timeframe,sourceId.
Given two assets with same timeframe but different sourceId, how to show a table which has
unixtime, Asset1open/asset2open,Asset1close/asset2close as columns?
Every resulting row should be the result of prices that have the same unixtime, and should be ordered by unixtime asc order.
How to do it with a self join?

You don't mention the specific database, so I'll assume this is for Sybase.
You can do:
select
a.unixtime,
a.open / b.open,
a.close / b.close
from t a
join t b on a.unixtime = b.unixtime and a.timeframe = b.timeframe
where a.sourceid = 123
and b.sourceid = 456
order by a.unixtime

Related

Expand Join to not limit data

I have a weird question - I understand that Joins return matching data based on the 'ON' stipulation, however the problem I am facing is I need the Business date back for both tables but at the same time i need to join on the date in order to get the totals correct
See below code:
Select
o.Resort,
o.Business_Date,
Occupied,
Comps,
House,
ADR,
Room_Revenue,
Occupied-(Comps+House) AS DandT,
Coalesce(gd.Projected_Occ1,0) AS Projected_Occ1,
Occupied-(Comps+House)+Coalesce(gd.Projected_Occ1,0) as Total
from Occupancy o
left join Group_Details_HF gd
on o.Business_Date = gd.Business_Date
and o.Resort = gd.resort
UNION ALL
select
o.Resort,
o.Business_Date,
Occupied,
Comps,
House,
ADR,
Room_Revenue,
Occupied-(Comps+House) AS DandT,
Coalesce(gd.Projected_Occ1,0) AS Projected_Occ1,
Coalesce(Occupied-(Comps+House),0)+Coalesce(gd.Projected_Occ1,0) as Total
from Occupancy_Forecast o
FULL OUTER JOIN Group_Details_HF gd
on o.Business_Date = gd.Business_Date
and o.Resort = gd.resort
Currently, this gives me the desired results from the Occupancy and Occupancy forecast table however when the business date does not exist in the occupancy forecast table it ignores the group_details table, I need the results to combine the dates when they exist in both or give the unique results for each when there is no match

I have decided to create another pivot table storing the details from Group_Details_HF and then Union together the two tables which has given me the desired result rather than fiddling with the join :)

returning one row, with the max date from two different columns from two different tables

Using report builder 3.0 for sql server 08 R2. Trying to get the most recent dates from 2 different columns in 2 different tables but I'm getting 4 rows instead of 1. In the picture below, there should be one row per patient.
Script I'm using is this:
SELECT "Patient"."PatientID", "PatientLastName", "PatientFirstName", "DischargeDate", "PatVisitPayable"."ContactDate"
FROM "BTI"."Patient"
JOIN "BTI"."PatAdmissions" ON "Patient"."PatientID" = "PatAdmissions"."PatientID"
JOIN "BTI"."PatVisitPayable" ON "PatAdmissions"."PatientID" = "PatVisitPayable"."PatientID"
JOIN "BTI"."PatAdmissionDivision" ON "PatAdmissions"."AdmissionID" = "PatAdmissionDivision"."AdmissionID"
GROUP BY "Patient"."PatientID", "PatientLastName", "PatientFirstName", "DischargeDate", "ContactDate"
I've tried putting max(contactdate) and max(dischargedate) in the select statement but still get 4 rows. Wasn't sure if this is something I should include in the initial query or something I can add to the report afterwards.
4 rows for one patient

Try to remove ContactDate from GROUP BY clause and use max() function in SELECT :
SELECT pt.PatientID,
pt.PatientLastName, pt.PatientFirstName, pt.DischargeDate,
MAX(ContactDate) as ContactDate
FROM BTI.Patient pt
JOIN BTI.PatAdmissions pa ON pt.PatientID = pa.PatientID
JOIN BTI.PatVisitPayable py ON pa.PatientID = py.PatientID
JOIN BTI.PatAdmissionDivision pd ON pa.AdmissionID = pd.AdmissionID
GROUP BY pt.PatientID, pt.PatientLastName,
pt.PatientFirstName, pt.DischargeDate;
Always define table alise that could be easy to follow/read and write.
This assumes ContactDate in resonbale format.

SQL Filtering duplicate rows due to bad ETL

The database is Postgres but any SQL logic should help.
I am retrieving the set of sales quotations that contain a given product within the bill of materials. I'm doing that in two steps: step 1, retrieve all DISTINCT quote numbers which contain a given product (by product number).
The second step, retrieve the full quote, with all products listed for each unique quote number.
So far, so good. Now the tough bit. Some rows are duplicates, some are not. Those that are duplicates (quote number & quote version & line number) might or might not have maintenance on them. I want to pick the row that has maintenance greater than 0. The duplicate rows I want to exclude are those that have a 0 maintenance. The problem is that some rows, which have no duplicates, have 0 maintenance, so I can't just filter on maintenance.
To make this exciting, the database holds quotes over 20+ years. And the data scientists guys have just admitted that maybe the ETL process has some bugs...
--- step 0
--- cleanup the workspace
SET CLIENT_ENCODING TO 'UTF8';
DROP TABLE IF EXISTS product_quotes;
--- step 1
--- get list of Product Quotes
CREATE TEMPORARY TABLE product_quotes AS (
SELECT DISTINCT master_quote_number
FROM w_quote_line_d
WHERE item_number IN ( << model numbers >> )
);
--- step 2
--- Now join on that list
SELECT
d.quote_line_number,
d.item_number,
d.item_description,
d.item_quantity,
d.unit_of_measure,
f.ref_list_price_amount,
f.quote_amount_entered,
f.negtd_discount,
--- need to calculate discount rate based on list price and negtd discount (%)
CASE
WHEN ref_list_price_amount > 0
THEN 100 - (ref_list_price_amount + negtd_discount) / ref_list_price_amount *100
ELSE 0
END AS discount_percent,
f.warranty_months,
f.master_quote_number,
f.quote_version_number,
f.maintenance_months,
f.territory_wid,
f.district_wid,
f.sales_rep_wid,
f.sales_organization_wid,
f.install_at_customer_wid,
f.ship_to_customer_wid,
f.bill_to_customer_wid,
f.sold_to_customer_wid,
d.net_value,
d.deal_score,
f.transaction_date,
f.reporting_date
FROM w_quote_line_d d
INNER JOIN product_quotes pq ON (pq.master_quote_number = d.master_quote_number)
INNER JOIN w_quote_f f ON
(f.quote_line_number = d.quote_line_number
AND f.master_quote_number = d.master_quote_number
AND f.quote_version_number = d.quote_version_number)
WHERE d.net_value >= 0 AND item_quantity > 0
ORDER BY f.master_quote_number, f.quote_version_number, d.quote_line_number
The logic to filter the duplicate rows is like this:
For each master_quote_number / version_number pair, check to see if there are duplicate line numbers. If so, pick the one with maintenance > 0.
Even in a CASE statement, I'm not sure how to write that.
Thoughts? The database is Postgres but any SQL logic should help.

I think you will want to use Window Functions. They are, in a word, awesome.
Here is a query that would "dedupe" based on your criteria:
select *
from (
select
* -- simplifying here to show the important parts
,row_number() over (
partition by master_quote_number, version_number
order by maintenance desc) as seqnum
from w_quote_line_d d
inner join product_quotes pq
on (pq.master_quote_number = d.master_quote_number)
inner join w_quote_f f
on (f.quote_line_number = d.quote_line_number
and f.master_quote_number = d.master_quote_number
and f.quote_version_number = d.quote_version_number)
) x
where seqnum = 1
The use of row_number() and the chosen partition by and order by criteria guarantee that only ONE row for each combination of quote_number/version_number will get the value of 1, and it will be the one with the highest value in maintenance (if your colleagues are right, there would only be one with a value > 0 anyway).

Can you do something like...
select
*
from
w_quote_line_d d
inner join
(
select
...
,max(maintenance)
from
w_quote_line_d
group by
...
) d1
on
d1.id = d.id
and d1.maintenance = d.maintenance;
Am I understanding your problem correctly?
Edit: Forgot the group by!

I'm not sure, but maybe you could Group By all other columns and use MAX(Maintenance) to get only the greatest.
What do you think?

SUM(a*b) not working

I have a PHP page running in postgres. I have 3 tables - workorders, wo_parts and part2vendor. I am trying to multiply 2 table column row datas together, ie wo_parts has a field called qty and part2vendor has a field called cost. These 2 are joined by wo_parts.pn and part2vendor.pn. I have created a query like this:
$scoreCostQuery = "SELECT SUM(part2vendor.cost*wo_parts.qty) as total_score
FROM part2vendor
INNER JOIN wo_parts
ON (wo_parts.pn=part2vendor.pn)
WHERE workorder=$workorder";
But if I add the costs of the parts multiplied by the qauntities supplied, it adds to a different number than what the script is doing. Help....I am new to this but if someone can show me in SQL I can modify it for postgres. Thanks

Without seeing example data, there's no way for us to know why you're query totals are coming out differently that when you do the math by hand. It could be a bad join, so you are getting more/less records than you expected. It's also possible that your calculations are off. Pick an example with the smallest number of associated records & compare.
My suggestion is to add a GROUP BY to the query:
SELECT SUM(p.cost * wp.qty) as total_score
FROM part2vendor p
JOIN wo_parts wp ON wp.pn = p.pn
WHERE workorder = $workorder
GROUP BY workorder
FYI: MySQL was designed to allow flexibility in the GROUP BY, while no other db I've used does - it's a source of numerous questions on SO "why does this work in MySQL when it doesn't work on db x...".
To Check that your Quantities are correct:
SELECT wp.qty,
p.cost
FROM WO_PARTS wp
JOIN PART2VENDOR p ON p.pn = wp.pn
WHERE p.workorder = $workorder
Check that the numbers are correct for a given order.

You could try a sub-query instead.
(Note, I don't have a Postgres installation to test this on so consider this more like pseudo code than a working example... It does work in MySQL tho)
SELECT
SUM(p.`score`) AS 'total_score'
FROM part2vendor AS p2v
INNER JOIN (
SELECT pn, cost * qty AS `score`
FROM wo_parts
) AS p
ON p.pn = p2v.pn
WHERE p2n.workorder=$workorder"

In the question, you say the cost column is in part2vendor, but in the query you reference wo_parts.cost. If the wo_parts table has its own cost column, that's the source of the problem.

outer query to list only if its rowcount equates to inner subquery

Need help on a query using sql server 2005
I am having two tables
code
chargecode
chargeid
orgid
entry
chargeid
itemNo
rate
I need to list all the chargeids in entry table if it contains multiple entries having different chargeids
which got listed in code table having the same charge code.
data :
code
100,1,100
100,2,100
100,3,100
101,11,100
101,12,100
entry
1,x1,1
1,x2,2
2,x3,2
11,x4,1
11,x5,1
using the above data , it query should list chargeids 1 and 2 and not 11.
I got the way to know how many rows in entry satisfies the criteria, but m failing to get the chargeids
select count (distinct chargeId)
from entry where chargeid in (select chargeid from code where chargecode = (SELECT A.chargecode
from code as A join code as B
ON A.chargecode = B.chargeCode and A.chargetype = B.chargetype and A.orgId = B.orgId AND A.CHARGEID = b.CHARGEid
group by A.chargecode,A.orgid
having count(A.chargecode) > 1)
)

First off: I apologise for my completely inaccurate original answer.
The solution to your problem is a self-join. Self-joins are used when you want to select more than one row from the same table. In our case we want to select two charge IDs that have the same charge code:
SELECT DISTINCT c1.chargeid, c2.chargeid FROM code c1
JOIN code c2 ON c1.chargeid != c2.chargeid AND c1.chargecode = c2.chargecode
JOIN entry e1 ON e1.chargeid = c1.chargeid
JOIN entry e2 ON e2.chargeid = c2.chargeid
WHERE c1.chargeid < c2.chargeid
Explanation of this:
First we pick any two charge IDs from 'code'. The DISTINCT avoids duplicates. We make sure they're two different IDs and that they map to the same chargecode.
Then we join on 'entry' (twice) to make sure they both appear in the entry table.
This approach gives (for your example) the pairs (1,2) and (2,1). So we also insist on an ordering; this cuts to result set down to just (1,2), as you described.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to make operations on rows that resulted from a self join query? - sql

You don't mention the specific database, so I'll assume this is for Sybase. You can do: select a.unixtime, a.open / b.open, a.close / b.close from t a join t b on a.unixtime = b.unixtime and a.timeframe = b.timeframe where a.sourceid = 123 and b.sourceid = 456 order by a.unixtime

Related

Expand Join to not limit data

returning one row, with the max date from two different columns from two different tables

SQL Filtering duplicate rows due to bad ETL

SUM(a*b) not working

outer query to list only if its rowcount equates to inner subquery

Categories

Resources