Compare column between two tables (greater and equal to) - sql

I have two table, and i want to compare the two column from those two table. The column reflow in table f_product must greater and equal to column lreflow in table f_line. The coding that I used is
SELECT f_product.oiv,f_product.product,f_product.passive,f_product.pitch,f_product.reflow,f_line.lreflow,f_product.spi,f_product.scomp,f_product.pallet,f_product.printer,f_line.line
FROM f_product,f_line
WHERE f_product.passive=f_line.passive
AND f_product.pitch=f_line.pitch
AND f_product.spi=f_line.spi
AND f_product.pallet=f_line.pallet
AND f_product.printer=f_line.printer
AND f_product.reflow >= f_line.lreflow
AND oiv='PMLE4720A' .
However, the result display out did not compare out the column data in between f_product.reflow and f_line.lreflow. For example, the result still list out the result of reflow=8 and lreflow=10 where reflow is less than the value of lreflow.
Is that my sql coding have any error?

I'm guessing this is Oracle? Sometimes it gets confused by the ambiguity between real where clauses and an implicit join using a where. I would recast it into ansi sql joins:
SELECT
.....
FROM
f_product a INNER JOIN f_line b ON
(a.passive = b.passive AND
a.pitch =b.pitch AND
a.spi=b.spi AND
a.pallet=b.pallet)
where oiv='PMLE4720A'
and a.reflow >= b.lreflow
Assuming the relationship between product and line is such that it makes sense to jion on these four fields...

Related

How do I join two dataframes, based on conditions, with no common variable?

I am trying to recreate the following SAS code in R
PROC SQL;
create table counts_2018 as
select a.*, b.cell_no
from work.universe201808 a, work.selpar17 b
where a.newregionxx = b.lower_region2
and a.froempment >= b.lower_size
and a.froempment <= b.upper_size
and a.frosic07_2 >= b.lower_class2
and a.frosic07_2 <= b.upper_class2;
QUIT;
What this does, in effect, is assign the cell_no found in selpar17 to the data in universe201808, based on the fulfillment of all 6 conditions outlined in the code. Data which does not fulfill these conditions, and thus won't have a cell_no assigned to it, is not included in the final table.
The documentation/answers I have found so far all start with a step where the two dataframes are merged by a common variable, then an sqldf select is carried out. I do not have a common column, and thus I cannot merge my dataframes.
Currently, you are running an implicit join between the two tables which is not advised in SQL. Per ANSI-1992 (a 25+ year specification) that made the explicit JOIN the standard way of joining relations, consider revising your SQL query accordingly.
Contrary to your statement, you in fact do have a common column between the tables as shown in your equality condition: a.newregionxx = b.lower_region2 which can serve as the JOIN condition. Even use the BETWEEN operator for concision:
new_df <- sqldf('select u.*, s.cell_no
from universe201808 u
inner join selpar17 s
on u.newregionxx = s.lower_region2
where u.froempment between s.lower_size and s.upper_size
and u.frosic07_2 between s.lower_class2 and s.upper_class2')
In fact, you can remove the where altogether and place all in the on clause:
...
on u.newregionxx = s.lower_region2
and u.froempment between s.lower_size and s.upper_size
and u.frosic07_2 between s.lower_class2 and s.upper_class2

Duplicate Results in Oracle SQL Plus

When I run the script below, I keep getting duplicate results, even when using distinct.
select distinct
a.SDT, a.fNo, b.IDType, b.pNo, b.pfName, b.plName, b.PDoB, b.Street, b.City, c.Phone
from Scheduled_Flight a, Passenger b, pass_Phone c
where fNo = '0000021'
and
a.SDT = '08-sep-2017 17:30';
I am new to SQL and any help would be much appreciated into solving this issue.
"I keep getting duplicate results, even when using distinct"
You are not getting duplicates in your result set. Rather you have a Cartesian product which is a combination of ONE flight, THREE passengers and THREE phone numbers. Each record in the set is unique so distinct doesn't have any affect.
The problem is you have no join conditions in your from clause. There should be a column on passenger which is the foreign key on flight, and a column on pass_phone which is the foreign key on passenger.
It is easy to fix: you just need to join the tables. Assuming your data model is consistent, your query should look like this (and you don't need DISTINCT):
select a.SDT, a.fNo, b.IDType, b.pNo, b.pfName, b.plName, b.PDoB, b.Street, b.City,c.Phone
from Scheduled_Flight a
join Passenger b on b.fNo = a.fNo
join pass_Phone c on c.pNo = b.bNo
where a.fNo = '0000021'
and a.SDT = '08-sep-2017 17:30';
However, I notice that in your version of the query you didn't prefix fNo. That makes me think you don't have a column of that name on passenger (otherwise the query would have failed on ORA-00918: column ambiguously defined). So, either the foreign key columns are named differently or you haven't got them.
"Is it possible to specify only the date without the time?"
Yep. Use an ANSI date literal e.g. date '2017-09-08'
"Is it possible to specify only the date without the time to still produce results from the database?"
That depends on the how the data is stored. Oracle dates are stored with a time element. If no time is specified (or the time element is truncated) then the time element defaults to midnight. This often catches beginners out, for instance because the pseudo-column sysdate returns the current date and time, not just the current date.
So, if you know the dates are stored in your table without a time element you can do this:
where a.sdt = date '2017-09-08'
But if you don't know that, you can truncate ...
where trunc(a.sdt) = date '2017-09-08'
or test for a range
where a.sdt >= date '2017-09-08'
and a.sdt < date '2017-09-09'
"How come the following code is still producing duplicate results?
select distinct r.sNo, r.tCode, s.fNo, s.SDT
from Airplane r, Scheduled_Flight s
where SDT >= SYSDATE -1;
The airplane attribute cannot have the s.SDT attribute."
Without seeing the output I can't be sure but I would bet that this query does not produce duplicate records either. What you have is a product combining all your AIRPLANE records with all your FLIGHT records matching the sdt filter.
This is another data modelling problem. Of course aeroplanes don't have a flight time: one aeroplane makes many flights. But it makes perfect sense for a flight to be assigned to a plane. In fact that's crucial to ensuring that you don't have more flights than you have planes to fly them, and that one plane isn't planned to take off from London for Madrid at a time when it's planned to be half-way to Hong Kong.
You really should use the ANSI 92 syntax, as I showed in my answer to your previous posted code. The explicit joins not only make it easier to understand the query but they prevent mistakes like this. The fact that you apparently don't have any candidate columns to make the join immediately highlights the flaw in the data model.
select distinct r.sNo, r.tCode, s.fNo, s.SDT
from Airplane r
INNER JOIN Scheduled_Flight s ON ????
where SDT >= SYSDATE -1;
i don't see any rows which are duplicated, if you compare every column of each row, each row is uniquely identified, since you are doing cartesian product you are getting multiple records. but each rows are unique to each other.

SQL - LEFT JOIN and WHERE statement to show just first row

I read many threads but didn't get the right solution to my problem. It's comparable to this Thread
I have a query, which gathers data and writes it per shell script into a csv file:
SELECT
'"Dose History ID"' = d.dhs_id,
'"TxFieldPoint ID"' = tp.tfp_id,
'"TxFieldPointHistory ID"' = tph.tph_id,
...
FROM txfield t
LEFT JOIN txfielpoint tp ON t.fld_id = tp.fld_id
LEFT JOIN txfieldpoint_hst tph ON fh.fhs_id = tph.fhs_id
...
WHERE d.dhs_id NOT IN ('1000', '10000')
AND ...
ORDER BY d.datetime,...;
This is based on an very big database with lots of tables and machine values. I picked my columns of interest and linked them by their built-in table IDs. Now I have to reduce my result where I get many rows with same values and just the IDs are changed. I just need one(first) row of "tph.tph_id" with the mechanics like
WHERE "Rownumber" is 1
or something like this. So far i couldn't implement a proper subquery or use the ROW_NUMBER() SQL function. Your help would be very appreciated. The Result looks like this and, based on the last ID, I just need one row for every og this numbers (all IDs are not strictly consecutive).
A01";261511;2843119;714255;3634457;
A01";261511;2843113;714256;3634457;
A01";261511;2843113;714257;3634457;
A02";261512;2843120;714258;3634464;
A02";261512;2843114;714259;3634464;
....
I think "GROUP BY" may suit your needs.
You can group rows with the same values for a set of columns into a single row

How can I get non repeating value from select?

This is my select statement, it returns duplicate rows (see screen shot).
How can I prevent the duplicated rows?
SELECT
A.TOTAL_PRESENT,
A."LIMIT",
A.COST_CENTER,
A.ID,
A.PLANT,
A.BUDGET_YEAR,
A."VERSION",
B.BUDGET_YEAR,
B."VERSION",
B.PLANT,
B.CHARGE_CC,
B.YEAR_DATE_USD
FROM
CMS.SUM_REPANDMAINT A,
CMS.V_SUM_REPANDMAINT B
WHERE
(A.BUDGET_YEAR = B.BUDGET_YEAR(+)) AND
(A."VERSION" = B."VERSION"(+)) AND
(A.PLANT = B.PLANT(+)) AND
(A.COST_CENTER = B.CHARGE_CC(+)) AND
(B.USERNAME = '[usr_name]')
Output
Duplicate entries mean the filter criteria are not precise enough. One of your data sources produces multiple rows and the WHERE clause doesn't offer sufficient restriction.
You haven't posted any raw data so we can't tell you what additional criteria you need. However you should look at the use of outer joins. Outer joins mean you will return rows if the criteria for the right hand table don't match the criteria of the left-hand table. Why are you doing that?

Remove duplicate column after SQL query

I have this query but I'm getting two columns of houseid:
How do I only get one?
SELECT vehv2pub.houseid, vehv2pub.vehid, vehv2pub.epatmpg,
dayv2pub.houseid, dayv2pub.trpmiles
FROM vehv2pub, dayv2pub
WHERE vehv2pub.vehid >= 1
AND dayv2pub.trpmiles < 15
AND dayv2pub.houseid = vehv2pub.houseid;
And also, how do I get the average of the epatmpg? So the query would just return the value?
The most elegant way would be to use the USING clause in an explicit join condition:
SELECT houseid, v.vehid, v.epatmpg, d.houseid, d.trpmiles
FROM vehv2pub v
JOIN dayv2pub d USING (houseid)
WHERE v.vehid >= 1
AND d.trpmiles < 15;
This way, the column houseid is in the result only once, even if you use SELECT *.
Per documentation:
USING is a shorthand notation: it takes a comma-separated list of
column names, which the joined tables must have in common, and forms a
join condition specifying equality of each of these pairs of columns.
Furthermore, the output of JOIN USING has one column for each of the
equated pairs of input columns, followed by the remaining columns from each table.
To get the average epatmpg for the selected rows:
SELECT avg(v.epatmpg) AS avg_epatmpg
FROM vehv2pub v
JOIN dayv2pub d USING (houseid)
WHERE v.vehid >= 1
AND d.trpmiles < 15;
If there are multiple matches in dayv2pub, the derived table can hold multiple instances of each row in vehv2pub after the join. avg() is based on the derived table.
not 100% sure this works in postgres sql, but something like this gets the average in SQL server:
SELECT vehv2pub.houseid, avg(vehv2pub.epatmpg)
FROM vehv2pub, dayv2pub
WHERE vehv2pub.vehid >= 1
AND dayv2pub.trpmiles < 15
AND dayv2pub.houseid = vehv2pub.houseid
GROUP BY vehv2pub.houseid