Subselect a Summed col in Oracle - sql

this is an attempted fix to a crystal reports use of 2 sub reports!
I have a query that joins 3 tables, and I wanted to use a pair of sub selects that bring in the same new table.
Here is the first of the two columns in script:
SELECT ea."LOC_ID", lo."DESCR", ea."PEGSTRIP", ea."ENTITY_OWNER"
, ea."PCT_OWNERSHIP", ea."BEG_BAL", ea."ADDITIONS", ea."DISPOSITIONS"
, ea."EXPLANATION", ea."END_BAL", ea."NUM_SHARES", ea."PAR_VALUE"
, ag."DESCR", ea."EOY", ea."FAKEPEGSTRIP",
(select sum(htb.END_FNC_CUR_US_GAAP)
from EQUITY_ACCOUNTS ea , HYPERION_TRIAL_BALANCE htb
where
htb.PEGSTRIP = ea.PEGSTRIP and
htb.PRD_NBR = 0 and
htb.LOC_ID = ea.LOC_ID and
htb.PRD_YY = ea.EOY
) firstHyp
FROM ("TAXPALL"."ACCOUNT_GROUPING" ag
INNER JOIN "TAXP"."EQUITY_ACCOUNTS" ea
ON (ag."ACCT_ID"=ea."PEGSTRIP") AND (ag."EOY"=ea."EOY"))
INNER JOIN "TAXP"."LOCATION" lo ON ea."LOC_ID"=lo."LOC_ID"
WHERE ea."EOY"=2009
ORDER BY ea."LOC_ID", ea."PEGSTRIP"
When this delivers the dataset the value of "firstHyp" fails to change by pegstrip value. It returns a single total for the join and fails to put the proper by value by pegstrip.
I thought that the where clause would have picked up the joins line by line.
I don't do Oracle syntax often so what am I missing here?
TIA

Your SQL is equivilent to the following:
SELECT ea."LOC_ID", lo."DESCR", ea."PEGSTRIP",
ea."ENTITY_OWNER" , ea."PCT_OWNERSHIP",
ea."BEG_BAL", ea."ADDITIONS", ea."DISPOSITIONS" ,
ea."EXPLANATION", ea."END_BAL", ea."NUM_SHARES",
ea."PAR_VALUE" , ag."DESCR", ea."EOY", ea."FAKEPEGSTRIP",
(select sum(htb.END_FNC_CUR_US_GAAP)
from EQUITY_ACCOUNTS iea
Join HYPERION_TRIAL_BALANCE htb
On htb.PEGSTRIP = iea.PEGSTRIP
and htb.LOC_ID = iea.LOC_ID
and htb.PRD_YY = iea.EOY
where htb.PRD_NBR = 0 ) firstHyp
FROM "TAXPALL"."ACCOUNT_GROUPING" ag
JOIN "TAXP"."EQUITY_ACCOUNTS" ea
ON ag."ACCT_ID"=ea."PEGSTRIP"
AND ag."EOY"=ea."EOY"
JOIN "TAXP"."LOCATION" lo
ON ea."LOC_ID"=lo."LOC_ID"
WHERE ea."EOY"=2009
ORDER BY ea."LOC_ID", ea."PEGSTRIP"
Notice that the subquery that generates firstHyp is not in any way dependant on the tables in the outer query... It is therefore not a Correllated SubQuery... meaning that the value it generates will NOT be different for each row in the outer query's resultset, it will be the same for every row. You need to somehow put something in the subquery that makes it dependant on the value of some row in the outer query so that it will become a correllated subquery and run over and over once for each outer row....
Also, you mention a pair of subselects, but I only see one. Where is the other one ?

Use:
SELECT ea.LOC_ID,
lo.DESCR,
ea.PEGSTRIP,
ea.ENTITY_OWNER,
ea.PCT_OWNERSHIP,
ea.BEG_BAL,
ea.ADDITIONS,
ea.DISPOSITIONS,
ea.EXPLANATION,
ea.END_BAL,
ea.NUM_SHARES,
ea.PAR_VALUE,
ag.DESCR,
ea.EOY,
ea.FAKEPEGSTRIP,
NVL(SUM(htb.END_FNC_CUR_US_GAAP), 0) AS firstHyp
FROM TAXPALL.ACCOUNT_GROUPING ag
JOIN TAXP.EQUITY_ACCOUNTS ea ON ea.PEGSTRIP = ag.ACCT_ID
AND ea.EOY = ag.EOY
AND ea.EOY = 2009
JOIN TAXP.LOCATION lo ON lo.LOC_ID = ea.LOC_ID
LEFT JOIN HYPERION_TRIAL_BALANCE htb ON htb.PEGSTRIP = ea.PEGSTRIP
AND htb.LOC_ID = ea.LOC_ID
AND htb.PRD_YY = ea.EOY
AND htb.PRD_NBR = 0
GROUP BY ea.LOC_ID,
lo.DESCR,
ea.PEGSTRIP,
ea.ENTITY_OWNER,
ea.PCT_OWNERSHIP,
ea.BEG_BAL,
ea.ADDITIONS,
ea.DISPOSITIONS,
ea.EXPLANATION,
ea.END_BAL,
ea.NUM_SHARES,
ea.PAR_VALUE,
ag.DESCR,
ea.EOY,
ea.FAKEPEGSTRIP,
ORDER BY ea.LOC_ID, ea.PEGSTRIP
I agree with Charles Bretana's assessment that the original SELECT in the SELECT clause was not correlated, which is why the value never changed per row. But the sub SELECT used the EQUITY_ACCOUNTS table, which is the basis for the main query. So I removed the join, and incorporated the HYPERION_TRIAL_BALANCE table into the main query, using a LEFT JOIN. I wrapped the SUM in an NVL rather than COALESCE because I didn't catch what version of Oracle this is for.

Related

Two almost identical queries returning different results

I am getting different results for the following two queries and I have no idea why. The only difference is one has an IN and one has an equals.
Before I go into the queries you should know that I found a better way to do it by moving the subquery into a common table expression, but this is still driving me crazy! I really want to know what caused the issue in the first place, I am asking out of curiosity
Here's the first query:
use [DB.90_39733]
Select distinct x.uniqproducer, cn.Firstname,cn.lastname,e.code,
ecn.FirstName, ecn.LastName, ecn.entid, x.uniqline
from product x
join employ e on e.EmpID=x.uniqproducer
join contactname cn on cn.uniqentity=e.uniqentity
join [ETL_GAWR92]..idlookupentity ide on ide.enttype='EM'
and ide.UniqEntity=e.UniqEntity
left join [ETL_GAWR92]..EntConName ecn on ecn.entid=ide.empid
and ecn.opt='Y'
Where x.UniqProducer =(SELECT TOP 1 idl.UniqEntity
FROM [ETL_GAWR92]..IDLookupEntity idl
LEFT JOIN [ETL_GAWR92]..Employ e2 ON e2.ProdID = ''
WHERE idl.empID = e2.EmpID AND
idl.EntType = 'EM')
And the second one:
use [DB.90_39733]
Select distinct x.uniqproducer, cn.Firstname,cn.lastname,e.code,
ecn.FirstName, ecn.LastName, ecn.entid, x.uniqline
from product x
join employ e on e.EmpID=x.uniqproducer
join contactname cn on cn.uniqentity=e.uniqentity
join [ETL_GAWR92]..idlookupentity ide on ide.enttype='EM'
and ide.UniqEntity=e.UniqEntity
left join [ETL_GAWR92]..EntConName ecn on ecn.entid=ide.empid
and ecn.opt='Y'
Where x.UniqProducer IN (SELECT TOP 1 idl.UniqEntity
FROM [ETL_GAWR92]..IDLookupEntity idl
LEFT JOIN [ETL_GAWR92]..Employ e2 ON e2.ProdID = ''
WHERE idl.empID = e2.EmpID AND
idl.EntType = 'EM')
The first query returns 0 rows while the second query returns 2 rows.The only difference is x.UniqProducer = versus x.UniqProducer IN for the last where clause.
Thanks for your time
SELECT TOP 1 doesn't guarantee that the same record will be returned each time.
Add an ORDER BY to your select to make sure the same record is returned.
(SELECT TOP 1 idl.UniqEntity
FROM [ETL_GAWR92]..IDLookupEntity idl
LEFT JOIN [ETL_GAWR92]..Employ e2 ON e2.ProdID = ''
WHERE idl.empID = e2.EmpID AND
idl.EntType = 'EM' ORDER BY idl.UniqEntity)
I would guess (with strong emphasis on the word “guess”) that the reason is based on how equals and in are processed by the query engine. For equals, SQL knows it needs to do a comparison with a specific value, where for in, SQL knows it needs to build a subset, and find if the "outer" value is in that "inner" subset. Yes, the end results should be the same as there’s only 1 row returned by the subquery, but as #RickS pointed out, without any ordering there’s no guarantee of which value ends up “on top” – and the (sub)query plan used to build the in - driven subquery might differ from that used by the equals pull.
A follow-up question: which is the correct dataset? When you analyze the actual data, should you have gotten zero, two, or a different number of rows?

COUNT is outputting more than one row

I am having a problem with my SQL query using the count function.
When I don't have an inner join, it counts 55 rows. When I add the inner join into my query, it adds a lot to it. It suddenly became 102 rows.
Here is my SQL Query:
SELECT COUNT([fmsStage].[dbo].[File].[FILENUMBER])
FROM [fmsStage].[dbo].[File]
INNER JOIN [fmsStage].[dbo].[Container]
ON [fmsStage].[dbo].[File].[FILENUMBER] = [fmsStage].[dbo].[Container].[FILENUMBER]
WHERE [fmsStage].[dbo].[File].[RELATIONCODE] = 'SHIP02'
AND [fmsStage].[dbo].[Container].DELIVERYDATE BETWEEN '2016-10-06' AND '2016-10-08'
GROUP BY [fmsStage].[dbo].[File].[FILENUMBER]
Also, I have to do TOP 1 at the SELECT statement because it returns 51 rows with random numbers inside of them. (They are probably not random, but I can't figure out what they are.)
What do I have to do to make it just count the rows from [fmsStage].[dbo].[file].[FILENUMBER]?
First, your query would be much clearer like this:
SELECT COUNT(f.[FILENUMBER])
FROM [fmsStage].[dbo].[File] f INNER JOIN
[fmsStage].[dbo].[Container] c
ON v.[FILENUMBER] = c.[FILENUMBER]
WHERE f.[RELATIONCODE] = 'SHIP02' AND
c.DELIVERYDATE BETWEEN '2016-10-06' AND '2016-10-08';
No GROUP BY is necessary. Otherwise you'll just one row per file number, which doesn't seem as useful as the overall count.
Note: You might want COUNT(DISTINCT f.[FILENUMBER]). Your question doesn't provide enough information to make a judgement.
Just remove GROUP BY Clause
SELECT COUNT([fmsStage].[dbo].[File].[FILENUMBER])
FROM [fmsStage].[dbo].[File]
INNER JOIN [fmsStage].[dbo].[Container]
ON [fmsStage].[dbo].[File].[FILENUMBER] = [fmsStage].[dbo].[Container].[FILENUMBER]
WHERE [fmsStage].[dbo].[File].[RELATIONCODE] = 'SHIP02'
AND [fmsStage].[dbo].[Container].DELIVERYDATE BETWEEN '2016-10-06' AND '2016-10-08'

How can I do a SQL join to get a value 4 tables farther from the value provided?

My title is probably not very clear, so I made a little schema to explain what I'm trying to achieve. The xxxx_uid labels are foreign keys linking two tables.
Goal: Retrieve a column from the grids table by giving a proj_uid value.
I'm not very good with SQL joins and I don't know how to build a single query that will achieve that.
Actually, I'm doing 3 queries to perform the operation:
1) This gives me a res_uid to work with:
select res_uid from results where results.proj_uid = VALUE order by res_uid asc limit 1"
2) This gives me a rec_uid to work with:
select rec_uid from receptor_results
inner join results on results.res_uid = receptor_results.res_uid
where receptor_results.res_uid = res_uid_VALUE order by rec_uid asc limit 1
3) Get the grid column I want from the grids table:
select grid_name from grids
inner join receptors on receptors.grid_uid = grids.grid_uid
where receptors.rec_uid = rec_uid_VALUE;
Is it possible to perform a single SQL that will give me the same results the 3 I'm actually doing ?
You're not limited to one JOIN in a query:
select grids.grid_name
from grids
inner join receptors
on receptors.grid_uid = grids.grid_uid
inner join receptor_results
on receptor_results.rec_uid = receptors.rec_uid
inner join results
on results.res_uid = receptor_results.res_uid
where results.proj_uid = VALUE;
select g.grid_name
from results r
join resceptor_results rr on r.res_uid = rr.res_uid
join receptors rec on rec.rec_uid = rr.rec_uid
join grids g on g.grid_uid = rec.grid_uid
where r.proj_uid = VALUE
a small note about names, typically in sql the table is named for a single item not the group. thus "result" not "results" and "receptor" not "receptors" etc. As you work with sql this will make sense and names like you have will seem strange. Also, one less character to type!

Ignore null values in select statement

I'm trying to retrieve a list of components via my computer_system, BUT if a computer system's graphics card is set to null (I.e. It has an onboard), the row isn't returned by my select statement.
I've been trying to use COALESCE without results. I've also tried with and OR in my WHERE clause, which then just returns my computer system with all different kinds of graphic cards.
Relevant code:
SELECT
computer_system.cs_id,
computer_system.cs_name,
motherboard.name,
motherboard.price,
cpu.name,
cpu.price,
gfx.name,
gfx.price
FROM
public.computer_case ,
public.computer_system,
public.cpu,
public.gfx,
public.motherboard,
public.ram
WHERE
computer_system.cs_ram = ram.ram_id AND
computer_system.cs_cpu = cpu.cpu_id AND
computer_system.cs_mb = motherboard.mb_id AND
computer_system.cs_case = computer_case.case_id AND
computer_system.cs_gfx = gfx.gfx_id; <-- ( OR computer_system.cs_gfx IS NULL)
Returns:
1;"Computer1";"Fractal Design"; 721.00; "MSI Z87"; 982.00; "Core i7 I7-4770K "; 2147.00; "Crucial Gamer"; 1253.00; "ASUS GTX780";3328.00
Should I use Joins? Is there no easy way to say return the requested row, even if there's a bloody NULL value. Been struggling with this for at least 2 hours.
Tables will be posted if needed.
EDIT: It should return a second row:
2;"Computer2";"Fractal Design"; 721.00; "MSI Z87"; 982.00; "Core i7 I7-4770K "; 2147.00; "Crucial Gamer"; 1253.00; "null/nothing";null/nothing
You want a LEFT OUTER JOIN.
First, clean up your code so you use ANSI joins so it's readable:
SELECT
computer_system.cs_id,
computer_system.cs_name,
motherboard.name,
motherboard.price,
cpu.name,
cpu.price,
gfx.name,
gfx.price
FROM
public.computer_system
INNER JOIN public.computer_case ON computer_system.cs_case = computer_case.case_id
INNER JOIN public.cpu ON computer_system.cs_cpu = cpu.cpu_id
INNER JOIN public.gfx ON computer_system.cs_gfx = gfx.gfx_id
INNER JOIN public.motherboard ON computer_system.cs_mb = motherboard.mb_id
INNER JOIN public.ram ON computer_system.cs_ram = ram.ram_id;
Then change the INNER JOIN on public.gfx to a LEFT OUTER JOIN:
LEFT OUTER JOIN public.gfx ON computer_system.cs_gfx = gfx.gfx_id
See PostgreSQL tutorial - joins.
I very strongly recommend reading an introductory tutorial to SQL - at least the PostgreSQL tutorial, preferably some more material as well.
It looks like it's just a bracket placement issue. Pull the null check and the graphics card id comparison into a clause by itself.
...
computer_system.cs_case = computer_case.case_id AND
(computer_system.cs_gfx IS NULL OR computer_system.cs_gfx = gfx.gfx_id)
Additionally, you ask if you should use joins. You are in fact using joins, by virtue of having multiple tables in your FROM clause and specifying the join criteria in the WHERE clause. Changing this to use the JOIN ON syntax might be a little easier to read:
FROM sometable A
JOIN someothertable B
ON A.somefield = B.somefield
JOIN somethirdtable C
ON A.somefield = C.somefield
etc
Edit:
You also likely want to make the join where you expect the null value to be a left outer join:
SELECT * FROM
first_table a
LEFT OUTER JOIN second_table b
ON a.someValue = b.someValue
If there is no match in the join, the row from the left side will still be returned.

Whats wrong with this nested query?

I am trying to write a query to return the id of the latest version of a market index stored in a database.
SELECT miv.market_index_id market_index_id from ref_market_index_version miv
INNER JOIN ref_market_index mi ON miv.market_index_id = mi.id
WHERE mi.short_name='dow30'
AND miv.version_num = (SELECT MAX(m1.version_num) FROM ref_market_index_version m1 INNER JOIN ref_market_index m2 ON m1.market_index_id = m2.id )
The above SQL statement can be (roughly) translated into the form:
SELECT some columns FROM SOME CRITERIA MATCHED TABLES
WHERE mi.short_name='some name'
AND miv.version_num = SOME NUMBER
What I don't understand is that when I supply an actual number (instead of a sub query), the SQL statement works - also, when I test the SUB query used to determine the latest version number, that also works - however, when I attempt to use the result returned by sub query in the outer (parent?) query, it returns 0 rows - what am I doing wrong here?
Incidentally, I also tried an IN CLAUSE instead of the strict equality match i.e.
... AND miv.version_num IN (SUB QUERY)
That also resulted in 0 rows, although as before, when running the parent query with a hard coded version number, I get 1 row returned (as expected).
BTW I am using postgeresql, but I prefer the solution to be db agnostic.
The problem is probably that the max(version_num) doesn't exist for 'dow30'.
Try the following correlated subquery:
SELECT miv.market_index_id market_index_id
from ref_market_index_version miv INNER JOIN
ref_market_index mi
ON miv.market_index_id = mi.id
WHERE mi.short_name='dow30' AND
miv.version_num = (SELECT MAX(m1.version_num)
FROM ref_market_index_version m1 INNER JOIN
ref_market_index m2
ON m1.market_index_id = m2.id
where m1.short_name = 'dow30'
)
I added the where clause in the subquery.