Oracle Join View - which rowid is used - sql

CREATE VIEW EVENT_LOCATION ("EVENT_ID", "STREET", "TOWN") AS
SELECT A.EVENT_ID, A.STREET, A.TOWN
FROM TBLEVENTLOCATION A
JOIN TBLEVENTS B
ON A.EVENT_ID = B.EVENT_ID
WHERE B.REGION = 'South';
if I run
SELECT ROWID, STREET, TOWN FROM EVENT_LOCATION
then which ROWID should I get back?
Reason I'm asking is:
In the database there are many views with the above 'pattern'. It seems to differ which rowid is being returned from different views. ie. I am getting both A.ROWID or B.ROWID ...
UPDATE:
I have resolved this using the following view. Which essentially guarantees the ROWID comes from the right table. Thanks for your replies!
CREATE VIEW EVENT_LOCATION ("EVENT_ID", "STREET", "TOWN") AS
SELECT A.EVENT_ID, A.STREET, A.TOWN
FROM TBLEVENTLOCATION A
WHERE A.EVENT_ID IN (SELECT EVENT_ID FROM TBLEVENTS WHERE REGION = 'South');

Try looking at
select * from user_updatable_columns where table_name = 'EVENT_LOCATION'
The columns that are updatable should indicate the table (and hence the rowid) which Oracle says is the child.
Bear in mind that, if you use multi-table clusters (not common, but possible), then different tables in the same cluster can have records with the same ROWID.
Personally, I'd recommend (a) don't use ROWID in your code anywhere and (b) if you do, then include an explicit evt.rowid evt_rowid column in the view.

Since you get ORA-01445 if non of the tables you use are key-preserving I think it will return the rowid of one of the key-preserving tables. I don't know what will happen if several tables are key-preserving.

Related

Oracle SQL query to pull data and add new created date column

I'm quite new to Oracle SQL so please excuse me if my question actually has a relatively schoolboy answer.
So I have 2 tables, Apps and Apps_history with the definitions below.
Apps Apps_history
ID ID
Other APP_ID
DATE_MODIFIED STATUS
DATE_MODIFIED
Apps_history has app_id which is a foreign key of the primary key ID in Apps, records in Apps are frequently updated and Apps_history keeps track of this. I want a new column to show when an ID in apps was created, this can be derived from the column date_modified in Apps_History when state is equal to 'initialized'.
Currently this is what I have
select *, t.date_modified as create_date
(select app_history.date_modified
from apps
inner join app_history on
apps.id=app_history.app_id where
status='initialized') T
from apps;
But I'm getting some errors, any help to nudge me in the right direction is much appreciated,
Thanks
There are multiple ways to accomplish this. You seem to have started down the road of a correlated subquery, so to continue with that:
select a.*,
(select ah.date_modified
from app_history ah
where a.id = ah.app_id and ah.status = 'initialized')
) as created_date
from apps a;
For performance, I would recommend an index on app_history(app_id, status, date_modified).
Your subquery is not correlated. You could solve this using a correlated query as suggested by #Gordon. Also, it would throw error if you are getting multiple rows from the correlated subquery but that's fixable.
I'd use left self join for this.
select a.*,
t.date_modified as create_date
from apps a
left join apps t on a.id = t.app_id
and t.status = 'initialized';

Query optimization for postgresql

I have to resolve a problem in my class about query optimization in postgresql.
I have to optimize the following query.
"The query determines the yearly loss in revenue if orders just with a quantity of more than the average quantity of all orders in the system would be taken and shipped to customers."
select sum(ol_amount) / 2.0 as avg_yearly
from orderline, (select i_id, avg(ol_quantity) as a
from item, orderline
where i_data like '%b'
and ol_i_id = i_id
group by i_id) t
where ol_i_id = t.i_id
and ol_quantity < t.a
Is it possible through indices or something else to optimize that query (Materialized view is possible as well)?
Execution plan can be found here. Thanks.
first if you have to do searches from the back of data, simply create an index on the reverse of the data
create index on item(reverse(i_data);
Then query it like so:
select sum(ol_amount) / 2.0 as avg_yearly
from orderline, (select i_id, avg(ol_quantity) as a
from item, orderline
where reverse(i_data) like 'b%'
and ol_i_id = i_id
group by i_id) t
where ol_i_id = t.i_id
and ol_quantity < t.a
Remember that making indexes may not speed up the query when you have to retreive something like 30% of the table. In this case bitmap index might help you but as far as I remember it is not available in Postgres. So, think which table to index, maybe it would be worth to index the big table by ol_i_id as the join you are making only needs to match less than 10% of the big table and small table is loaded to ram (I might be mistaken here, but at least in SAS hash join means that you load the smaller table to ram).
You may try aggregating data before doing any joins and reuse the groupped data. I assume that you need to do everything in one query without explicitly creating any staging tables by hand. Also recently, I have been working a lot on SQL Server so I may mix the syntax, but give it a try. There are many assumptions I have made about the data and the structure of the table, but hopefully it will work.
;WITH GrOrderline (
SELECT ol_i_id, ol_quantity, SUM(ol_amount) AS Yearly, Count(*) AS cnt
FROM orderline
GROUP BY ol_i_id, ol_quantity
),
WITH AvgOrderline (
SELECT
o.ol_i_id, SUM(o.ol_quantity)/SUM(cnt) AS AvgQ
FROM GrOrderline AS o
INNER JOIN item AS i ON (o.ol_i_id = i.i_id AND RIGHT(i.i_data, 1) = 'b')
GROUP BY o.ol_i_id
)
SELECT SUM(Yearly)/2.0 AS avg_yearly
FROM GrOrderline o INNER JOIN AvgOrderline a ON (a.ol_i_id = a.ol_i_id AND o.ol_quantity < a.AvG)

Check the query efficiency

I have this below SQL query that I want to get an opinion on whether I can improve it using Temp Tables or something else or is this good enough? So basically I am just feeding the result set from inner query to the outer one.
SELECT S.SolutionID
,S.SolutionName
,S.Enabled
FROM dbo.Solution S
WHERE s.SolutionID IN (
SELECT DISTINCT sf.SolutionID
FROM dbo.SolutionToFeature sf
WHERE sf.SolutionToFeatureID IN (
SELECT sfg.SolutionToFeatureID
FROM dbo.SolutionFeatureToUsergroup SFG
WHERE sfg.UsergroupID IN (
SELECT UG.UsergroupID
FROM dbo.Usergroup UG
WHERE ug.SiteID = #SiteID
)
)
)
It's going to depend largely on the indexes you have on those tables. Since you are only selecting data out of the Solution table, you can put everything else in an exists clause, do some proper joins, and it should perform better.
The exists clause will allow you to remove the distinct you have on the SolutionToFeature table. Distinct will cause a performance hit because it is basically creating a temp table behind the scenes to do the comparison on whether or not the record is unique against the rest of the result set. You take a pretty big hit as your tables grow.
It will look something similar to what I have below, but without sample data or anything I can't tell if it's exactly right.
Select S.SolutionID, S.SolutionName, S.Enabled
From dbo.Solutin S
Where Exists (
select 1
from dbo.SolutionToFeature sf
Inner Join dbo.SolutionToFeatureTousergroup SFG on sf.SolutionToFeatureID = SFG.SolutionToFeatureID
Inner Join dbo.UserGroup UG on sfg.UserGroupID = UG.UserGroupID
Where S.SolutionID = sf.SolutionID
and UG.SiteID = #SiteID
)

Referring to other SQL scripts from a SQL script?

I'm currently converting MS access queries to SQL queries and noticed that in the access query it appears to be joining another query to other tables. So I looked around and it seems like that query pretty much makes the query look cleaner without needing to have all sorts of subqueries in the same script
Something like
FROM [query name] INNER JOIN [some other table]
Is there something like this in SQL?
You are probably looking for VIEWS.
A view is basically a stored version of a SELECT query. It allows you to reference the result set without rewriting the query every time.
You can create a VIEW as a query, then reference the view in another query.
CREATE VIEW <viewname> AS <SELECT STATEMENT>
then
SELECT * FROM <viewname> INNER JOIN <other table>
Yes. They are called views.
You can create a view like
CREATE VIEW vw_some_query AS
SELECT * FROM
table_a LEFT INNER JOIN table_b ON table_a.id = table_b.id2
then you can write a select like:
SELECT * FROM vw_some_query LEFT INNER JOIN table_c ON vw_some_query.id = table_c.id3
Is there something like this in SQL?
Yes. In SQL you would probably use the WITH clause:
WITH someData AS
(
select a.col1, b.col2
from tableA a join tableB b
on (a.someKey = b.someKey)
),
...
select data1.col1, data1.col2, data2.col3
from someData data1 join tableC data2
on (data1.col1 = data2.anotherKey)
where ...
Views are ok too, but another db object to keep track of, and if using a materialized view, need to worry about refreshing snapshot table, etc. My suggestion is to use WITH along with plenty of comments where possible.
EDIT: If you find yourself asking the same question of the db over and over, then a view (or mat view) would be more appropriate. But otherwise, keep logic in the query.

I would like a simple example of a sub-query using T-SQL 2008

Can anyone give me a good example of a subquery using TSQL 2008?
Maximilian Mayer believes that, due to referencing MS documentation, my assertion that there is a difference between a subquery and a subSelect is incorrect. Frankly, I'd consider MSDN's "Subquery Fundamentals" a better choice. Quote:
You are making distinctions between terms that actually mean the same.
O RLY?
A subQUERY...
IE:
WHERE id IN (SELECT n.id FROM TABLE n)
OR id = (SELECT MAX(m.id) FROM TABLE m)
OR EXISTS(SELECT 1/0 FROM TABLE) --won't return a math error for division by zero
...affects the WHERE or HAVING clauses -- the filteration of data -- for a SELECT, INSERT, UPDATE or DELETE statement. The value from a subquery is never directly visible in the SELECT clause.
A subSELECT...
IE:
SELECT t.column,
(SELECT x.col FROM TABLE x) AS col2
FROM TABLE t
...does not affect the filteration of data in the main query, and the value is exposed directly in the SELECT clause. But it's only one value - you can't return two or more columns into a single column in the outer query.
A subselect is a consistent means of performing a LEFT JOIN in ANSI-89 join syntax - if there is no supporting row, the column will be null. Additionally, a non-correlated subselect will return the same value for every row of the main query.
Correlation
If a subquery or subselect is correlated, that query runs once for every record of the main query returned -- which doesn't scale well as the number of rows in the result set increases.
Derived Table/Inline View
IE:
SELECT x.*,
y.max_date,
y.num
FROM TABLE x
JOIN (SELECT t.id,
t.num,
MAX(t.date) AS max_date
FROM TABLE t
GROUP BY t.id, t.num) y ON y.id = x.id
...is a JOIN to a derived table (AKA inline view).
"Inline view" is a better term, because that is all that happens when you reference a non-materialized view -- a view is just a prepared SQL statement. There's no performance or efficiency difference if you create a view with a query like the one in the example, and reference the view name in place of the SELECT statement within the brackets of the JOIN. The example has the same information as a correlated subquery, but the performance benefit of using a join and none of the subquery detriments. And you can return more than one column, because it is a view/derived table.
Conclusion
It should be obvious why I and others make distinctions. The concept of relying on the word "subquery" to categorize any SELECT statement that isn't the main clause is fatality flawed, because it's also a specific case under a categorization of the same word (IE: subquery-subselect, subquery-subquery, subquery-join...). Now think of helping someone who says "I've got a problem with a subquery..."
Maximilian Mayer's idea of "official" documentation was written by technical writers, who often have no experience in the subject and are only summarizing what they've been told to from knowledgeable people who have simplified things. Ultimately, it's just text on a page or screen -- like what you're reading now -- and the decision is up to you if the details I've laid out make sense to you.
For variety's sake, here's one in the where clause:
select
a.firstname,
a.lastname
from
employee a
where
a.companyid in (
select top 10
c.companyid
from
company c
where
c.num_employees > 1000
)
...returns all employees in the top ten companies with over 1000 employees.
SELECT
*,
(SELECT TOP 1 SomeColumn FROM dbo.SomeOtherTable)
FROM
dbo.MyTable
SELECT a.*, b.*
FROM TableA AS a
INNER JOIN
(
SELECT *
FROM TableB
) as b
ON a.id = b.id
Thats a normal subquery, running once for the whole result set.
On the other hand
SELECT a.*, (SELECT b.somecolumn FROM TableB AS b WHERE b.id = a.id)
FROM TableA AS a
is a correlated subquery, running once for every row in the result set.