I have a view that is now joining two other views with some extra tables.
It's very slow.
My experience tell me it's because views are not indexed by default. I tried to create an index on each of them, but it's not possible since they have self joins or inner queries.
My question is:
It appears to me that in general the join of views is not recommended. So, in short, there is no way to reuse a code from one view into another? Example: the view A calculates the percentage and the view B calculates something else that uses the percentage from view A plus other information from other tables/views. What would be the best approach? Do you really have to replicate the code from view A to view B so it uses the original table's indexes?
Views (simplified view, to show the issue):
View A (calculates the percentage):
SELECT dbo.tblPopAgeGrp.RevID, dbo.tblPopAgeGrp.VarID, dbo.tblPopAgeGrp.LocID,
dbo.tblPopAgeGrp.TimeID, dbo.tblPopAgeGrp.AgeID, tPAGT.AgeID AS AgeTotal,
100 * dbo.tblPopAgeGrp.PopMale / tPAGT.PopMale AS PopMalePerc,
100 * dbo.tblPopAgeGrp.PopFemale / tPAGT.PopFemale AS PopFemalePerc,
100 * dbo.tblPopAgeGrp.PopTotal / tPAGT.PopTotal AS PopTotalPerc
FROM dbo.tblPopAgeGrp
INNER JOIN dbo.tblPopAgeGrp tPAGT
ON dbo.tblPopAgeGrp.GroupID = tPAGT.GroupID
AND dbo.tblPopAgeGrp.AgeID = 700
View A by itself, since there so many records, takes a long time to execute. However, in view B the records are filtered according to the VersionID.
View B (gets the percentage from view A with additional info from another view):
SELECT vPAGP.VersionID,
vPAGP.LocationID AS LocID,
vPAGP.PopTotalPerc AS pPopTot,
vPAGP.PopMalePerc AS pMale,
vPAGP.PopFemalePerc AS pFemale,
vPAGPSR.PopMaleSexRatio AS SexRatio,
vPAGPSR.PopFemaleSexRatio AS FemRatio
FROM dbo.vwA AS vPAGP
INNER JOIN dbo.vwOther AS vPAGPSR
ON vPAGPSR.GroupID = vPAGP.GroupID
WHERE vPAGP.VersionID=10
Executing View A without filters, takes like 10 minutes. Executing it for VersionID=10 only, it executes in 10 seconds. The view vwOther executes very quickly.
Thanks!
You are not correct when you state "It appears to me that in general the join of views is not recommended."
Views can be combined with other views and will perform well provided that all JOINs are optimizable and have the appropriate index created and any filtering done within the view is optimizable and has appropriate indexes created.
A view based on other views should perform as well as the same query written to factor out the views. If you want further help, please post the definition of all views involved in your problem.
Related
I have to build a view by fetching data from 7-8 tables and then there are field which are calculated from other calculated fields. For example first calculation is if(indicator=‘H’, amount*20, amount) as deliAmt. And then
If(isnull(deliAmt),0 else deliAmt)
This is just an example but for this view i have 5-6 such calculations required.
Also the final view has around 7-8 main tables and other tables for these fetching columns for these calculations. In tolal there will be 57 columns finally.
Please guide what is the best approach to implement this.
To write a view that selects data from 7-8 tables, write the SQL the select from 7-8 table and put it "in a view".
But the other part of you question of how to do IF like logic is to use in Snowflake the IFF operator thus your example if(indicator=‘H’, amount*20, amount) as deliAmt
would be written
IFF(indicator=‘H’, amount*20, amount) as deliAmt
and If(isnull(deliAmt),0 else deliAmt) would be:
IFF(isnull(deliAmt), 0, deliAmt)
of which can also be done via ZEROIFNULL like:
ZEROIFNULL(deliAmt)
I am using Excel's Power Query in order to test a SQL query that I am eventually going to use in order to make a pivot table that stays updated with the database. The database is accessed through an ODBC.
The problem is not related to Power Query itself but simply the SQL request.
Here I am trying to select all bills from the "facturation" (French database) table that are from the current year (2021). I am naming this selected data FACTURES_ANEE_COURANTE.
Then I want to also select some attributes of those items from 2021 in order to display them in the pivot table, but only on the selection that I just made in order to only select (and show) bills from the current year.
select * as FACTURES_ANNEE_COURANTE
from facturation
where year(date_fact)=2021 limit 3, select date_fact from FACTURES_ANNEE_COURANTE
I only have very basic knowledge of SQL and therefore this does not seem to work, the second part of my request that is (the first one works). I'm trying to do this in order to be able to show these specific attributes in the pivot table. What's the proper way to select attributes only from my first selection of elements from my table facturation?
Thank you for your help.
A major advantage of Power Query is being able to generate complex logic without needing to be able to code in SQL. So I would abandon writing hand coded SQL - there's no need.
Before PQ came out I had 2 decades of experience writing complex SQL. After PQ came out I've written almost none - the SQL code generated by PQ is good enough, you can easily add complex transformations that are hard/impossible in SQL, and overall developing and debugging is 10x easier.
For your scenario, I would build a PQ query just using the navigation to select your facturation table. Then I would use the PQ UI to Filter (instead of a SQL where clause) and Choose Columns (to restrict the columns returned).
Whatever other transformations you need are likely met using a button in the PQ UI.
Suppose that I have a view in BigQuery, e.g. [views.myview] defined as follows:
SELECT
Id AS Id,
MAX(Time) AS MostRecentTime
FROM
[dataset.mytable]
GROUP BY
Id
And then another query that queries that view:
SELECT
*
FROM
[dataset.mytable] tbl
JOIN [views.myview] view ON tbl.Time = mview.MostRecentTime
Is there a way to automatically generate a query where the [views.myview] in the second query is replaced with the query that generates it - basically "unpacking" the views so you have just one query that queries tables directly?
(The underlying problem: I have a query which queries many different views, including several layers of views-querying-other-views, and I want to put this query in my application. I don't want a user to be able to mess with the results of the query by changing the definition of one of the views, so I want to put the whole query in a fixed form in the application.)
This is not possible to do 'automatically'. You could try writing some script or code to do this through the BigQuery apis - https://cloud.google.com/bigquery/docs/managing-views
Have a very odd situation. For various reasons we have a bunch of tables where this was done:
TableA was renamed to TableASource and a view was created called TableA
TableB was renamed to TableBSource and a view was created called TableB
This all works fine and has done for a fair while. Yesterday I added a new field to TableASource, TableBSource (and the others). This was a field called 'createDate' a smallDateTime, no nulls with a default value of getDate(). You would think this should have zero impact on anything. Today first thing users are saying they can see these dates inplace of the intended data. E.G Lest says we had a page extracting 10 fields out of the view called TableA, instead of showing these 10 fields its now showing 9 correctly and randomly showing the value in the new createDate field that was added to TableASource (If it matters it placing it in the second field on the page)
I quickly dropped all these new createDate fields and the issue went away. How can this possibly happen? How can adding a field to TableASource effect the results of the TableA view?
Using SQLServer2008 r2
in response the the extra questions below. Here is the view called 'bayTrainCourses' which has been described above as TableA :
SELECT btc.course_id, btc.course_name, btc.course_status, btc.course_expiry, btc.course_startdate, btc.course_enddate, btc.course_type, btc.site_id, btc.channelid, btc.exam_type, btc.enroll_type, btc.last_updated, btc.compid, btc.TOC, btc.rdone, btc.autoqualify, btc.courseIntroId, btc.coursePurposeId, btc.courseBackgroundId,btc.courseObjectivesId, intro. [content] AS course_intro, purpose.[content] AS course_purpose, background.[content] AS course_background,
objectives.[content] AS course_objectives
FROM dbo.bayTrainCoursesSource AS btc
LEFT OUTER JOIN dbo.RTE AS intro ON btc.courseIntroId = intro.pk
LEFT OUTER JOIN dbo.RTE AS purpose ON btc.coursePurposeId = purpose.pk
LEFT OUTER JOIN dbo.RTE AS background ON btc.courseBackgroundId = background.pk
LEFT OUTER JOIN dbo.RTE AS objectives ON btc.courseObjectivesId = objectives.pk
The select is:
select * from baytraincourses....
EDITING - due to some trial an error. It turns out that the trial was actually the advice from JohnS below. Recreate the view.... I renamed the view to baytraincourses_broken. Then right click > script view as > create to > then I changed 'baytraincourses_broken' to 'baytraincourses' to recreate another version with the original name. Now the page works.
I am not a big fan of views. Its very rare I create them to be honest, I am working with code that I did not originally write. Do I really have to re-create views every time I add a new column? Surely not, how can I deal with this?
After adding the new field to TableASource you need to recreate the view. (i.e. DROP VIEW TableA ... then CREATE VIEW TableA ...).
Using SQlite I have a large database split into years:
DB_2006_thru_2007.sq3
DB_2008_thru_2009.sq3
DB_current.sq3
They all have a single table call hist_tbl with two columns (key, data).
The requirements are:
1. to be able to access all the data at once.
2. inserts only go to the current version.
3. the data will continue to be split as time goes on.
4. access is through a single program that has exclusive access.
5. the program can accept some setup SQL but needs to run the same when accessing one database or multiple databases.
To view them cohesively I do the following (really in a program but command line shown here):
sqlite3 DB_current.sq3
attach database 'DB_2006_thru_2007.sq3' as hist1;
attach database 'DB_2008_thru_2009.sq3' as hist2;
create temp view hist_tbl as
select * from hist1.hist_tbl union
select * from hist2.hist_tbl union
select * from main.hist_tbl;
There is now a temp.hist_tbl (view) and a main.hist_tbl (table).
When I select without qualifying the table I get the data thru the view.
This is desirable since I can use my canned sql queries against either the joined view or the individual databases depending on how I setup. Additionally I can always insert into main.hist_tbl.
Question 1: What are the downsides?
Question 2: Is there a better way?
Thanks in advance.
Question 1: What are the downsides?
You have to update the view EVERY. FISCAL. year.
Question 2: Is there a better way?
Add a date column so you can search for things within a given timespan, like a fiscal year.