Does Netezza store table->procedure dependecy metadata?
I trying to compose a query that would return such data.
Technically, a procedure is dynamic. So it could build a list on the fly of the tables it wants to act upon. Which could be 10 tables one time, and 1000 different tables the next time. So trying to "parse the SP" would be (a) very, very hard and (b) not conclusive.
So likely you would be best running the procedure and using the query history database and see what tables are used there on a given run of the procedure
Related
I am looking to see if the capability is there to have a custom SSMS sql query imported in SPSS (Statistical Package for the Social Sciences). I would want to build syntax that generates this query as my new dataset that I can then continue my scripted analysis. I see the basic query capability of one table from a Sql Server but I would like to create a query that joins to many tables. I anticipate the query to be a bit complex with many joins and perhaps data transformations.
Has anybody had experience or a solution to this situation?
I know I could take the query and make a table of it that SPSS can then connect to but my data changes daily and I would need a job in another application to refresh this table before my SPSS syntax would pull it and I would like to eliminate that first step by just having the query that grabs the data at the beginning of my syntax.
Ultimately I am looking to build out my SPSS syntax and schedule it in the Production Facility to run daily.
I am modifying an access 2010 application so that it uses SQL server to run its queries. The data has been transferred to the server some times ago, and used as linked tables, but that proves a bit slow and non-optimal. So I'm trying to put the queries on the server.
I have no problem for simple queries, views,... and I'm using stored functions when there is a need for simple parameters (dates, ids,...).
But now I have a process in that application that selects a bunch of ids in the database, stores them in a local table, does a bunch of actions on them (report with sub report, print preview, print, update of the original records with the date of print when the user confirms that everything printed OK), and empties the local table if all actions succeed.
I can't simply use an SQL server table to store the ids since many people use the application at the same time, and the same table is used in several processes; I can't use temporary tables since they disappear as soon as access goes to the next action; and I can't find a way to use a local table as a parameter to server stored procedures. Basically I'm stuck.
Can anyone help? Is there a way to do that (pass a bunch of values as a table to a server stored function)? Or another process that would achieve the same result (have a table on the server specific to the current user, or a general table and somehow identify the lines belonging to current user, or anything else)?
There are 2 methods that I use. Both work very well for multi-user apps. Here are the basics. You'll need to work out the details.
Create a table (tblSessions) in SQL Server with an identity column SessID (INT NOT NULL).
Create a function in Access to generate and retrieve a new SessID.
Create a second SS table tblWork with 2 columns SessID, YourID. Add appropriate indexes and keys. Link this table to your Access app. Then instead of inserting the IDs for your query into an Access temp table, insert them into tblWork along with a new SessID. You can now join tblWork to other SS tables/views to use as the datasource for you reports. Be sure to add the SessID that you generated to your where clause.
Create a stored procedure to generate the data for your reports. Include a parameter #YourIDList VARCHAR(MAX). Call the proc via a passthrough queryand pass the list of your IDs as a comma (or whatever you prefer) separated string to #YourIDList. In the proc, split #YourIDList into a temp table. SS2016+ has a STRING_SPLIT function. For older versions, roll your own. There are plenty of examples available. Then join the temp table to the other tables you need to generate your output. Use the PT query as your report datasource, or dump it into an Access temp table and use that as you report datasource.
So I have a summary i need to return to the end user application.
It should accept 3 parameters DateType, StartDate, EndDate.
Date Type will determine the date field I use to filter the data.
The way i accomplished this was putting all the IDs of the records for a datetype into a TEMP table and then joining my summary to the list of IDs.
This worked fine when running on the query on the SQL server that houses the data.
However, that is a replicated server, so when I compiled to a stored proc that would be on the server with the rest of the application data, it slowed the query down. IE 2 seconds vs 50 seconds.
I think the cross join from the temp table that is created on the SQL server then joining to the tables on the replciation server, is causing the slow down.
Are there any methods or techniques that I can use to get around this and build this all in one stored procedure?
If I create 3 stored procedures with their own date range, then they are fast again. However, this means maintaining multiple stored procs for the same thing.
First off, if you are running a version of SQL Server older than 2012 SP1, one problem is that users who aren't allowed to run DBCC SHOW_STATISTICS (which is most users who aren't sysadmins, see the "Permissions" section in the documentation) don't get access to statistics on remote tables. This can severely cripple the optimizer's ability to generate a good execution plan. Upgrading SQL Server or granting more permissions can help there.
If your query involves filtering or joining on a character column, make sure the remote server is flagged in the linked server options as "collation compatible". If this option is off, SQL Server can't assume strings can be compared across the servers and it will start pumping entire tables up and down just to make sure the data ends up where the comparison has to be made.
If the execution plan is as good as it gets and it's still not good enough, one general (lame) technique is to transfer all data locally first (SELECT * INTO #localtable FROM remote.db.schema.table), then run the query as a non-distributed query. Obviously, in order for this to work, the remote table cannot be "too big" and in some cases this actually has worse performance, depending on how many rows are involved. But it's always worth considering, because the optimizer does a better job with local tables.
Another approach that avoids pulling tables together across servers is packing up data in parameters to remote stored procedure calls. Entire tables can be passed as XML through an NVARCHAR(MAX), since neither XML columns nor table-valued parameters are supported in distributed queries. The basic idea is the same: avoid the need for the the optimizer to figure out an efficient distributed query. The best approach greatly depends on your data and your query, obviously.
#GregGalloway was able to answer the question I should have asked. I am adding a more concise question here, while maintaining the original lengthy text
How do I use a table valued function as the query for a partition, when the function is in separate database from my fact and referenced dimensions?
Overview: I am building a SSAS multidimensional cube that is built off of a single fact table in our application's data warehouse, and want to use the result set from a table valued function as my fact table's partition query. We are using SQL Server (and SSAS) 2014
Condition: For each environment (Dev,Tst,Prd) there are 2 separate databases on the same server, one for the application data warehouse [DW_App], the other for custom objects [DW_Custom]. I cannot create any objects in [DW_App], but have a lot of freedom in [DW_Custom]
Background info: I have not been able to find much information on using a TVF and partitions in this way. My thinking is that it will help streamline future development by giving me a single place to update the SQL if/when I modify the fact table.
So in testing out my crazy idea of using a TVF as the query for my partitions I have run into a bit of a conundrum. I am able to use my TVF when I explicitly state the Database in my FROM clause.
SELECT * FROM [DW_Custom].[dbo].[CubePartition](#StartDate, #EndDate)
However, that will not work, because the cube will be deployed in multiple environments before production, and it needs to point to different DBs for each. So I tried adding a new data source, setting my partition query to point to the new data source, and then remove the database name. IE:
SELECT * FROM [dbo].[CubePartition](#StartDate, #EndDate)
I get an error that
The SQL syntax is not valid. The relational database returned the following error message: Deferred prepare could not be completed. Invalid object name 'dbo.CubePartition'
If I click through this error and the subsequent warnings about the cube not being able to process if I continue I am able to build and deploy the cube. However I cannot process it, because I get an error that one of my dimensions does not exist.
Looking into the query that was generated and it is clear that it is querying my dimensions as well as fact, which do not exist inside of '[DW_Custom]' which explains that error perfectly fine.
So I guess 2 questions:
Is it possible to query another DB (on the same server) from inside of an SSAS partition query?
If not, is there any way I can use a variable as the database name in the query, and update that variable based on the project configuration (Dev,Tst,Prd)
Bonus question: Is the reason that I can not find much about doing it this way because it is an obviously bad idea that I am overlooking, and if so why?
How about creating a second SSAS Data Source pointing to the DW_Custom database (or whatever it's called in the particular environment you're deploying to)? Then when you deploy from Dev to Prod, you need only change that connection string. When you create your partitions, then specify the DW_Custom data source and then specify the query without database name:
SELECT * FROM [dbo].[CubePartition](#StartDate, #EndDate)
As long as the query plan for that table-valued function is efficient compared to a plain SELECT statement, then I don't see a problem with that.
I'm working in a project where I need to make changes in the database to create a history of managers and status of projects and use it, instead of the already existing columns of managers and status in a projects table. The problem is: there are several aplications that may or may not be using those fields in this old table and would be almost impossible find all the aplications that use those fields to make them use the new tables.
So, I had been thinking, there is any way to create a stored procedure in a db2 database that can make the values in the manager and status columns in the projects table come from a query to the new history tables?
Sorry for my bad english.