Running SQL via SQLWorkbench versus via Tableau Prep - sql

I have developed some SQL that reads from a redshift table, does some manipulation (esp listagg some fields), and then writes to another redshift table.
When I run the SQL using SQLWorkbench it executes successfully. When I embed it in a Tableau Prep flow (as "Complex SQL") I get several of these errors: "System error: AqlProcessor evaluation failed: [Amazon][Support] (40550) Invalid character value for cast specification." Presumably these relate to my treatment of data types. What I don't is what is so difference in the environment that would cause different results like this? Is it because SQLWorkbench and Tableau Prep use different SQL interpreters? Or is my question too broad to even speculate without going through the actual code?

Best guess is that Tableau, which has knowledge of DDL, is add some CAST() operations to the SQL. SQLWorkbench is simpler and is pushing the SQL to Redshift as written. This is based on there being no explicit CASTs in your SQL but an error message that identifies a CAST().
Look at stl_querytext for these two queries and see if they are being given to Redshift differently by the two benches. I suspect this will give you some clues to go on.
If there are no differences in the SQL then the issue may be with user / connection differences and more info will likely be needed about the issue.

Related

How do I copy data from one Azure database table to a different Azure database table and also convert data types?

I have to copy data from one table to another, the tables are held in two different databases within Azure. I did a quick search for answers to this and whilst a query seems fairly straight forward i.e.
INSERT INTO table1 (make, model, type, serial)
SELECT the_make, the_model, the_type, ref_no
FROM database2.dbo.table2
I encountered issues because I'm using Azure.
Msg 40515, Level 15, State 1, Line 16 Reference to database and/or
server name in 'database2.dbo.table2' is not supported in this version of
SQL Server.
The above issue led me to the Cross-Database Queries articles. My requirements are a little more complicated than some of the scenarios provided and I need some help in making it work.
I also need to convert some columns such as reg_no which is a 'string' to an 'int' and then copy the value to the 'serial' column.
My question is, what the best way to create a script for this that allows me to reference both databases without any errors, copy the data and convert the columns at the same time? I tried the simple way of exporting data and importing it, editing the mappings for the columns, it wasn't that good I found and was causing problems all over the place.
Any guidance is appreciated on this.
You're getting this error because there's no linked server by default. You'll need to add it, in order to access the secondary db server. Here's a link about how to do it:
https://www.sqlshack.com/create-linked-server-azure-sql-database/
In terms of the transformation. It depends on many factors e.g. amount of rows, frequency, etc..
Usually the best alternative is by using an external tool (ETL) such as SSIS / Azure Data Factory because you can schedule it's execution and get the status of each execution.

Generic SQL that both Access and ODBC/Oracle can understand

I have a MS Access query that is based on a linked ODBC table (Oracle).
I'm troubleshooting the poor performance of the query here: Access not properly translating TOP predicate to ODBC/Oracle SQL.
SELECT ri.*
FROM user1_road_insp AS ri
WHERE ri.insp_id = (
select
top 1 ri2.insp_id
from
user1_road_insp ri2
where
ri2.road_id = ri.road_id
and year(insp_date) between [Enter a START year:] and [Enter a END year:]
order by
ri2.insp_date desc,
ri2.length desc,
ri2.insp_id
);
The documentation says:
When you spot a problem, you can try to resolve it by changing the local query. This is often difficult to do successfully, but you may
be able to add criteria that are sent to the server, reducing the
number of rows retrieved for local processing.
In many cases you will find that, despite your best efforts, Office Access still retrieves some entire tables unnecessarily and
performs final query processing locally.
However, it's occurred to me that I don't really understand what sort of SQL I should be writing to make both Access and ODBC/Oracle happy.
Should I be writing some sort of generic SQL that Access can understand in a local query AND that can be easily translated to ODBC/Oracle SQL? Is generic SQL a real thing?
What kind of SQL does the ODBC driver use? It depends as typically MS Access has three types of external data connections that interfaces with different SQL dialects each with the ODBC API.
Linked tables that acts like local tables but are ODBC connected data sources and not stored locally. Once they are incorporated in an Access app, these tables can only use MS Access' SQL dialect. They can be joined with local or even other backend tables from other sources.
Hence, why TOP is available in MS Access and not Oracle. You are essentially using Access SQL to manipulate Oracle data. ODBC serves as the origin point of data while Access' Jet/ACE SQL engine does the processing and resultset viewing in cached memory.
Pass-through queries that do not see local tables or anything else in local app's environment. Such queries use the SQL dialect of the connected database here being Oracle.
Hence, why TOP is NOT available in Oracle and double quotes are allowed in column identifiers. Such quoting would fail in MS Access. Essentially, you are using Oracle SQL to manipulate Oracle data in an Access app. You can take the output of the sqlout.txt log and run it in a pass-through query ODBC-connected to your Oracle database.
ADO/DAO Recordsets that are run entirely via code such as VBA and are direct connections to data sources and uses the connecting database's dialect.
Here, you using Oracle SQL to manipulate Oracle data in an Access app via the ODBC API.
In each one of these types, you will have to connect to a backend ODBC data source. You do not even need to use the GUI but can use Access' object library to create linked tables (see DoCmd.TransferDatabase) and pass through querydefs (see QueryDef.Connect or .Execute).
I suspect the sqlout.txt log you see are translations of the ODBC calls to its native dialect.
To build on #Parfait's point #1:
From Microsoft Access Developer's Guide to SQL Server by Mary Chipman and Andy Baron:
Optimizing Access Queries:
There's a common misconception that the Jet engine always retrieves all the data in linked SQL Server tables and then processes the data locally. This is not usually true. Jet is perfectly capable of sending efficient queries to SQL Server over ODBC and retrieving only the rows required. However, in some cases, Jet will in fact be forced to fetch all the data in certain tables first and then process it. You should be aware of when you are forcing Jet to do this and be sure that it is justified. The following are some general guidelines to follow when creating your Access queries:
Using expressions that can't be evaluated by the server will cause Jet to retrieve all the data required to evaluate those expressions locally. The impact of using Access-specific expressions, such as domain aggregate functions, Access financial functions, or custom VBA functions will vary depending on where in your query the expressions are used. Using such an expression in the SELECT clause will usually not cause a problem because no extra data will be returned. However, if the expression is in the WHERE clause, that criterion cannot be applied on the server, and all the data evaluated by the expression will have to be returned.
With multiple criteria, as many as possible will be processed on the server. This means that even if you use criteria that you know include functions that will need to be processed by Jet, adding other criteria that can be handled by the server will reduce the number of records that Jet has to process. Adding criteria on indexed columns is especially helpful.
Query syntax that includes an Access-specific extension to SQL, not supported by the ODBC driver, may force processing to be done on the client by Access. For example, even though SELECT TOP 5 PERCENT is now supported by SQL Server, it is not supported by the ODBC driver. If you use that syntax in an Access query, Jet will need to retrieve all the records and calculate which ones are in the top 5 percent. On the other hand, even though crosstab queries are specific to Access, Jet will translate them into simple GROUP BY queries and fetch just the required data in one trip to the server unless problematic criteria is used.
Heterogeneous joins between local and remote tables or between remote tables that are in different data sources will, of course, have to be processed by Jet after the source data is retrieved. However, if the remote join field is indexed and the table is large, Jet will often use the index to retrieve only the required rows by making multiple calls to the remote table, one fore each row required.
Jet allows you to mix data types within [typo - fix later] of UNION queries and within expressions, but SQL server doesn't. Such mixing of data types will force processing to be done locally.
Multiple outer joins in one query will be processed locally.
The most important factor is reducing the total number of records being fetched. Jet will retrieve multiple batches of records in the background until the result set is complete, so even though you may seem to get results back immediately, a continuing load is being placed on the server for large result sets.
Note: this book is quite old (published in 2000) and is in reference to Jet Engine. I imagine things might be slightly different in newer versions of Access which use ACE, although I don't have a source to back this up.

SSAS Multidimensional - Table Value Function as a Query for Partition

#GregGalloway was able to answer the question I should have asked. I am adding a more concise question here, while maintaining the original lengthy text
How do I use a table valued function as the query for a partition, when the function is in separate database from my fact and referenced dimensions?
Overview: I am building a SSAS multidimensional cube that is built off of a single fact table in our application's data warehouse, and want to use the result set from a table valued function as my fact table's partition query. We are using SQL Server (and SSAS) 2014
Condition: For each environment (Dev,Tst,Prd) there are 2 separate databases on the same server, one for the application data warehouse [DW_App], the other for custom objects [DW_Custom]. I cannot create any objects in [DW_App], but have a lot of freedom in [DW_Custom]
Background info: I have not been able to find much information on using a TVF and partitions in this way. My thinking is that it will help streamline future development by giving me a single place to update the SQL if/when I modify the fact table.
So in testing out my crazy idea of using a TVF as the query for my partitions I have run into a bit of a conundrum. I am able to use my TVF when I explicitly state the Database in my FROM clause.
SELECT * FROM [DW_Custom].[dbo].[CubePartition](#StartDate, #EndDate)
However, that will not work, because the cube will be deployed in multiple environments before production, and it needs to point to different DBs for each. So I tried adding a new data source, setting my partition query to point to the new data source, and then remove the database name. IE:
SELECT * FROM [dbo].[CubePartition](#StartDate, #EndDate)
I get an error that
The SQL syntax is not valid. The relational database returned the following error message: Deferred prepare could not be completed. Invalid object name 'dbo.CubePartition'
If I click through this error and the subsequent warnings about the cube not being able to process if I continue I am able to build and deploy the cube. However I cannot process it, because I get an error that one of my dimensions does not exist.
Looking into the query that was generated and it is clear that it is querying my dimensions as well as fact, which do not exist inside of '[DW_Custom]' which explains that error perfectly fine.
So I guess 2 questions:
Is it possible to query another DB (on the same server) from inside of an SSAS partition query?
If not, is there any way I can use a variable as the database name in the query, and update that variable based on the project configuration (Dev,Tst,Prd)
Bonus question: Is the reason that I can not find much about doing it this way because it is an obviously bad idea that I am overlooking, and if so why?
How about creating a second SSAS Data Source pointing to the DW_Custom database (or whatever it's called in the particular environment you're deploying to)? Then when you deploy from Dev to Prod, you need only change that connection string. When you create your partitions, then specify the DW_Custom data source and then specify the query without database name:
SELECT * FROM [dbo].[CubePartition](#StartDate, #EndDate)
As long as the query plan for that table-valued function is efficient compared to a plain SELECT statement, then I don't see a problem with that.

Dynamic SQL for updating any table !

How to create a dynamic SQL statement, that will update any table given as one of parameter. Here I believe, i couldn't use "Set Column1 = Value ....." as the columns will differ according to the table.
This is an extremely poor idea. You can create massive havoc with your database doing such a thing. I can't imagine any dba who would allow it. You need to know the specifics of a table to insert into it properly, you need to be aware of what fields are required and what fields have default values. You need to know what kind of information and data types should be in each field so that you do not send bad data to the database. One proc that does all cannot properly check these things and certainly can't ever be properly tested. Further it means permissions must be at the table level which is a poor choice for internal security as well as for SQL injection attacks.
Could you provide more context? Are you executing arbitrary SQL statements from within scripts, such as Perl, PHP, or Python? Are you just trying to get a command-line .sql script working? What database server are you working on?
The solution can vary widely depending on your situation.

Microsoft T-SQL to Oracle SQL translation

I've worked with T-SQL for years but I've just moved to an organisation that is going to require writing some Oracle stuff, probably just simple CRUD operations at least until I find my feet. I'm not going to be migrating databases from one to the other simply interacting with existing Oracle databases from an Application Development perspective. Is there are tool or utility available to easily translate T-SQL into Oracle SQL, a keyword mapper is the sort of thing I'm looking for.
P.S. I'm too lazy to RTFM, besides it's not going to be a big part of my role so I just want something to get me up to speed a little faster.
The language difference listed so far are trivial compared to the logical differences. Anyone can lookup NVL. What's hard to lookup is
DDL
In SQL server you manipulate your schema, anywhere, anytime, with little or no fuss.
In Oracle, we don't like DDL in stored procedures so you have jump through hoops. You need to use EXECUTE IMMEDIATE to perform a DDL function.
Temp Tables
IN SQL Server when the logic becomes a bit tough, the common thing is to shortcut the sql and have it resolved to a temp table and then the next step is done using that temp table.
MSSS makes it very easy to do this.
In Oracle we don't like that. By forcing an intermediate result you completely prevent the Optimizer from finding a shortcut for you. BUT If you must stop halfway and persist the intermediate results Oracle wants you to make the temp table in advance, not on the fly.
Locks
In MSSS you worry about locking, you have nolock hints to apply to DML, you have lock escalation to reduce the count of locks.
In Oracle we don't worry about these in that way.
Read Commited
Until recently MSSS didn't fully handle Read Committed isolation so you worried about dirty reads.
Oracle has been that way for decades.
etc
MSSS has no concept of Bitmap indexes, IOT, Table Clusters, Single Table hash clusters, non unique indexes enforcing unique constraints....
I get the impression most answers focus on migrating an entire database or just point to some differences between T-SQL and PL/SQL. I recently had the same problem. The Oracle database exists, but I need to convert a whole load of T-SQL scripts to PL/SQL.
I installed Oracle SQL Developer and ran the Translation Scratch Editor (Tools > Migration > Scratch Editor).
Then, just enter your T-SQL, choose the correct translation in the dropdown-list (it should default to 'T-SQL to PL/SQL'), and convert it.
I have to things to mention.
1) When I worked on Oracle 8, you could not do "Select #Result", you had to instead use the dummy table as follows "Select #Result from dual". Not sure if that ridiculousness still exists.
2) In the Oracle world they seem to love cursors and you better read up on them, they use them all the time AFAICS.
Good luck and enjoy,
it is not that different to MS SQL. Thankfully, I do not have to work with it anymore and I am back in the warm comfort of MS tools.
If you replace your ISNULL and NVL nonsense with COALESCE, it'll work in T-SQL and PL/SQL!
It's not trivial to map them back and forth, so I doubt there's a tool that does it automatically. But this link might help you out: http://vyaskn.tripod.com/oracle_sql_server_differences_equivalents.htm
The most important differences for plain T-SQL are:
NVL replaces ISNULL
SYSDATE replaces GETDATE()
CONVERT is not supported
Identity columns must be replaced with sequences <-- not technically T- or PL/ but just SQL
Note. I assume you do not use the deprecated SQL Server *= syntax for joins
#jodonell: The table you link to is a bit outdated, oracle has become somewhat more standards compliant after 9i supporting things like CASE and ANSI outer joins
I have done a few SQL server to oracle migrations. There is no way to migrate without rewriting the backend code. Too many differences between the 2 databases and more importantly differences between the 2 mind sets of the programmers. Many managers think that the 2 are interchangeable, I have had managers ask me to copy the stored procedures from SQL server and compile them in oracle, not a clue! Toad is by far the best tool on the market for supporting an oracle application. SQL developer is ok but was disappointing compared to toad. I hope that oracle will catch their product up to toad one day but it is not there yet. Have a good day :) chances are if you are migrating to oracle it is for a reason and in order to meet that requirement you will need to rewrite the back end code or you will have many issues.
In Oracle SQL Developer, there is a tool called Translation Scratch Editor. You can find it from Tools > Migration.
The Oracle SQL Developer is a free download from Oracle and it is an easy install.
If you're doing a one-off conversion, rather than trying to support two versions, you must look at Oracle Migration Workbench. This tool works with Oracle's SQLDeveloper (which you really should have if you are working with Oracle). This does a conversion of the schema, data, and some of the T-SQL to PL/SQL. Knowing both well, I found it did about an 80% job. Good enough to make it worth while to convert the bulk of procedures, and hand convert the remainder "tougher" unknown parts.
Not cheap ($995) but this tool works great: http://www.swissql.com/products/sql-translator/sql-converter.html
A few people have mentioned here converting back and forward. I do not know of a tool to convert from MSSQL to Oracle, but I used the free MS tool to convert a Oracle db to MSSQL and it worked for me and converted a large db with no problems I can call. It is similar to the Access to MSSQL tool that MS also provide for free. Enjoy
jOOQ has a publicly available, free translator, which can be accessed from the website here: https://www.jooq.org/translate
It supports DML, DDL, and a few procedural syntax elements. If you want to run the translation locally via command line, a license can be purchased and the command line works as follows:
$ java -cp jooq-3.11.9.jar org.jooq.ParserCLI -t ORACLE -s "SELECT substring('abcde', 2, 3)"
select substr('abcde', 2, 3) from dual;
See: https://www.jooq.org/doc/latest/manual/sql-building/sql-parser/sql-parser-cli
(Disclaimer, I work for the vendor)