Do databases besides Postgres have features comparable to foreign data wrappers? - sql

I'm very excited by several of the more recently-added Postgres features, such as foreign data wrappers. I'm not aware of any other RDBMS having this feature, but before I try to make the case to my main client that they should begin preferring Postgres over their current cocktail of RDBMSs, and include in my case that no other database can do this, I'd like to verify that.
I've been unable to find evidence of any other database supporting SQL/MED, and things like this short note stating that Oracle does not support SQL/MED.
The main thing that gives me doubt is a statement on http://wiki.postgresql.org/wiki/SQL/MED:
SQL/MED is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database.
If FDWs are based on SQL/MED, and SQL/MED is an open standard, then it seems likely that other RDBMSs have implemented it too.
TL;DR:
Does any database besides Postgres support SQL/MED?

IBM DB2 claims compliance with SQL/MED (including full FDW API);
MySQL's FEDERATED storage engine can connect to another MySQL database, but NOT to other RDBMSs;
MariaDB's CONNECT engine allows access to various file formats (CSV, XML, Excel, etc), gives access to "any" ODBC data sources (Oracle, DB2, SQLServer, etc) and can access data on the storage engines MyIsam and InnoDB.
Farrago has some of it too;
PostgreSQL implements parts of it (notably it does not implement routine mappings, and has a simplified FDW API). It is usable as readeable since PG 9.1 and writeable since 9.3, and prior to that there was the DBI-Link.
PostgreSQL communities have a plenty of nice FDW like noSQL FDW (couchdb_fdw, mongo_fdw, redis_fdw), Multicorn (for using Python output instead of C for the wrapper per se), or the nuts PGStrom (which uses GPU for some operations!)

SQL Server has the concept of Linked Servers (http://technet.microsoft.com/en-us/library/ms188279.aspx), which allows you to connect to external data sources (Oracle, other SQL instances, Active Directory, File system data via the Indexing Service provider, etc.) and, if you really needed to, you can create your own Providers that can be used by a SQL Server Linked Server.
Another option within SQL Server is the CLR, in which you can write code to retrieve data from web services or other data sources as needed.
While this may not technically be "SQL/MED", it seems to accomplish the same thing.
Distributed query using local table joined to 4-part linked server query. I think case the remotetable filter might not be applied until after the entire table is pulled local (documentation is fuzzy on this and I've found article with conflicting opinions):
SELECT *
FROM LocalDB.dbo.table t
INNER JOIN LinkedServer1.RemoteDB.dbo.remotetable r on t.val = r.val
WHERE r.val < 1000
;
Using OpenQuery, remotetable filter is applied on the remote server, as long as the filter is passed into the OpenQuery 2nd parameter:
SELECT *
FROM LocalDB.dbo.table t
INNER JOIN OPENQUERY(LinkedServer1, 'SELECT * FROM RemoteDB.dbo.remotetable r WHERE r.val < 1000') r on t.val = r.val

Related

Data Migration Assistant - Cross-database queries

I am running an assessment using the Azure SQL Data Migration Assistant (3.4.3948.1). In my initial assessment, I had a function that was calling fn_varbintohexstr so I fixed it (read removed the function). I also deleted all our synonyms.
Now I run the assessment more times and it continues to give the 'cross-database queries' error but without listing any more specifics. How can I find out what particular objects it means? Or is it possible that it has somehow cached my result and I need to invalidate it somehow?
This means you have programming objects in that database that reference another database. For example a query like this:
SELECT * FROM Database1.dbo.Table1
One of the options you have is to import those external objects to your database and change the three and four-part name references that SQL Azure does not support.
You can also use CREATE EXTERNAL DATA SOURCE and CREATE EXTERNAL TABLE on SQL Azure to query tables that belong to other databases that you have to migrate to SQL Azure too.

Generic SQL that both Access and ODBC/Oracle can understand

I have a MS Access query that is based on a linked ODBC table (Oracle).
I'm troubleshooting the poor performance of the query here: Access not properly translating TOP predicate to ODBC/Oracle SQL.
SELECT ri.*
FROM user1_road_insp AS ri
WHERE ri.insp_id = (
select
top 1 ri2.insp_id
from
user1_road_insp ri2
where
ri2.road_id = ri.road_id
and year(insp_date) between [Enter a START year:] and [Enter a END year:]
order by
ri2.insp_date desc,
ri2.length desc,
ri2.insp_id
);
The documentation says:
When you spot a problem, you can try to resolve it by changing the local query. This is often difficult to do successfully, but you may
be able to add criteria that are sent to the server, reducing the
number of rows retrieved for local processing.
In many cases you will find that, despite your best efforts, Office Access still retrieves some entire tables unnecessarily and
performs final query processing locally.
However, it's occurred to me that I don't really understand what sort of SQL I should be writing to make both Access and ODBC/Oracle happy.
Should I be writing some sort of generic SQL that Access can understand in a local query AND that can be easily translated to ODBC/Oracle SQL? Is generic SQL a real thing?
What kind of SQL does the ODBC driver use? It depends as typically MS Access has three types of external data connections that interfaces with different SQL dialects each with the ODBC API.
Linked tables that acts like local tables but are ODBC connected data sources and not stored locally. Once they are incorporated in an Access app, these tables can only use MS Access' SQL dialect. They can be joined with local or even other backend tables from other sources.
Hence, why TOP is available in MS Access and not Oracle. You are essentially using Access SQL to manipulate Oracle data. ODBC serves as the origin point of data while Access' Jet/ACE SQL engine does the processing and resultset viewing in cached memory.
Pass-through queries that do not see local tables or anything else in local app's environment. Such queries use the SQL dialect of the connected database here being Oracle.
Hence, why TOP is NOT available in Oracle and double quotes are allowed in column identifiers. Such quoting would fail in MS Access. Essentially, you are using Oracle SQL to manipulate Oracle data in an Access app. You can take the output of the sqlout.txt log and run it in a pass-through query ODBC-connected to your Oracle database.
ADO/DAO Recordsets that are run entirely via code such as VBA and are direct connections to data sources and uses the connecting database's dialect.
Here, you using Oracle SQL to manipulate Oracle data in an Access app via the ODBC API.
In each one of these types, you will have to connect to a backend ODBC data source. You do not even need to use the GUI but can use Access' object library to create linked tables (see DoCmd.TransferDatabase) and pass through querydefs (see QueryDef.Connect or .Execute).
I suspect the sqlout.txt log you see are translations of the ODBC calls to its native dialect.
To build on #Parfait's point #1:
From Microsoft Access Developer's Guide to SQL Server by Mary Chipman and Andy Baron:
Optimizing Access Queries:
There's a common misconception that the Jet engine always retrieves all the data in linked SQL Server tables and then processes the data locally. This is not usually true. Jet is perfectly capable of sending efficient queries to SQL Server over ODBC and retrieving only the rows required. However, in some cases, Jet will in fact be forced to fetch all the data in certain tables first and then process it. You should be aware of when you are forcing Jet to do this and be sure that it is justified. The following are some general guidelines to follow when creating your Access queries:
Using expressions that can't be evaluated by the server will cause Jet to retrieve all the data required to evaluate those expressions locally. The impact of using Access-specific expressions, such as domain aggregate functions, Access financial functions, or custom VBA functions will vary depending on where in your query the expressions are used. Using such an expression in the SELECT clause will usually not cause a problem because no extra data will be returned. However, if the expression is in the WHERE clause, that criterion cannot be applied on the server, and all the data evaluated by the expression will have to be returned.
With multiple criteria, as many as possible will be processed on the server. This means that even if you use criteria that you know include functions that will need to be processed by Jet, adding other criteria that can be handled by the server will reduce the number of records that Jet has to process. Adding criteria on indexed columns is especially helpful.
Query syntax that includes an Access-specific extension to SQL, not supported by the ODBC driver, may force processing to be done on the client by Access. For example, even though SELECT TOP 5 PERCENT is now supported by SQL Server, it is not supported by the ODBC driver. If you use that syntax in an Access query, Jet will need to retrieve all the records and calculate which ones are in the top 5 percent. On the other hand, even though crosstab queries are specific to Access, Jet will translate them into simple GROUP BY queries and fetch just the required data in one trip to the server unless problematic criteria is used.
Heterogeneous joins between local and remote tables or between remote tables that are in different data sources will, of course, have to be processed by Jet after the source data is retrieved. However, if the remote join field is indexed and the table is large, Jet will often use the index to retrieve only the required rows by making multiple calls to the remote table, one fore each row required.
Jet allows you to mix data types within [typo - fix later] of UNION queries and within expressions, but SQL server doesn't. Such mixing of data types will force processing to be done locally.
Multiple outer joins in one query will be processed locally.
The most important factor is reducing the total number of records being fetched. Jet will retrieve multiple batches of records in the background until the result set is complete, so even though you may seem to get results back immediately, a continuing load is being placed on the server for large result sets.
Note: this book is quite old (published in 2000) and is in reference to Jet Engine. I imagine things might be slightly different in newer versions of Access which use ACE, although I don't have a source to back this up.

How can I perform the SQL query against different database systems?

I have a question about how to perform database queries against different database systems?
For example, I have a following SQL query string:
SELECT A.F1, A.F2, B.F3, B.F4
FROM TableA A, TableB B
WHERE A.ID=B.ID AND B.ID=xyz
Is there any solution that I can perform the above query when:
TableA is from an Oracle database, TableB is also from an Oracle database from another instance
TableA is from a SQL Server database, TableB is from an Oracle database
TableA is from an Oracle database, TableB is from a SQL Server database
TableA is from a SQL Server database, TableB is also from a SQL Server database from another SQL Server instance
I know that for situation #1 I can use the ORACLE DATABASE LINK feature (also maybe #3). But is there any common solution which can address all of the scenarios above? For instance, maybe there is another scenario that I want to join two tables from MySQL and SQL Server databases.
For coding I am using C#/.NET, any recommendation is welcome, including joining the data in the code.
Thanks!
This is known as a federated query
You can use SQL Servers federation ability (linked servers) and run the query in SQL Server. You can use Oracles federation ability (Oracle heterogeneous services) and run the query in Oracle.
Either way you need to pick a database server, make the other (foreign) database known to that server then execute the query on the server.
Keep in mind you cannot expect good performance from this as in most cases the executing server is loading all the source records locally and joining locally.
Otherwise you need to write your own 'federation' ability in your app. This is non trivial, i.e. data types don't match exactly between vendors so you need to build some smarts. Also if you are joining particularly large tables you'd better make sure your algorithm is optimised or you'll run our of memory. Further to this, federated query ability in existing products (SQL, Oracle, Cognos etc.) is the result of very large companies doing a lot of development.
Can you tell us how many records you expect to join, and are the various source database servers mostly fixed or is this for an ad hoc query tool?

I'd like to merge data sets using an SQL query from different servers (one Sybase the other MS)

Is that possible? I'm using Aquadesk and I can't get it to work. The tables have a matching unique identifier and wondering if I can match them up in some way.
What you need - as I think - are "Federated Servers" (Databases) (you can look this up)
The basic idea behind that is, the you can create (catalog) a table in you local Database that is already residing on an other Database (or Server, or even an other DB System, but that depends in you SQL system and version) -> that is defintely a question for your DBAS
You get a table like 'MYSQL'.'PERSONS' that resides remotely (eg. 'BASE','PERSDATA'), so you can use them in a
`SELECT *
from 'LOCALNAME'.'USERS usr
JOIN 'MYSQL'.'PERSONS' pers
on usr.user_id=pers.id`
So jou can select and join over different Databases (and Servers)
I only used that whith IBM/UDB but it works realy fine, and has a fair performance (altough heavily depending on your statement)

Listing all tables in a database

Is there a SQL command that will list all the tables in a database and which is provider independent (works on MSSQLServer, Oracle, MySQL)?
The closest option is to query the INFORMATION_SCHEMA for tables.
SELECT *
FROM INFORMATION_SCHEMA.Tables
WHERE table_schema = 'mydatabase';
The INFORMATION_SCHEMA is part of standard SQL, but not all vendors support it. As far as I know, the only RDBMS vendors that support it are:
MySQL
PostgreSQL
Microsoft SQL Server 2000/2005/2008
Some brands of database, e.g. Oracle, IBM DB2, Firebird, Derby, etc. have similar "catalog" views that give you an interface where you can query metadata on the system. But the names of the views, the columns they contain, and their relationships don't match the ANSI SQL standard for INFORMATION_SCHEMA. In other words, similar information is available, but the query you would use to get that information is different.
(footnote: the catalog views in IBM DB2 UDB for System i are different from the catalog views in IBM DB2 UDB for Windows/*NIX -- so much for the Universal in UDB!)
Some other brands (e.g. SQLite) don't offer any queriable interface for metadata at all.
No. They all love doing it their own little way.
No, the SQL standard does not constrain where the table names are listed (if at all), so you'll have to perform different statements (typically SELECT statements on specially named tables) depending on the SQL engine you're dealing with.
If you are OK with using a non-SQL approach and you have an ODBC driver for the database and it implements the SQLTables entry-point, you possibly might get the information you want!
pjjH
details on the API at:
http://msdn.microsoft.com/en-us/library/ms711831.aspx