Does Informix support several threads creating the same-named temporary table? - eclipselink

I am in the process of fixing EclipseLink's InformixPlatform database support class to work with our Informix 11.70 installation.
One of the contracts that needs work is expressed by the following code. The documentation explains what is needed:
/**
* INTERNAL:
* Indicates whether the platform supports local temporary tables.
* "Local" means that several threads may create
* temporary tables with the same name.
[snip]
*/
public boolean supportsLocalTempTables() {
return true; // is this correct?
}
/**
* INTERNAL:
* Indicates whether the platform supports global temporary tables.
* "Global" means that an attempt to create temporary table with the same
* name for the second time results in exception.
[snip]
* Note that this method is ignored in case supportsLocalTempTables() returns true.
*/
public boolean supportsGlobalTempTables() {
return false; // is this correct?
}
My sense is that I should return true from my implementation of supportsLocalTempTables(), because I think that Informix does indeed support the ability to create, for example, a temporary table named FRED from session 1, and a temporary table also named FRED from session 2. Is my assumption correct?
I consulted the Informix 11.70 InfoCenter topic but did not see anything specific there.

Yes. Temporary tables in Informix are local to the session and visible only by the session.
In addition, when the session closes, all its temporary tables are dropped.
So, in different sessions you can create temporary tables with the same name.
There is no way to share access to temporary tables between sessions.

Related

Object Variable in a query

I am using SSDT 2017 and I am working on a solution that basically gets a full result set from a query into a variable (1 column only: AccountID), and I need to include the values in that object variable in a query, something like this:
"SELECT * FROM dbo.account WHERE AccountID IN (" + #AccountIDObjectVariable + ")"
I tried with an expression but I get an error, so I am not sure if there's a better way, also I tried a for each loop container logic but since I have millions of record in the object variable I think that's not the best way.
Any idea?
It doesn't work that way. Where "it" is going to be a host of things.
The SSIS data types are primitive types (boolean, date, numbers) or Object. The only supported operations for Object is a null check and enumeration.
SSIS parameterization is only for equality based substitutions. There is no concept of a list data type in SQL so there's no analog in SSIS.
I have millions of record in the object variable
Even if you converted your list to a string and used string concatenation, the next problem you're going to run into is the string length limit of 4000 characters.
What is the way?
Let's reset the problem: You have a non-trivial set of identities from a source system. That set of ids needs to be used as the basis for a subsequent extract.
Is the source of identities and the actual data on the same server
While you can empty the ocean with a teaspoon, it's not the correct tool. Same holds true here. Move the query that identifies the recordset to be extracted into a filter condition for your source.
i.e.
Load dataset into #AccountIDObjectVariable
SELECT
OA.AccountId
FROM
dbo.OutstandingAccount AS OA;
Extract that isn't working
"SELECT * FROM dbo.account WHERE AccountID IN (" + #AccountIDObjectVariable + ")"
is rewritten as
SELECT * FROM dbo.account AS A WHERE EXISTS (SELECT * FROM dbo.OutstandingAccount AS OA WHERE OA.AccountID = A.AccountID);
There are two reasonable approaches for solving this
Pull it all
If the source ids list and the source table are of similar orders of magnitude, it might be easier to just bring it all down and use the account id generating query in a Lookup Task. If AccountID exists, then it's the data you want. Yes you pulled more than you wanted but you likely would have burned more cycles and complexity trying to selectively pull what you wanted.
Push and pull
This approach is going to work for SQL Server and I have no idea about any other database. Well, I suppose Sybase would be the same given database paternity.
Open SSMS and create a global temporary table on the database where dbo.account lives. Do not disconnect from SSMS.
IF OBJECT_ID('tempdb..##SO_66961235') IS NOT NULL
BEGIN
DROP TABLE ##SO_66961235;
END
GO
CREATE TABLE ##SO_66961235
(
AccountID int NOT NULL
);
Modify the Connection manager to set the RetainSameConnection Property to true for the database connection to dbo.account
Execute SQL Task - Make Temp Table
Use the connection to the account database and the above query. This will ensure the table exists for future sessions of SSIS to work.
DataFlow Load IDs
In the dataflow properties, set DelayValidation to True
Use your source query to generate the list of IDs and select the temporary table as the destination. You might need to have a second connection manager to this system running and pointed at tempdb, it's been a long time since I've done this. Same rule about RetainSameConnection will hold true though.
When this data flow completes, then we will have a temporary table on the data source server that we can reference.
Dataflow 2 Get Data
Again, DelayValidation to true.
Source will be a query
SELECT * FROM dbo.account AS A WHERE EXISTS (SELECT * FROM ##SO_66961235 AS OA WHERE OA.AccountID = A.AccountID);
What's with all the delay validation?
When the SSIS package starts, the first thing it does is ensure all the pieces are in place for it to run successfully and not only are the pieces in place, is the shape of the data still the same? A temporary table won't exist when the package starts and the package will fail with VS_NEEDSNEWMETADATA error. Setting DelayValidation tells SSIS that it should not worry about checking until the component actually gets the signal to start before it checks metadata. Since we defined the precursor Execute SQL Task to create the table, the validation should succeed.
I used global temporary tables here. You can use local scoped temporary tables but it makes the already fiddly design process much more so. Were it me, I'd have a package parameter controlling a boolean that uses a global temp table for development sessions and local temp table for actual run-time operations but that's beyond the scope of this question.

Use schema name in a JOIN in Redshift

Our database is set up so that each of our clients is hosted in a separate schema (the organizational level above a table in Postgres/Redshift, not the database structure definition). We have a table in the public schema that has metadata about our clients. I want to use some of this metadata in a view I am creating.
Say I have 2 tables:
public.clients
name_of_schema_for_client
metadata_of_client
client_name.usage_info
whatever columns this isn't that important
I basically want to get the metadata for the client I'm running my query on and use it later:
SELECT *
FROM client_name.usage_info
INNER JOIN public.clients
ON CURRENT_SCHEMA() = public.clients.name_of_schema_for_client
This is not possible because CURRENT_SCHEMA() is a leader-node function. This function returns an error if it references a user-created table, an STL or STV system table, or an SVV or SVL system view. (see https://docs.aws.amazon.com/redshift/latest/dg/r_CURRENT_SCHEMA.html)
Is there another way to do this? Or am I just barking up the wrong tree?
Your best bet is probably to just manually set the search path within the transaction from whatever source you call this from. See this:
https://docs.aws.amazon.com/redshift/latest/dg/r_search_path.html
let's say you only want to use the table matching your best client:
set search_path to your_best_clients_schema, whatever_other_schemas_you_need_for_this;
Then you can just do:
select * from clients;
Which will try to match to the first clients table available, which by coincidence you just set to your client's schema!
You can manually revert afterwards if need be or just reset the connection to return to default, up to you

How to join a table which is in another database in postgres [duplicate]

I'm going to guess that the answer is "no" based on the below error message (and this Google result), but is there anyway to perform a cross-database query using PostgreSQL?
databaseA=# select * from databaseB.public.someTableName;
ERROR: cross-database references are not implemented:
"databaseB.public.someTableName"
I'm working with some data that is partitioned across two databases although data is really shared between the two (userid columns in one database come from the users table in the other database). I have no idea why these are two separate databases instead of schema, but c'est la vie...
Note: As the original asker implied, if you are setting up two databases on the same machine you probably want to make two schemas instead - in that case you don't need anything special to query across them.
postgres_fdw
Use postgres_fdw (foreign data wrapper) to connect to tables in any Postgres database - local or remote.
Note that there are foreign data wrappers for other popular data sources. At this time, only postgres_fdw and file_fdw are part of the official Postgres distribution.
For Postgres versions before 9.3
Versions this old are no longer supported, but if you need to do this in a pre-2013 Postgres installation, there is a function called dblink.
I've never used it, but it is maintained and distributed with the rest of PostgreSQL. If you're using the version of PostgreSQL that came with your Linux distro, you might need to install a package called postgresql-contrib.
dblink() -- executes a query in a remote database
dblink executes a query (usually a SELECT, but it can be any SQL
statement that returns rows) in a remote database.
When two text arguments are given, the first one is first looked up as
a persistent connection's name; if found, the command is executed on
that connection. If not found, the first argument is treated as a
connection info string as for dblink_connect, and the indicated
connection is made just for the duration of this command.
one of the good example:
SELECT *
FROM table1 tb1
LEFT JOIN (
SELECT *
FROM dblink('dbname=db2','SELECT id, code FROM table2')
AS tb2(id int, code text);
) AS tb2 ON tb2.column = tb1.column;
Note: I am giving this information for future reference. Reference
I have run into this before an came to the same conclusion about cross database queries as you. What I ended up doing was using schemas to divide the table space that way I could keep the tables grouped but still query them all.
Just to add a bit more information.
There is no way to query a database other than the current one. Because PostgreSQL loads database-specific system catalogs, it is uncertain how a cross-database query should even behave.
contrib/dblink allows cross-database queries using function calls. Of course, a client can also make simultaneous connections to different databases and merge the results on the client side.
PostgreSQL FAQ
Yes, you can by using DBlink (postgresql only) and DBI-Link (allows foreign cross database queriers) and TDS_LInk which allows queries to be run against MS SQL server.
I have used DB-Link and TDS-link before with great success.
I have checked and tried to create a foreign key relationships between 2 tables in 2 different databases using both dblink and postgres_fdw but with no result.
Having read the other peoples feedback on this, for example here and here and in some other sources it looks like there is no way to do that currently:
The dblink and postgres_fdw indeed enable one to connect to and query tables in other databases, which is not possible with the standard Postgres, but they do not allow to establish foreign key relationships between tables in different databases.
If performance is important and most queries are read-only, I would suggest to replicate data over to another database. While this seems like unneeded duplication of data, it might help if indexes are required.
This can be done with simple on insert triggers which in turn call dblink to update another copy. There are also full-blown replication options (like Slony) but that's off-topic.
see https://www.cybertec-postgresql.com/en/joining-data-from-multiple-postgres-databases/ [published 2017]
These days you also have the option to use https://prestodb.io/
You can run SQL on that PrestoDB node and it will distribute the SQL query as required. It can connect to the same node twice for different databases, or it might be connecting to different nodes on different hosts.
It does not support:
DELETE
ALTER TABLE
CREATE TABLE (CREATE TABLE AS is supported)
GRANT
REVOKE
SHOW GRANTS
SHOW ROLES
SHOW ROLE GRANTS
So you should only use it for SELECT and JOIN needs. Connect directly to each database for the above needs. (It looks like you can also INSERT or UPDATE which is nice)
Client applications connect to PrestoDB primarily using JDBC, but other types of connection are possible including a Tableu compatible web API
This is an open source tool governed by the Linux Foundation and Presto Foundation.
The founding members of the Presto Foundation are: Facebook, Uber,
Twitter, and Alibaba.
The current members are: Facebook, Uber, Twitter, Alibaba, Alluxio,
Ahana, Upsolver, and Intel.
In case someone needs a more involved example on how to do cross-database queries, here's an example that cleans up the databasechangeloglock table on every database that has it:
CREATE EXTENSION IF NOT EXISTS dblink;
DO
$$
DECLARE database_name TEXT;
DECLARE conn_template TEXT;
DECLARE conn_string TEXT;
DECLARE table_exists Boolean;
BEGIN
conn_template = 'user=myuser password=mypass dbname=';
FOR database_name IN
SELECT datname FROM pg_database
WHERE datistemplate = false
LOOP
conn_string = conn_template || database_name;
table_exists = (select table_exists_ from dblink(conn_string, '(select Count(*) > 0 from information_schema.tables where table_name = ''databasechangeloglock'')') as (table_exists_ Boolean));
IF table_exists THEN
perform dblink_exec(conn_string, 'delete from databasechangeloglock');
END IF;
END LOOP;
END
$$

SQL dot notation

Can someone please explain to me how SQL Server uses dot notation to identify
the location of a table? I always thought that the location is Database.dbo.Table
But I see code that has something else in place of dbo, something like:
DBName.something.Table
Can someone please explain this?
This is a database schema. Full three-part name of a table is:
databasename.schemaname.tablename
For a default schema of the user, you can also omit the schema name:
databasename..tablename
You can also specify a linked server name:
servername.databasename.schemaname.tablename
You can read more about using identifiers as table names on MSDN:
The server, database, and owner names are known as the qualifiers of the object name. When you refer to an object, you do not have to specify the server, database, and owner. The qualifiers can be omitted by marking their positions with a period. The valid forms of object names include the following:
server_name.database_name.schema_name.object_name
server_name.database_name..object_name
server_name..schema_name.object_name
server_name...object_name
database_name.schema_name.object_name
database_name..object_name
schema_name.object_name
object_name
An object name that specifies all four parts is known as a fully qualified name. Each object that is created in Microsoft SQL Server must have a unique, fully qualified name. For example, there can be two tables named xyz in the same database if they have different owners.
Most object references use three-part names. The default server_name is the local server. The default database_name is the current database of the connection. The default schema_name is the default schema of the user submitting the statement. Unless otherwise configured, the default schema of new users is the dbo schema.
What #Szymon said. You should also make a point of always schema-qualifying object references (whether table, view, stored procedure, etc.) Unqualified object references are resolved in the following manner:
Probe the namespace of the current database for an object of the specified name belonging to the default schema of the credentials under which the current connection is running.
If not found, probe the namespace of the current database for an object of the specified name belonging to the dbo schema.
And if the object reference is to a stored procedure whose name begins with sp_, it's worse, as two more steps are added to the resolution process (unless the references is database-qualified): the above two steps are repeated, but this time, looking in the database master instead of the current database.
So a query like
select *
from foo
requires two probes of the namespace to resolve foo (assuming that the table/view is actually dbo.foo): first under your default schema (john_doe.foo) and then, not being found, under dbo (dbo.foo'), whereas
select *
from dbo.foo
is immediately resolved with a single probe of the namespace.
This has 3 implications:
The redundant lookups are expensive.
It inhibits query plan caching, as every execution has to be re-evaluated, meaning the query has to be recompiled for every execution (and that takes out compile-time locks).
You will, at one point or another, shoot yourself in the foot, and inadvertently create something under your default schema that is supposed to exist (and perhaps already does) under the dbo schema. Now you've got two versions floating around.
At some point, you, or someone else (usually it happens in production) will run a query or execute a stored procedure and get...unexpected results. It will take you quite some time to figure out that there are two [differing] versions of the same object, and which one gets executed depends on their user credentials and whether or not the reference was schema-qualified.
Always schema-qualify unless you have a real reason not to.
That being said, it can sometimes be useful, for development purposes to be able to maintain the "new" version of something under your personal schema and the "current" version under the 'dbo' schema. It makes it easy to do side-by-side testing. However, it's not without risk (which see above).
When SQL sees the syntax it will first look at the current users schema to see if the table exists, and will use that one if it does.
If it doesn't then it looks at the dbo schema and uses the table from there

Possible to perform cross-database queries with PostgreSQL?

I'm going to guess that the answer is "no" based on the below error message (and this Google result), but is there anyway to perform a cross-database query using PostgreSQL?
databaseA=# select * from databaseB.public.someTableName;
ERROR: cross-database references are not implemented:
"databaseB.public.someTableName"
I'm working with some data that is partitioned across two databases although data is really shared between the two (userid columns in one database come from the users table in the other database). I have no idea why these are two separate databases instead of schema, but c'est la vie...
Note: As the original asker implied, if you are setting up two databases on the same machine you probably want to make two schemas instead - in that case you don't need anything special to query across them.
postgres_fdw
Use postgres_fdw (foreign data wrapper) to connect to tables in any Postgres database - local or remote.
Note that there are foreign data wrappers for other popular data sources. At this time, only postgres_fdw and file_fdw are part of the official Postgres distribution.
For Postgres versions before 9.3
Versions this old are no longer supported, but if you need to do this in a pre-2013 Postgres installation, there is a function called dblink.
I've never used it, but it is maintained and distributed with the rest of PostgreSQL. If you're using the version of PostgreSQL that came with your Linux distro, you might need to install a package called postgresql-contrib.
dblink() -- executes a query in a remote database
dblink executes a query (usually a SELECT, but it can be any SQL
statement that returns rows) in a remote database.
When two text arguments are given, the first one is first looked up as
a persistent connection's name; if found, the command is executed on
that connection. If not found, the first argument is treated as a
connection info string as for dblink_connect, and the indicated
connection is made just for the duration of this command.
one of the good example:
SELECT *
FROM table1 tb1
LEFT JOIN (
SELECT *
FROM dblink('dbname=db2','SELECT id, code FROM table2')
AS tb2(id int, code text);
) AS tb2 ON tb2.column = tb1.column;
Note: I am giving this information for future reference. Reference
I have run into this before an came to the same conclusion about cross database queries as you. What I ended up doing was using schemas to divide the table space that way I could keep the tables grouped but still query them all.
Just to add a bit more information.
There is no way to query a database other than the current one. Because PostgreSQL loads database-specific system catalogs, it is uncertain how a cross-database query should even behave.
contrib/dblink allows cross-database queries using function calls. Of course, a client can also make simultaneous connections to different databases and merge the results on the client side.
PostgreSQL FAQ
Yes, you can by using DBlink (postgresql only) and DBI-Link (allows foreign cross database queriers) and TDS_LInk which allows queries to be run against MS SQL server.
I have used DB-Link and TDS-link before with great success.
I have checked and tried to create a foreign key relationships between 2 tables in 2 different databases using both dblink and postgres_fdw but with no result.
Having read the other peoples feedback on this, for example here and here and in some other sources it looks like there is no way to do that currently:
The dblink and postgres_fdw indeed enable one to connect to and query tables in other databases, which is not possible with the standard Postgres, but they do not allow to establish foreign key relationships between tables in different databases.
If performance is important and most queries are read-only, I would suggest to replicate data over to another database. While this seems like unneeded duplication of data, it might help if indexes are required.
This can be done with simple on insert triggers which in turn call dblink to update another copy. There are also full-blown replication options (like Slony) but that's off-topic.
see https://www.cybertec-postgresql.com/en/joining-data-from-multiple-postgres-databases/ [published 2017]
These days you also have the option to use https://prestodb.io/
You can run SQL on that PrestoDB node and it will distribute the SQL query as required. It can connect to the same node twice for different databases, or it might be connecting to different nodes on different hosts.
It does not support:
DELETE
ALTER TABLE
CREATE TABLE (CREATE TABLE AS is supported)
GRANT
REVOKE
SHOW GRANTS
SHOW ROLES
SHOW ROLE GRANTS
So you should only use it for SELECT and JOIN needs. Connect directly to each database for the above needs. (It looks like you can also INSERT or UPDATE which is nice)
Client applications connect to PrestoDB primarily using JDBC, but other types of connection are possible including a Tableu compatible web API
This is an open source tool governed by the Linux Foundation and Presto Foundation.
The founding members of the Presto Foundation are: Facebook, Uber,
Twitter, and Alibaba.
The current members are: Facebook, Uber, Twitter, Alibaba, Alluxio,
Ahana, Upsolver, and Intel.
In case someone needs a more involved example on how to do cross-database queries, here's an example that cleans up the databasechangeloglock table on every database that has it:
CREATE EXTENSION IF NOT EXISTS dblink;
DO
$$
DECLARE database_name TEXT;
DECLARE conn_template TEXT;
DECLARE conn_string TEXT;
DECLARE table_exists Boolean;
BEGIN
conn_template = 'user=myuser password=mypass dbname=';
FOR database_name IN
SELECT datname FROM pg_database
WHERE datistemplate = false
LOOP
conn_string = conn_template || database_name;
table_exists = (select table_exists_ from dblink(conn_string, '(select Count(*) > 0 from information_schema.tables where table_name = ''databasechangeloglock'')') as (table_exists_ Boolean));
IF table_exists THEN
perform dblink_exec(conn_string, 'delete from databasechangeloglock');
END IF;
END LOOP;
END
$$