SQL statement does not have effect using pyodbc - sql

I am running python script on the server that should update existing
table 'loading_log' using ODBC connection.
The issue is that my script does not have any effect on the table in Database i.e. it does not delete records and does not insert new records.
At the same time I don't see any errors thrown after the execution.
If I run the same SQL query from Desktop using the same credentials it works fine.
My question:
Why it does not work inside python script?
Here's an excerpt from my code:
curs.execute('''
delete from loading_log
''')
#
#record loaded record ids into loading_log table
#
#logging.info('insert laoded record id data into loading_log table')
curs.execute('''
insert into loading_log (catalog_sample_events_id,ShippingId)
select top 500
cs.catalog_sample_events_id,
cs.shipping_id ShippingId
from catalog_sample_events cs
join event_type et on et.event_type_id = cs.event_type_id
join event_source es on es.event_source_id = cs.event_source_id
join etl_status esi on esi.etl_status_id = cs.etl_status_id
where cs.catalog_sample_events_id > ?
order by cs.catalog_sample_events_id
''', max_id)

You need to commit the transaction:
curs.commit()
or tell pyodbc to use autocommit mode. See pyodbc wiki for more details.

Related

Optimization when merging from Oracle datalink

I am trying to write an Oracle procedure to merge data from a remote datalink into a local table. Individually the pieces work quickly, but together they time out. Here is a simplified version of what I am trying.
What works:
Select distinct ProjectID from Project where LastUpdated < (sysdate - 6/24);
--Works in split second.
Merge into project
using (select /*+DRIVING_SITE(remoteCompData)*/
rp.projectID,
rp.otherdata
FROM Them.Remote_Data#DBLink rd
WHERE rd.projectID in (1,2,3)) sourceData -- hardcoded IDs
On (rd.projectID = project.projectID)
When matched...
-- Merge statement works quickly when the IDs are hard coded
What doesn't work: Combining the two statements above.
Merge into project
using (select /*+DRIVING_SITE(rd)*/ -- driving site helps when this piece is extracted from the larger statement
rp.projectID,
rp.otherdata
FROM Them.Remote_Data#DBLink rd
WHERE rd.projectID in --in statement that works quickly by itself.
(Select distinct ProjectID from Project where LastUpdated < (sysdate - 6/24))
-- This select in the in clause one returns 10 rows. Its a test database.
On (rd.projectID = project.projectID)
)
When matched...
-- When I run this statement in SQL Developer, this is all that I get without the data updating
Connecting to the database local.
Process exited.
Disconnecting from the database local.
I also tried pulling out the in statement into a with statement hoping it would execute differently, but it had no effect.
Any direction for paths to pursue would be appreciated.
Thanks.
The /*+DRIVING_SITE(rd)*/ hint doesn't work with MERGE because the operation must run in the database where the merged table sits. Which in this case is the local database. That means the whole result set from the remote table is pulled across the database link and then filtered against the data from the local table.
So, discard the hint. I also suggest you convert the IN clause into a join:
Merge into project p
using (select rp.projectID,
rp.otherdata
FROM Project ld
inner join Them.Remote_Data#DBLink rd
on rd.projectID = ld.projectID
where ld.LastUpdated < (sysdate - 6/24)) q
-- This select in the in clause one returns 10 rows. Its a test database.
On (q.projectID = p.projectID)
)
Please bear in mind that answers to performance tuning questions without sufficient detail are just guesses.
I found your question having same problem. Yes, the hint in query is ignored when the query is included into using clause of merge command.
In my case I created work table, say w_remote_data for your example, and splitted merge command into two commands: (1) fill the work table, (2) invoke merge command using work table.
The pitfall is, we cannot simply use neither of commands create w_remote_data as select /*+DRIVING_SITE(rd)*/ ... or insert into w_remote_data select /*+DRIVING_SITE(rd)*/ ... to fill the work table. Both of these commands are valid but they are slow - the hint does not apply too so we would not get rid of the problem. The solution is in PLSQL: collect result of query in using clause using intermediate collection. See example (for simplicity I assume w_remote_data has same structure as remote_data, otherwise we have to define custom record instead of %rowtype):
declare
type ct is table of w_remote_data%rowtype;
c ct;
i pls_integer;
begin
execute immediate 'truncate table w_remote_data';
select /*+DRIVING_SITE(rd)*/ *
bulk collect into c
from Them.Remote_Data#DBLink rd ...;
if c.count > 0 then
forall i in c.first..c.last
insert into w_remote_data values c(i);
end if;
merge into project p using (select * from w_remote_data) ...;
execute immediate 'truncate table w_remote_data';
end;
My case was ETL script where I could rely it won't run in parallel. Otherwise we would have to cope with temporary (session-private) tables, I didn't try if it works with them.

hive doesn't support merge function

trying to update the value from table to another table, both of these tables have the same field name but different values, the query must be work fine on any normal DB but here it returns
Error while compiling statement: FAILED: ParseException line 1:0
cannot recognize input near 'MERGE' 'INTO' 'FINAL'
MERGE
INTO FINAL
USING FIRST_STAGE
ON IMSI = FIRST_STAGE.IMSI and Site = FIRST_STAGE.Site
WHEN MATCHED THEN UPDATE SET
Min_Date = least(FIRST_STAGE.Min_Date, Min_Date),
Max_Date = greatest(FIRST_STAGE.Max_Date, Max_Date),
NoofDays = FIRST_STAGE.NoofDays + NoofDays,
Down_Link = FIRST_STAGE.Down_Link + Down_Link,
up_Link = FIRST_STAGE.up_Link + up_Link,
connection = FIRST_STAGE.connection + connection
WHEN NOT MATCHED THEN INSERT ( Min_Date,
Max_Date,
NoofDays,
IMSI,
Site,
Down_Link,
Up_Link,
Connection )
VALUES ( FIRST_STAGE.Min_Date,
FIRST_STAGE.Max_Date,
FIRST_STAGE.NoofDays,
FIRST_STAGE.IMSI,
FIRST_STAGE.Site,
FIRST_STAGE.Down_Link,
FIRST_STAGE.Up_Link,
FIRST_STAGE.Connection )
Hive merge statement is introduced in Hortonworks distribution.
Prerequisite for these merge statement to run is:
Final table needs to be created with transactional enabled ,ORC format ,and bucketed.
AFAIK In case of Cloudera distribution we need to use Kudu to perform upsert operations starting from cloudera-5.10+.
Note: Upsert statement only works for Impala tables that use the Kudu storage engine.
I don't think we can run merge statements as mentioned in the post in CDH distributions as of now.

Creating SQL Connection in Excel that creates and drops temporary tables

I've created query that runs perfectly in Microsoft SQL Server and it utilizes temporary tables it creates and drops. I now want to create a data connection in Microsoft Excel that will run my query and display the results in Excel, so that business partners who do not know SQL can utilize the report.
However, when I try to run the query in a data connection in excel I get errors like the following:
"Database name 'database' ignored, referencing object in tempdb."
OR
"The query did not run, or the database table could not be opened. Check the database server or contact your database administrator. Make sure the external database is available and hasn't been moved or reorganized, then try the operation again."
I then tried to use the phrase "SET NOCOUNT on" before creating my temporary tables but it did not resolve the issue and I still have no report.
Here is how my code is set up
select
x.PRODUCT_ID,
y.SETID,
LW = ISNULL (SUM(y.SALES_QTY) ,0)
INTO database.#tmpLastWeek
--#tempLastWeek is a temporary table I am creating here and inserting values into
from
database.sales_table y
inner join
database.product_table x
on y.PRODUCT_ID = x.PRODUCT_ID
and x.SETID = y.SETID
where
y.stores IN (
'storeslist'
)
and y.sales_week = #this_week - 1
and y.sales_year = #this_year
group by
x.PRODUCT_ID,
y.SETID,
x.DESCR
Then I select from the temporary table #tmpLastWeek
select * from #tmpLastWeek
Then finally I drop this temporary table so that it can be ran again
drop table #tmpLastWeek
Any advice or suggestions on getting this query to be ran thru and SQL connection in Excel would be much appreciated! Thanks!!!!

Multiple success messages from SQL Server Update statement

I have the following query that SHOULD update 716 records:
USE db1
GO
UPDATE SAMP
SET flag1 = 'F', flag2 = 'F'
FROM samp INNER JOIN result ON samp.samp_num = result.samp_num
WHERE result.status != 'X'
AND result.name = 'compound'
AND result.alias = '1313'
AND sample.standard = 'F'
AND sample.flag2 = 'T';
However, when this query is run on a SQL Server 2005 database from a query window in SSMS, I get the following THREE messages:
716 row(s) affected
10814 row(s) affected
716 row(s) affected
So WHY am I getting 3 messages (instead of the normal one for a single update statement) and WHAT does the 10814 likely refer to? This is a production database I need to update so I don't want to commit these changes without knowing the answer :-) Thanks.
This is likely caused by a trigger on the [samp] table. If you go to Query -> Query Options -> Execution -> Advanced and check SET STATISTICS IO, you will see which other tables are being updated when you run the query.
You can also use the object browser in SSMS to look for the triggers. Open the Tables Node, find the table, open the table node and then open the triggers. The nice thing about this method is that you can script the trigger to a new query window and see what the trigger is doing.
It's probably because you have one trigger in your table.
This command will show you what is happening.
SET STATISTICS IO { ON | OFF }
https://msdn.microsoft.com/en-us/library/ms184361.aspx

Select Query on 2 tables, on different database servers

I am trying to generate a report by querying 2 databases (Sybase) in classic ASP.
I have created 2 connection strings:
connA for databaseA
connB for databaseB
Both databases are present on the same server (don't know if this matters)
Queries:
q1 = SELECT column1 INTO #temp FROM databaseA..table1 WHERE xyz="A"
q2 = SELECT columnA,columnB,...,columnZ FROM table2 a #temp b WHERE b.column1=a.columnB
followed by:
response.Write(rstsql) <br>
set rstSQL = CreateObject("ADODB.Recordset")<br>
rstSQL.Open q1, connA<br>
rstSQL.Open q2, connB
When I try to open up this page in a browser, I get error message:
Microsoft OLE DB Provider for ODBC Drivers error '80040e37'
[DataDirect][ODBC Sybase Wire Protocol driver][SQL Server]#temp not found. Specify owner.objectname or use sp_help to check whether the object exists (sp_help may produce lots of output).
Could anyone please help me understand what the problem is and help me fix it?
Thanks.
With both queries, it looks like you are trying to insert into #temp. #temp is located on one of the databases (for arguments sake, databaseA). So when you try to insert into #temp from databaseB, it reports that it does not exist.
Try changing it from Into #temp From to Into databaseA.dbo.#temp From in both statements.
Also, make sure that the connection strings have permissions on the other DB, otherwise this will not work.
Update: relating to the temp table going out of scope - if you have one connection string that has permissions on both databases, then you could use this for both queries (while keeping the connection alive). While querying the table in the other DB, be sure to use [DBName].[Owner].[TableName] format when referring to the table.
your temp table is out of scope, it is only 'alive' during the first connection and will not be available in the 2nd connection
Just move all of it in one block of code and execute it inside one conection
temp is out of scope in q2.
All your work can be done in one query:
SELECT a.columnA, a.columnB,..., a.columnZ
FROM table2 a
INNER JOIN (SELECT databaseA..table1.column1
FROM databaseA..table1
WHERE databaseA..table1.xyz = 'A') b
ON a.columnB = b.column1