When did sqlalchemy execute the query? - sql

As I've just start learning to use sqlalchemy recently, the result of the following code make me confused about when sqlalchemy execute the query:
query = db.session.query(MyTable)
query = query.filter(...)
query = query.limit(...)
query = query.offset(...)
records = query #records=query.all()
for r in records:
#do something
note the line
records = query #records=query.all()
Seems that it brings the same correct result(stored in variable "records") when using "query" and "query.all()", I wonder when was the query executed?
If it is executed during the first line "db.session.query(MyTable)", the result set may be large at this point; if during the fifth line "records = query", how could that happen as there's no function call at all?

In your example, the query gets executed upon for r in records. Accessing the query object via iterator triggers the execution. (Normally, only then will it be compiled into a SELECT statement)
Up until this time, the query will be built (via filter, limit etc).
Please read also the ORM Tutorial on querying

Related

Getting parameters error when using ibm_db_dbi sql query in python

I'm trying to use the results from one query to use in the where clause of another and cannot get it to work. at the moment i'm getting an error....
ProgrammingError: ibm_db_dbi::ProgrammingError: Exception('Statement Execute Failed: [IBM][CLI Driver] CLI0100E Wrong number of parameters. SQLSTATE=07001 SQLCODE=-99999')
My code below (eventually, 'result' will just be a variable assigned to the results from another query, but for now i'm just trying to get it to work with a static variable. Thanks in advance!
import ibm_db_dbi as db
result = ['c80fS4Pn1', '9f*hzNT21']
conn = db.connect('DRIVER=DB2 zOS;'
'DATABASE=xxxx;'
'HOSTNAME=xxxx.com;'
'PORT=xxx;'
'PROTOCOL=xxxx;'
'UID=id;'
'PWD=passord;', '', '')
cur = conn.cursor()
sql = "SELECT * FROM SCHEMA.TABLE WHERE PRIM_KEY IN (?)"
cur.execute(sql, (result))
conn.close()
The reason you get error CLI0100E is because in your code-sample you show a list (called result) with two entries, while in your query there is a single parameter-marker (?)
The number of parameters to be bound (as done by the cur.execute()), must exactly match the number of parameter-markers in the query
As you usually do not know in advance the number of rows returned from a query, you usually don't know the number of parameter-markers in advance.
You could dynamically generate the number of parameter markers to match the number of rows in the previous result-set. Or you could generate the SQL string in full without parameter markers which is inefficient and might not scale.
It is wise to do in SQL the things that SQL is good at doing, such as passing the results of a sub-query into another query. Trying to do that in client side code (instead of inside the SQL engine) may be inelegant and slow.

Get a Big Query script to output a table

I am using the new scripting feature of Big Query to declare a variable and then am using that variable in a standard SQL query.
The structure of the query is :
DECLARE {name of variable} {data type};
SET {name of variable} = {Value}'
(A SQL QUERY THEN FOLLOWS USING THE ABOVE VARIABLE)
I understand that this is now a script a no longer a typical query, and thus when I run it, it runs as a sequence of executable tasks. But is there anyway in the script to explicitly state that I only want to output the resulting table of the SQL query as opposed to both the result of declaring the variable and SQL query?
What BQ Outputs
Depending on how you "capture" the output, if you are sending a query from Python/Java/CLI, then the last SELECT statement in script is the only output that you receive with the API.
Please also note that each "output" that you see come with a cost/bytes-billed, which is another reason for them to be visible at all time.
Update:
If you need to capture the output of SELECT statement to a table, depending on your intention, you may use:
CREATE OR REPLACE TABLE <your_destination_table> AS SELECT ...
or
INSERT INTO TABLE <your_destination_table> SELECT ...

Raills: Get SQL generated by delete_all

I'm not particularly familiar with Ruby on Rails, but I'm troubleshooting an issue we're experiencing with a rake job that is supposed to be cleaning database tables. The tables grow very large very quickly, and the query generated by ActiveRecord doesn't seem to be efficient enough to handle it.
The Ruby calls looks like this:
Source.where("id not IN (#{Log.select('DISTINCT source_id').to_sql})").delete_all
and this:
Log.joins(:report).where(:report_id => Report.where(cond)).delete_all
I'm trying to get at the SQL, so we can have our DBA's attempt to optimize it better. I've noticed if I drop the ".delete_all" I can add a ".to_sql" which gives me the SELECT statement of the query, prior to the call to ".delete_all". I'd like to see what SQL is being generated by that delete_all method though.
Is there a way to do that?
Another option is to use raw Arel syntax, similar to a simplified version of what ActiveRecord::Relation#delete_all does.
relation = Model.where(...)
arel = relation.arel
stmt = Arel::DeleteManager.new
stmt.from(arel.join_sources.empty? ? Model.arel_table : arel.source)
stmt.wheres = arel.constraints
sql = Model.connection.to_sql(stmt, relation.bound_attributes)
print sql
This will give you the generated delete sql. Here's an example using postgres as the sql adapter
relation = User.where('email ilike ?', '%#gmail.com')
arel = relation.arel
stmt = Arel::DeleteManager.new
stmt.from(arel.join_sources.empty? ? User.arel_table : arel.source)
stmt.wheres = arel.constraints
sql = User.connection.to_sql(stmt, relation.bound_attributes)
=> DELETE FROM "users" WHERE (email ilike '%#gmail.com')
From the fine manual:
delete_all(conditions = nil)
Deletes the records matching conditions without instantiating the records first, and hence not calling the destroy method nor invoking callbacks. This is a single SQL DELETE statement that goes straight to the database, much more efficient than destroy_all.
So a Model.delete_all(conditions) ends up as
delete from models where conditions
When you say Model.where(...).delete_all, the conditions for the delete_all come from the where calls so these are the same:
Model.delete_all(conditions)
Model.where(conditions).delete_all
Applying that to your case:
Source.where("id not IN (#{Log.select('DISTINCT source_id').to_sql})").delete_all
you should see that you're running:
delete from sources
where id not in (
select distinct source_id
from logs
)
If you run your code in a development console you should see the SQL in the console or the Rails logs but it will be as above.
As far as optimization goes, my first step would be to drop the DISTINCT. DISTINCT usually isn't cheap and IN doesn't care about duplicates anyway so not in (select distinct ...) is probably pointless busy work. Then maybe an index on source_id would help, the query optimizer might be able to slurp the source_id list straight out of the index without having to do a table scan to find them. Of course, query optimization is a bit of a dark art so these simple steps may or may not work.
ActiveRecord::Base.logger = Logger.new(STDOUT) should show you all the SQL generated by rails on your console.

SQL queries in batch don't execute

My project is in Visual Foxpro and I use MS SQL server 2008. When I fire sql queries in batch, some of the queries don't execute. However, no error is thrown. I haven't used BEGIN TRAN and ROLLBACK yet. What should be done ??
that all depends... You don't have any sample of your queries posted to give us an indication of possible failure. However, one thing I've had good response with from VFP to SQL is to build into a string (I prefer using TEXT/ENDTEXT for readabilty), then send that entire value to SQL. If there are any "parameter" based values that are from VFP locally, you can use "?" to indicate it will come from a variable to SQL. Then you can batch all in a single vs multiple individual queries...
vfpField = 28
vfpString = 'Smith'
text to lcSqlCmd noshow
select
YT.blah,
YT.blah2
into
#tempSqlResult
from
yourTable YT
where
YT.SomeKey = ?vfpField
select
ost.Xblah,
t.blah,
t.blah2
from
OtherSQLTable ost
join #tempSqlResult t
on ost.Xblah = t.blahKey;
drop table #tempSqlResult;
endtext
nHandle = sqlconnect( "your connection string" )
nAns = sqlexec( nHandle, lcSqlCmd, "LocalVFPCursorName" )
No I don't have error trapping in here, just to show principle and readability. I know the sample query could have easily been done via a join, but if you are working with some pre-aggregations and want to put them into temp work areas like Localized VFP cursors from a query to be used as your next step, this would work via #tempSqlResult as "#" indicates temporary table on SQL for whatever the current connection handle is.
If you want to return MULTIPLE RESULT SETs from a single SQL call, you can do that too, just add another query that doesn't have an "into #tmpSQLblah" context. Then, all instances of those result cursors will be brought back down to VFP based on the "LocalVFPCursorName" prefix. If you are returning 3 result sets, then VFP will have 3 cursors open called
LocalVFPCursorName
LocalVFPCursorName1
LocalVFPCursorName2
and will be based on the sequence of the queries in the SqlExec() call. But if you can provide more on what you ARE trying to do and their samples, we can offer more specific help too.

LINQ-to-SQL IN ()

Currently I use a block of code like this, to fetch a set of DB objects with matching IDs.
List<subjects> getSubjectsById(List<long> subjectIDs){
return ctx.tagSubjects.Where(t => subjectIDs.Contains(t.id)).ToList();
}
But this is really inefficient, because it requires the entire table to be read from the database and then filtered inside of C#.
What I would rather do would be something the equivelent of:
SELECT * FROM subjects WHERE subjects.id IN (1,2,3,4,5,...);
The big difference is that in the first example the filtering is happening inside the C# code, and in the second the filtering is done on the SQL server (where the data is).
Is there a [better] way to do this with LINQ?
Where did you find out that it downloads the entire table from SQL Server?
I'm sure it does what you want. It translates the query to a parameterized IN clause like:
... IN (#p1, #p2, #p3)
and passes the contents of the list as values to those parameters. You can confirm this with tools such as SQL Profiler and LINQ to SQL debugger visualizer or set the DataContext.Log property to console (before executing the query) and read the generated SQL:
dataContext.Log = Console.Out;