Hibernate and use of criteria method setFirstResult - sql

I'm trying to learn the hibernate criteria API but I'm puzzled by the criteria method setFirstResult.
I don't understand why I would want to use it except in the rarest of circumstances. It seems to me that when I retrieve information from a database, I'm only interested in establishing some criteria and then executing the query against the criteria. Why do I care from which index number in the database the results should be read. It is not something I normally do when I write sql queries yet I see this method all over the hibernate literature. Is this method something I always have to invoke when writing Hibernate queries or can I safely ignore it?
Thank you,
Elliott

This is typically used when displaying paginated results of a query. The first page goes from 0 to 19, the second page from 20 to 39, etc.

Well I use it in a bunch of places.. its unfortunate or outright lucky/dumb that you have run into a case where you needed to page your results in which case you generally right queries that pick from one index to another. consider the case where you want to display the audit log of an app that is stored for every write action on the page. in that case you will show the 20 results based on which page the user is and what field the audit log is sorted on.

Related

How to keep SQL data and Elasticsearch in-sync, and which to search from?

I've seen two solutions mentioned, and was wondering what most people do.
Use logstash
Code your application to make writes to Elasticsearch alongside SQL. For example,
public saveRecord() {
saveToElasticsearch();
saveToSQL();
}
Another question is how to handle actually searching the entity? Do you ONLY use Elasticsearch?
If not, I would assume you fetch from Elasticsearch based on keywords and use the IDs returned to filter your SQL query. My question then, is how do you handle pagination? For example let's say you only want results 50 to 100. First you query Elasticsearch which returns 50-100. Then the SQL query reduces that to 20 results - the other 30 results are in what would've been the next Elasticsearch query (100 - 150 for example). Do you keep going back and forth?
As for your first question check here
As for the second question, if you plan to use elasticsearch as your search layer then better do it for all the searchable/filterable fields. As you've described, the alternative will get very messy very soon. Use elasticsearch for all your searches/filters and even aggregations if it suits your needs. Use the sql database as your point of truth and just get the full payload from there.
In general, if you will need to paginate then your search should better be in one place otherwise it will get ugly.

Access dynamic query - Better to build one conditional SQL query or multiple queries with VBA?

I have a Microsoft Access 2010 form with dropboxes and a checkbox which represent certain parameters. I need to run a query with conditions based on these parameters. It should also be a possibility for no criteria from the dropdown boxes and checkbox in order to pull all data.
I have two working ways of implementing this:
I build a query with IIf statements in the WHERE clause, nesting statements until I have accounted for every combination of criteria. I reference the criteria in the SQL logic by using Forms!frmMyFrm!checkbox1 for example or by using a function FormFieldValue(formName,fieldName) which returns the value of a control with the input of the form and control name (This is because of previous issues). I set this query to run with the press of the form's button.
I set a vba sub to run with the press of the button. I check the conditions and set the query SQL to a predetermined SQL string based on the control criteria (referenced in the same way as the previous method). This also involves many If...Else statements, but is a little easier to read than a giant query.
What is the preferred method? Which is more efficient?
I don't believe you would find one way is more efficient over the other, at least not noticeably. For the most part it is simply personal preference.
I generally use VBA and check the value of each dropdown/checkbox and build pieces of the SQL query then put together at the end. The issue that you may run into with this method though is that if you have a large number of dropdowns and checkboxes the code is easy to get "lost" in.
If time to run is very key though you could always use some of the tips How do you test running time of VBA code? to see which way is faster.
After a lot of experimentation, and a bit of new information indicating having a pre-built query is faster than having SQL compiled in VBA, the most efficient and clear solution in the context of Microsoft Access is to build and save a number of dependent queries beforehand.
Essentially, build a string of queries each with an IIf dependent on a different criterium. Then you only need to run the final query. The only case where you would have to incorporate a VBA If...Else is if you need to query something more complicated than SELECT...WHERE(IIf(...)).
This has a few advantages:
The SQL is already compiled in the saved query, speeding things up.
No more getting lost in code:
There is no giant, nearly-impossible-to-edit query with way too many IIfs.
The minimal VBA code is even easier to follow.
At least for me, who's not an expert in SQL, it's convenient that I can often use the MS Access visual query builder for each part.

Get last few query results in SQL

I frequently do a static analysis of SQL databases, during which I have the luxury of nobody being able to change the data except me.
However, I have not found a way to 'tell' this to SQL in order to prevent running the same query multiple times.
Here is what I would like to do, first I start with a complicated query that has a very small output.
SELECT * FROM MYTABLE WHERE MYPROPERTY = 1234
Then I run a simple query from the same window (Mostly using SQL server studio if that is relevant)
SELECT 1
Now I suddenly realize that I forgot to save the results from my first complicated (slow) query.
As I know the underlying data did not change (or even if it did) I would like to look one step back and simply get the result. However at the moment I don't know any trick to do this and I have to run the entire query again.
So the question summary is: How can I (automatically store/)get the results from recently executed queries.
I am particulary interested in simple select queries, and would be happy to allocate say 100MB memory for automated result storage. Would prefer a solution that works in SQL server studio with T-SQL, but other SQL solutions are also welcome.
EDIT: I am not looking for a way to manually prevent this from happening. In the cases where I can anticipate the problem it will not happen.
This can't be done in Microsoft SQL Server. SQL Server does not cache results, instead it caches data pages that were accessed by your query. This should make your query go a lot faster the second time around so it won't be as painful to re-run it.
In other databases, such as Oracle and MySQL, they do have a query caching mechanism that will allow you to retrieve the results directly the second time around.
I run into this frequently, I often just throw the results of longer-running queries into a temp table:
SELECT *
INTO #results1
FROM MYTABLE WHERE MYPROPERTY = 1234
SELECT *
FROM #results1
If the query is very long-running I might use a 'real' table. It's a good way to save on re-run time.
Downside is that it adds to your query.
You can also send query results to a file in SSMS, info on formatting the output is here: SSMS Results to File
The easiest way to do this is to run each query in its own SSMS window, the results will stay there until you close it, or run out of memory - besides that, I am not sure there is a way to accomplish what you want.
Once you close the SSMS window, I don't believe there is a way to get back 'cached' results.
This isn't a technical answer to your question. Having written queries and looking at results for many years, I am in the habit of saving the results in Excel, regardless of the database/query tool I'm using.
The format in Excel is rather methodical:
Each worksheet has the date. (Called something like "1 Jul".)
Each spreadsheet contains one month. (Typically with the month name like "work-201307".)
In the "B" column I copy and paste the query.
Underneath, in the "C" column, I copy and paste the results.
The next query starts a few lines after, one after the other.
I put the queries in the "B" column, so I can go to the "A" column and use to get to the first row. I put the results in the "C" column, so I can go to the "B" column and use to move between queries.
I mostly do this so I can go back and see the work I did many months ago. For instance, someone sends an email from February and says "do this again". I can go back to the February spreadsheet, go to the day it was created, and see what I was doing at that time.
In your question, though, I realize that I now instinctively solve this problem, because the "right click on the grid, copy with column headers, alt-tab to excel, alt-V" is a behavior that I comes quite naturally.
I was going to suggest you to run each query into a script with a counter (stored in a table) increased each time the query is executed (i.e. i++) and storing each query in a Temp Table called "tmpTable" + i, but it sounds very complicated to manage. Am I right?
Then I googled and I've found this Tool Pack: I didn't try it but you could take a look:
http://www.ssmstoolspack.com/Features
Hope it helps.
EDIT: added the folliwing link. There's the option to output as XML file and they mention SQL Server Integration Services as a possible solution too.
http://michaeljswart.com/2012/03/sending-query-results-to-others/#method5
SECOND EDIT: There's this DBMS-Independent tool too, it sounds interesting:
http://www.sql-workbench.net/
i am not sure this is what you want. Anyway check my answer
In sql server management studio you can open multiple tabs for executing queries. Open new tab for each query, then the result of executed queries will be available under that tab.
After executing one query in a tab dont use that tab for new query, open new tab for that job.
Have you considered using some kind of offline SQL client such as Excel? Specifically, Excel will retrieve the results into the spread sheet (using the Data ribbon/menus) where they are stored pretty much permanently as results. It will prompt you to refresh when necessary or you can do it on demand.
Your question as to whether it can be done in T/SQL or other databases depends on the database and results cache and even then they are options that the query processor can use not guarantees to the individual query.

How could i write this code in a more performant way?

In our app people have 1 or multiple projects. These projects have a start and an end date. People have a limited amount of available days.
Now we have a page that displays the availability of a given person on a week by week basis. It currently shows 18 weeks.
The way we currently calculate the available time for a given week is like this:
def days_available(query_date=Date.today)
days_engaged = projects.current.where("start_date < ? AND finish_date > ?", query_date, query_date).sum(:days_on_project)
available = days_total - hours_engaged
end
This means that to display the page descibed above the app will fire 18(!) queries into the database. We have pages that lists the availability of multiple people in a table. For these pages the amount of queries is quickly becomes staggering.
It is also quite slow.
How could we handle the availability retrieval in a more performant manner?
This is quite a common scenario when working with date ranges in an entity. Easy and fastest way is in SQL:
Join your events to a number generated date table (see generate days from date range) so that you have a row for each day a person or people are occupied. Once you have the data in this form it is simply a matter of grouping by the week date part of the date and counting the rows per grouping.
You can extend this to group by person for multiple person queries.
From a SQL point of view, I'd advise using a stored procedure and pass in your date/range requirement, you can then return a recordset for a user or possibly multiple users. This way your code just has to access db once.
You can then output recordset data in one go, by iterating through.
Hope this helps.
USE Stored procedure to fire your query to SQL to get data.
Pass paramerts in your case it is today's date to the SQl query.
Apply your conditions and Logic in the SQL Stored procedure , Using procedure is the goood and fastest way to retrieve data from the SQL , also it will prevent your code from the SQL injection too.
Call that SP from your Code as i dont know the Ruby on raisl I cant provide you steps about how to Call the Stored procedure from it.
After that the data fdetched as per you stored procedure will be available in Data table or something like that.
After getting the data you can perform all you need
Hope this helps
see what query is executed. further you may make comand explain to your query
explain select * from project where start_date < any_date and end_date> any_date2
you see the plan of query . Use this plan to optimized your query.
for example :
if you have index using field end_date replace a condition(end_date> any_date2 and start_date < any_date) . this step will using index if you have index on this field. But it step is db dependent . example is for nysql. if you want use index in mysql you must have using index condition on left part of where
There's not really enough information in your question to know exactly what you're trying to achieve here, e.g. the code snippet doesn't make use of the returned database query, so you could just remove it to make it faster. Perhaps this is just a bug in the code you posted?
Having said that, there are some techniques you should look into to implement your functionality.
I would take a look at using data warehouse techniques. I would think of your 'availability information' as a Fact table in a star schema, with 'Dates' and 'People' as Dimension tables.
You can then use queries to get stuff like - list of users for this projects for this week, and their availability.
Data warehousing has a whole bunch of resources you can tap into to help make this perform well, but there's also a lot of terminology that can be confusing, but for this type of 'I need to slice and dice my data across several sets of things (people and time)', Data Warehousing techniques can be quite powerful.
As I dont understand ruby on rails,from sql point of view i suggest you to write a stored procedure and return a dataset.And do the necessary table operations on the dataset from front end.It will reduce the unnecessary calls to DB.

Using Greater Than In FQL Queries

Can anyone tell me why this works:
https://graph.facebook.com/fql
?q=SELECT display_name FROM application WHERE app_id=0&access_token=...
(and returns 0 results, obviously)
but this doesn't:
https://graph.facebook.com/fql
?q=SELECT display_name FROM application WHERE app_id>=0&access_token=...
(HTTP 500)
The FQL pages on Facebook itself only ever give the simplest of queries - they never give samples of more complex queries involving strpos() and anything other than =.
I am aware of the need to work on an indexed column, but app_id is definitely one of those :)
If the column is indexable you still need to provide specific values for it.
If you could provide vague ranges (e.g. '>0') it would defeat the restriction of requiring you to specify the target objects first, and filter later