How could i write this code in a more performant way? - sql

In our app people have 1 or multiple projects. These projects have a start and an end date. People have a limited amount of available days.
Now we have a page that displays the availability of a given person on a week by week basis. It currently shows 18 weeks.
The way we currently calculate the available time for a given week is like this:
def days_available(query_date=Date.today)
days_engaged = projects.current.where("start_date < ? AND finish_date > ?", query_date, query_date).sum(:days_on_project)
available = days_total - hours_engaged
end
This means that to display the page descibed above the app will fire 18(!) queries into the database. We have pages that lists the availability of multiple people in a table. For these pages the amount of queries is quickly becomes staggering.
It is also quite slow.
How could we handle the availability retrieval in a more performant manner?

This is quite a common scenario when working with date ranges in an entity. Easy and fastest way is in SQL:
Join your events to a number generated date table (see generate days from date range) so that you have a row for each day a person or people are occupied. Once you have the data in this form it is simply a matter of grouping by the week date part of the date and counting the rows per grouping.
You can extend this to group by person for multiple person queries.

From a SQL point of view, I'd advise using a stored procedure and pass in your date/range requirement, you can then return a recordset for a user or possibly multiple users. This way your code just has to access db once.
You can then output recordset data in one go, by iterating through.
Hope this helps.

USE Stored procedure to fire your query to SQL to get data.
Pass paramerts in your case it is today's date to the SQl query.
Apply your conditions and Logic in the SQL Stored procedure , Using procedure is the goood and fastest way to retrieve data from the SQL , also it will prevent your code from the SQL injection too.
Call that SP from your Code as i dont know the Ruby on raisl I cant provide you steps about how to Call the Stored procedure from it.
After that the data fdetched as per you stored procedure will be available in Data table or something like that.
After getting the data you can perform all you need
Hope this helps

see what query is executed. further you may make comand explain to your query
explain select * from project where start_date < any_date and end_date> any_date2
you see the plan of query . Use this plan to optimized your query.
for example :
if you have index using field end_date replace a condition(end_date> any_date2 and start_date < any_date) . this step will using index if you have index on this field. But it step is db dependent . example is for nysql. if you want use index in mysql you must have using index condition on left part of where

There's not really enough information in your question to know exactly what you're trying to achieve here, e.g. the code snippet doesn't make use of the returned database query, so you could just remove it to make it faster. Perhaps this is just a bug in the code you posted?
Having said that, there are some techniques you should look into to implement your functionality.
I would take a look at using data warehouse techniques. I would think of your 'availability information' as a Fact table in a star schema, with 'Dates' and 'People' as Dimension tables.
You can then use queries to get stuff like - list of users for this projects for this week, and their availability.
Data warehousing has a whole bunch of resources you can tap into to help make this perform well, but there's also a lot of terminology that can be confusing, but for this type of 'I need to slice and dice my data across several sets of things (people and time)', Data Warehousing techniques can be quite powerful.

As I dont understand ruby on rails,from sql point of view i suggest you to write a stored procedure and return a dataset.And do the necessary table operations on the dataset from front end.It will reduce the unnecessary calls to DB.

Related

ms access query (ms access freezes)

I have this report and need to add totals for each person (the red circle)
existing report
new report
I cannot change the existing report so I export data from MS SQL to MS Access and create a new report there. I got it working for one employee but have trouble with a query which would for multiple employees.
This query extract data use as input:
SELECT [TIME].[RCD_NUM], [TIME].[EMP_ID], [TIME].[PPERIOD], [TIME].[PRUN], [TIME].[TDATE], [TIME].[PC], [TIME].[RATE], [TIME].[HOURS], [TIME].[AMOUNT], [TIME].[JOB_ID], [TIME].[UPDATED], [TIME].[UPDATED_BY], [TIME].[LOG_DATE], [TIME].[ORIGINAL_REC_NUM]
FROM [TIME]
WHERE ((([TIME].[EMP_ID])=376) And (([TIME].[TDATE])<=#12/31/2006# And ([TIME].[TDATE])>=#1/1/2006#) And (([TIME].[PC])<599));
this query populates the report:
SELECT *
FROM TIME1
WHERE RCD_NUM = (SELECT Max(RCD_NUM) FROM [TIME1] UQ WHERE UQ.PPERIOD = [TIME1].PPERIOD AND UQ.PC = [TIME1].PC);
the problem is if I remove EMP_ID from the first query like this
SELECT [TIME].[RCD_NUM], [TIME].[EMP_ID], [TIME].[PPERIOD], [TIME].[PRUN], [TIME].[TDATE], [TIME].[PC], [TIME].[RATE], [TIME].[HOURS], [TIME].[AMOUNT], [TIME].[JOB_ID], [TIME].[UPDATED], [TIME].[UPDATED_BY], [TIME].[LOG_DATE], [TIME].[ORIGINAL_REC_NUM]
FROM [TIME]
WHERE ((([TIME].[TDATE])<=#12/31/2006# And ([TIME].[TDATE])>=#1/1/2006#) And (([TIME].[PC])<599));
then the second query doesn't work and ms access freezes when running this query.
any help/idea please?
Caveat: I won't pretend to know the precise cause of the problem, but I have had to repeatedly refactor queries in Access to get them working even though the original SQL statements are completely valid in regards to syntax and logic. Sometimes I've had to convolute a sequence of queries just to avoid bugs in Access. Access is often rather dumb and will simply (re)execute queries and subqueries exactly as given without optimization. At other times Access will attempt to combine queries by performing some internal optimizations, but sometimes those introduce frustrating bugs. Something as simple as a name change or column reordering can be the difference between a functioning query and one that crashes or freezes Access.
First consider:
Can you leave the data on SQL Server and link to the results in Access (rather than export/importing it into Access)? Even if you need or prefer to use Access for creating the actual report, you could use all the power of SQL Server for querying the data--it is likely less buggy and more efficient.
Common best practice is to create SQL Server stored procedures that return just what data you need in Access. A pass-through query is created in Access to retrieve the data, but all data operations are performed on the server.
Perhaps this is just a performance issue where limiting the set by [EMP_ID] selects a small subset, but the full table is large enough to "freeze" Access.
How long have you let Access remain frozen before killing the process? Be patient... like many, many minutes (or hours). Start it in the morning and check after lunch. :) It might eventually return a result set. This does not imply it is tolerable or that there is no other solution, but it can be useful to know if it eventually returns data or not.
How many possible records are there?
Are the imported data properly indexed? Add indexes to all key fields and those which are used in WHERE clauses.
Is the database located on a network share or is it local? Try copying the database to a local drive.
Other hints:
Try the BETWEEN operator for dates in the WHERE clause.
Try refactoring the "second" query by performing a join in the FROM clause rather than the WHERE clause. In doing this, you may also want to save the subquery as a named query (just as [TIME1] is saved). Whether or not a query is saved or embedded in another statement CAN change the behavior of Access (see caveat) even though the results should be identical.
Here's a version with the embedded aggregate query. Notice how all column references are qualified with the source. Some of the original query's columns do not have a source alias prefixing the column name. Remember the caveat... such picky details can affect Access behavior.:
SELECT TIME1.*
FROM TIME1 INNER JOIN
(SELECT UQ.PPERIOD, UQ.PC, Max(UQ.RCD_NUM) As Max_RCD_NUM
FROM [TIME1] UQ
GROUP BY UQ.PPERIOD, UQ.PC) As TIMEAGG
ON (TIME1.PPERIOD = TIMEAGG.PPERIOD) And (TIME1.PC = TIMEAGG.PC)
AND (TIME1.RCD_NUM = TIMEAGG.Max_RCD_NUM)

t-sql append data to an existing dataset

I have a report writer that will allow me to query my db directly, however the query has to select from a SQL view. I have to develop a lot tracking solution but because of the logic involved, the only way I have been able to accomplish this so far is to hijack the SQL statement before it leaves the report writer and point it towards a function (instead of the view).
I need to develop a more user friendly way to accomplish this. My first though was to populate the view that the report writer sees with an item and lot number from one of my tables, call my function with the item and lot number and then somehow append the original view with usage and consumption transactions for that item/lot. Because of how the report writer is designed, the original view that returns just the item/lot must be the same object as the view that is eventually populated with the transactions.
Is there a way to use an alter view statement as part of a query? Is there a better way to accomplish this goal? I am a bit lost here.
Well not having the reputation to comment, and seeing that this is SQL Server could you do the following?:
SELECT st.*
, dbo.ufn_usage_and_consumption(st.item_number, st.lot_number)
from some_table st
Basically, you are calling both the view and FOR EACH ROW of the view, calling the SQL Server Function.
Please note that this is NOT OPTIMAL. You are essentially doing RBAR processing (Row by Agonizing Row) and calling the function for every row.
Really, I would see about creating a stored procedure, if your report writer supports that and pass parameters to call the query and pass results back.
I'm making the following assumption:
1) the data coming back from the function is a scalar (one value only), if it's not you can return it as a comma delimitted string
Don't know if that helps or not but good luck with your query!

Script to compare two tables in database, from user input

I am very new to VBA and SQL and am trying to learn. I have a MS Access project that requires a VBA script that prompts the user to input two table names and numerous field names and create a SQL query utilizing those the names.
The specific SQL query I'm trying to use is below.
SELECT
A.user_index, A.input1, B.input1, A.input2, B.input2, A.input3, B.input3, B.input4,
A.input4, A.input5, B.input5
FROM
table1 AS A
LEFT JOIN
table2 AS B ON A.user_index = B.user_index
WHERE
(((A.input1) <> [B].[input1)) OR
(((A.input2) <> [B].[input2])) or
(((A.input4) <> [B].[input4]));
The overall purpose of this is to have a script that will be able to list fields for comparison that is applicable with any database. I know this is probably a relatively easy solution. However, I have no idea where to start.
My first instinct is to say "What have you tried so far?", but as you said, you don't know where to start.
It sounds like you need to first prompt the user for several field and table names, then build a query based on those values. I recommend first outlining exactly what you want your script to do. Maybe something like:
Declare variables to hold the values.
Prompt the user for each of the values and store them in the variables.
2a. After the user enters a value, make sure it is valid. If not, do something accordingly.
Declare a variable to hold your SQL query.
Construct the query.
Run the query.
This is obviously just an example. Break down each step into "baby steps" as much as possible.
It's a good idea to ask yourself how unique these baby steps are to your particular situation (hint: they almost certainly are not unique). If they aren't, then they have probably been solved tens of thousands of times already, so you have a very good chance of googling your questions.
If you still can't find an answer to how to do a particular step, feel free to ask here. Just remember to include your code even if it is broken :)

Executing multiple queries in a single DB Call - SQL

I'm creating pdf reports with Data retrieved from Database (Oracle). For each report I'm making a DB Call. I want to optimize the call for multiple reports ( can have max of 500 reports). In current scenario, I am making 500 DB calls and this results in timeout of the Server.
I'm looking for solutions and answers.
1. Can I pass a list of data as input to a query ? (The query required 2 inputs.)
2. The entire set of data retrieval should happen in 1 DB Call not 500 separate calls.
3. The response should be accumulated result of 500 inputs.
Please suggest ways to solve or directions to the solve the issue ?
It is a Java based system. The DB call is from a Web App. DB : Oracle.
Thanks!!
If you want to get the data for an arbitrary number of "reports" in a single database call, then I would imagine you need to be calling a stored procedure that returns a very large nugget of XML or JSON text that you can then parse and display in your application. Oracle has built-in functions for constructing XML, and JSON is pretty easy to structure yourself (though I believe a 3rd party PL/SQL JSON package may be available).
there are a few ways of combining results from multiple queries.
UNION ALL -- lets you literally combine results between query1 UNION ALL query2
Make 1 more general query. -- this is the best answer if it can be done.
Join Sub queries and print the data horizontally if you can join them. select a., b. from (querya) a join (queryb) b on (id)
There are probably other ways as well. Such as a stored procedure etc.

Consolidated: SQL Pass comma separated values in SP for filtering

I'm here to share a consolidated analysis for the following scenario:
I've an 'Item' table and I've a search SP for it. I want to be able to search for multiple ItemCodes like:
- Table structure : Item(Id INT, ItemCode nvarchar(20))
- Filter query format: SELECT * FROM Item WHERE ItemCode IN ('xx','yy','zz')
I want to do this dynamically using stored procedure. I'll pass an #ItemCodes parameter which will have comma(',') separated values and the search shud be performed as above.
Well, I've already visited lot of posts\forums and here're some threads:
Dynamic SQL might be a least complex way but I don't want to consider it because of the parameters like performance,security (SQL-Injection, etc..)..
Also other approaches like XML, etc.. if they make things complex I can't use them.
And finally, no extra temp-table JOIN kind of performance hitting tricks please.
I've to manage the performance as well as the complexity.
T-SQL stored procedure that accepts multiple Id values
Passing an "in" list via stored procedure
I've reviewed the above two posts and gone thru some solutions provided, here're some limitations:
http://www.sommarskog.se/arrays-in-sql-2005.html
This will require me to 'declare' the parameter-type while passing it to the SP, it distorts the abstraction (I don't set type in any of my parameters because each of them is treated in a generic way)
http://www.sqlteam.com/article/sql-server-2008-table-valued-parameters
This is a structured approach but it increases complexity, required DB-structure level changes and its not abstract as above.
http://madprops.org/blog/splitting-text-into-words-in-sql-revisited/
Well, this seems to match-up with my old solutions. Here's what I did in the past -
I created an SQL function : [GetTableFromValues] (returns a temp table populated each item (one per row) from the comma separated #ItemCodes)
And, here's how I use it in my WHERE caluse filter in SP -
SELECT * FROM Item WHERE ItemCode in (SELECT * FROM[dbo].[GetTableFromValues](#ItemCodes))
This one is reusable and looks simple and short (comparatively of course). Anything I've missed or any expert with a better solution (obviously 'within' the limitations of the above mentioned points).
Thank you.
I think using dynamic T-SQL will be pragmatic in this scenario. If you are careful with the design, dynamic sql works like a charm. I have leveraged it in countless projects when it was the right fit. With that said let me address your two main concerns - performance and sql injection.
With regards to performance, read T-SQL reference on parameterized dynamic sql and sp_executesql (instead of sp_execute). A combination of parameterized sql and using sp_executesql will get you out of the woods on performance by ensuring that query plans are reused and sp_recompiles avoided! I have used dynamic sql even in real-time contexts and it works like a charm with these two items taken care of. For your satisfaction you can run a loop of million or so calls to the sp with and without the two optimizations, and use sql profiler to track sp_recompile events.
Now, about SQL-injection. This will be an issue if you use an incorrect user widget such as a textbox to allow the user to input the item codes. In that scenario it is possible that a hacker may write select statements and try to extract information about your system. You can write code to prevent this but I think going down that route is a trap. Instead consider using an appropriate user widget such as a listbox (depending on your frontend platform) that allows multiple selection. In this case the user will just select from a list of "presented items" and your code will generate the string containing the corresponding item codes. Basically you do not pass user text to the dynamic sql sp! You can even use slicker JQuery based selection widgets but the bottom line is that the user does not get to type any unacceptable text that hits your data layer.
Next, you just need a simple stored procedure on the database that takes a param for the itemcodes (for e.g. '''xyz''','''abc'''). Internally it should use sp_executesql with a parameterized dynamic query.
I hope this helps.
-Tabrez