Reduce Load on SQL Server DB - sql

I have a third party application from which queries will hit the SQL Server 2008 database to fetch the data ASAP (near to real time). The same query can be called by multiple users at different times.
Is there a way to store the latest result and serve the results for subsequent queries without actually hitting the database again and again for the same piece of data?

Get the results from a procedure that stores data in a global temporary table, or change to a permanent table if you regularly drop connections: change tempdb..##Results to Results. param = 1 refreshes the data:
Create procedure [getresults] (#refresh int = 0)
as
begin
IF #refresh = 1 and OBJECT_ID('tempdb..##Results') IS NOT NULL
drop table ##Results
IF OBJECT_ID('tempdb..##Results') IS NULL
select * into ##Results from [INSERT SQL HERE]
SELECT * FROM ##RESULTS
END

Can you create an indexed view for the data?
When the data is updated the view will be updated when the 3rd party makes a call the view contents will be returned without needing to hit the base tables

Unfortunately the SQL server you are using doesn't have cache system like for example, MySQL query cache. But as per the documentation I just saw here: Buffer Management
Data pages which are read during a SELECT are first brought into the buffer cache. Subsequent requests reading the same data can thus be served quicker than the initial request without needing to access the disc.

Related

SQL Table Comparison Taking Extended Periods of Time

I am working on building an application to assist with taking data from different sources dynamically Files and Emails (usually CSV and Excel), APIs, and other SQL Databases and processing them and moving them to a central SQL server. All the tables are being uploaded to the main SQL server and processed to insert new rows into the destination table and update rows with changed data if some is available. The main SQL server is a Microsoft SQL server.
When the data is uploaded to the main server for comparison it is being stored in a temporary table which is being dropped after the comparison is done. The statement that I am using created dynamically by the program in order to allow it to be dynamic to different datasets. What I have been using is a NOT EXISTS which when I run it on a table that is 380k+ rows of data it has been taking 2+ hours to process that data. I have also tried EXCEPT, however I am unable to use that as some of the tables contain text fields which can't be used for the EXCEPT statement. The datasets that are being uploaded to the server are written to and read from at different intervals based on the schedules built into the program.
I was looking to find a more efficient way or improvements that I might be able to make use of in order to bring down the run times for this table. The program that manages the server is running on a remote server than the SQL instance which runs on part of the organizations SQL farm. I am not very experienced with SQL so I appreciate all the help I might be able to get. Below I added links to the code and an example statement produced by the system when going to run the comparison.
C# Code: https://pastebin.com/8PeUvekG
SQL Statement: https://pastebin.com/zc9kshJw
INSERT INTO vewCovid19_SP
(Street_Number,Street_Dir,Street_Name,Street_Type,Apt,Municipality,County,x_cord,y_cord,address,Event_Number,latitude,longitude,Test_Type,Created_On_Date,msg)
SELECT A.Street_Number,A.Street_Dir,A.Street_Name,A.Street_Type,A.Apt,A.Municipality,A.County,A.x_cord,A.y_cord,A.address,A.Event_Number,A.latitude,A.longitude,A.Test_Type,A.Created_On_Date,A.msg
FROM #TEMP_UPLOAD A
WHERE NOT EXISTS
SELECT * FROM vewCovid19_SP B
WHERE ISNULL(CONVERT(VARCHAR,A.Street_Number), 'NULL') = ISNULL(CONVERT(VARCHAR,B.Street_Number), 'NULL')
AND ISNULL(CONVERT(VARCHAR,A.Street_Dir), 'NULL') = ISNULL(CONVERT(VARCHAR,B.Street_Dir), 'NULL')
AND ISNULL(CONVERT(VARCHAR,A.Apt), 'NULL') = ISNULL(CONVERT(VARCHAR,B.Apt), 'NULL')
AND ISNULL(CONVERT(VARCHAR,A.Street_Name), 'NULL') = ISNULL(CONVERT(VARCHAR,B.Street_Name), 'NULL')
AND ISNULL(CONVERT(VARCHAR,A.Street_Type), 'NULL') = ISNULL(CONVERT(VARCHAR,B.Street_Type), 'NULL'));
DROP TABLE #TEMP_UPLOAD"
One simple query form would be to load a new table from a UNION (which includes de-duplication).
eg
insert into viewCovid19_SP_new
select *
from vewCovid19_SP
union
select *
from #temp_upload
then swap the tables with ALTER TABLE ... SWITCH or drop and sp_rename.

How to fix MS SQL linked server slowness issue for MariaDB

I'm using SQL Link server for fetching data from MariaDB.
But I fetching issue with slowness when i used MariaDB from link server.
I used below scenarios to fetch result (also describe time taken by query)
Please suggest if you have any solutions.
Total number of row in patient table : 62520
SELECT count(1) FROM [MariaDB]...[webimslt.Patient] -- 2.6 second
SELECT * FROM OPENQUERY([MariaDB], 'select count(1) from webimslt.patient') -- 47ms
SELECT * FROM OPENQUERY([MariaDB], 'select * from webimslt.patient') -- 20 second
This isn’t really a fair comparison...
SELECT COUNT(1) is only returning a single number and will probably be using an index to count rows.
SELECT * is returning ALL data from the table.
Returning data is an expensive (slow) process, so it will obviously take time to return your data. Then there is the question of data transfer, are the servers connected using a high speed connection? That is also a factor in this. It will never be as fast to query over a linked server as it is to query your database directly.
How can you improve the speed? I would start by only returning the data you need by specifying the columns and adding a where clause. After that, you can probably use indexes in Maria to try to speed things up.

SQL select by field acting weird

I am writing this post because I have encountered something truly weird with an SQL statement I am trying to make.
Context:
I am developing an app which uses JPA in the backend to persist / retrieve objects to/from a postgres database.
Problem:
During some tests I have noticed that when a particular user adds entries in the database and later I try to fetch them by his facebook id, the result is an empty list, even though the entries are there in the database. Doing a select statement on the database returns no rows. This does not happen with other users.
I have noticed that the mentioned user's facebook id is slightly longer then others. I do not know if and how this affects this situation.
Interesting part:
When during debugging I created an entry not programmatically, but manually with a SQL INSERT statement directly on the database (marked red on the 1st screenshot), I could fetch the data by facebook id both in my app and with a select statement.
Do you have any ideas what is going on here?
Please check the screenshots:
result of select * from table:
result of select * from table where user_facebook_id = 10215905779020408 :
Please help,
Thanks

Make an SQL request that builds a 'result cache' without returning results

I have a stateless webserver that requires 2 sets of user input to do a computation:
Page 1: GET INPUT A
Page 2: GET INPUT B
Page 3: Results calculated form user input A and B
It so happens that the bottleneck in my application is a lookup related to user input A.
As a speed up hack I make the SQL request on A that "Page 3" later does, while I wait for the user to input B such that when the user clicks submit on 'Page 2', the lookup result from 'data A' is already cached (saving impatient users 2-5 seconds).
My question
Is it possible to make my SQL lookup in such a way that the server does the query and caches it with out returning anything, as I only need it to be in the cache to make the final request 2-5 sec faster.
PostgreSQL doesn't have a result cache, so you can't pre-warm it.
It does have a disk cache. To pre-warm that, you can just run SELECT statements that discard the results. Use a PL/PgSQL PERFORM statement, a pointless aggregate, etc. A search for "postgresql pre-warm cache" may be informative.
You may also want to look into materialized views. PostgreSQL doesn't support materialized views yet, but you can simulate them with triggers and scripts. There are patches in progress to add materialized view support in progress too.
If the result is not big the best would be to save the user state in a server side session.
Update:
That is one of the cases (multiple web servers vs single db server) where a db stored session fits.
Save the result:
insert into temp_result (session_id, a, b)
select %(session_id)s, a, b
from t
Retrieve it:
select a, b
from t
where session_id = %(session_id)s
When the session expires or after a timeout delete it.

How can I efficiently compare my data with a remote database?

I need to update my contacts database in SQL Server with changes made in a remote database (also SQL Server, on a different server on the same local network). I can't make any changes to the remote database, which is a commercial product. I'm connected to the remote database using a linked server. Both tables contain around 200K rows.
My logic at this point is very simple: [simplified pseudo-SQL follows]
/* Get IDs of new contacts into local temp table */
Select remote.ID into #NewContactIDs
From Remote.Contacts remote
Left Join Local.Contacts local on remote.ID=local.ID
Where local.ID is null
/* Get IDs of changed contacts */
Select remote.ID into #ChangedContactIDs
From Remote.Contacts remote
Join Local.Contacts local on remote.ID=local.ID
Where local.ModifyDate < remote.ModifyDate
/* Pull down all new or changed contacts */
Select ID, FirstName, LastName, Email, ...
Into #NewOrChangedContacts
From Remote.Contacts remote
Where remote.ID in (
Select ID from #NewContactIDs
union
Select ID from #ChangedContactIDs
)
Of course, doing those joins and comparisons over the wire is killing me. I'm sure there's a better way - advice?
Consider maintaining a lastCompareTimestamp (the last time you did the compare) in your local system. Grab all the remote records with ModifyDates > lastCmpareTimestamp and throw them in a local temp table. Work with them locally from there.
The last compare date is a great idea
One other method I have had great success with is SSIS (though it has a learning curve, and might be overkill unless you do this type of thing a lot):
Make a package
Set a data source for each of the two tables. If you expect a lot of change pull the whole tables, if you expect only incremental changes then filter by mod date. Make sure the results are ordered
Funnel both sets into a Full Outer Join
Split the results of the join into three buckets: unchanged, changed, new
Discard the unchanged records, send the new records to an insert destination, and send the changed records to either a staging table for a SQL-based update, or - for few rows - an OLEDB command with a parameterized update statement.
OR, if on SQL Server 2008, use Merge