I have an api exposing a websocket connection and to keep the connection alive my reactjs frontend echoes in the websocket connection each second. Whenever the server receives the message, a database query (a SELECT) is done. So I'm querying the database each second by the way. Will it kill the system overtime ? Is it a poor practice to query a database as frequently as that ? Any explanation would help me improve the code. My system will go production in a few and I'd like not to encounter any silly problem
According to your words, a query is executed every second, and by doing this, you will have problems with the server resources
In my opinion, you can have two different solutions
1- Manage the number of requests from the database using the design pattern and data caching
2- Change your websocket structure and in case of an event or data changes, take the data from the bank and send it to the user.
Related
I have data conversion and caching service running as self-hosted WCF service.
Now it uses database polling in constant short intervals to update its data.
I think it's unnecessary. The data can be changed only if one of the tables is changed, and when the data is changed depends on system users actions.
There is no problem in setting a trigger for specific tables, however I would need an action outside SQL-Server to update my cache. My WCF service could perform update when receiving specific URI via HTTP. So all I need is a command in table trigger which would send a request. Is it even possible?
I think about a hack I used back in the days with HTTP requests. I halted HTTP request response at server until data packet from somewhere else arrived. There was no delay between polling requests. I achieved fully asynchronous, "real-time" updates.
Maybe this approach is possible to apply with SQL? I think about a query which blocks termination until receives a signal. Well, it eventually times out, but it's good enough to try. Then - how to signal and wait in SQL? By locking and unlocking shared resource, like cursor or dummy table?
Any other options?
I need the cache update done at lowest possible frequency (because it's pretty expensive, so once per minute is great), but I need immediate update when the data is changed.
To answer your question, have you looked at xp_cmdshell?
https://msdn.microsoft.com/en-us/library/ms175046.aspx
However, the security/performance implications of such a decision could be non-trivial depending on your use case.
We have data stored in a data warehouse as follows:
Price
Date
Product Name (varchar(25))
We currently only have four products. That changes very infrequently (on average once every 10 years). Once every business day, four new data points are added representing the day's price for each product.
On the website, a user can request this information by entering a date range and selecting one or more products names. Analytics shows that the feature is not heavily used (about 10 users requests per week).
It was suggested that the data warehouse should daily push (SFTP) a CSV file containing all data (currently 6718 rows of this data and growing by four each day) to the web server. Then, the web server would read data from the file and display that data whenever a user made a request.
Usually, the push would only be once a day, but more than one push could be possible to communicate (infrequent) price corrections. Even in the price correction scenario, all data would be delivered in the file. What are problems with this approach?
Would it be better to have the web server make a request to the data warehouse per user request? Or does this have issues such as a greater chance for network errors or performance issues?
Would it be better to have the web server make a request to the data warehouse per user request?
Yes it would. You have very little data, so there is no need to try and 'cache' this in some way. (Apart from the fact that CSV might not be the best way to do this).
There is nothing stopping you from doing these requests from the webserver to the database server. With as little information as this you will not find performance an issue, but even if it would be when everything grows, there is a lot to be gained on the database-side (indexes etc) that will help you survive the next 100 years in this fashion.
The amount of requests from your users (also extremely small) does not need any special treatment, so again, direct query would be the best.
Or does this have issues such as a greater chance for network errors or performance issues?
Well, it might, but that would not justify your CSV method. Examples and why you need not worry, could be
the connection with the databaseserver is down.
This is an issue for both methods, but with only one connection per day the change of a 1-in-10000 failures might seem to be better for once-a-day methods. But these issues should not come up very often, and if they do, you should be able to handle them. (retry request, give a message to user). This is what enourmous amounts of websites do, so trust me if I say that this will not be an issue. Also, think of what it would mean if your daily update failed? That would present a bigger problem!
Performance issues
as said, this is due to the amount of data and requests, not a problem. And even if it becomes one, this is a problem you should be able to catch at a different level. Use a caching system (non CSV) on the database server. Use a caching system on the webserver. Fix your indexes to stop performance from being a problem.
BUT:
It is far from strange to want your data-warehouse separated from your web system. If this is a requirement, and it surely could be, the best thing you can do is re-create your warehouse-database (the one I just defended as being good enough to query directly) on another machine. You might get good results by doing a master-slave system
your datawarehouse is a master-database: it sends all changes to the slave but is inexcessible otherwise
your 2nd database (on your webserver even) gets all updates from the master, and is read-only. you can only query it for data
your webserver cannot connect to the datawarehouse, but can connect to your slave to read information. Even if there was an injection hack, it doesn't matter, as it is read-only.
Now you don't have a single moment where you update the queried database (the master-slave replication will keep it updated always), but no chance that the queries from the webserver put your warehouse in danger. profit!
I don't really see how SQL injection could be a real concern. I assume you have some calendar type field that the user fills in to get data out. If this is the only form just ensure that the only field that is in it is a date then something like DROP TABLE isn't possible. As for getting access to the database, that is another issue. However, a separate file with just the connection function should do fine in most cases so that a user can't, say open your webpage in an HTML viewer and see your database connection string.
As for the CSV, I would have to say querying a database per user, especially if it's only used ~10 times weekly would be much more efficient than the CSV. I just equate the CSV as overkill because again you only have ~10 users attempting to get some information, to export an updated CSV every day would be too much for such little pay off.
EDIT:
Also if an attack is a big concern, which that really depends on the nature of the business, the data being stored, and the visitors you receive, you could always create a backup as another option. I don't really see a reason for this as your question is currently stated, but it is a possibility that even with the best security an attack could happen. That mainly just depends on if the attackers want the information you have.
I am currently beginning a new personal project. I have a database that keeps track of users as they log in to my webpage. It shows when they log on and log off. It uses SQL Server 2008.
What I would like to do is, whenever a user logs in, a scrolling bar along the top of my webpage alerts me to this. I have created a dashboard to keep track of a lot of my website statistics and this is something I think would be really cool. Useless, ultimately - but it would produce a "heheh" from me every so often, so why not ?
Now, I have never attempted to build something like this (which is the reason I am building it!) so I am torn between a few different design approaches. It seems like I could poll the database server repeatedly using http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqldependency.aspx, just writing a query to find the set of currently logged in users and display any additions to that pool. If this is the right path to go down, then I would appreciate some more in-depth commentary on how this could be used.
From a high level perspective it seems like, rather than repeatedly polling the database, it would be more efficient to have the DB push the message out to my web server when there is a change. Would this be possible? If so, how ?
For the sake of argument, and to give this discussion a bit more specificity, let's assume our SQL Server tables are structured as follows (but feel free to make any improvements or changes as you see fit!):
Users {
ID Primary Key
Username(Varchar 100)
Password
}
LogInOrOutLogs {
SessionID Primary Key
UserID (Foreign Key)
TimeLoggedIn (DateTime)
TimeLoggedOut (DateTime)
CurrentlyLoggedIn(Bool)
}
Open to all technologies, all database structures, all design ideas. Go crazy! Only requirements : You have a DB of users which updates as they log in and out. Display the information on a web server as meaningfully, elegantly and simply as you can.
Thanks a lot, looking forward to reading peoples solutions for this problem.
Do you have look at Hibernate ? This is an elegante object layer over SQL database.
Then you can push triggers on your database to push the event. When you have a event to your data you send it to your web application via long query (it is an ajax query with very very very long timeout, the query is re-send after a event is receive).
A crazy design should also use a two way messaging system, one for message incoming into the DB one for other outputing from DB.
If you really like crazy thing you could thing of cache using a DB4O database (a cache for your SQL Server) embedded into a servicemix - redhatfuse. There easy way with servicemix because of the predeployed broker(activemq) and fuse with it's nice fabric system.
I am using c#.net, the db is MS SQL 2008 R2.
I have a question that seems to have been asked a lot in the forums here. I want to use a database table as a a queue...but the processing of these messages cannot be done from the database.
I have a table that stores the requests i get from a .Net component. I now have to read the data from these tables and make http calls to 2 webservices. Based on the response received from the webservices, the data gets archived or deleted.
I had a few specific questions:
1. How do i make sure that if i pick a record for processing and the http call fails I should be able to go on to the next record, and then come back to this record at the end of the run
2. Is there an alternative to using the database as a queue(like MSMQ etc.), which option is better
3. I want to maintain an audit trail of the record status. Is creating a trigger to log the changes before the edit the best way to do it?
Regards
Leo
Use Service Broker!
I am using it for a while and think its great thing althought its takes time to understand how it works. Was using book to learn.
Service Broker solves:
concurency
application state
... many, many other things
I have the need to access a sybase database (12.5) from oversea. The high latency is definitely a problem.
I already optimized the connection parameters to make better use of the network and achieved a 20x performance increase, but it's still not enough : 1 minute to get 3Mb of data.
We need another 10x or 20x increase for our application.
Technical data :
the data are flowing through a single TCP connection using the TDS protocol
the client app is an excel sheet with macros, using the default Sybase driver
the corporate environment makes it difficult to push big changes in the 10+ years architecture, so solutions need to be the least intrusive. But some changes may be bargained due to the importance of this project.
Can anyone give me pointers ?
I already thought of :
splitting SQL requests over several concurrent connections to the database. The problem is data consistency : what if records are modified at the same time since requests will not be exactly executed at the same time ? Is there an existing mechanism to spread a request over several calls on different connections ?
using some kind of database "cache" or "local replication" oversea, but I don't know what is possible.
Thanks.
Try to install local database (ASE or ASA) and synchronize this databases with Sybase Mobilink (or Sybase Replication Server if you need small replication latency and you have a lot of money).
(I know I answer to my own question)
Eventually, we settled to designing our own database remote access protocol. It's not complicated since we are only using a basic subset of SQL (SELECT and UPDATE), and the protocol doesn't have to understand SQL anyway.
By using our own protocol, we'll be able to use compression, make the client able to use several TCP links at the same time, maximize network utilisation and add some functionnal caching secific to our application.
The client will be our app and the server will be a "proxy" to the real database, sitting next to it (like #Tim suggested in the comments).
It's not the only solution, but we feel that it's a good balance between enormous replication price, development complexity and expected benefits.