Dashboard that updates real time, how to structure - sql

I am trying to create a dashboard app for my company that displays data from a few different sources that they use. I am starting with an in house system that stores data in MSSQL. I'm struggling to decide how I can display real time (or at least updated regularly) data based on this database.
I was thinking of writing a node server to poll the company database and check for updates and then store a copy of the relevant tables in my own database. Then creating another node server that computes metrics (average delivery time, Turnover, etc.) from my database and then a frontend (probably react) to display these metrics nicely and trigger the logic in the backend whenever the page is loaded by a user.
This is my first project so just need some guidance on whether this is the right way to go about this or if I'm over complicating it.
Thanks

One of solutions is to implement a CRON job in nodejs or in you frontEnd side, then you can retrieve new Data inserted to your Database.
You can reffer to this link for more information about the CROB job :
https://www.npmjs.com/package/cron
if you are using MySQL, you can use the mysql-events listner, it is a MySQL database and runs callbacks on matched events.
https://www.npmjs.com/package/mysql-events

Related

Update data in NodeJS in realtime from database when new data is added

I have data in a SQL database and I'm starting to build a dashboard to monitor certain attributes of the data. My idea is to build a NodeJS server that can take the data and perform calculations, and then use socket.io or something similar to update a React frontend in real-time.
My problem is that I'm not sure how to handle fetching data from the database in real-time. I want to re-calculate and then display new data as soon as new entries are added to the database. I could just fetch the whole database every minute or so but that seems like a waste of resources and wouldn't be real-time. Optimally I'd want to push new data to the NodeJS server every time new data is added to the database.
I've briefly looked at cube.js but it doesn't seem to do exactly what I need and I'd prefer something free/open source. If anyone has any idea of how to do this or what sort of tools to look into I'd appreciate it!

How to work with the data warehouse in the application with tasks (real time)?

I study Vue and Vuex. In the official documentation there is a simple example of using a Vuex with saving data to localStorage.
To better understand the material I studied, I decided to consolidate the knowledge into practice and write a mini application - a clone of trello (SPA).
Namely:
Create three routes:
General dashboard (/dashboard) where are boards
Board (/board) where one or several columns are located, each column has a button for
creating a task in it
Task (/:task-id) that are in columns, tasks can be moved between columns
Sidebar in which all notice with the board are displayed (CRUD by tasks and columns, changes in the status of a task, and so on.)
Sockets so that other users can see the
changes on the board in real time.
Questions!
What data should I store exclusively in the storage Vuex? Excluding authorization. It is obvious.
For what data in this application can localStorage be useful?
What should I use so that data is not discarded when I refresh the page or navigate? I can use localStorage, but hypothetically there can be a lot of data. The fourth question follows from this.
Is a better solution to use persistent remote storage on server / cloud? If so, could you give me information on how to do this? And in this case, interaction with the database is of interest, at what point is it better to save data in the database?
I'm interested in how to properly build such an application, as in a real commercial application.
I use and learn the stack MEVN
1- you can store any type of data in your store, 2 - I don't thing is useful. Because if users remove browser cache all them data will be forget. So you need configure an database for this. 3 - You need a Database and some Backend to provide your data. 4 - It's depends. if you need only for developement, you can install any things in your machine. If you need some thing more robust, could you take a cloud server. But for configure the server you need a little bit system administrator skills.

Service that does advanced queries on a data set, and automatically returns relevant updated results every time new data is added to the set?

I'm looking for a cloud service that can do advanced statistics calculations on a large amount of votes submitted by users, in "real time".
In our app, users can submit different kind of votes like picking a favorite, rating 1-5, say yes/no etc. on various topics.
We also want to show "live" statistics to the user, showing the popularity of a person etc. This will be generated by a rather complex SQL where we are calculating the average number of times a person was picked as favorite, divided by total number of votes and the number of games in which the person has been participating etc. And the score for the latest X games should count higher than the overall score for all games. This is just an example, there are several other SQL queries with similar complexity.
All our presentable data (including calculated statistics) is served from Firestore documents, and the votes will be saved as Firestore documents.
Ideally, the Firebase-backend (functions, firestore etc) should not need to know about the query logic.
What I wish for is a pay as you go cloud service that does the following:
I define some schemas and set up the queries we need for the statistics we have (15-20 different SQLs). Like setting up views in MySQL
On every vote, we push the vote data to this service, which will store it in a row.
The service should then, based on its knowledge about the defined queries, and the content of the pushed vote data, determine which statistics that are affected by the newly added row, and recalculate these. A specific vote type can affect one or more statistics.
Every time a statistic is recalculated, the result should be automatically pushed back to our Firebase backend (for instance by calling an HTTPS endpoint that hits a cloud function) - so we can update the relevant Firestore documents.
The service should be able to throttle the calculations, like only regenerating new statistics every 1 minute despite having several votes per second on the same topic.
Is there any product like this in the market? Or can it be built by combining available cloud services? And what is the official term for such a product, if I should search for it myself?
I know that I can probably build a solution like this myself, and run it on a cloud hosted database server, which can scale as our need grows - but I believe that I'm not the first developer with a need of this, so I hope that someone has solved it before me :)
You can leverage the existing cloud services available on the Google Cloud Platform.
Google BigQuery, Google Cloud Firestore, Google App Engine (CRON Jobs), Google Cloud Tasks
The services can be used to solve the problems mentioned above:
1) Google BigQuery : Here you can define schema for the data on which you're going to run the SQL queries. BigQuery supports Standard and legacy SQL queries.
2) Every vote can be pushed to the defined BigQuery tables using its streaming insert service.
3) Every vote pushed can trigger the recalculation service which calculates the statistics by executing the defined SQL queries and the query results can be stored as documents in collections in Google Cloud Firestore.
4) Google Cloud Firestore: Here you can store the live statistics of the user. This is a real time database, so you'll be able to configure listeners for the modifications to the statistics and show the modifications as soon as the statistics are recalculated.
5) In the same service which inserts every vote, create a new record with a "syncId" in an another table. The idea is to group a number of votes cast in a particular interval to a its corresponding syncId. The syncId can be suffixed with a timestamp. According to your requirement a particular time interval can be set so that the recalculation can be triggered using CRON jobs service which invokes the recalculation service within the interval. Once the recalculation related to a particular syncId is completed the record corresponding to the syncId should be marked as completed.
We are leveraging the above technologies to build a web application on Google Cloud Platform, where the inputs are recorded on Google Firestore and then stream-inserted to Google BigQuery. The data stored in BigQuery is queried after 30 sec of each update using SQL queries and the query results are stored in Google Cloud Firestore to serve dashboards which are automatically updated using listeners configured for the collection in which the dashboard information is stored.

How to get data changes from SQL to another system?

My current system, I'll call it B, gets data from an old legacy SQL database, A, via replication. I do not have any control over this legacy system other than access to the database. Replication was initially chosen to keep the two systems separate as well as to reduce any chances of performance hits on the legacy system, A, from my system, B. In B, there is a task that runs every 15 minutes that loads data from the replicated source tables and transforms it into the format I need and saves it to a SQL database that I own. This task is currently very slow as it loads up all possible data and checks for changes before it decides if any updates need to be made.
I'm creating a new system, C, that also needs to use the data from A as well as interact with B. The data is needed in a completely different format for C, so I am not able to reuse much of B, so speeding up B is not an option. Ideally, I'd like to switch B over to whatever solution chosen for C, but in the meantime, whatever is picked needs to play well with replication.
I am researching new options to get data from A as well as some way to get change notifications. Ideally, I'd love for A to send messages when there are changes but this is not possible due to the fragility of A.
I've looked into SQL Query Notifications, specifically SQLDependency and SQLTableDependency. I need to be able to see what data has changed, so SQLTableDependency might be better but its only listening when the application is running so nothing is listening when it stops. I'd like to be able to cache data instead of stitching it into the format I need when I load my website.
I've also looked into Change Tracking and Change Data Capture. Both seem like they could work in my current set up but they both seem heavier than what I need. I am also concerned about both of these running with replication. For example, if replication is re-initializing, all data is truncated and it looks like I have a ton of changes?
Am I going in the right direction given the constraints of my system? Does anyone have any other ideas of some way to get data changes? Is there some other messaging system that can be used with SQL?
Thanks!

SQL Server: Using triggers for workflow automation

In a media management system my task is to create a workflow automation. Currently, i have created it using SQL Server triggers and the UI using ASP.NET with JQuery.
For Ex:
When a new file enters the system the trigger works and it will update the database metadata table with some data for that file.
Millions of assets get through the system. Is it ideal to have triggers to do this process.
Is there a better way to create this automation?
Is there a "best practice" to do this kind of works?
I'm having the same issue and data enters my central asset database on several ways (may differ from client to client).
So I also want to create an easily customizable workflow in the data layer (no other dependencies)
As the other people mention, triggers may affect the parent activity.
That is overcome by writing your action that should be performed away to a queue table.
Example Trigger. Hardware.Status = "Issue Work Order"
INSERT INTO Queue (Created, Task, Completed) VALUES (GETUTCDATE(),"EXEC dbo.IssueWorkOrder(123)",0);
The insert of a record into your queue table will reduce the problems as highlighted by other user comments.
The you build a scheduling tool (hangfire, sql tasks, or whatever), that execute tasks in the queue in the data order it wAS added.
Now, of course in practice it's not as simple as that. You will have to address the following:
What if the step fails2
Dependencies of previous steps to first have been completed
Multiple operators changing a record. (the deploy time between the job step being executed, and another person updating the same record.
I guess #2 and #3 is an issue with any workflow engine / pipleline. To address this a locking mechanism must be put in place.