I have a table A that contains 500,000 rows. Table is related with 5 other tables (B,C,D,E,F) (the size of 100,000 rows each).
I'd like to extract in a fast way a structured information as (key A, value A+B+C+D+E+F) where value A+B+C+D+E+F is the column result of inner join of these 6 tables (bit slowly).
On these tables updates/insert/delete occur every 10 min (asynchronously).
I need to present result of this selection to perform a search as a binding source.
My IDEA
Materialize this selection in a NoSQL database where to perform search. My questions is:
which NoSQL database to use?
how to synchronize SQL Server to change the data in the NoSQL database any time my SQL Server record is changed?
SCENARIO
Desktop application made by C# and framework 4.0
Database SQL Server 2012
Related
For the purpose of my application I have created an Azure Function that connects with my Dataverse environment, query data with SELECT from different tables (I create new records based on records in tables A, B, C), stores the result into a list and then saves those records into same Dataverse environment but into another table (lets say D). I decided for such solution because Power Automate was creating those new records too slowly.
It works fine, however when there are too many requests (more than 2-3 users work with application and run Azure Functions) the save into Dataverse begins to be too slow too.
So I am thinking about another way to save and store those records. What important is that those records in table D are only for calculation purpose, users do not work with them or edit them. This is why I am thinking about creating SQL Datables table, store those records (only from table D) there, and just change connection in my application where needed.
Can you suggest me the most efficient way to do this? In a nutshell what I need is:
connect to Dataverse and query data from tables A, B, C. Result of this query will be records for table D
save the result of the query into SQL Database table (table D)
There are quite a few things to consider here.
If users don't use data of table D, could you maybe run this operation overnight or at a time when there is low traffic and slow performance of this operation is acceptable?
Have you considered using SQL views? Do you really need to store the computed data?
Perhaps you are inserting 1 item at a time? Are you using the Sql Bulk Copy Class?
Bulk Insert In SQL Server From C#
Observe the CPU utilisation of your server during this operation. If probably shoots to 100%. You want to hit 70% average utilisation for good trade-off between performance and cost. So another option is to scale up.
My primary datasource get 50M records per day. I need view record max delay about 5 minutes.
How I have best way to transfer data from primary SQL Server datasource to report SQL Server datasource.
At this time, I user merge join every 30seconds. But it seems effect to primary datasource performance.
The most common approach to minimize the load on your source server is to do periodic extracts using a timestamp, i.e. a simple SELECT ... WHERE timestamp > previous-max-timestamp-extracted.
The source table(s) need to provide a column that allows you to filter on un-extracted records. If that's completely impossible, you might extract e.g. the last hour's data into staging tables, and deduplicate with previously extracted records.
Yes, you could use CDC, but that's often more involved, and usually adds some restrictions.
Cheers, Kristian
I have a curious question and as my name suggests I am a novice so please bear with me, oh and hi to you all, I have learned so much using this site already.
I have an MSSQL database for customers where I am trying to track their status on a daily basis, with various attributes being recorded in several tables, which are then joined together using a data table to create a master table which yields approximately 600million rows.
As you can imagine querying this beast on a middling server (Intel i5, SSD HD OS, 2tb 7200rpm HD, Standard SQL Server 2017) is really slow. I was using Google BigQuery, but that got expensive very quickly. I have implemented indexes which have somewhat sped up the process, but still not fast enough. A simple select distinct on customer id for a given attribute is still taking 12 minutes on average for a first run.
The whole point of having a daily view is to make it easier to have something like tableau or QLIK connect to a single table to make it easy for the end user to create reports by just dragging the required columns. I have thought of using the main query that creates the master table and parameterizes it, but visualization tools aren't great for passing many variables.
This is a snippet of the table, there are approximately 300,000 customers and a row per day is created for customers who join between 2010 and 2017. They fall off the list if they leave.
My questions are:
1) should I even bother creating a flat file or should I just parameterize the query.
2) Are there any techniques I can use aside from setting the smallest data types for each column to keep the DB size to a minimal.
3) There are in fact over a hundred attribute columns, a lot of them, once they are set to either a 0 or 1, seldom change, is there another way to achieve this and save space?
4)What types of indexes should I have on the master table if many of the attributes are binary
any ideas would be greatly received.
I am having a System Setup in ASP.NET Webforms and there is Acccounts Records Generation Form In Some Specific Situation I need to Fetch All Records that are near to 1 Million .
One solution could be to reduce number of records to fetch but when we need to fetch records for more than a year of 5 years that time records are half million, 1 million etc. How can I decrease its time?
What could be points that I can use to reduce its time? I can't show full query here, it's a big view that calls some other views in it
Does it take less time if I design it in as a Linq query? That's why I asked Linq vs Views
I have executed a "Select * from TableName" Query and its 40 mins and its still executing table is having 1,17,000 Records Can we decrease this timeline
I started this as a comment but ran out of room.
Use the server to do as much filtering for you as possible and return as few rows as possible. Client side filtering is always going to be much slower than server side filtering. Eg, it does not have access to the indexes & optimisation techniques that exist on the server.
Linq uses "lazy evaluation" which means that it builds up a method for filtering but does not execute it until it is forced to. I've used it and was initially impressed with the speed ... until I started to access the data it returned. When you use the data you want from Linq, this will trigger the actual selection process, which you'll find is slow.
Use the server to return a series of small resultsets and then process those. If you need to join these resultsets on a key, save them into dictionaries with that key so you can join them quickly.
Another approach is to look at Entity Framework to create a mirror of the server database structure along with indexes so that the subset of data you retrieve can be joined quickly.
For example, a website offers the ability to create mobile surveys. Each survey ID is a FK in the survey response table, which contains ALL of the survey responses.
What is the size limitation of this table in a SQL Server 2008 db, if the table contains, say 20 varchar(255) fields including the bigint PK & FK?
I realize this would depend on the file size limitation as well, but I would like some more of an educated answer rather than my guess on this.
In terms of searchability, some fields that contain geo-related details such as the survey ID, city, state, and two commends fields would have to be searchable, and thus indexed ... index only these fields?
Also, aged responses would expire after a given amount of time - thus deleted from the table. Does the table, at this point being very large, need to be re-indexed/cleaned up, after the deletions (which would be an automated process)?
Thanks.
Maximum Capacity Specifications for SQL Server
Bytes per row: 8,060
Rows per table: Limited by available storage
Note
SQL Server supports row-overflow storage which enables variable length
columns to be pushed off-row. Only a 24-byte root is stored in the
main record for variable length columns pushed out of row; because of
this, the effective row limit is higher than in previous releases of
SQL Server. For more information, see the "Row-Overflow Data Exceeding
8 KB" topic in SQL Server Books Online
You mention 'table size' -- does this mean number of rows?
Maximum Capacity Specifications for SQL Server
Rows per table : Limited by available storage
As per this Reference, the max size of a table is limited by the available storage.
It sounds like you are going to have a high traffic and high content table. You should consider performance and storage enhancements like Table Partitioning. Also, because this table will be the victim of often INSERTS/UPDATES/DELETES, carefully plan out your indexing, as indexes add overhead for DML statements on the table.