I have wondering if there is an option to run scheduled report in BigQuery manually. I've got report in Google Data Studio that source is BigQuery table which is scheduled from BigQuery view every hour. But sometimes when I am working on query and would like to try if the changes that I have made are correct, but I have to wait that 1 hour to check it out. I read that backfill can do it but if I set start date and end date as today I can't go further. How can I solve this problem?
If you want realtime reports, just create a View with your query and create a Report in Data Studio that consumes this View.
Another approach would be to put the custom query directly on Data Studio. This way you can change the query in Data Studio and it will reprocess your data everytime you refresh the report.
Obviously, this is not the most cost-effective or efficient solution, but it is a good workaround if you just want to test something while developing.
For a production scenario (with lots of concurrent users), if you're able to pre-process your data as you already do, your reports will be faster and they'll probably consume less Big Query resources.
Related
I have a SSRS Sales report that will be run many times a day by users, but with different parameters selected for the branch and product types.
The SQL query uses some large tables and is quite complex, therefore, running it many times is going to have a performance cost.
I assumed the best solution would be to create a dataset for the report with all permutations, ran once overnight and then apply filters when the users run the report.
I tried creating a snapshot in SSRS which doesn’t consider the parameters and therefore has all the required data, then filtering the Tablix using the parameters that the users selected. The snapshot works fine but it appears to be refreshed when the report is run with different parameters.
My next solution would be to create a table for the dataset which the report would then point to. I could recreate the table every night using a stored procedure. With a couple of small indexes the report would be lightning fast.
This solution would seem to work really well but my knowledge of SQL is limited, and I can’t help thinking this is not the right solution.
Is this suitable? Are there better ways? Can anybody confirm either way?
SSRS datasets have caching capabilities. I think you'll find this more useful instead of having to create extra db tables and such.
Please see here https://learn.microsoft.com/en-us/sql/reporting-services/report-server/cache-shared-datasets-ssrs?view=sql-server-ver15
If the rate of change of the data is low enough, and SSRS Caching doesn't suit your needs, then you could manually cache the record set from the report query (without the filtering) into its own table, then you can modify the report to query from that table.
Oracle and most Data Warehouse implementations have a formal mechanism specifically for this called Materialized Views, no such luck in SQL server though you can easily implement the same pattern yourself.
There are 2 significant drawbacks to this:
The data in the new table is a snapshot at the point in time that it was loaded, so this technique is better suited to slow moving datasets or reports where it is not critical that the data is 100% accurate.
You will need to manage the lifecycle of the data in this table, ideally you should setup a Job or Scheduled Task to automate this refresh but you could trigger a refresh as part of the logic in your report (not recommended, but possible).
Though it is possible, you would NOT consider using a TRIGGER to update the data as you have already indicated the query takes some time to execute, this could have a major impact on the rest of your LOB application
If you do go down this path you should write the refresh logic into a stored procedure so that it can be executed when needed and from other internal and external automation mechanisms.
You should also add a column that records the date and time of when the dataset was executed, then replace any references in your report that display the date and time the report was printed, with the time the data was prepared.
It is also worth pointing out that often performance issues with expensive queries in SSRS reports can be overcome if you can reducing the functions and value formatting that is in the SQL query itself and move that logic into the report definition. This goes for filtering operations too, you can easily add additional computed columns in the dataset definition or in the design surface and you can implement filtering directly in the tablix too, there is no requirement that every record from the SQL query be displayed in the report at all, just as we do not need to show every column.
Sometimes some well crafted indexes can help too, for complicated reports we can often find a balance between what the SQL engine can do efficiently and what the RDL can do for us.
Disclaimer: This is hypothetical advice, you should evaluate each report on a case by case basis.
I recently created a fairly lengthy SQL query script that takes the information of some base tables like forecast, bill of materials, part information and so on to automatically create a production schedule.
The script itself works well, but whenever something in those base tables changes the query script needs to be rerun (the script itself involves first dropping the tables created by the query, and then basically running the longer query of creating and dropping tables to get to the final schedule).
To make things easier on the front end user, my intention was to create a front end through access to allow the users to update the necessary base data.
My question is, is there a way to set something up either through Microsoft SSMS or Access (2016) that would run this script automatically whenever these tables were updated?
My initial search showed a lot of people talking about SQL Server Agent being able to automate queries, but I was not able to find anything regarding running a script when a table is updated, only scheduling things based on time frequency.
Ideally I think the easiest option would be if it were possible on the Access front end to allow the user to run this script by just pushing a button on a form, but I am open to whatever options would achieve the same goal.
Thanks in advance.
I am using SQL Server Database for my Power BI reports using import , is there any way to refresh the reports every hour (without using direct query)?
Thanks
you can go on Power BI Services, on your dataset (the one related to your DB) and schedule a refresh and then, add every hour of the day in you scheduled refresh. I'm totally aware that it is not the best option but it is working. Let me know if it helps you.
One possibility would be to have a job on your SQL server that runs every hour and saves the data, for the BI, to a table. Then the BI would not have to do any processing, rather it would just read (i.e. Direct Query) the latest data from a table. That should be a lot faster than running the query, in Direct Query, every time.
I am sorry to say but the schedule refresh functionality is only present in your Power BI Service and not in Power BI Desktop. If you want to know how to implement Schedule Refresh in Power BI Service I will provide you the solution. Please revert back and I have written one blog too. You can go through this link.
Visit this link
I'm working with PostgreSQL now for a few months. Now before going live we usually used the live database for almost everything (creating new columns in the live database tables, executing update and insert queries etc.). But now we want to go live and we have to do things differently before we do that. The best way is to have a test database and live database.
Now I created a copy of the live database so we have a test database to run tests on. The problem is that the data is old after 24 hours, so we actually need to create a fresh copy every 24 hours, which is not really smart to do manually.
So my question is, are there people over here who know a proper way to handle this issue?
I think the most ideal way is:
- copy a selection of tables from the live database to the test database (skip tables like users).
- make it possible to add columns, rename them or even delete them and when we deploy a new version of the website, transfer those changes from the test database to the live database (net necassary but would be a good feature).
If your database structure is changing, you do NOT want it automatic. You will blow away dev work and data. You want it manual.
I once managed a team that had a situation similar: multi-TiB database, updated daily, and needing to do testing and development against that up-to-date data. Here was the way we solved it:
In our database, we defined a function called TODAY(). In our live system, this was a wrapper for NOW(). In our test system, it called out to a one-column table whose only row was a date that we could set. This means that our test system was a time machine, that could pretend any date was the current one.
This meant that every function or procedure we wrote had to be time-aware. Should I care about future-scheduled events? How far in the future? This made our functions extremely robust, and made it dead simple to test them against a huge variety of historical data. This helped catch a large number of bugs that we would have never thought would happen, but we saw would indeed occur in our historical data. It's like functional programming for your database!
We would still schedule database updates from a live backup, about every month or so. This had the benefit of more data AND testing our backup/restore procedure. Our DBA would run a "post-test-sync" script that would set permissions for developers, so we were damn sure than anything we ran on the test system would work on the live one as well. This is what helped us build our deployment database scripts.
I wan't sure how to word this question so I'll try and explain. I have a third-party database on SQL Server 2005. I have another SQL Server 2008, which I want to "publish" some of the data in the third-party database too. This database I shall then use as the back-end for a portal and reporting services - it shall be the data warehouse.
On the destination server I want store the data in different table structures to that in the third-party db. Some tables I want to denormalize and there are lots of columns that aren't necessary. I'll also need to add additional fields to some of the tables which I'll need to update based on data stored in the same rows. For example, there are varchar fields that contain info I'll want to populate other columns with. All of this should cleanse the data and make it easier to report on.
I can write the query(s) to get all the info I want in a particular destination table. However, I want to be able to keep it up-to-date with the source on the other server. It doesn't have to be updated immediately (although that would be good) but I'd like for it be updated perhaps every 10 minutes. There are 100's of thousands of rows of data but the changes to the data and addition of new rows etc. isn't huge.
I've had a look around but I'm still not sure the best way to achieve this. As far as I can tell replication won't do what I need. I could manually write the t-sql to do the updates perhaps using the Merge statement and then schedule it as a job with sql server agent. I've also been having a look at SSIS and that looks to be geared at the ETL kind of thing.
I'm just not sure what to use to achieve this and I was hoping to get some advice on how one should go about doing this kind-of thing? Any suggestions would be greatly appreciated.
For that tables whose schemas/realtions are not changing, I would still strongly recommend Replication.
For the tables whose data and/or relations are changing significantly, then I would recommend that you develop a Service Broker implementation to handle that. The hi-level approach with service broker (SB) is:
Table-->Trigger-->SB.Service >====> SB.Queue-->StoredProc(activated)-->Table(s)
I would not recommend SSIS for this, unless you wanted to go to something like dialy exports/imports. It's fine for that kind of thing, but IMHO far too kludgey and cumbersome for either continuous or short-period incremental data distribution.
Nick, I have gone the SSIS route myself. I have jobs that run every 15 minutes that are based in SSIS and do the exact thing you are trying to do. We have a huge relational database and then we wanted to do complicated reporting on top of it using a product called Tableau. We quickly discovered that our relational model wasn't really so hot for that so I built a cube over it with SSAS and that cube is updated and processed every 15 minutes.
Yes SSIS does give the aura of being mainly for straight ETL jobs but I have found that it can be used for simple quick jobs like this as well.
I think, staging and partitioning will be too much for your case. I am implementing the same thing in SSIS now but with a frequency of 1 hour as I need to give some time for support activities. I am sure that using SSIS is a good way of doing it.
During the design, I had thought of another way to achieve custom replication, by customizing the Change Data Capture (CDC) process. This way you can get near real time replication, but is a tricky thing.