can you please help me with time stamp of summay index..
we having disk space issue and we are clearing the old logs . but we want keep some field data so if will schedule a SI then does it will add the data from last 1 month at one time ..then why we need to schedule it ? have gone through the splunk document but unable to understand the steps and logic ..
The idea of a summary index is to store the results of a search until they are needed for a later search. The classic example is the end-of-month report. Rather than run a huge search over thirty days to crunch the thousands of events of each day into a final report, a daily search crunches the events of that day into a SI then the monthly report runs on day 30 to read the 30 summary events from the SI into a report that runs quickly. The same SI can then be used for end-of-week reports and to populate a dashboard with the daily sales (or whatever) figures.
The key is to make the summary smaller than the original data. One cannot dump 1 month of data into a SI and hope to save space - it won't happen.
A summary index can help save disk space by retaining a smaller set of summary data long after the original events have been discarded.
Summaries do not have to be scheduled, but that is the most common way to producing them. It means no one has to remember to run the daily sales reports everyday to be able to get the monthly sales report. That said, one can write events to a summary index in an ad-hoc search using the collect command.
When you are providing a report to an end user and you want to validate the report against the system and database (checking that your SQL code is pulling the details accurately), what is considered enough validation on the output Excel file?
For example:
10% in 100 would be 10
10% in 1000 would be 100 seems reasonable
10% in 1,000,000 would be 100,000 and seems completely unreasonable.
Is there a template or scale to validate across large datasets for human validation? Has anyone done or seen something like this before that I could use as a guide?
Now when you say validate against the database, the data you are giving has already been pulled from the database I am assuming. So is the question how to validate data that was input or validate that the queries are accurate?
If it's the first, the only real way to do this is to have controls for type of entry (IE: string, date, integer validation.)
If it's the second, then you should evaluate it on some quantitative criteria. For example, If I am trying to validate that I sold 100 computers last month and I have hard receipts for those 100 then I can query to ensure that this is reflective of the actual truth. Beyond that, you could have controls to make sure reports don't report duplicates and so on and so forth, but that has more to do with just general administration of input.
I have two data-sets in my SSRS tool, first table contain 12,000 records and second one 26,000 records. And 40 columns in each table.
While building a report each time I go preview - it takes forever to display.
Is any way to do something to avoid that, so I can at least not spent so much time to build this report?
Thank you in advance.
Add a dummy parameter to limit your dataset. Or just change your select to select top 100 while building the report
#vercelli's answer is a good one. In addition you can change your cache options in the designer (for all resultsets including patramters) so that the queries are not rerun each time.
This is really useful plus - a couple of tips for you:
1. I don't recommend caching until you are happy with the your dataset results.
2. If you are using the cache and you want to do a quick refresh then the data is stored in a ".data" file in the same location as a your .rdl. You can delete this to query the database again if required.
I have a ~20 million row dataset for addresses in the US. Currently I perform a query for each state's addresses by manually changing the WHERE statement to the proper State Abbreviation (e.g. WHERE STATE_ABV= 'NY').
I have to do queries state-by-state because I export these files as csvs which has a low row limit and cannot handle the enitre US dataset.
Is there a way to automate the process so after each run the query would be exported as a csv, the WHERE statement would be changed to the next State Abbreviation, and then the proccess run again? I would like to have 50 csvs saved somewhere (one for each state) by the end of the automation.
All my searcing about automation discussed doing recurring queries based on time not variables so I understand if this is not possible.
The software I am using is SQL Developer
Thanks as always
This is quite a long and complicated question, I will do my best to explain exactly what I need to do.
This applies to a flight department. Let's start with what I have, we use spreadsheets to track flight time, landings, and engine cycles. Currently we're using two spreadsheets, one is our "trip" sheet, and the other is our flight "log".
The trip sheet can be one to three worksheets long, it is used to track each flight flown during the trip. The trip could range from one flight (leg), up to 25 flights (legs), and could range from 1 day, to 21 days. Each DAY of the trip is it's own Log #, ie. if there are 3 flights on one day, they all share the same Log #. The trip #'s are not in order, one trip could be #672, the next #264543, the next #689. The creation date is the only thing that could be used to track the trip workbooks in order.
The flight log is the FAA required logbook for the aircraft. The Log #'s run in order, ie. 459, 460, 461. A flight log is required for each day that the aircraft flies. Some, but not all of the information from the trip sheet is required on the flight log. The most important thing is that the times, landing, and cycles calculate in order.
Now here is what I'm looking for. I'd like a spreadsheet that contains the three trip sheet worksheets as we have now, but when a flight (leg) is entered, it creates a 4th worksheet which would be the flight log. Each leg flown that day would have it's information transferred to that flight log. Now, when we fly on a NEW day, a 5th worksheet would be created for the new day's flight log. Times, landings, and cycle totals need to transfer over from the previous day's flight log, and the other information needed from the trip sheet just like the previous log. And so on, and so on, till the end of the TRIP.
Now here's the REAL tricky part, when we start a new TRIP, and create a new workbook for that trip, I need the totals from the previous trip to transfer to the new workbook, so a legal, running total of aircraft times can be kept.
So basically, what I want to do, is take two separate workbooks for each trip we use now, and cram them into one, but each time a new trip workbook is created I need to go grab info from the LAST workbook created to keep a running total.
New to this forum, if there's a way to attach a copy of the two workbooks we use now please tell me. Looking at what we are using would probably make a lot of this clearer.
Thank you!!! PQ
It sounds like you have a working solution using Excel, which is very good. Oftentimes the biggest challenge is figuring out the process flow and all its branches. Further it seems like you just want to make your solution more routine and sustainable to work with.
Although making a souped-up macro-enabled Excel document sounds like the right way to go, the features you are asking for are really more suited for a relational database. Not to say it can't be done, but implementing an Excel based solution is going to be messy. The crucial difference I believe is maintaining the the logical link between the trip sheet and the log sheet.
For example, if I understand correctly, you will have to create several Excel files, and they are going to need a naming convention in order for the computer to know which ones to look for. This exposes the data to the most basic mishaps like mistakenly renaming or moving a file. If you will be the only one to maintain the system, then perhaps that won't be an issue, but experience tells me that a lot of effort can be instantly undermined by something as simple as opening a file and editing it.
This also means that you will have to maintain a "builder" file that must contain the code you develop. Not every machine is set up for macros, and a lot of end-users will get scare notices that, "this document contains macros which could be a danger...blah blah blah." Which means every output file should probably be macro free.
Instead I would recommend recreating your system using a relational database like MS-Access. You can create unlimited number of records/tables, and use any number of variables to maintain the logical link (by log #, by date, by flight #, etc.). You can also set the rules so data can be recorded and reported in a consistent manner. And if you have the need and the programming expertise, VBA macros can also be introduced to an MS-Access based solution.
Lastly, all the data could easily be kept in one central *.mdb file, which would be far easier to maintain and backup than several overlapping excel files.