When you are providing a report to an end user and you want to validate the report against the system and database (checking that your SQL code is pulling the details accurately), what is considered enough validation on the output Excel file?
For example:
10% in 100 would be 10
10% in 1000 would be 100 seems reasonable
10% in 1,000,000 would be 100,000 and seems completely unreasonable.
Is there a template or scale to validate across large datasets for human validation? Has anyone done or seen something like this before that I could use as a guide?
Now when you say validate against the database, the data you are giving has already been pulled from the database I am assuming. So is the question how to validate data that was input or validate that the queries are accurate?
If it's the first, the only real way to do this is to have controls for type of entry (IE: string, date, integer validation.)
If it's the second, then you should evaluate it on some quantitative criteria. For example, If I am trying to validate that I sold 100 computers last month and I have hard receipts for those 100 then I can query to ensure that this is reflective of the actual truth. Beyond that, you could have controls to make sure reports don't report duplicates and so on and so forth, but that has more to do with just general administration of input.
Related
I am looking for a way to download daily WTI crude oil (CL) term structure data from Bloomberg in the Excel Add-in.
My goal is to create a table that looks like this: For each futures contract, I want the lat price and the days to maturity.
Date
CL1
CL2
...
1.1.12
PX_Last
FUT_ACT_DAYS_EXP
PX_Last
FUT_ACT_DAYS_EXP
I have tried the FUT_ACT_DAYS_EXP code but for historical price series it is not available. Is there maybe a different way to receive the term structure data with days to maturity from BB?
Many thanks!
The generic contract CL1 Comdty has a historic field of FUT_CUR_GEN_TICKER. So you can pull back a timeseries of PX_LAST and FUT_CUR_GEN_TICKER.
Then you can feed these underlying contract tickers into a BDP call for LAST_TRADEABLE_DT and subtract the PX_LAST date. You can hide the intermediate columns if they are not needed.
And the final result, with the intermediate column D hidden:
NB. I'm using array functions here (note the # symbols in the formulae), rather than hard-coding ranges. It makes it more flexible if you want to change the history range. The multiple BDP calls are unnecessary but the Bloomberg addin may be caching them in any case. If performance is an issue you can use the UNIQUE() function to get a list of the underlying contract names into a lookup table.
It's been a few years since I looked at this but you are using generic tickers, whereas FUT_ACT_DAYS_EXP would most likely expect an actual contract, e.g. CLU1. There is a field that you can use to convert generic to actual as a time series, i.e. with BDH, but you would have to check that on FLDS. Once you have that you can use BDH to pull in price and days to expiry. Bear in mind that with one BDH per date this would be a very inefficient approach with regards to limits .
Alternatively ask HELP HELP and they should give you a solid answer. Unfortunately I don't access to the terminal anymore but I used to be very involved in building such analytics.
I am trying to create a report that displays 3 different numbers for each of my projects.
Contract Hours - Stored in projects table, 1 to 1 relationship
Worked Hours - Stored in linked table that will be updated using an external website reporting feature that will contain only data for the dates that are to be displayed in the report, one to many relationship needs to be a sum
Allocated Hours - Stored in a table in my database called allocations and contains data for all dates, one to many relationship needs to be summed.
Right now i have it set up in a way that the user has to type the data range for the report every time it is run, however the date range only actually applies to the Allocation data because the worked hours data comes filtered and the contract data is one to one.
What I would like to do is set up a query that can see the domain of the worked hours and apply it as a date criteria for the allocated hours.
I have attempted to use max and min values of the Worked hours and tried to get creative but I'm actually not even sure if this is possible because I cannot see any simple solution (although I know it should be possible and fairly simple)
Any help, suggestions, or recommendations are appreciated.
We have a system that records data to an SQL Server DB captured from field equipment every minute. This data is used for a number of purposes, one of which is for charting in reports via SSRS.
The issue is that with such a high volume of data, when a report is run for period of for example 3 months, the volume of data returned obviously causes excessive report rendering times.
I've been thinking of finding a way of dynamically reducing the amount of data returned, based on the start and end time periods chosen. Something along the lines of a sliding scale where from the duration between the start and end period, I can apply different levels of filtering so that where larger periods are chosen, more filtering occurs while for smaller periods less or no filtering occurs.
There is still a need to be able to produce higher resolution (as in more data points returned) reports for troubleshooting purposes.
For example:
Scenario 1:
User is executing a report for a period of 3 months. Result set returned by the query is reduced for performance reasons without adversely affecting what information the user wants to see (the chart is still representative of the changes over time).
Scenario 2:
User executes the report for a period of 1 hour, in order to look for potential indicator(s) of problems with field devices while troubleshooting the system. For this short time period, no filtering is applied.
My first thought was to use a modulo operation on the primary key of the data (which is an identity field), whereby the divisor is chosen depending on the difference between the start and end dates.
For example, something like if the difference between the start and end dates for the report execution period is 5 weeks, choose a divisor of 5 and apply a mod to the PK, selecting where the result is equal to zero.
I would love to get feedback as to whether this sounds like a valid approach or whether there is a better way to do this.
Thanks.
I have what I consider a real need to create a query with several hundred columns.
We are working on a mailing for our client. In this mailing, they are listing out several locations where their customers can go to get information. As our designers create the template for this mailing, they are setting up "Slots" for each address. The number of slots on the mailing varies from one mailing to the other, from 6 to possibly 50.
My need for the query is to setup the merge of data into the mailing. I need to provide a query where each mailing is 1 record containing all the information they need for that mailing. I am dynamically creating the SQL statement with the max number of slots on that mailing. With up to 50 slots on that mailing, my query needs to look like this:
MailingID,
LogoLocation,
APNCode,
TFN,
CopyVersion,
Slot1_Name,
Slot1_Address,
Slot1_City,
Slot1_State,
Slot1_DateTime,
...
Slot50_Name,
Slot50_Address,
Slot50_City,
Slot50_State,
Slot50_DateTime
My first attempt was to create a table with all these fields, but I got this error:
The table has been created, but its maximum row size exceeds the allowed maximum of 8060 bytes. INSERT or UPDATE to this table will fail if the resulting row exceeds the size limit.
They only want the data in a CSV file, so I don't need to create a temp table for it.
My problem is that I'm trying to create a standard process and with the number of fields varying like that, I want to set this up in a way that we won't blow up the system every time we try and run it.
I've looked at a few pages and found details on the size limitations of SQL Server and several comments saying a table like this shows a bad database design.
http://msdn.microsoft.com/en-us/library/ms143432(v=sql.105).aspx
http://social.msdn.microsoft.com/Forums/en-US/fec1efbb-94ff-4fe9-8d69-12e95c48587d/its-maximum-row-size-exceeds-the-allowed-maximum-of-8060-bytes-insert-or-update-to-this-table-will?forum=transactsql
Work around SQL Server maximum columns limit 1024 and 8kb record size
I'm hoping that someone out there has some experience doing this and can share some insights on how to make this efficient. Is there another way to accomplish this that I don't know about?
UPDATE:
Thanks for all the quick replies.
More detail on my scenario. You get a flyer in the mail and when you turn the flyer over, it lists 50 locations in your county where you could go take a class or attend a meeting or something. All the details for that flyer needs to be in 1 record so they can map the fields on the one page. If that county has 50 address/date/time combinations, they need them included in the 1 record so they can properly slot the flyer. Think giant mail merge where there might only be 100 counties (100 flyers) but each flyer has tons of information.
When the data is actually stored in the database, I'm storing an id for the specific flyer (MailingID) and each address/date/time combo is its own record. It's just the file they need to merge the details onto the creative piece that has to be denormalized like this.
I haven't been able to find any details on limitations on views. Does a View have the same limitations as a table? Would it work to create a view for them that they can download when they need the data?
All the details for that flyer needs to be in 1 record so they can map the fields on the one page That is a questionable assumption. Why can't the data be stored in 50 rows in a 2nd table?
Anyway, if you insist on storing everything in one row you should probable use XML or JSON. That makes all these problems go away. SQL Server has great support for XML. You can even generate XML on the fly. So you could properly store the 50 items in a 2nd table and only combine them into one XML value for query purposes.
At my workplace there is a "Daily Feedback" database where details are entered of any errors made by Customer Service Officers (CSOs), such as who made the mistake, when, and what service it was for. this is so we can gather data which would show areas where CSO's are repeatedly making mistakes so we can feed this back to them and train them in those areas if need be.
i currently have a report where an CSOs name is entered along with a date range and it produces the report showing the number of errors made for each service for that date range.
the code used is -
=Sum(IIf([Service]="Housing",1,0))
=Sum(IIf([Service]="Environmental Health",1,0))
etc etc for each Service.
The problem i have is that not every CSO does EVERY service and so there are usually a few results showing as "0". and i cannot be sure if thats because they dont do the service or if they are just very good at that service.
Being absolutely useless at SQL (or any other possible way of fixing this) i cannot figure out how to HIDE the entries that produce the zero value.
any help here would be greatly appreciated!
Assuming you have a table with the fields CSO, Service, FeedbackComments you could modify the report record source to
SELECT [CSO], [Service], Count([FeedbackComments])
FROM [FeedbackTable]
GROUP BY [CSO], [Service];
Then services which have no records will not appear on the report.
I don't understand exactly what you want. But I want to mention you can use the COUNT() function along with SUM(). A count >0 will reveal if 0 means '0' instances or '0' errors.