Impact of creating calculated fields in Workday - workday-api

What is the impact of creating calculated fields in Workday
Will it make Workday slower, if this is slower how we can avoid to make some changes specially for integration ?
Is it better to have a report based calculated fields. If Yes, then will it impact on the report only ?
Is Workday doesn't calculated field OR they have any special recommendation on this ?
Asif

It, the calculated field, doesn't make Workday as a whole slower, not in any noticeable way anyway. If your calculated fields are complex, that can sometimes make the report run slower. Try using report filters when you can if you find the calculated field slowing things up. It's possible a lot of long running reports might slow your system down some; you can contact Workday for that question.
As far as report based calculated fields vs global calculated fields: only use the global calculated fields if you think you will reuse them in more than one report, otherwise, use the report calculated fields. Make sure you have a good naming convention for your calculated fields; the more you have, it can get hard to keep track of them all. But whether your calculated field is report-based or global, they both perform the same way, one isn't slower than the other.

Report performance is not based on calculated fields. It completely depends on the Datasource and business object you are using within the report.

Report running performance is completely depends on the Datasource and business object what we using in Report. But sometimes we will create calculated field to fetch specific data instead of changing datasource or business object. During this time performance will affect.
Maximum we need to give importance to use workday delivered field.
Check Report run time for various Business Object and Datasource
Try to use indexed datasource
consider or try to use Optimized Performance check box for choosing BO or Datasource while creating report

Related

The more tablixes the slower the report runs?

I apologize if this has already been answered, but did a search and could not find anything.
So, I've been working in SSRS and have made some parameters that display a different tablix depending on the options selected; for example, if the user does not want to display both the original AND actual expiration date, I have a parameter to display one of the two tablixes. I may need to add one or two more tablixes though because I wanted to implement an "order by" parameter (I couldn't get it to work in my actual SQL code).
Does having numerous tablixes affect performance? I do not want my reports to be bogged down! Thank you.
The number of tablixes is not a significant factor in performance imo. The main cause for performance issues would be
1) SQL performance
2) slow performing library functions or procedures

Is it possible to have text measures in SSAS tabular?

The question pretty much sums it up.
I am creating a model that involves textual status information on some processes. I would like to show these as text but cant for the life of me figure out how.
Tried FirstNonBlank(textualcolumn, 1) without luck. Anyone know if this is possible?
Rather than having a text measure physically in any fact table I would suggest you to go for calculated measure. As per your post the measure has to represent some process status (I suppose Open or closed), you can easily write a MDX expression for the calculated measure.

Row Inserted and Updated Time in Fact Table

I see there is an importance in having a row inserted and a row last updated fields in a fact table. But I could not find any standard data warehouse or a reference which says that this is a good thing to do. I am uncertain whether this is because it is a bad practice; if so why should it be so? If it is because of the data size, I see it is only 8bytes for a full date field.
Any help is greatly appreciated!!!
There's nothing talking about whether it's a good or bad practice because we include creation time and updated time only if we need them or ever will need them.
It's a "good thing to do" if you need to access those columns and a "bad thing to do" if your table will never require those columns.
The inclusion of insert and update timestamps in your data warehouse tables allow you to be able to report from the perspective of as was and/or as is with regards to the data warehouse. These timestamps would be in addition to any timestamps that may be captured from the source.
They also make troubleshooting easier and in a worse case scenario the ability to back out a set of data from a specific run of an ETL process.
At a previous client, the data model we implemented included upwards of 6 different timestamps to provide slowly changing history, as is/ as was reporting, and source related time stamps. It made for very flexible reporting but also increased the learning curve of how to get exactly what you wanted from the table(s).

What are your best practices for ensuring the correctness of the reports from SQL?

Part of my work involves creating reports and data from SQL Server to be used as information for decision. The majority of the data is aggregated, like inventory, sales and costs totals from departments, and other dimensions.
When I am creating the reports, and more specifically, I am developing the SELECTs to extract the aggregated data from the OLTP database, I worry about mistaking a JOIN or a GROUP BY, for example, returning incorrect results.
I try to use some "best practices" to prevent me for "generating" wrong numbers:
When creating an aggregated data set, always explode this data set without the aggregation and look for any obvious error.
Export the exploded data set to Excel and compare the SUM(), AVG(), etc, from SQL Server and Excel.
Involve the people who would use the information and ask for some validation (ask people to help to identify mistakes on the numbers).
Never deploy those things in the afternoon - when possible, try to take a look at the T-SQL on the next morning with a refreshed mind. I had many bugs corrected using this simple procedure.
Even with those procedures, I always worry about the numbers.
What are your best practices for ensuring the correctness of the reports?
have you considered filling your tables with test data that produces known results and compare your query results with your expected results.
Signed, in writing
I've found that one of the best practices is that both the reader/client and the developers are on the same (documented) page. That way, when mysterious numbers appear (and they do), I can point to the specification in writing and say, "This is why you see this number. Would you like it to be different?".
Test, test, test
For seriously complicated reports, we went through test data up and down with the client, until all the numbers were correct, and client was were satisfied.
Edge Cases
We discovered a seriously complicated case in our reporting system that turned everything upside down (on our end). What if the user generates a report (say Year-End 2009) , enters data for the new year, and then comes back to generate the same report? The data has changed but that report should not. Thinking and working these cases out can save a lot of heartache.
Write some automated tests.
We have quite a lot of reporting services reports - we test them using Selenium. We use a test data page to squirt some known data into an empty database, then run the report and assert that the numbers are as expected.
The builds run every time we check in, so we know we haven't done anything too stupid

BASIC Object-Relation Mapping question asked by a noob

I understand that, in the interest of efficiency, when you query the database you should only return the columns that are needed and no more.
But given that I like to use objects to store the query result, this leaves me in a dilemma:
If I only retrieve the column values that I need in a particular situation, I can only partially populate the object. This, I feel, leaves my object in a non-ideal state where only some of the properties and methods are available. Later, if a situation arises where I would like to the reuse the object but find that the new situation requires a different but overlapping set of columns to be returned, I am faced with a choice.
Should I reuse the existing SQL and add to the list of selected columns the additional fields that are required by the new situation so that the same query and object mapping logic can be reused for both? Or should I create another method that results in the execution of a slightly different SQL which results in the populating of only those object properties that were returned by the 2nd query?
I strongly suspect that there is no magic answer and that the answer really "depends" on the situation but I guess I am looking for general advice. In general, my approach has been to either return all columns from the queried table or to add to the query the additional columns as they are needed but to reuse the same SQL (and mapping code) that is, until performance becomes a concern. In general, I find that unless you are retrieving a large number of row - and I usually am not - that the cost of adding additional columns to the output does not have a noticable effect on performance and that the savings in development time and the simplified API that result are a good trade off.
But how do you deal with this situation when performance does become a factor? Do you create methods like
Employees.GetPersonalInfo
Employees.GetLittleMorePersonlInfoButMinusSalary
etc, etc etc
Or do you somehow end up creating an API where the user of your API has to specify which columns/properties he wants populated/returned, thereby adding complexity and making your API less friendly/easy to use?
Let's say you want to get Employee info. How many objects would typically be involved?
1) an Employee object
2) An Employees collection object containing one Employee object for each Employee row returned
3) An object, such as EmployeeQueries that returns contains methods such as "GetHiredThisWeek" which returns an Employees collection of 0 or more records.
I realize all of this is very subjective, but I am looking for suggestions on what you have found works best for you.
I would say make your application correct first, then worry about performance in this case.
You could be optimizing away your queries only to realize you won't use that query anyway. Create the most generalized queries that your entire app can use, and then as you are confident things are working properly, look for problem areas if needed.
It is likely that you won't have a great need for huge performance up front. Some people say the lazy programmers are the best programmers. Don't over-complicate things up front, make a single Employee object.
If you find a need to optimize, you'll create a method/class, or however your ORM library does it. This should be an exception to the rule; only do it if you have reason to do so.
...the cost of adding additional columns to the output does not have a noticable effect on performance...
Right. I don't quite understand what "new situation" could arise, but either way, it would be a much better idea (IMO) to get all the columns rather than run multiple queries. There isn't much of a performance penalty at all for getting more columns than you need (although the queries will take more RAM, but that shouldn't be a big issue; besides, hardware is cheap). Also, you'd save yourself quite a bit of development time.
As for the second part of your question, it's really up to you. As an example, Rails takes more of a "usability first, performance last" approach, but that may not be what you want. It just depends on your needs. If you're willing to sacrifice a little usability for performance, by all means, go for it. I would.
If you are using your Objects in a "row at a time" CRUD type application, then, by all means copy all the columns into your object, the extra overhead is minimal, and you object becomes truly re-usable for any program wanting row access to the table.
However if your SQL is doing a complex join or returning a large set of rows, then request precisely and only what you want. You get two performance penalties here, one handling each column each time will eat up cpu for no benefit, and, two most DBMS systems have a bag of tricks for optimising queries (such as index only access) which can only be used if you specify precisely which columns you want.
There is no reuse issue in most of these cases as scan/search processes tend to very specific to a particular use case.