sum/merge multiple data source in google data studio - data-visualization

I have added 5 google ad data sources to my dashboard. I want to merge them all together. I tried to blend the data. But for each dataset, it takes its own metrics. As a result, I am getting individual metrics for each of them like 5 different "Link Clicks". But how can I combine/merge them together? Is there any functionality or I need to write some code for summing up the metrics?

I think you are in the correct path.
Just blend your data as you already did. Doing this, you'll have 5 fields with the same information, one from each data source (like link_clicks_sourceA, link_clicks_sourceB etc).
Then, create a new field in this blended source called blended_link_clicks with this formula (the field should be created in the chart level, since there's no way to add custom fields in a blended data source):
CASE
WHEN link_clicks_sourceA IS NOT NULL THEN link_clicks_sourceA
WHEN link_clicks_sourceB IS NOT NULL THEN link_clicks_sourceB
WHEN link_clicks_sourceC IS NOT NULL THEN link_clicks_sourceC
WHEN link_clicks_sourceD IS NOT NULL THEN link_clicks_sourceD
WHEN link_clicks_sourceE IS NOT NULL THEN link_clicks_sourceE
ELSE NULL
END
PS: For some reason, I tried to reproduce the steps here before posting and, in fact, this solution didn't work. In my tests, the expression anyField IS NOT NULL always evaluates to false in CASE statements and I don't know why. I would say this is due to a bug in DataStudio, so I decided to post the answer anyway.

Well, Data Studio doesn't allow merging columns from different data sources.
Data Blending operation lets you blend different data sources and use them for the charts. The possible way could be creating new fields but there's no way to add custom fields in blended data. The only way here could be using formulas in charts to sum up the data. Like you will have Link_Clicks_A, Link_Clicks_B etc. You can sum them up.
SUM: Try using the sum function
SUM(Link_Clicks_A) + SUM(Link_Clicks_B) + SUM(Link_Clicks_C)
NARY_MAX: If the sum function doesn't work, you could try this one.
NARY_MAX(Link_Clicks_A, 0) + NARY_MAX(Link_Clicks_B, 0) + NARY_MAX(Link_Clicks_C, 0)

Related

Comparing two datasets in SSRS

I'm looking to compare two datasets with each other. In an ideal world, I'd like to have it to show a green item if the data matches between the two. I have created two different GDocs files to get the code out there, to prevent SO from dinging me on formatting.
The first dataset is from our program itself, it pulls everything from our application, and displays the information, based on company code. The second dataset is from an external source requiring validation. The main fields I am matching are "NPI Number (Type 1)" from DS1 vs. "NPI" from DS2. If there is a match to highlight in green the row from both sides of data.
Dataset 1
Dataset 2
You may need to use LookUp function and set that as a expression to fill the background color of a text box or row of a table
Sample Expression: =iif(Len(Lookup(Fields!NPI.Value, Fields!NPI.Value, Fields!ProviderName.Value, "DS1"))>0,"Green","Red")
I have created a sample here. Download entire content and run it.

(Excel-VBA) Specific data import (on the background) in the active sheet

Would you please help me (total beginner) to prepare a VBA macro that would open a sheet on the background and import specific selection as shown below:
Let's say we have downloaded wordcount analysis (xlsx) like this downloaded from a CAT tool for testing.
Now I would need to add a macro to my main sheet that would read lines starting (Column A) with "All". If "All" then I'd need to record columns of that line (specficilly Columns A - O) in array / hashtable?.
Please take a look at this image that summs it all (better than explaining it for me :-)
Let me know in case you need to know more details.
All tips / suggestions are greatly appreciated.
Many thanks!
My suggestion (I'm a beginner too) would be to use the Macro Recorder. Great tool to learn (example).
start recording
filter for 'ALL'
copy/past the Cells
stop Recording
Then have a look at the recorded code and adjust it :)
Looking at your data and the final layout you are looking for, using a Pivot Table would provide you with all of the flexibility you need.
You can:
filter which data to display
generate calculated values based on data in other columns
choose what order your columns are displayed
dynamically change the layout if you decide you want a different view
From your data, I was able to generate the following Pivot Table in about 15 minutes.
There are several good, simple tutorials on building Pivot Tables. A Google search will turn up plenty.
Things you will need to learn about for your particular problem:
Classic display (I used the classic display to get this particular layout)
Calculated Fields (many of the columns in the pivot table are calculated based on your spec). There is a maximum string length of 255 characters for a field calculation, so you may need to rename some of the columns in the original data set.
Of course, basics of Pivot Tables
Loading new data and updating your pivot table
Good Luck!

Adding two extra columns to input data - Pentaho Kettle

I am working on a transformation step for Pentaho Kettle. It selects several input columns and based on that adds two new columns during transformation. I am unable to understand (based on code from other plugins), how I can add the two new columns so that 1) steps downstream are aware of these columns and 2) i can push the transformed data into these columns.
Thanks in advance.
You might need to override meta.getStepFields() to add new ValueMetaInterface objects to the RowMetaInterface passed in. This is the standard way to add columns at runtime; however, the row's metadata (i.e. list of ValueMetaInterface objects) must be the same from row to row or else the next step in your transformation will complain.
Often when doing data-driven custom plugins, you consume as many rows as you need (using getRow()) in order to figure out what the outgoing row format/metadata will be, then you can construct a RowMetaInterface (usually using meta.getStepFields()) that will be passed into the putRow() call. If you intend to pass through the incoming fields, do something like:
RowMetaInterface outputRowMeta = getInputRowMeta().clone();
If you're creating new rows use this:
RowMetaInterface outputRowMeta = new RowMeta();
Either way when you call meta.getStepFields(outputRowMeta, ...) it should populate outputRowMeta with the appropriate fields, by adding/changing/removing ValueMetaInterface objects from outputRowMeta.
I've got a blog post using Groovy to add/replace fields in the incoming rows here:
http://funpdi.blogspot.com/2014/10/flatten-json-to-key-value-pairs-in-pdi.html
Not sure if that is similar to your use case or not. If you have more questions, feel free to find me on IRC at ##pentaho (my nick is usually mburgess_pdi)
IF i have understood your question correctly, i think you are trying to create an output file with dynamic column. So you can do this by checking on the "fast dumping" option in Text File Output Step. While doing so , donot define any column names in the "Fields" tab
Check my image below:
Hope it helps :)

Reporting Services - Two filters on the same chart Category Group?

I have sales data that I'd like to plot on my chart. However, at a specific point in time, we had a change taking place I'd like to ensure is clearly visible in the chart, preferably by dividing the sales data (which is stored in a single SQL Server column) into two different chunks, which would allow me to then treat them as different data series.
I used to solve this in Excel by storing the post-event data in a different column (by simply dragging them to a different column), and thus I was able to treat them as a different series (the blue and green line in the chart below. The red and orange line are pre-event and post-event averages):
I'd like to reproduce this effect in SSRS, but am not sure how to tackle it. I've tried using an approach where I added two category groups, both pointing to the date-time column, and applying filters to them (one <= the cutoff date, the other >=).
I then added my sales data twice, with the idea I could somehow connect them to the individual category groups, but that does not seem possible.
Has anyone tried anything like this before, or would have a different approach to achieve what I'm trying to get?
Thanks!
I managed to get this to work, and figured I'd share how to do it.
My dataset contains a field called DATEKEY, which stores the date in the format YYYYMMDD. It's possible to use this in an expression and evaluate the date for a specific row. In case the expression evaluates to true, we display the value. If not, we display a blank string.
In case we want to show the values prior to the date, the expression would be:
=IIF(Fields!DATEKEY.Value <= 20130601, Avg(Fields!My_NUMBER.Value), "")
The second series can then be made by reversing the symbol:
=IIF(Fields!DATEKEY.Value >= 20130601, Avg(Fields!My_NUMBER.Value), "")
The graph then looks like this:

Summing different parts of a column in SQL

I have a database extract in excel and want to create a custom value in Tablue using their create calculation, which I believe is SQL based.
Basically I have a large number of feeds which all show up different amounts in a column. For example:
feed 1
feed 1
feed 2
feed 3
feed 4
feed 4
feed 4
And I want to have a sum for feed 1, feed 2, and feed 4. But in my actual DB there's about 100 feeds all with different number of appearances. I'm having troubles finding a good way to do this. If there even is one. Any help or direction would be appreciated!
I'm assuming that your list is a single column and you need a count of the number of occurrences of each feed. For the sake of example, since a column or table names were not supplied, let's call them colname and tablename.
select colname, count(*) as Ct from tablename group by colname
It would be easier to give an exact answer if you posted a small simplified subset of your spreadsheet. But assuming you have a column called "feed_name" which takes on values like "feed 1", "feed 2" etc depending on the row. Then the feed_name column should be a discrete dimension in Tableau.
Then just put the feed_name pill on a shelf, say the row shelf. And put the "Number of Records" field on another shelf, say the column shelf.
You don't need to write SQL to do this (or most tasks) in Tableau. It helps to understand SQL concepts and its very helpful to drop down to the SQL level when needed to solve tricky issues. But for most situations, you can just interactively explore the data by moving fields around and writing some simple calculations -- and let Tableau take care of generating the SQL necessary to retrieve the data needed to build the visualization you requested.
Tableau supports SQL and some NO-SQL data sources, along with some cubes too. It does that quite well and in multiple ways. You just can work more quickly and efficiently by using Tableau's visual based manipulations in most cases, and then drop to the lower level detail when needed. It just takes getting used to how Tableau operates.