Different results BigQuery and Tableau - sql

Hello guys when i try to calculate median and average in my BigQuery and Tableau i get different results even though I am using same numbers and rows. Is there something I should know?
For example;
In BiQuery
select district, avg(sales) from table name
In Tableau
Using district as dimension and select average of sales from the maxcard drop down menu.
Surprisingly the output from the two are not the same.
Any one knows what might be a problem that I should know?
Thanks!

Hi matt_black just got the solution, because in Tableau i made a join with polygon data when this happens it duplicates the data in the Tableau hence different results.

Related

Getting rid of date filter on BigQuery?

I'm trying to get rid of this particular date filter on BQ and have a date column instead (kindly check the red circle to see what I'm talking about)
This data was extracted from Microsoft (Bing) Ads using Supermetrics. Any idea how to tackle this? Let me know if you need more information. Thank you.
AK
So the string depicted in the red circle is part of the table name i.e. 2021111.
So to view all the data you will want to create a query that looks at all the tables like, but be careful as datasets can get BIG fast.
SELECT *
FROM `BINGADS_AGE_GENDER_*`
If you wanted to only combine a certain date range you could use
SELECT *
FROM `BINGADS_AGE_GENDER_*`
WHERE _table_suffix BETWEEN '20211001' and '20211101'
If you use this datasource within datastudio you can use the #DS_START_DATE and #DS_END_DATE parameters with the date picker.

How to populate all possible combination of values in columns, using Spark/normal SQL

I have a scenario, where my original dataset looks like below
Data:
Country,Commodity,Year,Type,Amount
US,Vegetable,2010,Harvested,2.44
US,Vegetable,2010,Yield,15.8
US,Vegetable,2010,Production,6.48
US,Vegetable,2011,Harvested,6
US,Vegetable,2011,Yield,18
US,Vegetable,2011,Production,3
Argentina,Vegetable,2010,Harvested,15.2
Argentina,Vegetable,2010,Yield,40.5
Argentina,Vegetable,2010,Production,2.66
Argentina,Vegetable,2011,Harvested,15.2
Argentina,Vegetable,2011,Yield,40.5
Argentina,Vegetable,2011,Production,2.66
Bhutan,Vegetable,2010,Harvested,7
Bhutan,Vegetable,2010,Yield,35
Bhutan,Vegetable,2010,Production,5
Bhutan,Vegetable,2011,Harvested,2
Bhutan,Vegetable,2011,Yield,6
Bhutan,Vegetable,2011,Production,3
Image of the above csv:
Now there is a very small country lookup table which has all possible countries the source data can come with, listed. PFB:
I want to have the output data's number of columns always fixed (this is to ensure the reporting/visualization tool doesn't get dynamic number columns with every day's new source data ingestions depending on the varying distinct number of countries present).
So, I've to somehow join the source data with the country_lookup csv and populate all those columns with default value as F. Every country column would be binary with T or F being the possible values.
The original dataset from the above has to be converted into below:
Data (I've kept the Amount field unsolved for column Type having Derived Yield as is, rather than calculating them below for a better understanding and for you to match with the formulae):
Country,Commodity,Year,Type,Amount,US,Argentina,Bhutan,India,Nepal,Bangladesh
US,Vegetable,2010,Harvested,2.44,T,F,F,F,F,F
US,Vegetable,2010,Yield,15.8,T,F,F,F,F,F
US,Vegetable,2010,Production,6.48,T,F,F,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+15.2)/(6.48+2.66),T,T,F,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
US,Vegetable,2011,Harvested,6,T,F,F,F,F,F
US,Vegetable,2011,Yield,18,T,F,F,F,F,F
US,Vegetable,2011,Production,3,T,F,F,F,F,F
US,Vegetable,2011,Derived Yield,(6+10)/(3+9),T,T,F,F,F,F
US,Vegetable,2011,Derived Yield,(6+2)/(3+3),T,F,T,F,F,F
US,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
Argentina,Vegetable,2010,Harvested,15.2,F,T,F,F,F,F
Argentina,Vegetable,2010,Yield,40.5,F,T,F,F,F,F
Argentina,Vegetable,2010,Production,2.66,F,T,F,F,F,F
Argentina,Vegetable,2010,Derived Yield,(2.44+15.2)/(6.48+2.66),T,T,F,F,F,F
Argentina,Vegetable,2010,Derived Yield,(15.2+7)/(2.66+5),F,T,T,F,F,F
Argentina,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
Argentina,Vegetable,2011,Harvested,10,F,T,F,F,F,F
Argentina,Vegetable,2011,Yield,90,F,T,F,F,F,F
Argentina,Vegetable,2011,Production,9,F,T,F,F,F,F
Argentina,Vegetable,2011,Derived Yield,(6+10)/(3+9),T,T,F,F,F,F
Argentina,Vegetable,2011,Derived Yield,(10+2)/(9+3),F,T,T,F,F,F
Argentina,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
Bhutan,Vegetable,2010,Harvested,7,F,F,T,F,F,F
Bhutan,Vegetable,2010,Yield,35,F,F,T,F,F,F
Bhutan,Vegetable,2010,Production,5,F,F,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(15.2+7)/(2.66+5),F,T,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
Bhutan,Vegetable,2011,Harvested,2,F,F,T,F,F,F
Bhutan,Vegetable,2011,Yield,6,F,F,T,F,F,F
Bhutan,Vegetable,2011,Production,3,F,F,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(10+2)/(9+3),F,T,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
The image of the above expected output data for a structured look at it:
Part 1 -
Part 2 -
Formulae for populating Amount Field for Derived Type:
Derived Amount = Sum of Harvested of all countries with T (True) grouped by Year and Commodity columns divided by Sum of Production of all countries with T (True)grouped by Year and Commodity columns.
So, the target is to have a combination of all the countries from source and calculate the sum of respective Harvested and Production values which then has to be divided. The commodity can be more than one in the actual scenario for any given country, but that should not bother as the summation of amount happens on grouped commodity and year.
Note: The users in the frontend can select any combination of countries. The sole purpose of doing it in the backend rather than dynamically doing it in the frontend is because AWS QuickSight (our visualisation tool), even though can populate sum on selected column filters but doesn't yet support calculation on those derived summed fields. Hence, the entire calculation of all combination of countries has to be pre-populated (very naive approach) in order to make it available in report on dynamic users selection of countries.
Also if you've any better approach (than the above naive approach mentioned in note) to solve this problem, you are most welcome to guide me. I've also posted a question on the same problem without writing my expected approach for experts to show me the path on how we can solve this kind of a problem better than this naive approach. If you want to help solve it with some other technique, you're most welcome, here is the link to that question.
Any help shall be greatly acknowledged.

How to create a sales query in SAP Business One

I would like to create a sales query for the past 12 months that can be reapplied once a month.I would like to do the whole thing with the query generator, but I can not find a table from which the sales emerge. I am familiar with the sales analysis, but I would like to simplify the whole thing. Is there a corresponding table at all or is it possible to create one?
If i understand correctly,
are you looking for .ORDR ? this is the sales order header table.
there is also .RDR1 which contains the row data.
if you go to view, then turn on 'system information' table/field details will be displayed in the bottom left.
regards,

SSRS: Repeat Down Total from Matrix in all rows

Need some help in SSRS,
New to this so apologies ahead if not entirely up to scratch with answers.
I am trying to get the Total of a column and repeat the value of that total (Total Checks - see image) across all the rows.
Below is image:
I have tried multiple different ways, each time with failure.
Any help would be much appreciated.
Thanks,
Depending on where you are getting your data from, you have different options. For example, within SQL you could use windowed functions:
select Date
,Checker
,sum(Checker) over () as TotalChecks
,sum(Checker) over (partition by Date) as TotalChecksByDate
from table
or within SSRS expressions by explicitly stating your scope, either for the dataset as a whole:
=sum(Fields!Checker.Value, "YourDataset")
or just for the grouping within your table:
=sum(Fields!Checker.Value, "YourTableGroupName")

SQL Server How to Build Query?

I'm kind new in this forum but I'm stuck in a problem and I need our help.
I Have one table with several lines where each line represent one project, then in another table I'll have many tasks that need to be done in each project, each task would have a percentage of at what level is, and the result of this two tables is one table where I'll have the process Id and also the percentage of accomplished with the average of the last entries of every tasks values.
I can't figured out the query that needs to be done to have the result that I want.
Can anyone help me? You can see follow the link bellow to see tables and the result that I want.
Table iamges
I didnt understand the colors of rows you used, but with your description, i think this is the query you are looking for:
select P.id_Proceso, P.SubProceso, avg(R.estado)
from Processos P
join Registros R
on P.ID = R.Id_processo
group by P.id_Proceso, P.SubProceso