Postgres cross tabs mixes categories in result set - sql

I have a query returning me three columns 'Date', 'District' and 'Total'. There are four districts but some time a district may not have record for corresponding date. I want to put all my districts in column and their total in rows as ..
Date, District1, District2, District3, District4
While in my rows will be date along with the total value per district.
I am using crosstab to get the desired results. But in the result total for one district is listed in other district though the datewise grand total is same.
I don't know why crosstab query mixes categories. Please help.

Related

Calculating the percentage whilst grouping SQL

With reference to the table below, I'm suppose to group by campaign_name and media_plan_id. How can I query a table to include the percentage of a creative_names passing a threshold of 8 (banner_np_count) associated with a media_plan_id.
i.e. there are two creative names associated with some media_plan_ids, I need the percentage of total creative_names meeting the condition banner_np_count >= 8.

Google Sheets Query Function. How can I get only Unique or Distinct Rows?

I am trying to answer a question on a case using the Query function on Google Sheets and am stuck on a particular problem.
I need to get the total number of unique orders per year. I used the formula below and managed to get the total orders per year.
=QUERY(raw_data!$A$1:$U$9995, "select YEAR(C), COUNT(B) group by YEAR(C)", 1)
Where column C is the date and B is the order_id.
The problem is that this returns a total of 9994 orders and includes duplicates of the same order. For example, if a customer purchased 3 different products, they would each be given a line in the database and would count as 3 of the 9994 orders. However, they all have the same order_id.
I need to get the number of unique orders per year. I know this number is 5009 since I did some manual research through Excel, but wanted to find that same total, separated by year, using the Query Function since this is a case to test my SQL Knowledge.
Is this possible? Does the Query Function have a way to get the count for unique order_ids? Thank you very much for your help!
See if this helps
=QUERY(UNIQUE(raw_data!$B$1:$C$9995), "select YEAR(Col2), COUNT(Col1) where Col2 is not null group by YEAR(Col2)", 1)

SUM results is multiplying by number of rows

I'm creating a crystal report from several tables.
One table has fields that I want to have sum totals on, but these sum fields are being distorted by number of rows from another table. There are no fields other than DocEntry that I can link with between the two tables.
Here, total bales is repeating 4 times:
If I sum the field total bales, instead of showing 12 the result is 48:
Please assist.
Insert a Group on DocEntry.
Add a Running Total that sums Bales but evaluates only on change of group.

Trying to create a well count to compare to BOE using the on production date and comparing it to Capital spends and total BOE

I have data that includes the below columns:
Date
Total Capital
Total BOED
On Production Date
UWI
I'm trying to create a well count based on the unique UWI for each On Production Date and graph it against the Total BOED/Total Capital with Date as the x-axis.
I've tried unique count by UWI but it then populates ALL rows of that UWI with the same well count total, so when it is summed the numbers are multiplied by the row count.
Plot Xaxis as Date and Y with Total BOED and Well Count.
Add a calculated column to create a row id using the rowid() function. Then, in the calculation you already have, the one that populates all rows of the UWI with the same well count, add the following logic...
if([rowid] = min([rowid]) over [UWI], uniquecount([UWI]) over [Production Date], null)
This will make it so that the count only populates once.

SQL GROUPING SETS averages with multiple many-to-many dimensions

I have a table of data with the following:
User,Platform,Dt,Activity_Flag,Total_Purchases
1,iOS,05/05/2016,1,1
1,Android,05/05/2016,1,2
2,iOS,05/05/2016,1,0
2,Android,05/05/2016,1,2
3,iOS,05/05/2016,1,1
3,Android,06/05/2016,1,3
1,iOS,06/05/2016,1,2
4,Android,06/05/2016,1,2
1,Android,06/05/2016,1,0
3,iOS,07/05/2016,1,2
2,iOS,08/05/2016,1,0
I want to do a GROUPING SETS (Platform,Dt,(Platform,Dt),()) aggregation to be able to find for each combination of Platform and Dt the following:
Total Purchases
Total Unique Users
Average Purchases per User per Day
The first two are simple as these can be achieved via a sum(Total_Purchases) and count(distinct user) respectively.
The problem I have is with the last metric. The result set should look like this but I don't know how to get the last column to be calculated correctly:
Platform,Dt,Total_Purchases,Total_Unique_Users,Average_Purchases_Per_User_Per_Day
Android,05/05/2016,4,2,2.0
iOS,05/05/2016,2,3,0.7
Android,06/05/2016,5,3,1.7
iOS,06/05/2016,2,1,2.0
iOS,07/05/2016,2,1,2.0
iOS,08/05/2016,0,1,0.0
,05/05/2016,6,3,2.0
,06/05/2016,7,3,2.3
,07/05/2016,1,1,1.0
,08/05/2016,1,1,1.0
Android,,9,4,1.8
iOS,,6,3,1.2
,,15,4,1.6
For the first ten rows we see that getting the Average purchase per user per day is a simple division of the first two columns as the dimension in these rows represent a single date only. But when we look at the final 3 rows we see that the division is not the way to achieve the desired result. This is because it needs to take an average for each day in turn to get the overall per day amount.
If this isn't clear please let me know and I'll be happy to explain better. This is my first post on this site!