How to bypass default parameter to include a range or better SQL? - google-bigquery

EDITED (AGAIN): added tables and two screenshots (one of Google Sheets Chart and another showing mutliple issues in DS) to help demonstrate what I am seeing.
Short Version: I have created a parameter to help me score trending topics based on the date range filter. However, I want to be able to show a range of dates' worth of data, not just a specific date's worth of data. In theory, I could make the parameter a checklist with a huge range, but that doesn't seem efficient or sustainable down the road.
Disclaimer: I am about a week into SQL and Data Studio.
Long Version: We are tracking trends over time from a specific customer data set. I'd like to make it so that when a user adjusts the time range, various topics’ " score " depends on the end date. For instance, every time the topic "Recession" is brought up, it is given a score. That score is weighted based on when it was said. I was using 365 as the highest possible score so that anything over a year is null. So if "Recession" is referenced twice, once a week ago and once today, the avg score for recession is 361.5, but if a reference is made to the topic "Talent Management" twice today, then it would have a score of 365, and so forth across a growing list of 50+ topics pertaining to 50+ specific communities we are tracking the topics across.
Here is an example:
topics
groups
entry_date
recession
A
2022-11-24
talent mgt
A
2022-11-24
recession
B
2022-11-22
economy
A
2022-11-22
recession
C
2022-11-15
talent mgt
B
2022-11-8
This score would then affect the bubble size on a chart where the Y-axis is the count of unique groups referencing the topics, and an x-axis based on the range of average scores.
The goal is to be able to see which topics are the most common across groups, which ones are emerging trends, and which ones are dated trends by having a range slider. That way users (colleagues in other departments) can play with the date range "see" the bubbles moving in location and size.
example of static chart in google sheets
I could then also use the same data and fields to measure the percentage of topics being discussed across groups based on the weighted averages against a time range.
In Goolge Sheets I can do this with an xLookUp to a tab that has a column of 0-365 and then next to it a column of 365-0 (on a tab called 'scales') and then a cell on a sheet that you can put any date as the point in time, and it affects all the scores, tables, charts, etc. (I used. =xlookup((point_in_time - entry_date), 'scales'!A:A, 'scales'!B:B, "0")
In the data studios custom SQL I used:
SELECT
*
FROM
`qRaw_data'
where
DATE(_entry_dates_) between
parse_date('%Y%m%d', #DS_START_DATE) and
parse_date('%Y%m%d', #DS_END_DATE)
AND
#pit_date_diff = date_diff(
parse_date('%Y%m%d', #ds_end_date),
_entry_dates_,
day
)
Then I created a field that is time_score of:
avg((Pit_Date_Diff-365)*(-1))
I have been googling and youtubing like crazy and think I either have to come up with a way to override the #pit_date_diff default value OR I need to use a CASE WHEN in the custom query where each time the date_diff is 1 then 365, and so on, but when I try that I get all sorts of errors.
I would like below to include all topics averaged based on all entry dates, not just those that correlate with the inputted parameter field.
currently, I can only show specific entry dates due to the parameter
I appreciate any and all help. I am a week into using data studio and am going cross-eyed Googling and YouTubing things. There is likely a better logical path to accomplish all this. Hoping for a holiday miracle.
Thanks in advance.

It turns out this was much easier than I realized... I added an AS syntax to create a column and then created a field that created the same metrics that I had in the Google Sheets:
SELECT
*,
(date_diff(parse_date('%Y%m%d', #ds_end_date), _entry_dates_,day)) AS q_time_diff
FROM
`qRaw_data`
Then the score field is: (avg(q_time_diff)-365)*(-1)
In case that helps any others in the future... ¯\(ツ)/¯
Happy Holidays!

Related

Power BI: Measure for Date difference depending on other columns

I hope everybody is doing fine! :)
I have a table like the one in my "Example" picture. Let's say it is data about certain products and a certain assembly status (i.e. column "Status"). In "Status Date" I can see the date on which the product has been in the specific status. I only added dates for ID 1 to make the table easier.
Table
What I am looking for is a measure in Power BI to calculate the difference (in days or month doesn't matter) between the dates. I don't want to use the number in the Status (e.g. 1 for Stat 1) to identify the order of the dates. To make it even harder, I may want to filter out Stat 2 for some reason. In that case I want the measure to automatically adapt and calculate the difference between Stat 3 and Stat 1.
I have the feeling that this is possible in a single formular using a measure which would be the optimal solution from my point of view.
I hope there's someone who can help me!
Thanks in advance.
Daniel

SQL Method for Cascading Workload Based on Rank and Available Hours

Recently I created an automated production scheduling tool through Excel that assigns a rank to items being produced in the same process, and then uses that rank in combination with the workload to create a schedule.
It functions exactly the way it is intended to, but due to the large amount of data and it being excel it has very slow performance, which is why I am looking to move the calculations over to SQL.
The general logic is like this:
-Always produce everything from the first day before the second day
-Always produce items from an earlier rank before items from a later rank
You can see how this plays out in the image below, where the line has 21.5 hours today, so items will be produced on day 1 until it equals 21.5, where the remainder is then carried over to day 2 and so on.
I was able to do this in excel using lengthy positional based formulas, but I am trying to think of a way to get the same result in SQL without having to rely on looking at the row above.
I am not sure how to convey something like 'Subtract from the available time production time of higher priority items produced on the same day'.
I apologize if the question is unclear, but any advice would be appreciated.
Image of Production Hours Cascading by Priority and Day
Example of Position-Based Fomula
Thanks to shawnt00, that put me in the right direction. Ultimately I had to modify the case statements a bit to go off of the cumulative total instead, but I was able to get the desired results using a sum() Over (partition by order by ) statement.

Excel - VBA - Access: Date Selection Solution

I'm looking for a direction, assuming that surely someone has had to do something similar and I'm making this more difficult than it is.
We have an Access DB, feeds to a pivot table in Excel, which is in turn used to supply charts for a "user dashboard." This is 2010, so no slicers.
My problem is that that DB is updated adding months to a field. There is a listbox in the dashboard that will allow the user to select a specific month and see stats for that time. I'm having a couple problems even getting started and would like to make sure I'm going about this the simplist/most efficient way.
My thought was to populate the listbox with the 'month' fields from the pivot table. I'm not quite sure how I'm going to do that with VBA (I have a couple ideas), but if that's the best route then I'll figure it out.
But, has anyone had a similar need, and found a better solution? I have a bunch of buttons to handle other fields, but I would really like to allow for the user to select a date/month/range...whatever. Surely this is a common, easily managed desire, no?
I'd put this in with the conversation you're having with a couple of people above, but I don't have enough rep to do that yet.
I had a similar dashboard issue years ago. Resolved it by adding a dropdown beside the month box (which was a dropdown in my case, not a listbox) with the options "Year to date" and "Month to date". By definition selecting a past month and MTD gave you the whole month, whereas selecting the current month can only ever give you MTD. Same thing with YTD - it would give you the combined stats for the current year to date instead of just one month.
The month dropdown in my dashboard was populated based on the current data in the pivot, which in turn was controlled from the database. We used a 25-month rolling select for the data and showed only the last 13 months in the month dropdown. That gave us a full 12 month spread of historical data to work from if someone chose the oldest month we offered them, yet kept the size of the pivot cache manageable
I used a dropdown for the options instead of option buttons or a checkbox, because I had a suspicion that delivering what was asked for would lead to additional requests. I was right. Eventually we had options for "Last year to date" (how we were tracking this day last year), "Quarter to date", "Financial year to date", and so on. Adding extra choices to the dropdown box was easier than rearranging the dashboard to accommodate the proliferating requirements.

DAX sum different DateTime

I have a problem here, i would like to sum the work time from my employee based on the data (time2 - time 1) daily and here is my query:
Effective Minute Work Time = 24. * 60 * (LASTNONBLANK(time2,0) -FIRSTNONBLANK(time1,0))
It works daily, but if i drill up to weekly / monthly data it show the wrong sum as it shown below :
What i want is summary of minute between daily different times (time2-time1)
Thanks for your help :)
You have several approaches you can take: the hard way or the easier way :). The harder (at least for me :)) is to use DAX to do this. You would:
1) create a date table,
2) Use the DAX calculate function to evaluate your last non-blank and first non-blank values (you might need to use calculate table, but I'm not sure; DAX experts jump in). Then subtract one vs. the other.
This will give you correct values for a given day for a given person. You can enforce the latter condition by putting a 'has one value' guard on the person name so that your measure informs the report author if they're not using it right.
Doing the same for dates is a little trickier. In the example you show you are including the date in the row grouping. But if you change your mind and want instead to have 'total hours worked by person' or 'total hours worked by everyone' you're not done with modelling yet.
Your next step is to use calculate table in combination with calculate to create a measure that returns the total. You'll use calculate table so you evaluate each date and the hours worked on that date by person. Then you'll use calculate to summarize that all down to a single number. If you're not careful with your DAX (or report authoring) you might mix which person you're summarizing for so that your first/last non blank are not at the person level. It gets intense quickly.
Your easier solution, though it might be more limited in its application - depends really on your scenario - is to use the query to transform the data into a summary by day and person using the group by command. This will give you a row per person per day with their start and end times. Then you can quickly calculate the hours worked on that day. Then you can quite easily build visuals on top of the summary data. Of course you give up some of the flexibility of the having a proper data model. However if you have a date table, a person table, and your summary table and then setup your relationships correctly you can achieve answers to the most common questions.

Qlikview line chart with multiple expressions over time period dimension

I am new to Qlikview and after several failed attempts I have to ask for some guidance regarding charts in Qlikview. I want to create Line chart which will have:
One dimension – time period of one month broke down by days in it
One expression – Number of created tasks per day
Second expression – Number of closed tasks per day
Third expression – Number of open tasks per day
This is very basic example and I couldn’t find solution for this, and to be honest I think I don’t understand how I should setup my time period dimension and expression. Each time when I try to introduce more then one expression things go south. Maybe its because I have multiple dates or my dimension is wrong.
Here is my simple data:
http://pastebin.com/Lv0CFQPm
I have been reading about helper tables like Master Callendar or “Date Island” but I couldn’t grasp it. I have tried to follow guide from here: https://community.qlik.com/docs/DOC-8642 but that only worked for one date (for me at least).
How should I setup dimension and expression on my chart, so I can count the ID field if Created Date matches one from dimension and Status is appropriate?
I have personal edition so I am unable to open qwv files from other authors.
Thank you in advance, kind regards!
My solution to this would be to change from a single line per Call with associated dates to a concatenated list of Call Events with a single date each. i.e. each Call will have a creation event and a resolution event. This is how I achieve that. (I turned your data into a spreadsheet but the concept is the same for any data source.)
Calls:
LOAD Type,
Id,
Priority,
'New' as Status,
date(floor(Created)) as [Date],
time(Created) as [Time]
FROM
[Calls.xlsx]
(ooxml, embedded labels, table is Sheet1) where Created>0;
LOAD Type,
Id,
Priority,
Status,
date(floor(Resolved)) as [Date],
time(Resolved) as [Time]
FROM
[Calls.xlsx]
(ooxml, embedded labels, table is Sheet1) where Resolved>0;
Key concepts here are allowing QlikView's auto-conatenate to do it's job by making the field-names of both load statements exactly the same, including capitalisation. The second is splitting the timestamp into a Date and a time. This allows you to have a dimension of Date only and group the events for the day. (In big data sets the resource saving is also significant.) The third is creating the dummy 'New' status for each event on the day of it's creation date.
With just this data and these expressions
Created = count(if(Status='New',Id))
Resolved = count(if(Status='Resolved',Id))
and then
Created-Resolved
all with full accumulation ticked for Open (to give you a running total rather than a daily total which might go negative and look odd) you could draw this graph.
For extra completeness you could add this to the code section to fill up your dates and create the Master Calendar you spoke of. There are many other ways of achieving this
MINMAX:
load floor(num(min([Date]))) as MINTRANS,
floor(num(max([Date]))) as MAXTRANS
Resident Calls;
let zDateMin=FieldValue('MINTRANS',1);
let zDateMax=FieldValue('MAXTRANS',1);
//complete calendar
Dates:
LOAD
Date($(zDateMin) + IterNo() - 1, '$(DateFormat)') as [Date]
AUTOGENERATE 1
WHILE $(zDateMin)+IterNo()-1<= $(zDateMax);
Then you could draw this chart. Don't forget to turn Suppress Zero Values on the Presentation tab off.
But my suggestion would be to use a combo rather than line chart so that the calls per day are shown as discrete buckets (Bars) but the running total of Open calls is a line