Commodity Term Structure Data from Bloomberg - bloomberg

I am looking for a way to download daily WTI crude oil (CL) term structure data from Bloomberg in the Excel Add-in.
My goal is to create a table that looks like this: For each futures contract, I want the lat price and the days to maturity.
Date
CL1
CL2
...
1.1.12
PX_Last
FUT_ACT_DAYS_EXP
PX_Last
FUT_ACT_DAYS_EXP
I have tried the FUT_ACT_DAYS_EXP code but for historical price series it is not available. Is there maybe a different way to receive the term structure data with days to maturity from BB?
Many thanks!

The generic contract CL1 Comdty has a historic field of FUT_CUR_GEN_TICKER. So you can pull back a timeseries of PX_LAST and FUT_CUR_GEN_TICKER.
Then you can feed these underlying contract tickers into a BDP call for LAST_TRADEABLE_DT and subtract the PX_LAST date. You can hide the intermediate columns if they are not needed.
And the final result, with the intermediate column D hidden:
NB. I'm using array functions here (note the # symbols in the formulae), rather than hard-coding ranges. It makes it more flexible if you want to change the history range. The multiple BDP calls are unnecessary but the Bloomberg addin may be caching them in any case. If performance is an issue you can use the UNIQUE() function to get a list of the underlying contract names into a lookup table.

It's been a few years since I looked at this but you are using generic tickers, whereas FUT_ACT_DAYS_EXP would most likely expect an actual contract, e.g. CLU1. There is a field that you can use to convert generic to actual as a time series, i.e. with BDH, but you would have to check that on FLDS. Once you have that you can use BDH to pull in price and days to expiry. Bear in mind that with one BDH per date this would be a very inefficient approach with regards to limits .
Alternatively ask HELP HELP and they should give you a solid answer. Unfortunately I don't access to the terminal anymore but I used to be very involved in building such analytics.

Related

Insert ceros instead of interopolate ARIMA_PLUS bigquery

I want to do ARIMA_plus forecasting on a series of sale records. The problem is that sale records only contain sales. When doing the forecast we need to insert for every product the "non sales", which, essentially, are rows with the import column set to cero for every day the product has not been sold. We have here two options:
Fill the database with those zero-rows (uses a lot of space)
When doing the forecasting with ARIMA_PLUS in bigquery tell the model to fill with zeros instead of interpolating (default and seemingly unique option).
I want to follow the second option, yet, i dont see how. Here you can see a screenshot of the documentation Google info about interpolation
The first option would be carried out with a merge, nevertheless I would prefer to discard it since it increases the size of the sales table.
I have scanned the documentation and havent seen any solution
You need to provide an input dataset covering the missing values with the right method for your use case.
In other words, the SQL query must solve the interpolation so that the input for the model already contains the expected data.
You can, for example, create a query to add a liner interpolation solution for your use case.
So, the first approach you mentioned can be solved using that input SQL (rather than adding the data to the source table) and the second approach is not valid in bigquery, as far as I know.
Here you have an example: https://justrocketscience.com/post/interpolation_sql/

Need to divide a Dataframe in various tables using multiple categories and date time

this is my first time asking a question here, so if I'm doing something wrong please guide me to the right place. I have a big and clean dataset. (29000+ , 24). The thing is that I have to calculate the churn rate based on 4 different categorical columns, and I'm given just 1 column that contains the subs for a given period. I have a date column too. My idea on calculating the churn is to do
churn_rate= (Sub_start_period-Sub_end_period)/Sub_start_period*100
The Problem
I don't know how to group the data using these 4 different categorical variables. Also If I manage to do so I would end up with more than 200 different tables, so I don't believe this would be a good approach.
My goal is able to predict the churn rate using the information in the table but I should be able to determine the churn rate based on these variables. The churn is not given, it has to be calculated, so I'm having problems here as I can't think of a way of working through this.

How to populate all possible combination of values in columns, using Spark/normal SQL

I have a scenario, where my original dataset looks like below
Data:
Country,Commodity,Year,Type,Amount
US,Vegetable,2010,Harvested,2.44
US,Vegetable,2010,Yield,15.8
US,Vegetable,2010,Production,6.48
US,Vegetable,2011,Harvested,6
US,Vegetable,2011,Yield,18
US,Vegetable,2011,Production,3
Argentina,Vegetable,2010,Harvested,15.2
Argentina,Vegetable,2010,Yield,40.5
Argentina,Vegetable,2010,Production,2.66
Argentina,Vegetable,2011,Harvested,15.2
Argentina,Vegetable,2011,Yield,40.5
Argentina,Vegetable,2011,Production,2.66
Bhutan,Vegetable,2010,Harvested,7
Bhutan,Vegetable,2010,Yield,35
Bhutan,Vegetable,2010,Production,5
Bhutan,Vegetable,2011,Harvested,2
Bhutan,Vegetable,2011,Yield,6
Bhutan,Vegetable,2011,Production,3
Image of the above csv:
Now there is a very small country lookup table which has all possible countries the source data can come with, listed. PFB:
I want to have the output data's number of columns always fixed (this is to ensure the reporting/visualization tool doesn't get dynamic number columns with every day's new source data ingestions depending on the varying distinct number of countries present).
So, I've to somehow join the source data with the country_lookup csv and populate all those columns with default value as F. Every country column would be binary with T or F being the possible values.
The original dataset from the above has to be converted into below:
Data (I've kept the Amount field unsolved for column Type having Derived Yield as is, rather than calculating them below for a better understanding and for you to match with the formulae):
Country,Commodity,Year,Type,Amount,US,Argentina,Bhutan,India,Nepal,Bangladesh
US,Vegetable,2010,Harvested,2.44,T,F,F,F,F,F
US,Vegetable,2010,Yield,15.8,T,F,F,F,F,F
US,Vegetable,2010,Production,6.48,T,F,F,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+15.2)/(6.48+2.66),T,T,F,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
US,Vegetable,2011,Harvested,6,T,F,F,F,F,F
US,Vegetable,2011,Yield,18,T,F,F,F,F,F
US,Vegetable,2011,Production,3,T,F,F,F,F,F
US,Vegetable,2011,Derived Yield,(6+10)/(3+9),T,T,F,F,F,F
US,Vegetable,2011,Derived Yield,(6+2)/(3+3),T,F,T,F,F,F
US,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
Argentina,Vegetable,2010,Harvested,15.2,F,T,F,F,F,F
Argentina,Vegetable,2010,Yield,40.5,F,T,F,F,F,F
Argentina,Vegetable,2010,Production,2.66,F,T,F,F,F,F
Argentina,Vegetable,2010,Derived Yield,(2.44+15.2)/(6.48+2.66),T,T,F,F,F,F
Argentina,Vegetable,2010,Derived Yield,(15.2+7)/(2.66+5),F,T,T,F,F,F
Argentina,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
Argentina,Vegetable,2011,Harvested,10,F,T,F,F,F,F
Argentina,Vegetable,2011,Yield,90,F,T,F,F,F,F
Argentina,Vegetable,2011,Production,9,F,T,F,F,F,F
Argentina,Vegetable,2011,Derived Yield,(6+10)/(3+9),T,T,F,F,F,F
Argentina,Vegetable,2011,Derived Yield,(10+2)/(9+3),F,T,T,F,F,F
Argentina,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
Bhutan,Vegetable,2010,Harvested,7,F,F,T,F,F,F
Bhutan,Vegetable,2010,Yield,35,F,F,T,F,F,F
Bhutan,Vegetable,2010,Production,5,F,F,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(15.2+7)/(2.66+5),F,T,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
Bhutan,Vegetable,2011,Harvested,2,F,F,T,F,F,F
Bhutan,Vegetable,2011,Yield,6,F,F,T,F,F,F
Bhutan,Vegetable,2011,Production,3,F,F,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(10+2)/(9+3),F,T,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
The image of the above expected output data for a structured look at it:
Part 1 -
Part 2 -
Formulae for populating Amount Field for Derived Type:
Derived Amount = Sum of Harvested of all countries with T (True) grouped by Year and Commodity columns divided by Sum of Production of all countries with T (True)grouped by Year and Commodity columns.
So, the target is to have a combination of all the countries from source and calculate the sum of respective Harvested and Production values which then has to be divided. The commodity can be more than one in the actual scenario for any given country, but that should not bother as the summation of amount happens on grouped commodity and year.
Note: The users in the frontend can select any combination of countries. The sole purpose of doing it in the backend rather than dynamically doing it in the frontend is because AWS QuickSight (our visualisation tool), even though can populate sum on selected column filters but doesn't yet support calculation on those derived summed fields. Hence, the entire calculation of all combination of countries has to be pre-populated (very naive approach) in order to make it available in report on dynamic users selection of countries.
Also if you've any better approach (than the above naive approach mentioned in note) to solve this problem, you are most welcome to guide me. I've also posted a question on the same problem without writing my expected approach for experts to show me the path on how we can solve this kind of a problem better than this naive approach. If you want to help solve it with some other technique, you're most welcome, here is the link to that question.
Any help shall be greatly acknowledged.

Use domain of one table for criteria in another in ms Access query?

I am trying to create a report that displays 3 different numbers for each of my projects.
Contract Hours - Stored in projects table, 1 to 1 relationship
Worked Hours - Stored in linked table that will be updated using an external website reporting feature that will contain only data for the dates that are to be displayed in the report, one to many relationship needs to be a sum
Allocated Hours - Stored in a table in my database called allocations and contains data for all dates, one to many relationship needs to be summed.
Right now i have it set up in a way that the user has to type the data range for the report every time it is run, however the date range only actually applies to the Allocation data because the worked hours data comes filtered and the contract data is one to one.
What I would like to do is set up a query that can see the domain of the worked hours and apply it as a date criteria for the allocated hours.
I have attempted to use max and min values of the Worked hours and tried to get creative but I'm actually not even sure if this is possible because I cannot see any simple solution (although I know it should be possible and fairly simple)
Any help, suggestions, or recommendations are appreciated.

track sales for week/month and find the best sellers

Lets say I have a website that sells widgets. I would like to do something similar to a tag cloud tracking best sellers. However, due to constantly aquiring and selling new widgets, I would like the sales to decay on a weekly time scale.
I'm having problems puzzling out how store and manipulate this data and have it decay properly over time so that something that was an ultra hot item 2 months ago but has since tapered off doesn't show on top of the list over the current best sellers. What would be the logic and database design for this?
Part 1: You have to have tables storing the data that you want to report on. Date/time sold is obviously key. If you need to work in decay factors, that raises the question: for how long is the data good and/or relevant? At what point in time as the "value" of the data decayed so much that you no longer care about it? When this point is reached for any given entry in the database, what do you do--keep it there but ensure it gets factored out of all subsequent computations? Or do you archive it--copy it to a "history" table and delete it from your main "sales" table? This is relevant, as it has to be factored into your decay formula (as well as your capacity planning, annual reporting requirements, and who knows what all else.)
Part 2: How much thought has been given to the decay formula that you want to use? There's no end of detail you can work into this. Options and factors to wade through include but are not limited to:
Simple age-based. Everything before the cutoff date counts as 1; everything after counts as 0. Sum and you're done.
What's the cutoff date? Precisly 14 days ago, to the minute? Midnight as of two Saturdays ago from (now)?
Does the cutoff date depend on the item that was sold? If some items are hot but some are not, does that affect things? What if you want to emphasize some things (the expensive/hard to sell ones) over others (the fluff you'd sell anyway)?
Simple age-based decays are trivial, but can be insufficient. Time to go nuclear.
Perhaps you want some kind of half-life, Dr. Freeman?
Everything sold is "worth" X, where the value of X is either always the same or varies on the item sold. And the value of X can decay over time.
Perhaps the value of X decreased by one-half every week. Or ever day. Or every month. Or (again) it may vary depending on the item.
If you do half-lifes, the value of X may never reach zero, and you're stuck tracking it forever (which is why I wrote "part 1" first). At some point, you probably need some kind of cut-off, some point after which you just don't care. X has decreased to one-tenth the intial value? Three months have passed? Either/or but the "range" depends on the inherent valud of the item?
My real point here is that how you calculate your decay rate is far more important than how you store it in the database. So long as the data's there that the formalu needs to do it's calculations, you should be good. And if you only need the last month's data to do this, you should perhaps move everything older to some kind of archive table.
you could just count the sales for the last month/week/whatever, and sort your items according to that.
if you want you can always add the total amonut of sold items into your formula.
You might have a table which contains the definitions of the pointing criterion (most sales, most this, most that, etc.), then for a given period, store in another table the attribution of points for each of the criterion defined in the criterion table. Obviously, a historical table will be used to store the score for each sellers for a given period or promotion, call it whatever you want.
Does it help a little?