Bloomberg - get real world prices from the API - bloomberg

For a number of financial instruments, Bloomberg scales the prices that are shown in the Terminal - for example:
FX Futures at CME: e.g. ADZ3 Curncy (Dec-2013 AUD Futures at CME) show as 93.88 (close on 04-Oct-2013), whereas the actual (CME) market/settlement price was 0.9388
FX Rates: sometimes FX rates are scaled - this may vary by which way round the FX rate is asked for, so EURJPY Curncy (i.e. JPY per EUR) has a BGN close of 132.14 on 04-Oct-2013. The inverse (EUR per JPY) would be 0.007567. However, for JPYEUR Curncy (i.e. EUR per JPY), BGN has a close of 0.75672 for 04-Oct-2013.
FX Forwards: Depending on whether you are asking for rates or forward points (which can be set by overrides)... if you ask for rates, you might get these in terms of the original rate, so for EURJPY1M Curncy, BGN has a close of 132.1174 on 04-Oct-2013. But if you ask for forward points, you would get these scaled by some factor - i.e. -1.28 for EURJPY1M Curncy.
Now, I am not trying to criticise Bloomberg for the way that they represent this data in the Terminal. Goodness only knows when they first wrote these systems, and they have to maintain the functionality that market practitioners have come to know and perhaps love... In that context, scaling to the significant figures might make sense.
However, when I am using the API, I want to get real-world, actual prices. Like... the actual price at the exchange or the actual price that you can trade EUR for JPY.
So... how can I do that?
Well... the approach that I have come to use is to find the FLDS that communicate this scaling information, and then I fetch that value to reverse the scale that they have applied to the values. For futures, that's PX_SCALING_FACTOR. For FX, I've found PX_POS_MULT_FACTOR most reliable. For FX forward points, it's FWD_SCALE.
(It's also worth mentioning that how these are applied vaires - so PX_SCALING_FACTOR is what futures prices should be divided by, PX_POS_MULT_FACTOR is what FX rates should be multipled by, and FWD_SCALE is how many decimal places to divide the forward points by to get to a value that can be added to the actual FX rate.)
The problem with that is that it doubles the number of fetchs I have to make, which adds a significant overhead to my use of the API (reference data fetches also seem to take longer than historical data fetches.) (FWIW, I'm using the API in Java, but the question should be equally applicable to using the API in Excel or any of the other supported languages.)
I've thought about finding out this information and storing it somewhere... but I'd really like to not have to hard code that. Also, that would require to spend a very long time finding out the right scaling factors for all the different instruments I'm interested in. Even then, I would have no guarantee that they wouldn't change their scale on me at some point!
What I would really like to be able to do is apply an override in my fetch that would allow me specify what scale should be used. (And no, the fields above do not seem to be override-able.) I've asked the "helpdesk" about this on lots and lots of occasions - I've been badgering them about it for about 12 months, but as ever with Bloomberg, nothing seems to have happened.
So...
has anyone else in the SO community faced this problem?
has anyone else found a way of setting this as an override?
has anyone else worked out a better solution?

Short answer: you seem to have all the available information at hand and there is not much more you can do. But these conventions are stable over time so it is fine to store the scales/factors instead of fetching the data everytime (the scale of EURGBP points will always be 4).
For FX, I have a file with:
number of decimal (for spot, points and all-in forward rate)
points scale
spot date
To answer you specific questions:
FX Futures at CME: on ADZ3 Curncy > DES > 3:
For this specific contract, the price is quoted in cents/AUD instead of exchange convention USD/AUD in order to show greater precision for both the futures and options. Calendar spreads are also adjusted accordingly. Please note that the tick size has been adjusted by 0.01 to ensure the tick value and contract value are consistent with the exchange.
Not sure there is much you can do about this, apart from manually checking the factor...
FX Rates: PX_POS_MULT_FACTOR is your best bet indeed - note that the value of that field for a given pair is extremely unlikely to change. Alternatively, you could follow market conventions for pairs and AFAIK the rates will always be the actual rate. So use EURJPY instead of JPYEUR. The major currencies, in order, are: EUR, GBP, AUD, NZD, USD, CAD, CHF, JPY. For pairs that don't involve any of those you will have to fetch the info.
FX Forwards: the points follow the market conventions, but the scale can vary (it is 4 most of the time, but it is 3 for GBPCZK for example). However it should not change over time for a given pair.

Related

Total coin supply, how it works, and what the code means?

I am currently studying the bitcoin and litecoin to try and get a better understanding of cryptocurrencies, and blockchains in general - and I have spotted something in the code that I have a question about.
in src/amount.h - I see the following code...
/** No amount larger than this (in satoshi) is valid.
*
* Note that this constant is *not* the total money supply, which in Bitcoin
* currently happens to be less than 21,000,000 BTC for various reasons, but
* rather a sanity check. As this sanity check is used by consensus-critical
* validation code, the exact value of the MAX_MONEY constant is consensus
* critical; in unusual circumstances like a(nother) overflow bug that allowed
* for the creation of coins out of thin air modification could lead to a fork.
* */
static const CAmount MAX_MONEY = 84000000 * COIN;
Now, the comment here, seems to suggest that this code does not actually define what the total supply of the currency will be, even though the amount of Litecoin available is in fact 84,000,000...
So, my real question :
Is the real total supply held in another piece of code? If so, what am I missing, where can I find this code, and if I were to be trying to edit this (I'm not - but I want to understand what is going on here) - would I need to edit code in multiple places?
NOTE : Tagged bitcoin even though this is litecoin souce in the question, because litecoin doesn't appear to have a stackoverflow tag, and the two codebases are similar anyway.
EDIT : I also wanted to add, that I performed a grep for "84000000" - and only really found that one line of code to be relevant... So I must be missing something...
EDIT 2 : According to literally every coin out there on git that I have looked at - this is the number that they change when adjusting the total supply - so is the comment just wrong - or did I misunderstand it?
I realise this is an old question, but since it hasn't been updated I'll provide an answer.
As the source suggests, MAX_MONEY is simply a sanity check. If someone tries to create a transaction spending 500 million Bitcoin, and it somehow manages to bypass all other sanity checks, the network will still reject it because the amount exceeds MAX_MONEY. So MAX_MONEY is not directly related to total supply, but as you have observed, many alts will set MAX_MONEY to the expected total supply over the lifetime of the coin.
For a pure proof-of-work coin with consistent reward scheme (eg halving every X blocks) the total supply can be pre-calculated, but a future fork could change that.
For a typical proof-of-stake or hybrid proof-of-work and proof-of-stake coin, the maximum supply can be estimated by simulation, but the exact amount will vary depending on network activity.
(This assumes there is not another part of the code that cuts off all rewards after a limit is reached.)

Can AMPL handle this recursively or is a remodeling neccessary?

I'm using AMPL to model a production where I have two particular constraints that I am not very sure how to handle.
subject to Constraint1 {t in T}:
prod[t] = sum{i in I} x[i,t]*u[i] + Recycle[f]*RecycledU[f];
subject to Constraint2 {t in T}:
Solditems[t]+Recycle[t]=prod[t];
EDIT: where x[i,t] is the amount of products from supply point i. u[i] denotes the "exchange rate" of the raw material from supply point i to create the product. I.E. a percentage of the raw material will become the finished products, whereas some raw material will go to waste. The same is true for RecycledU[f] where f is in F, which denotes the refinement station where it has been refined. The difference is that RecycledU[f] has a much lower percentage that will go to waste due to Recycled already being a finished product from f (albeitly a much less profitable one). I.e. Recycle has already "went through" the process of being a raw material earlier, x, but has become a finished product in some earlier stage, or hopefully (if it can be modelled) in the same time period as this. In the actual models things as "products" and "refinement station" is existent as well, but I figured for this question those could be abandoned to keep it more simple.
What I want to accomplish is that the amount of products produced is the sum of all items sold in time period t and the amount of products recycled in time period t (by recycled I mean that the finished product is kept at the production site for further refinement in some timestep g, g>t).
Is it possible to write two equal signs for prod[t] like I have done? Also, how to handle Recycle[t]? Can AMPL "understand" that since these are represented at the same time step, that AMPL must handle the constraints recursively, i.e. compute a solution for Recycle[t] and subsequently try to improve that solution in every timestep?
EDIT: The time periods are expressed in years which is why I want to avoid having an expression with Recycle[t-1].
EDIT2: prod and x are parameters and Recycle and Solditems are variables.
Hope anyone can shed some light into this!
Cenderze
The two constraints will be considered simultaneously (unless you explicitly exclude one from the problem). AMPL or optimization solvers don't have the notion of time steps and the complete problem is considered at the same time, so you might need to add some linking constraints between time periods yourself to model time periods. In particular, you might need to make sure that the inventory (such as the amount finished product is kept at the production site for further refinement) is carried over from one period to another, something like:
Recycle[t + 1] = Recycle[t] - RecycleDecrease + RecycleIncrease;
You have to figure out the expressions for the amounts by which Recycle is increased (RecycleIncrease) and decreased (RecycleDecrease).
Also if you want some kind of an iterative procedure with one constraint considered at a time instead, then you should use AMPL script.

Neural Network Input and Output Data formatting

and thanks for reading my thread.
I have read some of the previous posts on formatting/normalising input data for a Neural Network, but cannot find something that addresses my queries specifically. I apologise for the long post.
I am attempting to build a radial basis function network for analysing horse racing data. I realise that this has been done before, but the data that I have is "special" and I have a keen interest in racing/sportsbetting/programming so would like to give it a shot!
Whilst I think I understand the principles for the RBFN itself, I am having some trouble understanding the normalisation/formatting/scaling of the input data so that it is presented in a "sensible manner" for the network, and I am not sure how I should formulate the output target values.
For example, in my data I look at the "Class change", which compares the class of race that the horse is running in now compared to the race before, and can have a value between -5 and +5. I expect that I need to rescale these to between -1 and +1 (right?!), but I have noticed that many more runners have a class change of 1, 0 or -1 than any other value, so I am worried about "over-representation". It is not possible to gather more data for the higher/lower class changes because thats just 'the way the data comes'. Would it be best to use the data as-is after scaling, or should I trim extreme values, or something else?
Similarly, there are "continuous" inputs - like the "Days Since Last Run". It can have a value between 1 and about 1000, but values in the range of 10-40 vastly dominate. I was going to scale these values to be between 0 and 1, but even if I trim the most extreme values before scaling, I am still going to have a huge representation of a certain range - is this going to cause me an issue? How are problems like this usually dealt with?
Finally, I am having trouble understanding how to present the "target" values for training to the network. My existing results data has the "win/lose" (0 or 1?) and the odds at which the runner won or lost. If I just use the "win/lose", it treats all wins and loses the same when really they're not - I would be quite happy with a network that ignored all the small winners but was highly profitable from picking 10-1 shots. Similarly, a network could be forgiven for "losing" on a 20-1 shot but losing a bet at 2/5 would be a bad loss. I considered making the results (+1 * odds) for a winner and (-1 / odds) for a loser to capture the issue above, but this will mean that my results are not a continuous function as there will be a "discontinuity" between short price winners and short price losers.
Should I have two outputs to cover this - one for bet/no bet, and another for "stake"?
I am sorry for the flood of questions and the long post, but this would really help me set off on the right track.
Thank you for any help anyone can offer me!
Kind regards,
Paul
The documentation that came with your RBFN is a good starting point to answer some of these questions.
Trimming data aka "clamping" or "winsorizing" is something I use for similar data. For example "days since last run" for a horse could be anything from just one day to several years but tends to centre in the region of 20 to 30 days. Some experts use a figure of say 63 days to indicate a "spell" so you could have an indicator variable like "> 63 =1 else 0" for example. One clue is to look at outliers say the upper or lower 5% of any variable and clamp these.
If you use odds/dividends anywhere make sure you use the probabilities ie 1/(odds+1) and a useful idea is to normalize these to 100%.
The odds or parimutual prices tend to swamp other predictors so one technique is to develop separate models, one for the market variables (the market model) and another for the non-market variables (often called the "fundamental" model).

Optimizing lumber purchasing

I would like to know how to classify the following optimization problem.
A lumber yard sells 2x4's in various stock lengths. For example, an 8ft could be $3 and a 10ft could be $4, while a 14ft might be $5.50. Importantly, the lengths are not linearly related to price and not all discrete lengths can be purchased as stock. It can be assumed that the available stock units are inexhaustible in these discrete lengths.
length cost
7.7ft $2.75
8ft $3.00
10ft $4.00
14ft $5.50
I need to create a set of 2x4's with given lengths by cutting them from the above stock (say I need lengths of 2ft, 2.5ft, 6ft once all is said and done). Also, each "cut" incurs a material cost of 1/8" (i.e. 0.0104ft). The solution of the problem is an assignment of each desired length to a piece of stock with the total cost of all stock minimized. In this example, the optimal solution minimizing cost is to buy a 14ft board at $5.50. (A runner-up solution is to buy two 8ft boards and allocate as {6ft} and {2ft, 0.0104ft, 2.5ft} for a cost of $6.)
It does not seem to be a Knapsack-class problem. It does not seem to be a cutting stock problem (because I would like to minimize cost rather than minimize waste). What sort of problem is this, and how can I go about efficiently solving it?
(As an after-note, this is a non-fictional problem I have solved in the obvious, inefficient way using multiset partitions and iteration in Haskell. The runtime is prohibitive to practical use with more than 23 desired lengths and 6 available stock sizes.)
I believe that this is a cutting stock problem, except that it's a multi-objective or multi-criteria cutting stock problem (where you want to minimize monetary cost as well as material cost), see for this example this article. Unfortunately almost all of the online resources I found for this breed of cutting stock problem were behind paywalls; in addition, I haven't done any integer-linear programming in several years, but if I remember correctly multi-objective problems are much more difficult than single-objective problems.
One option is to implement a two-pass algorithm. The first pass completely ignores the material cost of cutting the boards, and only uses the monetary cost (in place of the waste cost in a standard cutting stock problem) in a single-objective problem. This may leave you with an invalid solution, at which point you perform a local search to e.g. replace two 10-foot boards with a 14-foot board and an 8-foot bard until you reach a valid solution. Once you find a valid solution, you can continue the local search for several more iterations to see if you can improve on the solution. This algorithm will likely be sub-optimal when compared to a one-pass multi-objective solution, but it ought to be much easier to implement.

How would I calculate EXPECTED income if I have PAST income data in mySQL?

Ok, I'm just curious what the formula would be for calculating an expected income over the next X weeks/months/etc, if the only data I have in mySQL DB is all past transactions (dates of transactions, amounts, etc)
I am thinking taking some averages and whatnot, but I can't think of a specific formula (there must be something along those lines) to take say average rise of income over time (weekly/monthly) and then apply it to a select future period and display it weekly/monthly/etc?
Any suggestions?
use AVG() on the income in the past devide it to proper weekly/monthly amounts if neccessary.
see http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html#function_avg for more info on AVG()
Linear regression + simple integration is probably sufficient for your needs. I leave sorting out exact implementation for your DB up to you, but that follow that link to the "Estimation Methods" section, and probably use Ordinary Least Squares.
Alternatively, you can always slurp your data into something like R where the details are already implemented.
EDIT:
For more detail: you're trying to model INCOME = BASE + SCALING*T where we are assuming that a linear model is "good" (it's probably not great, but it's probably good enough on a short time scale). For two value linear regression, you're pretty much just taking averages; follow that link to "Fitting the Regression Line" and you'll see which things you need to average (y = INCOME and x = T). There are some tricks you can play to simplify the calculation for the computer if you can enforce some other conditions (e.g., having equally spaced time periods + no missing data), but you'll need to math a bit more yourself first if you want to do that (and you'll be less flexible in the face of changing db assumptions).