How to compute Technical Indicators with 1 Minute stock price data? - stock

I am using TA-lib to compute various Technical Indicators. The dataset I have is stock price data for 1 minute intervals. Easiest way is to multiply 390 (390 minutes in a Trading day) with number of days, e.g. To compute 5SMA, SMA(inputs, timeperiod=5*390)
Is there any library for such purpose or any better solution ?

This really depends on what you're seeking. If you're seeking average daily price, you'd have to convert your quote history from minute to daily-sized bars, then run it through in that new format. TA-Lib is an older array-based library, so you'd have to do this work while the data still has date/time context, before you put it in the array.
Here's an example of how to "quantize" quote history in C#.
history
.OrderBy(x => x.Date)
.GroupBy(x => x.Date.RoundDown(newBarSize))
.Select(x => new Quote
{
Date = x.Key,
Open = x.First().Open,
High = x.Max(t => t.High),
Low = x.Min(t => t.Low),
Close = x.Last().Close,
Volume = x.Sum(t => t.Volume)
});
This is also available in the open-source Skender.Stock.Indicators library and can be used as simply history.Aggregate(PeriodSize.Day), for example.
This libary can replace TA-Lib, if you're looking for a more modern technical indicators library. For example, it has a history.GetSma(5) method, among others.

Related

How to Convert foor loop to NHibernate Futures for performance

NHibernate Version: 3.4.0.4000
I'm currently working on optimizing our code so that we can reduce the number of round trips to the database and am looking at a for loop that is one of the culprits. I'm having a hard time figuring out how to batch all of these iterations into a future that gets executed once when sent to SQL Server. Essentially each iteration of the loop causes 2 queries to hit the database!
foreach (var choice in lineItem.LineItemChoices)
{
choice.OptionVersion = _session.Query<OptionVersion>().Where(x => x.Option.Id == choice.OptionId).OrderByDescending(x => x.OptionVersionNumber).FirstOrDefault();
choice.ChoiceVersion = _session.Query<ChoiceVersion>().OrderByDescending(x => x.ChoiceVersionIdentity.ChoiceVersionNumber).Where(x => x.Choice.Id == choice.ChoiceId).FirstOrDefault();
}
One option is to extract OptionId and ChoiceId from all the LineItemChoices into two lists in local memory. Then issue just two queries, one for options and one for choices, giving these lists in .Where(x => optionIds.Contains(x.Option.Id)). This corresponds to SQL IN operator. This requires some postprocessing. You will get two result lists (transform to dictionary or lookup if you expect many results), that you need to process to populate the choice objects. This postprocessing is local and tends to be very cheap compared to database roundtrips. This option can be a bit tricky if the existing FirstOrDefault part is absolutely necessary. Do you expect there to be more than result for a single optionId? If not, this code could instead have used SingleOrDefault, which could just be dropped if converting to use IN-queries.
The other option is to use futures (https://nhibernate.info/doc/nhibernate-reference/performance.html#performance-future). For Linq it means to use ToFuture or ToFutureValue at the end, which also conflicts with FirstOrDefault I believe. The important thing is that you need to loop over all line item choices to initialize ALL queries BEFORE you access the value of any of them. So this is likely to also result in some postprocessing, where you would first store the future values in some list, and then in a second loop access the real value from each query to populate the line item choice.
If you to expect that the queries can yield more than one result (before applying FirstOrDefault), I think you can just use Take(1) instead, as that will still return an IQueryable where you can apply the future method.
The first option is probably the most efficient, since it will just be two queries and allow the database engine to make just one pass over the tables.
Keep the limit on the maximum number of parameters that can be given in an SQL query in mind. If there can be thousands of line item choices, you may need to split them in batches and query for at most 2000 identifiers per round trip.
Adding on the Oskar answer, NHibernate Futures was implement in NHibernate 2.1. It is available on method Future for collections and FutureValue for single values.
In your case, you could separate the IDs of the list in memory ...
var optionIds = lineItem.LineItemChoices.Select(x => x.OptionId);
var choiceIds = lineItem.LineItemChoices.Select(x => x.ChoiceId);
... and execute two queries using Future<T> to get two lits in one hit over the database.
var optionVersions = _session.Query<OptionVersion>()
.Where(x => optionIds.Contains(x.Option.Id))
.OrderByDescending(x => x.OptionVersionNumber)
.Future<OptionVersion>();
var choiceVersions = _session.Query<ChoiceVersion>()
.Where(x => choiceIds.Contains(x.Choice.Id))
.OrderByDescending(x => x.ChoiceVersionIdentity.ChoiceVersionNumber)
.Future<ChoiceVersion>();
After with all you need in memory, you could loop on the original collection you have and search in memory the data to fill up the choice object.
foreach (var choice in lineItem.LineItemChoices)
{
choice.OptionVersion = optionVersions.OrderByDescending(x => x.OptionVersionNumber).FirstOrDefault(x => x.Option.Id == choice.OptionId);
choice.ChoiceVersion = choiceVersions.OrderByDescending(x => x.ChoiceVersionIdentity.ChoiceVersionNumber).FirstOrDefault(x => x.Choice.Id == choice.ChoiceId);
}

How can I reduce the database call time and number (Rails)?

So I'm working on a rails app for a building that keeps track of water usage/collection and electricity use/solar generation, etc. These are stored as measurement rows, attached to sensors, which are attached to programs (location in the building, essentially) and subtypes (attached to types - water, electricity).
I'm doing some graphing with chartkick, and the database calls related to this are way too slow. They'll be much faster on the production servers, but there will also be far more data.
Here's the helper method that has the chart generation and database call in it:
def stackedSubtypeChart(grouping)
rsubs = #resource.subtypes
.order(:usage?) #add usage types after gen types
.map{|stype| [
stype.name,stype.measurements #this takes too long!
.where("date >= ?", params[:start]) #(4 calls!!)
.where("date <= ?", params[:stop])
.group_by_period(grouping, :date).maximum(:amount)]}
rsubs = rsubs.map {|stype|
{name: stype[0],
data: stype[1]}}
ret = column_chart rsubs,
stacked: true,
library: { :series => {0 => { type: "line"}}}
end
#resource is defined in the controller as:
#resource = Type.includes(:subtypes => :sensors).find_by_resource('electricity')
I've commented the line that's responsible for there being multiple calls, which is definitely part of the problem. This takes two seconds to load on my (admittedly very very old) computer with a month of data.
I could really use help with both changing the map so that this is one call instead of however-many-subtypes calls, and with reducing what I'm pulling in so each call isn't taking half a second. I don't have a ton of experience optimizing this sort of thing and I'm not really sure how to start doing more than I have here already.
Might be helpful to look into ActiveRecord Explain to dig into the SQL. There's a good screencast that explains (pun totally intended) pretty well.
After a lot of bashing my head against a wall, I stumbled across this, which is a much faster single query that grabs all the data + data connections I need. It's a little hard to format but it works.
rsubs = Measurement
.where("measurements.date >= ? AND measurements.date <= ?",
offset(params[:start], -1, grouping),
offset(params[:stop], 1, grouping))
.joins(sensor: {subtype: :type})
.where("types.resource = ?", #rname)
.order('subtypes."usage?"')
.group_by_period(grouping, :date).group("subtypes.id, subtypes.name").maximum(:amount)

Printing a pdf of more than 5000 pages takes longtime using Prawn pdf gem

I am using prawn pdf gem to print pdf.
I am formatting the data in to tables and then printing it to the pdf. I have around 5000 pages (about 50000 entries) to print and it takes forever. For small number of pages its quick ... Is there any way I can improve the speed of printing.
Also, printing without the data in table format was quick. please help me out with this.
code for this :
format.pdf {
pdf = Prawn::Document.new(:margin => [20,20,20,20])
pdf.font "Helvetica"
pdf.font_size 12
#test_points_all = Hash.new
dataset_id = Dataset.where(collection_success: true).order('created_at DESC').first
if(inode.leaf?)
meta=MetricInstance.where(dataset_id: dataset_id, file_or_folder_id: inode.id).includes(:test_points,:file_or_folder,:dataset).first
#test_points_all[inode.name] = meta.test_points
else
nodes2 = []
nodes2 = inode.leaves
if(!nodes2.nil?)
nodes2.each do |node|
meta=MetricInstance.where(dataset_id: dataset_id, file_or_folder_id: node.id).includes(:test_points,:file_or_folder,:dataset).first
#test_pointa = meta.test_points
if(!#test_pointa.nil?)
#test_points_all[node.name] = #test_pointa
end
end
end
end
#test_points_all.each do |key, points|
table_data = [["<b> #{key} </b>", "<b>433<b>","xyz","xyzs"]]
points.each do |test|
td=TestDescription.find(:first, :conditions=>["test_point_id=?", test.id])
if (!td.nil?)
table_data << ["#{test.name}","#{td.header_info}","#{td.comment_info}","#{td.line_number}"]
end
pdf.move_down(5)
pdf.table(table_data, :width => 500, :cell_style => { :inline_format => true ,:border_width => 0}, :row_colors => ["FFFFFF", "DDDDDD"])
pdf.text ""
pdf.stroke do
pdf.horizontal_line(0, 570)
end
pdf.move_down(5)
end
end
pdf.number_pages("<page> of <total>", {
:start_count_at => 1,
:page_filter => lambda{ |pg| pg > 0 },
:at => [pdf.bounds.right - 50, 0],
:align => :right,
:size => 9
})
pdf.render_file File.join(Rails.root, "app/reports", "x.pdf")
filename = File.join(Rails.root, "app/reports", "x.pdf")
send_file filename, :filename => "x.pdf", :type => "application/pdf",:disposition => "inline"
end
The first of those two lines is pointless, take it out!
nodes2 = []
nodes2 = inode.leaves
Based on your information, i understand that the following query to the database seems to be performed around 50000 times ... Depending on the volume and content of your table, it might be very reasonable to perform one single query (fetching the whole table) at the start of your whole script, and to keep this data in memory to perform any following operations on it in pure Ruby, without talking to the database. Then again, if the table you are working with is insanely huge, it might also totally clog up your memory and be not a good idea at all. It really depends ... so figure it out!
TestDescription.find(:first, :conditions=>["test_point_id=?", test.id])
Also, if, as you say, printing without tables was very quick, you might be able to achieve a major speedup by reimplementing that minor part of table functionality you are actually using yourself, with only low level functions from prawn. Why? Prawn's table function is surely made to fulfill as many usecases as possible, and therefore includes a lot of overhead (at least form the perspective of someone who needs only barebones functionality - For everyone else this "overhead" is gold!). And therefore you can just implement that little part of tables you need yourself, and that might just give you a major performance boost. Give it a shot!
If you're using a recent version of ActiveRecord, I'd suggest using pluck in your inner loop. Instead of this:
td=TestDescription.find(:first, :conditions=>["test_point_id=?", test.id])
if (!td.nil?)
table_data << ["#{test.name}","#{td.header_info}","#{td.comment_info}","#{td.line_number}"]
end
Try this instead:
td = TestDescription.where(test_point_id: test.id)
.pluck(:name, :header_info, :comment_info, :line_number).first
table_data << td unless td.blank?
Instead of instantiating an ActiveRecord object for each TestDescription, you'll just get back an array of field values that you should be able to append directly to table_data, which is really all you need here. This means less memory usage, and less time spent in GC.
It might also be worth trying to use pluck to retrieve all the entries at once, in which case you'd have an array of arrays to loop over. This would take more memory than fetching one at a time, but a lot less than an array of AR objects, and you'd save doing separate db queries.

How to get charges(transactions) details in Stripe based on date range

I wanted to get a list of charges(Transactions) based on date range I specify, ie all transactions between my specified Start date and End date.
But in CHARGES API, I can not see any Start date nor End Date arguments.
How can I get this?
Had a chat with Stripe staffs through online chat, and found that there is a way to get list of charges based on date range.
Stripe Charges API actually has some argument that are not yet listed in their documentation.
Arguments like created[lte] and created[gte] with Unix timestamp can be used, just like Events API call.
EG: https://api.stripe.com/v1/charges?created[gte]=1362171974&created[lte]=1362517574
Try this one. It's working for me
$pcharges = Charge::all(
array(
'limit' => 100,
'created' => array(
'gte' => strtotime('-15 day'),
'lte' => strtotime('-1 day')
)
)
);
This will return last 15 days data excluding today's transaction. You can set your custom date range as per your requirement.
Here's a Ruby based hack
Stripe.api_key = ENV['STRIPE_SECRET']
stripe_charges = []
first_charge = Stripe::Charge.all(limit: 1).data[0].id
charge_index = first_charge
*a lot of*.times do
new_charges = Stripe::Charge.all(limit: 100, starting_after: charge_index).data
stripe_charges << new_charges
charge_index = new_charges.last.id
stripe_charges.flatten!
end
Was looking in to it today and here is what i found
https://stripe.com/docs/api/curl#list_charges
curl https://api.stripe.com/v1/charges?limit=3 \
-u sk_test_BQokikJOvBiI2HlWgH4olfQ2:
This is stripes curl example there are more examples on their website.
-James Harrington
In case anyone is using Ruby on Rails and is looking for a solution to list out all refunds that's been created after a Unix timestamp, using the created.gte syntax, here's a working example for me that I got from Stripe Support.
Stripe::Refund.list({limit: 100, created: {gte: 1614045880}})
You can change that Unix timestamp to fit your situation.
Resources: Stripe API Reference, List all refunds and Stripe Support
To get particular date data code is like
$mydata= \Stripe\Charge::all(array('limit'=>50,'starting_after'=>null ,"created" => array("gt" => strtotime("2020-02-17"),"lt" => strtotime("2020-02-19"))));
print_r($mydata);
It will give you data of 2020-02-18 with limit 50, If you want more record add last charge id in starting_after parameter

Grouping items into first occurances

I totally can't get my head around writing a correct query for my problem this morning so here's hoping that someone out there can help me out.
I have a database table called Sessions which basically looks like this
Sessions:
SessionID
SessionStarted
IPAddress
..other meta data..
I have a requirement where I am to show how many new Sessions (where new is defined as from a previously unseen IPAddress) arrive each day over a given period. Basically, each IPAddress should count only once in the results, namely for the day of the first session from the IPAddress. So I'm looking for a result like:
[Date] [New]
2009-10-01 : 11
2009-10-02 : 6
2009-10-03 : 19
..and so on
...which I can plot on some nice chart and show to important people. I would very much prefer a Linq2SQL query as that is what we are currently using for data access, but if I'm out of luck I may be able to go with some raw SQL (accessed via stored procedure, but I would really, really, really prefer Linq2SQL).
(As a bonus my next step will very likely be qualifying which sessions should be included by filtering on some of the other meta data)
Hoping that someone clever will help me out here...
I would use something like this.
var result = data.OrderBy(x => x.SessionStarted)
.GroupBy(x => x.IPAddress)
.Select(x => x.First())
.GroupBy(x => x.SessionStarted.Date)
.Select(x => new { Date = x.Key, New = x.Count() });