Optimization Performance with Intraday Data
Author: ronc
Creation Date: 10/17/2013 1:13 AM
profile picture

ronc

#1
I have been running optimizations using a 1-Day time scale. I just started doing intra-day simulations/optimizations.

1. The intra-day optimizations seem to run much slower than expected. Using a 1-minute time scale they run much slower than a simulation that uses a 1-day time scale but with the same number of bars. Ii.e. the number of days in the simulation with the daily time scale is roughly the same as the number of minutes in the simuiation with the 1-minute time scale. Is this expected behavior?

To speed things up I tried coarser time scales - 3 min, 5 min, 10 min, 60 min, thinking that these would yield fewer bars and thus a faster optimization. However, the optimization time seems to remain roughly constant regardless of the time scale. Am I doing something wrong?

2. Again in an effort to speed things up I would like to optimize over just a single trading day. However, the UI requires start date not equal to end date, thus making the minimum simulation duration 2 days. Is there any way to do a 1-day simulation?
profile picture

Eugene

#2
First, have you disabled on demand data update and updated your DataSet before running optimizations? You should do it, as the obliging on demand update feature is the primary source of performance degradation.

Naturally, working with large sets of intraday data is slower but how much is "much" i.e. what you expected exactly? Wealth-Lab has to load the entire 1-minute Bars object into memory (before cropping it to 1440 bars) which is much bigger than a Daily data file, and I think it does so for every parameter combination being optimized.

Finally, don't count out what you're optimizing. The source of slowdown can be in your algorithms, their implementation in code, some open issue, etc.
profile picture

ronc

#3
Yes, on-demand data update is disabled. I am using synthetic symbols I created myself, so there is nothing to update.

The algorithms/code are the same ones I use for daily-time-scale optimizations. I am using the same strategy file and just changing between daily and intra-day time scales in the UI, so I would think any algorithm issue would common to both daily and intra-day. Once the simulation gets going on intra-day it runs so slow that the optimizations I have been doing with daily time scales become infeasibly slow with any intra-day time scale.
profile picture

Eugene

#4
I would not assume anything just because it works for you on Daily. We don't know a thing about your algorithms/code, and in my experience, user code and incorrect expectations are primary sources of problems.

Help us help you. How slow is "so slow"? How we can duplicate? Show us the code you're running. Is there a dependence on the amount of data being loaded? (i.e. 1-minute data of a recent IPO vs. 1-minute data of a stock with long history e.g. MSFT, AAPL). What optimizer are you using, what settings are applied? Don't hesitate to be verbose.
profile picture

ronc

#5
Sorry to be out of touch on this thread. I am using the genetic optimizer. I am loading 1-minute data for dow stocks. I am (almost) convincing myself that there is no technical issue here but just a lot more data, compared to daily simulations. My challenge is that I typically optimize train on about 6 months of data on a daily time scale, which is about 184 bars. With 1-minute day, each day is 900 bars, and the minimum time space the WL UI lets me set is 2 days, so my minimum 1-minute test comprises 1800 bars, which is 10X larger than my typical daily tests, and hence much slower. I could go to larger intraday time increments, but is there not a way to have WL test on only 1 day, i.e. start day = end day? I really do not have a need to go beyond 1 day (actually it interferes with my approach, which is aimed at day trading).

Thanks!
profile picture

Eugene

#6
Well, I see no problem here as switching to 1-min scale naturally loads much more data, explaining the relative slowness.

To load 1 (last) day of data you could try using the "Number of Bars" option. Of course, should a symbol have not traded even for a minute this day, choosing this option would also "grab" the data from the previous trading session.

However, I don't think you're moving in the right direction with trying to load exactly 1 day of data because the overhead of doing it is just the same for loading e.g. 2 days. To speed things up you might want to use an SSD; if you're not a SSD user by now, then I'm a proponent of replacing HDD with SSD.

If it's your strategy that processes so slowly that a single day counts, then you can still load up 2 days of data but avoid processing the previous day by using a trick like this:

How do I eliminate sell alerts from buys prior to activating my strategy

Finally, consider limiting the number of optimizable parameters (I see you're a fan of optimizations) and/or increasing the step.
profile picture

akuzn

#7
Let me suggest some ideas that i use.
Nothing special i am not a pro programmer just experienced manager.

All trading begins from intraday and it is abolutely right to find many algo solutions on intraday timeframe and simplify it or adjust with portfolio management on EOD and higher periods.
For example i use real time intraday strategies with stop orders and take profit orders computed and placed anyway.
It s good if there is probability of problems with connection. It may happen at any unpredictable moment (old trading school).
I found that it is much more comfortable and may be portable to compute stop levels and to check if price upper/higher these levels before place stops instead of checking bool value returned by StopAt methods.

1. I ve rewritten some methods to improve execution and backtest speed:
For example i dont use BBUpper and BBLower, because better compute SMA and after that compute bbup and bbdown series as bbup = sma + StdDev.
Btw if you need StdDev it will be computed only once too.
2.1. I found that for example av = (H+L+C+O)/4 is computing 2-4 slowly than another method
Something like:
CODE:
Please log in to see this code.

If you want compute ds = ds1 +/- ds2, better replace it by analogy.
May be Eugene will override some operators later.

You may also try to use Parallel methods for computing. It may improves speed too.
At least my 45.000 bars are computed in 800-2000 ms depending on how many dataseries are used. It may be up to 50-60.
And now it takes 13-18 hours) or optimizations. Now you may see why i had to find solutions.

I tested Tasks also but not sure about breakthrough because i usually optimize 6-8 strategies on 4 cores cpu so not sure that it can greatly improve speed under WealthLab. I think WealthLab is working with threads too. Not yet sure i can improve something.
But i ve tried some GA computing examples in parallel threads started outside WealthLab - they are really running 2 times faster in parallel execution (Microsoft examples and smart threadpool with multicore mode) But theses examples are without any large dataserie.

You may find these examples and source code on MSDN, codeplex and codeproject.

But seems some improvements with DataSeries computing may really help.
Dont forget use Clear.Cache() or assign Null to arrayes - it frees memory and sometimes really improves speed.

I even tried to place some values and series in Struct outside classes (in general value type must be placed in stack wich is quicklier itslef and is quickly freed too )or even Strategy class to Struct - just to see what will happen - not sure but may be 3-5% of additional speed improvement.))

I hope Eugene wont criticize me.
profile picture

Eugene

#8
Alexey,
Thank you for your thoughtful post, as always. Some quick comments at the risk of running into off topic territory:

QUOTE:
Btw if you need StdDev it will be computed only once too.

Bars.Cache won't let it compute more than once anyway.

QUOTE:
May be Eugene will override some operators later.

If only it took just me. ;) I've already committed a suggestion to the Fidelity developers to parallelize the DataSeries math operators. The ball is in their court.
profile picture

akuzn

#9
Just tested SMA, AMA, StdDev, RSI and some other indicators with optimized code execution.
Real result impressed me a lot - 2 to 8 times faster.
And it was without any unsafe, C++ inline or assembler methods.
Just impoved moving window in indicators and tried to replace excessive DataSeries and arrayes member calls by placing their values in single variables.
Just for example following code runs much faster than standard indicator (1.5-5 times faster depending on how CPU is loaded).
CODE:
Please log in to see this code.

Optimized StdDev code runs for example - 3 to 8 times faster.
Seems to me there are many hidden possibilities to optimize.
profile picture

akuzn

#10
Suggested example with SMA is not good because actually my SMA version gives same result.
And time computing in example started after memory allocation not in method entry. I ve replaced stopwatch.Start() before memory allocation method and result is equal to standard SMA.Series.


But i have done some job and have created some oscillator indicators and because of that have rewritten many standard indicators: StdDev, RSI's, some averages: about 12 methods. 100-200% of speed improvement due to parallizing cycles when there is no dependency between consecutive data, some code improvements without excessive memory adressing trying to use only internal block variables etc.
In general each serie computing takes 4-8 times faster depending on CPU load.
Funny but my StdDev version just showed 3 ms and standard WealthLab method 45 ms (10ms versus 120ms, 6 and 90 and so on) on 45000 bars during 5 parallel optimization. Seems too many memory calls and CPU cant optimize these memory calls.
There is an additional effect - half of memory is free now. Previously it was occupied. Now i use much more dataseries than before.
As for my opinion you have to improve standard indicators computing and optimizers.
Seems to me WealthLab doesnt use parallel threading for dataseries computing. For example you could use parallel invokation wich works better than Tasks and doesnt need thread pool managing. I use it in my code and it gives about 40% of speed improvement.

I ve tested some examples from MSDN with parallel evolutionary optimization - they really compute 100% faster relative to computing without parallalel threads.
May be it is difficult task to manage threads, optimize GA parallel execution but If you could recompile WealthLab with parallel invokation of strategies i think it would be great.
It will take just some seconds if you use different methods(blocks) of dataseries computing and their drawing.
Just for example.
CODE:
Please log in to see this code.
profile picture

Eugene

#11
QUOTE:
May be it is difficult task to manage threads, optimize GA parallel execution but If you could recompile WealthLab with parallel invokation of strategies i think it would be great.

It's not that simple: not all internal methods are thread-safe.

QUOTE:
Seems to me WealthLab doesnt use parallel threading for dataseries computing.

That's right. Parallel methods were not available before .NET 4.0, and I won't be too far from truth when saying that the Wealth-Lab indicators were coded somewhere between late 2006 and mid 2007. Since then, they've seen only bug fixes but no groundbreaking changes.

In reality, many traders prefer to stick with KISS systems and therefore they won't necessarily share the need to squeeze out every bit of RAM and every CPU cycle with you. However, at MS123 we really appreciate your suggestions on optimization and parallel caclulations and want to make Wealth-Lab even faster than it currently is. In fact, thanks to you our product backlog includes the tasks for parallelizing Community Components, Community Indicators and TASCIndicators, and Fidelity's backlog now has an offician feature request on parallelizing the DataSeries class and the built-in WealthLab.Indicators library. I expect our part to be finished somewhere during 2014 while the Wealth-Lab thick client modification is currently not being considered for the upcoming release.
profile picture

akuzn

#12
It wasnt to difficult to modify some series computings. But what i cant do - improve optimization speed.
Im not sure but it seems to me WealthLab can run much faster additionnally 2-4 times.
profile picture

Eugene

#13
QUOTE:
But what i cant do - improve optimization speed.


From the horse's mouth: the developers would really like to improve it as well but it's a major enhancement effort that should be treated accordingly (would require management approval and prioritization).
profile picture

akuzn

#14
I ve heard many conversations c++, native c etc.
But was always sure it is not so important. Thanks to you now i know that C# in right hands can be dangerous arm.)
So waiting for new look of WealthLab and especially for improved speed of optimization. It s needed especially for WFO.
This website uses cookies to improve your experience. We'll assume you're ok with that, but you can opt-out if you wish (Read more).