Access to the name of the dataset
Author: sedelstein
Creation Date: 3/13/2012 6:51 PM
profile picture

sedelstein

#1
I have a two different datasets,m say NAS100 and SP100. I am writing some results to a file and I would like to write to a filename based on the dataset, say NAS100_results and SP100_results.

I see methods for getting the symbols within a dataset e.g. DataSetSymbols[0] .... but not the name of the dataset itself.

Is it available?

Can it also be set within a script so a single script can run first on dataset 1 and then on dataset 2?

Thanks
profile picture

Eugene

#2
QUOTE:
Is it available?

No, it's not available natively but as a "hack" in Community Components:

GetDataSetName
GetDataSource

QUOTE:
Can it also be set within a script so a single script can run first on dataset 1 and then on dataset 2?

What is the purpose? How do you plan to use this, Steve?
profile picture

sedelstein

#3
Hi Eugene

Sure thing. it's a little involved so bear with me.

I've got some code using the runDonor methods in Community components. For a given data set, it allows me to execute a loop over multiple strategies and writes to .csv files much of the performance stats often found in visualizers and some of my own that I've written. ( I do this for newly minted alerts this morning and for all symbols in the datasets for differing time periods). I then look to do some statistical analysis to determine whether any of the past measures provide any predictability for future results and this helps me chose trades I care to take. E.g. Calculate the "Information Coefficient" etc (see the papers of Grinold and Kahn from a different thread).

I've found that as the datasets grow in size (# of symbols) that performance noticeably bogs downs (hence my question in another thread about not looping over the 4 results types, long, short, all, buy and hold and writes my .csv files 4 times) and that the code runs perfectly well for small (10 symbol) datasets but give me "object not found messages" in the debug window for larger datasets. (My Visual Studio C# debugging skills while improving are not quite there yet) My thought was that if I could breakdown the larger datasets into smaller ones, I can execute an outer loop over these smaller datasets and write the results to files that have the smaller dataset names embedded in the name of the file,

e.g.
System1_Dataset1_results.csv,
System1_Dataset2_results.csv,
System2_Dataset1_results.csv,
System2_Dataset2_results.csv, etc...

Perhaps a bit convoluted but it gets the job done
profile picture

sedelstein

#4
I'm happy to share the basic code if you have interest and can help me "fix" the problem.
I'd also welcome thoughts from the user community about pursuing and improving these methodologies
profile picture

sedelstein

#5
Hi Eugene

Even though skalmsn99 and I were talking about slightly different topics here, is there a way to have this code write the files only once? (and not 4x)?
profile picture

Eugene

#6
Thanks for the explanation.
QUOTE:
Can it also be set within a script so a single script can run first on dataset 1 and then on dataset 2?

Yes. From you it will take a modification of our open source. If you examine the familiar runDonorNew() method, there's a block commented as being responsible for executing as a Multi-Symbol Backtest that builds a List<Bars>.

I think the following has a chance to work out: find a way to pass either a) the DataSet returned by GetDataSource above or b) a List<string> to the runDonorNew() method. The implementation is entirely up to you.

The idea is to call the runDonorNew() method each time with a new list of symbols representing your DataSet1, DataSet2 etc. Voilà.

QUOTE:
that the code runs perfectly well for small (10 symbol) datasets but give me "object not found messages" in the debug window for larger datasets.

As Cone and I pointed out here (getting perplexed with all these crosslinks caused by splitting a discussion in different threads...), it looks like you're erroneously trying to execute a DataSetSymbols-looping code as a Multi-Symbol Backtest. This can well be the reason behind the error messages and the extra looping.

QUOTE:
is there a way to have this code write the files only once? (and not 4x)?

Steve, you know perfectly how to confuse me. :) In skalman99's topic you were talking about the WealthScript (sic!) solution, here you shifted gears towards a performance visualizer solution.

My reply here from 3/12/2012 10:50 AM is in effect, and there's nothing to add. Please carefully re-read it. I believe it contains enough food for thought and your experiments. Good luck with your venture.
profile picture

sedelstein

#7
Seems like we confuse each other! ;->) Not intentionally of course but I think we are looking at things from different perspectives and only one of us knows how WL really works internally. I'm not intentionally splitting the topics as each is a sub-topic in the broader project and the different parts seems (to me at least) to be related to different parts of WL and I was told before to keep to the right sub-forum.

I feel as if I can get to where I'm going because most of the tools are there and just need a little more than the hint that things are self explanatory if you read quite a few lines of code. I appreciate your efforts to date.

Regards.
profile picture

sedelstein

#8
Eugene

Where can I learn more about what you mentioned above

DataSetSymbols-looping code versus Multi-Symbol Backtest code?
profile picture

Eugene

#9
For example, in the Wiki FAQ:

FAQ | Strategies and WealthScript > "My code with DataSetSymbols runs incorrectly or can't run on more than a few symbols."

Although it isn't a rotation Strategy, the principle applies.
profile picture

sedelstein

#10
Eugene,

I think your last comment was helpful. I might suggest that in the answer to the question you point out that examining the SetContext method would be helpful
profile picture

sedelstein

#11
Hi Eugene

Is it possible to "Set" the dataset that the execute method will use within the execute method?

I have some code here (which is not quite working because it returns only 1 dataset name multiple time (# symbols)

I am looking for all the dataset names in the directory and then might want to run the strategy on a few of these datasets

CODE:
Please log in to see this code.
profile picture

Eugene

#12
QUOTE:
Is there a reason the code above doesnt return all the names of the datasets prperly?

Because the code does not make sense. You don't need no GetDataSetName, just

CODE:
Please log in to see this code.


Edit: corrected a string.
profile picture

sedelstein

#13
Thank you. Will do
profile picture

Eugene

#14
QUOTE:
Is it possible to "Set" the dataset that the execute method will use within the execute method?


Here's where I answered this duplicate question:

Accessing SystemResults

Namely my reply from 2/2/2012 8:21 AM.
profile picture

sedelstein

#15
I see the code

CODE:
Please log in to see this code.


Does this mean I will need to create my own modified copy (with the code above modified) of the runDonor method to make this work?
profile picture

sedelstein

#16
or can mySymbol below which I've been using as a single symbol

CODE:
Please log in to see this code.


be a List<Bars>() ?
profile picture

Eugene

#17
Congratulations, you found the right code spot. It's entirely up to you how to modify it so it suits your needs.
This website uses cookies to improve your experience. We'll assume you're ok with that, but you can opt-out if you wish (Read more).