This is a simple implementation of a data miner for traders with a graphical user interface. It is mostly aimed at forex traders but can be used to predict any time series. It classifies the future price into buy, hold, or sell. To do this, four simple machine learning methods are implemented:
The features of the models are different technical trading indicators as well as principal components derived from them. The methods are based on the scikit-learn library. The implementation follows a walk-forward optimization scheme.
I developed the basic version for a client who ended up going into another direction. Therefore I have the permission to release this little tool to the public. I hope I can help some people with their trading.
Download (86.3MB, Windows):
Once you have downloaded and unpacked the .zip file, please run the AlgoMinr.exe file. Please give the program some time to open up, it is loading many libraries. After a little while, the following GUI will open:
Please note, that a console window gets open up as well, where additional information is displayed. What we need to do first, is load some data to work with. It takes simple .csv files where the first column shows the time stamps and the second column contains the mid-price. See this screen if it isn’t clear.
The folder where AlgoMinr.exe is located also contains a folder “data”. It contains an example “.csv” file.
In the GUI, please click on “Load file” and navigate to this folder. (Please note, to open a folder, a single-click is sufficient) and load “input.csv”.
Now you have to specify the periodicity of the loaded file. Please set it to 1 hour if you are using the example file provided.
You can now hit the “Run” button in the top right corner. It will take a while to train and test the models, but after a short while you should get the results on the right side of the window.
The top part shows the accuracy of each predictor in the most recent testing window. In the example screenshot, LDA predicted 61 percent of the time the correct price movement. If we assume that there is some stationarity in the forex time series, we expect this to be the best predictor for the next prediction.
The lower part shows the predictions for the next observation (meaning the one following the last observation in the data file). As training sometimes can take quite a bit of time, you can use the models to make a new prediction by loading an up-to-date data file and clicking on the “Make prediction” button. This will only apply the trained models to the new data and might be helpful in deciding on a discretionary trade.
Please let me know if you have any critique or questions in the comments. Would you like to have further features?
The walk-forward optimization uses a training and a testing window. Initially, the predictors are trained for X training days and then tested on Y testing days. Once this is done, both windows are rolled forward by Y days and the process is repeated. The results of this operation can be found in the console window that is opened when starting the AlgoMinr application. Additionally, the folder containing AlgoMinr.exe will also contain a results.csv file which lists the accuracy of each predictor for the different windows. This can help to see if there is any kind of stability in the results.
The slider “Prediction time”, defines how far into the future the prediction should be. So if the frequency of the data is 5min and the “Prediction time” is 5, the classifiers will look 25mins into the future.
The slider “Train size (days)” shows the number of days each training cycle should be.
The slider “Test size (days)” shows the number of days each training cycle should be. If the training size plus the testing size are too large for multiple cycles, only one cycle is performed. Be careful, however, if the sum is larger than what is available in the input file, an error occurs and the program will terminate.
The input slider define the look-back windows for the different technical indicators. You can test which sizes work best for you.
The tick box at the bottom “Use non-linear transformations” is highly experimental and should probably not be used. It creates non-linear transformations from the technical indicators. This extends the feature space by several hundred inputs. Most of the time, this will only lead to overfitting and the accuracy in the testing windows suffers. Additionally, the large feature space slows down the training process significantly.
The models and their parameters are saved to the hard-disk. So they will be available on the next start up of the program. If you only want to make a prediction, specify the new .csv file and hit the “Make prediction” button.
If the .csv file gets updated, you do not need to search for it again, with the load button. The program will just use the new data.
Training and testing windows to large for data file…
This error appears if the training and testing windows are too large for the specified file. Please also make sure that you specified the right periodicity of your data, as otherwise the training size and testing size are not calculated in the correct way.