Moving Quandl data into MySQL

I like using MySQL to power my backtests. It’s just the data store that I’m fastest with. So, I naturally wanted to load data that we bought from into MySQL for analysis. The problem is that the file is just too big to input (via LOAD DATA INFILE) directly, and the solution is breaking it up into chunks. It’s easy using GNU split – and here, 100,000 rows is a good size.

You’ll also notice how I designed the table with indexes:

CREATE TABLE `adj_eod` ( `id` int(11) NOT NULL AUTO_INCREMENT, `symbol` varchar(10) DEFAULT NULL, `date` date DEFAULT NULL, `open` decimal(35,20) DEFAULT NULL, `high` decimal(35,20) DEFAULT NULL, `low` decimal(35,20) DEFAULT NULL, `close` decimal(35,20) DEFAULT NULL, `volume` int(11) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `unique_index` (`symbol`,`date`), KEY `symbol` (`symbol`), KEY `date` (`date`) )

I generally like to look up information by symbol and date, so I added keys there. The unique index key is descriptive and also valuable for decreasing lookup times (though there is an insertion penalty).

Two more notes: sometimes I was only interested in using data from a certain time period, so I filtered it with the awk utility. And I wasn’t interested in certain rows in the Quandl CSV, so I told MySQL’s LOAD DATA INFILE to discard them by marking them @dummy

Anyway, here’s the script.

I welcome comments and suggestions for improvement

Calculating Internal Rate of Return For a Portfolio of Stocks

In evaluating business decisions, it is often valuable to calculate the Internal Rate of Return (IRR) for an investment strategy. The IRR is the rate of return where the strategy’s Net Present Value (NPV) equals zero.

Now, say that you’re given a stock portfolio and the stocks have appreciated over time. How do you calculate the IRR? It’s not entirely clear. Most of the formulas available rely on cash flows, and that’s not necessarily an intuitive jump from a series of total portfolio values. So what do we do?

The answer is to transform the difference in portfolio value over time into a series of cash transactions: appreciation or depreciation of the portfolio. Here’s an example:

Say your portfolio is worth $5,000 in January 2000, then $7,000 in January 2001, then $4,000 in January 2002, and then $5,000 in January 2003. To transform this into a series of cashflows, for manipulation inside our favorite programming environment, our task is to transform: [5000, 7000, 4000, 5000] into something appropriate for analysis.

When your portfolio is constructed with $5,000 it must be funded with a $5,000 deposit. Therefore, the first value is transformed to -5,000 to reflect that your sum total worth decreased by 5,000. The next value is 7,000, because we have to reflect that your sum total worth increased by 7,000 – that’s what our 5,000 turned into. Each subsequent item gets mapped to the difference between the previous value.

Therefore, in the end, your cashflows will look like this: [-5000, 7000, -3000, 1000]. From there, you can use any number of freely available tools for determining your Internal Rate of Return. In the Ruby programming language, I recommend the gems finance and xirr, available on GitHub. Then you can just do something like this: