[python] Logarithmic returns in pandas dataframe

Python pandas has a pct_change function which I use to calculate the returns for stock prices in a dataframe:

ndf['Return']= ndf['TypicalPrice'].pct_change()

I am using the following code to get logarithmic returns, but it gives the exact same values as the pct.change() function:

ndf['retlog']=np.log(ndf['TypicalPrice'].astype('float64')/ndf['TypicalPrice'].astype('float64').shift(1))
#np is for numpy

This question is related to python pandas

The answer is


@poulter7: I cannot comment on the other answers, so I post it as new answer: be careful with

np.log(df.price).diff() 

as this will fail for indices which can become negative as well as risk factors e.g. negative interest rates. In these cases

np.log(df.price/df.price.shift(1)).dropna()

is preferred and based on my experience generally the safer approach. It also evaluates the logarithm only once.

Whether you use +1 or -1 depends on the ordering of your time series. Use -1 for descending and +1 for ascending dates - in both cases the shift provides the preceding date's value.


The results might seem similar, but that is just because of the Taylor expansion for the logarithm. Since log(1 + x) ~ x, the results can be similar.

However,

I am using the following code to get logarithmic returns, but it gives the exact same values as the pct.change() function.

is not quite correct.

import pandas as pd

df = pd.DataFrame({'p': range(10)})

df['pct_change'] = df.pct_change()
df['log_stuff'] = \
    np.log(df['p'].astype('float64')/df['p'].astype('float64').shift(1))
df[['pct_change', 'log_stuff']].plot();

enter image description here


Log returns are simply the natural log of 1 plus the arithmetic return. So how about this?

df['pct_change'] = df.price.pct_change()
df['log_return'] = np.log(1 + df.pct_change)

Even more concise, utilizing Ximix's suggestion:

df['log_return'] = np.log1p(df.price.pct_change())

Single line, and only calculating logs once. First convert to log-space, then take the 1-period diff.

    np.diff(np.log(df.price))

In earlier versions of numpy:

    np.log(df.price)).diff()