Assume I have a DataFrame sales
of timestamp values:
timestamp sales_office
2014-01-01 09:01:00 Cincinnati
2014-01-01 09:11:00 San Francisco
2014-01-01 15:22:00 Chicago
2014-01-01 19:01:00 Chicago
I would like to create a new column time_hour
. I can create it by writing a short function as so and using apply()
to apply it iteratively:
def hr_func(ts):
return ts.hour
sales['time_hour'] = sales['timestamp'].apply(hr_func)
I would then see this result:
timestamp sales_office time_hour
2014-01-01 09:01:00 Cincinnati 9
2014-01-01 09:11:00 San Francisco 9
2014-01-01 15:22:00 Chicago 15
2014-01-01 19:01:00 Chicago 19
What I'd like to achieve is some shorter transformation like this (which I know is erroneous but gets at the spirit):
sales['time_hour'] = sales['timestamp'].hour
Obviously the column is of type Series
and as such doesn't have those attributes, but it seems there's a simpler way to make use of matrix operations.
Is there a more-direct approach?
Since the quickest, shortest answer is in a comment (from Jeff) and has a typo, here it is corrected and in full:
sales['time_hour'] = pd.DatetimeIndex(sales['timestamp']).hour
Now we can use:
sales['time_hour'] = sales['timestamp'].apply(lambda x: x.hour)
For posterity: as of 0.15.0, there is a handy .dt accessor you can use to pull such values from a datetime/period series (in the above case, just sales.timestamp.dt.hour
!
You can use a lambda expression, e.g:
sales['time_hour'] = sales.timestamp.apply(lambda x: x.hour)
You can try this:
sales['time_hour'] = pd.to_datetime(sales['timestamp']).dt.hour
Here is a simple solution:
import pandas as pd
# convert the timestamp column to datetime
df['timestamp'] = pd.to_datetime(df['timestamp'])
# extract hour from the timestamp column to create an time_hour column
df['time_hour'] = df['timestamp'].dt.hour
Source: Stackoverflow.com