I'm somewhat new to pandas. I have a pandas data frame that is 1 row by 23 columns.
I want to convert this into a series? I'm wondering what the most pythonic way to do this is?
I've tried pd.Series(myResults)
but it complains ValueError: cannot copy sequence with size 23 to array axis with dimension 1
. It's not smart enough to realize it's still a "vector" in math terms.
Thanks!
data = pd.DataFrame({"a":[1,2,3,34],"b":[5,6,7,8]})
new_data = pd.melt(data)
new_data.set_index("variable", inplace=True)
This gives a dataframe with index as column name of data and all data are present in "values" column
You can retrieve the series through slicing your dataframe using one of these two methods:
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iloc.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html
import pandas as pd
import numpy as np
df = pd.DataFrame(data=np.random.randn(1,8))
series1=df.iloc[0,:]
type(series1)
pandas.core.series.Series
You can also use stack()
df= DataFrame([list(range(5))], columns = [“a{}”.format(I) for I in range(5)])
After u run df, then run:
df.stack()
You obtain your dataframe in series
Another way -
Suppose myResult is the dataFrame that contains your data in the form of 1 col and 23 rows
# label your columns by passing a list of names
myResult.columns = ['firstCol']
# fetch the column in this way, which will return you a series
myResult = myResult['firstCol']
print(type(myResult))
In similar fashion, you can get series from Dataframe with multiple columns.
You can transpose the single-row dataframe (which still results in a dataframe) and then squeeze the results into a series (the inverse of to_frame
).
df = pd.DataFrame([list(range(5))], columns=["a{}".format(i) for i in range(5)])
>>> df.T.squeeze() # Or more simply, df.squeeze() for a single row dataframe.
a0 0
a1 1
a2 2
a3 3
a4 4
Name: 0, dtype: int64
Note: To accommodate the point raised by @IanS (even though it is not in the OP's question), test for the dataframe's size. I am assuming that df
is a dataframe, but the edge cases are an empty dataframe, a dataframe of shape (1, 1), and a dataframe with more than one row in which case the use should implement their desired functionality.
if df.empty:
# Empty dataframe, so convert to empty Series.
result = pd.Series()
elif df.shape == (1, 1)
# DataFrame with one value, so convert to series with appropriate index.
result = pd.Series(df.iat[0, 0], index=df.columns)
elif len(df) == 1:
# Convert to series per OP's question.
result = df.T.squeeze()
else:
# Dataframe with multiple rows. Implement desired behavior.
pass
This can also be simplified along the lines of the answer provided by @themachinist.
if len(df) > 1:
# Dataframe with multiple rows. Implement desired behavior.
pass
else:
result = pd.Series() if df.empty else df.iloc[0, :]
Source: Stackoverflow.com