Pandas concat ValueError Shape of passed values is blah indices imply blah2

Question

I m trying to merge a  Pandas 14 1  dataframe and a series  The series should form a new column  with some NAs  since the index values of the series are a subset of the index values of the dataframe    This works for a toy example  but not with my data  detailed below    Example   import pandas as pd import numpy as np  df1   pd DataFrame np random randn 6  4   columns   A    B    C    D    index pd date range  1 1 2011   periods 6  freq  D    df1  A   B   C   D 2011-01-01  -0 487926   0 439190    0 194810    0 333896 2011-01-02  1 708024    0 237587    -0 958100   1 418285 2011-01-03  -1 228805   1 266068    -1 755050   -1 476395 2011-01-04  -0 554705   1 342504    0 245934    0 955521 2011-01-05  -0 351260   -0 798270   0 820535    -0 597322 2011-01-06  0 132924    0 501027    -1 139487   1 107873  s1   pd Series np random randn 3   name  foo   index pd date range  1 1 2011   periods 3  freq  2D    s1  2011-01-01   -1 660578 2011-01-03   -0 209688 2011-01-05    0 546146 Freq  2D  Name  foo  dtype  float64  pd concat  df1  s1  axis 1   A   B   C   D   foo 2011-01-01  -0 487926   0 439190    0 194810    0 333896    -1 660578 2011-01-02  1 708024    0 237587    -0 958100   1 418285    NaN 2011-01-03  -1 228805   1 266068    -1 755050   -1 476395   -0 209688 2011-01-04  -0 554705   1 342504    0 245934    0 955521    NaN 2011-01-05  -0 351260   -0 798270   0 820535    -0 597322   0 546146 2011-01-06  0 132924    0 501027    -1 139487   1 107873    NaN   The situation with the data  see below  seems basically identical -  concatting a series with a DatetimeIndex whose values are a subset of the dataframe s  But it gives the ValueError in the title  blah1    5  286  blah2    5  276     Why doesn t it work    In 187   df head   Out 188   high    low loc h   loc l time                 2014-01-01 17 00 00 1 376235    1 375945    1 376235    1 375945 2014-01-01 17 01 00 1 376005    1 375775    NaN NaN 2014-01-01 17 02 00 1 375795    1 375445    NaN 1 375445 2014-01-01 17 03 00 1 375625    1 375515    NaN NaN 2014-01-01 17 04 00 1 375585    1 375585    NaN NaN In  186   df index Out 186    lt class  pandas tseries index DatetimeIndex  gt   2014-01-01 17 00 00       2014-01-01 21 30 00  Length  271  Freq  None  Timezone  None  In  189   hl head   Out 189   2014-01-01 17 00 00    1 376090 2014-01-01 17 02 00    1 375445 2014-01-01 17 05 00    1 376195 2014-01-01 17 10 00    1 375385 2014-01-01 17 12 00    1 376115 dtype  float64  In  187  hl index Out 187    lt class  pandas tseries index DatetimeIndex  gt   2014-01-01 17 00 00       2014-01-01 21 30 00  Length  89  Freq  None  Timezone  None  In  pd concat  df  hl   axis 1  Out   stack trace  ValueError  Shape of passed values is  5  286   indices imply  5  276

User · Answer

To drop duplicate indices  use df   df loc df index drop duplicates     C f  pandas pydata org pandas-docs stable generated         BallpointBen Apr 18 at 15 25  This is wrong but I can t reply directly to BallpointBen s comment due to low reputation   The reason its wrong is that df index drop duplicates   returns a list of unique indices  but when you index back into the dataframe using those the unique indices it still returns all records   I think this is likely because indexing using one of the duplicated indices will return all instances of the index  Instead  use df index duplicated    which returns a boolean list  add the   to get the not-duplicated records   df   df loc  df index duplicated

User · Answer

Try sorting index after concatenating them  result pd concat  df1 df2   sort index

User · Answer

Your indexes probably contains duplicated values   import pandas as pd  T1 INDEX         0      1      lt       if I write e g    0  here then it fails     0 2    T1 COLUMNS          A    B    C    D    T1          1 0  1 1  1 2  1 3        2 0  2 1  2 2  2 3        3 0  3 1  3 2  3 3      T2 INDEX         1 2      2 11     T2 COLUMNS          D    E    F     T2          54 0  5324 1  3234 2        55 0  14 5324  2324 2          3 0  3 1  3 2     df1   pd DataFrame T1  columns T1 COLUMNS  index T1 INDEX  df2   pd DataFrame T2  columns T2 COLUMNS  index T2 INDEX    print pd concat  pd DataFrame         df2  df1   axis 1

User · Answer

My problem were different indices  the following code solved my problem  df1 reset index drop True  inplace True  df2 reset index drop True  inplace True  df   pd concat  df1  df2   axis 1

User · Answer

I had a similar problem  join worked  but concat failed    Check for duplicate index values in df1 and s1   e g  df1 index is unique   Removing duplicate index values  e g   df drop duplicates inplace True   or one of the methods here https   stackoverflow com a 34297689 7163376 should resolve it

User · Answer

Aus lacy s post gave me the idea of trying related methods  of which join does work   In  196    hl name    hl  Out 196    hl  In  199    df join hl  head 4  Out 199   high    low loc h   loc l   hl 2014-01-01 17 00 00 1 376235    1 375945    1 376235    1 375945    1 376090 2014-01-01 17 01 00 1 376005    1 375775    NaN NaN NaN 2014-01-01 17 02 00 1 375795    1 375445    NaN 1 375445    1 375445 2014-01-01 17 03 00 1 375625    1 375515    NaN NaN NaN   Some insight into why concat works on the example but not this data would be nice though

User · Answer

Maybe it is simple  try this if you have a DataFrame  then make sure that both matrices or vectros that you re trying to combine have the same rows name index I had the same issue  I changed the name indices of the rows to make them match each other here is an  example for a matrix  principal component  and a vector target  have the same row indicies  I circled them in the blue in the leftside of the pic  Before   quot when it was not working quot   I had the matrix with normal row indicies  0 1 2 3  while I had the vector with row indices  ID0  ID1  ID2  ID3  then I changed the vector s row indices to  0 1 2 3  and it worked for me  enter image description here

[python] Pandas concat: ValueError: Shape of passed values is blah, indices imply blah2

Examples related to python

Examples related to pandas