Get first letter of a string from column

Question

I m fighting with pandas and for now I m loosing  I have source table similar to this   import pandas as pd  a pd Series  123 22 32 453 45 453 56   b pd Series  234 4353 355 453 345 453 56   df pd concat  a  b   axis 1  df columns   First    Second     I would like to add new column to this data frame with first digit from values in column  First   a  change number to string from column  First  b  extracting first character from newly created string c  Results from b save as new column in data frame  I don t know how to apply this to the pandas data frame object  I would be grateful for helping me with that

User · Answer

str get  This is the simplest to specify string methods     Setup df   pd DataFrame   A     xyz    abc    foobar     B    123  456  789    df          A    B 0     xyz  123 1     abc  456 2  foobar  789  df dtypes  A    object B     int64 dtype  object     For string  read object  type columns  use  df  C     df  A   str 0    Similar to  df  C     df  A   str get 0     str handles NaNs by returning NaN as the output   For non-numeric columns  an  astype conversion is required beforehand  as shown in  Ed Chum s answer     Note that this won t work well if the data has NaNs     It ll return lowercase  n  df  D     df  B   astype str  str 0      df         A    B  C  D 0     xyz  123  x  1 1     abc  456  a  4 2  foobar  789  f  7     List Comprehension and Indexing  There is enough evidence to suggest a simple list comprehension will work well here and probably be faster      For string columns df  C      x 0  for x in df  A       For numeric columns df  D      str x  0  for x in df  B        df         A    B  C  D 0     xyz  123  x  1 1     abc  456  a  4 2  foobar  789  f  7   If your data has NaNs  then you will need to handle this appropriately with an if else in the list comprehension   df2   pd DataFrame   A     xyz   np nan   foobar     B    123  456  np nan    df2          A      B 0     xyz  123 0 1     NaN  456 0 2  foobar    NaN    For string columns df2  C      x 0  if isinstance x  str  else np nan for x in df2  A       For numeric columns df2  D      str x  0  if pd notna x  else np nan for x in df2  B             A      B    C    D 0     xyz  123 0    x    1 1     NaN  456 0  NaN    4 2  foobar    NaN    f  NaN     Let s do some timeit tests on some larger data   df    df copy   df   pd concat  df     5000  ignore index True     timeit df assign C df  A   str 0    timeit df assign D df  B   astype str  str 0     timeit df assign C  x 0  for x in df  A      timeit df assign D  str x  0  for x in df  B         12 ms    253   s per loop  mean    std  dev  of 7 runs  100 loops each  27 1 ms    1 38 ms per loop  mean    std  dev  of 7 runs  10 loops each   3 77 ms    110   s per loop  mean    std  dev  of 7 runs  100 loops each  7 84 ms    145   s per loop  mean    std  dev  of 7 runs  100 loops each    List comprehensions are 4x faster

User · Answer

Cast the dtype of the col to str and you can perform vectorised slicing calling str   In  29   df  new col     df  First   astype str  str 0  df  Out 29      First  Second new col 0    123     234       1 1     22    4353       2 2     32     355       3 3    453     453       4 4     45     345       4 5    453     453       4 6     56      56       5   if you need to you can cast the dtype back again calling astype int  on the column

[python] Get first letter of a string from column

`.str.get`

List Comprehension and Indexing

Examples related to python

Examples related to pandas