How to normalize an array in NumPy to a unit vector

Question

I would like to convert a NumPy array to a unit vector  More specifically  I am looking for an equivalent version of this function def normalize v       norm   np linalg norm v      if norm    0          return v     return v   norm  Is there something like that in skearn or numpy  This function works in a situation where v is the 0 vector

User · Answer

There is also the function unit vector   to normalize vectors in the popular transformations module by Christoph Gohlke   import transformations as trafo import numpy as np  data   np array   1 0  1 0  0 0                     1 0  1 0  1 0                     1 0  2 0  3 0     print trafo unit vector data  axis 1

User · Answer

Without sklearn and using just numpy  Just define a function    Assuming that the rows are the variables and the columns the samples  axis  1    import numpy as np    Example array X   np array   1 2 3   4 5 6     def stdmtx X       means   X mean axis  1      stds   X std axis  1  ddof 1      X  X - means    np newaxis      X  X   stds    np newaxis      return np nan to num X     output   X array   1  2  3           4  5  6     stdmtx X  array   -1    0    1            -1    0    1

User · Answer

If you re working with 3D vectors  you can do this concisely using the toolbelt vg  It s a light layer on top of numpy and it supports single values and stacked vectors   import numpy as np import vg  x   np random rand 1000  10 norm1   x   np linalg norm x  norm2   vg normalize x  print np all norm1    norm2    True   I created the library at my last startup  where it was motivated by uses like this  simple ideas which are way too verbose in NumPy

User · Answer

If you re using scikit-learn you can use sklearn preprocessing normalize   import numpy as np from sklearn preprocessing import normalize  x   np random rand 1000  10 norm1   x   np linalg norm x  norm2   normalize x   np newaxis   axis 0  ravel   print np all norm1    norm2    True

User · Answer

You mentioned sci-kit learn  so I want to share another solution   sci-kit learn MinMaxScaler  In sci-kit learn  there is a API called MinMaxScaler which can customize the the value range as you like   It also deal with NaN issues for us       NaNs are treated as missing values  disregarded in fit  and maintained   in transform      see reference  1    Code sample  The code is simple  just type    Let s say X train is your input dataframe from sklearn preprocessing import MinMaxScaler   call MinMaxScaler object min max scaler   MinMaxScaler     feed in a numpy array X train norm   min max scaler fit transform X train values    wrap it up if you need a dataframe df   pd DataFrame X train norm    Reference    1   sklearn preprocessing MinMaxScaler

User · Answer

If you don t need utmost precision  your function can be reduced to   v norm   v    np linalg norm v    1e-16

User · Answer

You can specify ord to get the L1 norm  To avoid zero division I use eps  but that s maybe not great   def normalize v       norm np linalg norm v  ord 1      if norm  0          norm np finfo v dtype  eps     return v norm

User · Answer

I would agree that it were nice if such a function was part of the included batteries  But it isn t  as far as I know  Here is a version for arbitrary axes  and giving optimal performance   import numpy as np  def normalized a  axis -1  order 2       l2   np atleast 1d np linalg norm a  order  axis       l2 l2  0    1     return a   np expand dims l2  axis   A   np random randn 3 3 3  print normalized A 0   print normalized A 1   print normalized A 2    print normalized np arange 3    None    print normalized np arange 3

User · Answer

If you want to normalize n dimensional feature vectors stored in a 3D tensor  you could also use PyTorch   import numpy as np from torch import FloatTensor from torch nn functional import normalize  vecs   np random rand 3  16  16  16  norm vecs   normalize FloatTensor vecs   dim 0  eps 1e-16  numpy

User · Answer

If you work with multidimensional array following fast solution is possible   Say we have 2D array  which we want to normalize by last axis  while some rows have zero norm   import numpy as np arr   np array        1  2  3         0  0  0        5  6  7     dtype np float   lengths   np linalg norm arr  axis -1  print lengths       3 74165739  0          10 48808848  arr lengths  gt  0    arr lengths  gt  0    lengths lengths  gt  0     np newaxis  print arr      0 26726124 0 53452248 0 80178373     0          0          0              0 47673129 0 57207755 0 66742381

User · Answer

If you have multidimensional data and want each axis normalized to its max or its sum   def normalize  d  to sum True  copy True         d is a  n x dimension  np array     d    d if not copy else np copy  d      d -  np min d  axis 0      d     np sum d  axis 0  if to sum else np ptp d  axis 0       return d   Uses numpys peak to peak function   a   np random random  5  3    b   normalize a  copy False  b sum axis 0    array  1   1   1     the rows sum to 1  c   normalize a  to sum False  copy False  c max axis 0    array  1   1   1     the max of each row is 1

User · Answer

This might also work for you import numpy as np normalized v   v   np sqrt np sum v  2    but fails when v has length 0  In that case  introducing a small constant to prevent the zero division solves this

[python] How to normalize an array in NumPy to a unit vector?

Examples related to python

Examples related to numpy

Examples related to scikit-learn

Examples related to statistics

Examples related to normalization