As @Kanmani hinted, an easier to interpret implementation may use numpy.flip
, as in the following:
import numpy as np
avgDists = np.array([1, 8, 6, 9, 4])
ids = np.flip(np.argsort(avgDists))
print(ids)
By using the visitor pattern rather than member functions, it is easier to read the order of operations.
An elegant way could be as follows -
ids = np.flip(np.argsort(avgDists))
This will give you indices of elements sorted in descending order. Now you can use regular slicing...
top_n = ids[:n]
Instead of using np.argsort
you could use np.argpartition
- if you only need the indices of the lowest/highest n elements.
That doesn't require to sort the whole array but just the part that you need but note that the "order inside your partition" is undefined, so while it gives the correct indices they might not be correctly ordered:
>>> avgDists = [1, 8, 6, 9, 4]
>>> np.array(avgDists).argpartition(2)[:2] # indices of lowest 2 items
array([0, 4], dtype=int64)
>>> np.array(avgDists).argpartition(-2)[-2:] # indices of highest 2 items
array([1, 3], dtype=int64)
With your example:
avgDists = np.array([1, 8, 6, 9, 4])
Obtain indexes of n maximal values:
ids = np.argpartition(avgDists, -n)[-n:]
Sort them in descending order:
ids = ids[np.argsort(avgDists[ids])[::-1]]
Obtain results (for n=4):
>>> avgDists[ids]
array([9, 8, 6, 4])
Another way is to use only a '-' in the argument for argsort as in : "df[np.argsort(-df[:, 0])]", provided df is the dataframe and you want to sort it by the first column (represented by the column number '0'). Change the column-name as appropriate. Of course, the column has to be a numeric one.
You can use the flip commands numpy.flipud()
or numpy.fliplr()
to get the indexes in descending order after sorting using the argsort
command. Thats what I usually do.
Just like Python, in that [::-1]
reverses the array returned by argsort()
and [:n]
gives that last n elements:
>>> avgDists=np.array([1, 8, 6, 9, 4])
>>> n=3
>>> ids = avgDists.argsort()[::-1][:n]
>>> ids
array([3, 1, 2])
The advantage of this method is that ids
is a view of avgDists:
>>> ids.flags
C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
(The 'OWNDATA' being False indicates this is a view, not a copy)
Another way to do this is something like:
(-avgDists).argsort()[:n]
The problem is that the way this works is to create negative of each element in the array:
>>> (-avgDists)
array([-1, -8, -6, -9, -4])
ANd creates a copy to do so:
>>> (-avgDists_n).flags['OWNDATA']
True
So if you time each, with this very small data set:
>>> import timeit
>>> timeit.timeit('(-avgDists).argsort()[:3]', setup="from __main__ import avgDists")
4.2879798610229045
>>> timeit.timeit('avgDists.argsort()[::-1][:3]', setup="from __main__ import avgDists")
2.8372560259886086
The view method is substantially faster (and uses 1/2 the memory...)
You could create a copy of the array and then multiply each element with -1.
As an effect the before largest elements would become the smallest.
The indeces of the n smallest elements in the copy are the n greatest elements in the original.
Source: Stackoverflow.com