I found things not to like about all the above solutions, so I came up with my own. This version makes sure parts are joined with a single slash and leaves leading and trailing slashes alone. No pip install
, no urllib.parse.urljoin
weirdness.
In [1]: from functools import reduce
In [2]: def join_slash(a, b):
...: return a.rstrip('/') + '/' + b.lstrip('/')
...:
In [3]: def urljoin(*args):
...: return reduce(join_slash, args) if args else ''
...:
In [4]: parts = ['https://foo-bar.quux.net', '/foo', 'bar', '/bat/', '/quux/']
In [5]: urljoin(*parts)
Out[5]: 'https://foo-bar.quux.net/foo/bar/bat/quux/'
In [6]: urljoin('https://quux.com/', '/path', 'to/file///', '//here/')
Out[6]: 'https://quux.com/path/to/file/here/'
In [7]: urljoin()
Out[7]: ''
In [8]: urljoin('//','beware', 'of/this///')
Out[8]: '/beware/of/this///'
In [9]: urljoin('/leading', 'and/', '/trailing/', 'slash/')
Out[9]: '/leading/and/trailing/slash/'