urlopen#

urlopen(url, filename=None, save=None, headers=None, params=None, data=None, prefix='http', convert=True, die=False, response='text', verbose=False)[source]#

Download a single URL.

Alias to urllib.request.urlopen(url).read(). See also sc.download() for downloading multiple URLs. Note: sc.urlopen()/sc.wget() are aliases.

Parameters:

url (str) – the URL to open, either as GET or POST
filename (str) – if supplied, save to file instead of returning output
save (bool) – if supplied instead of filename, then use the default filename
headers (dict) – a dictionary of headers to pass
params (dict) – a dictionary of parameters to pass to the GET request
data (dict) –
prefix (str) – the string to ensure the URL starts with (else, add it)
convert (bool) – whether to convert from bytes to string
die (bool) – whether to raise an exception if converting to text failed
response (str) – what to return: ‘text’ (default), ‘json’ (dictionary version of the data), ‘status’ (the HTTP status), or ‘full’ (the full response object)
verbose (bool) – whether to print progress

Examples:

html = sc.urlopen('wikipedia.org') # Retrieve into variable html
sc.urlopen('http://wikipedia.org', filename='wikipedia.html') # Save to file wikipedia.html
sc.urlopen('https://wikipedia.org', save=True, headers={'User-Agent':'Custom agent'}) # Save to the default filename (here, wikipedia.org), with headers
sc.urlopen('wikipedia.org', response='status') # Only return the HTTP status of the site

New in version 2.0.0: renamed from wget to urlopen; new arguments
New in version 2.0.1: creates folders by default if they do not exist
New in version 2.0.4: “prefix” argument, e.g. prepend “http://” if not present
New in version 3.1.4: renamed “return_response” to “response”; additional options