sciris.sc_profiling

Profiling and CPU/memory management functions.

Highlights:

Functions

checkmem

Checks how much memory the variable or variables in question use by dumping them to file.

checkram

Measure actual memory usage, typically at different points throughout execution.

cpu_count

Alias to mp.cpu_count()

cpuload

Takes a snapshot of current CPU usage via psutil

loadbalancer

Delay execution while CPU load is too high -- a very simple load balancer.

memload

Takes a snapshot of current fraction of memory usage via psutil

mprofile

Profile the line-by-line memory required by a function.

profile

Profile the line-by-line time required by a function.

Classes

resourcemonitor

Asynchronously monitor resource (e.g.

Exceptions

LimitExceeded

Custom exception for use with the sc.resourcemonitor() monitor.

cpu_count()[source]

Alias to mp.cpu_count()

cpuload(interval=0.1)[source]

Takes a snapshot of current CPU usage via psutil

Parameters

interval (float) – number of seconds over which to estimate CPU load

Returns

a float between 0-1 representing the fraction of psutil.cpu_percent() currently used.

memload()[source]

Takes a snapshot of current fraction of memory usage via psutil

Note on the different functions:

  • sc.memload() checks current total system memory consumption

  • sc.checkram() checks RAM (virtual memory) used by the current Python process

  • sc.checkmem() checks memory consumption by a given object

Returns

a float between 0-1 representing the fraction of psutil.virtual_memory() currently used.

loadbalancer(maxcpu=0.8, maxmem=0.8, index=None, interval=0.5, cpu_interval=0.1, maxtime=36000, label=None, verbose=True, **kwargs)[source]

Delay execution while CPU load is too high – a very simple load balancer.

Parameters
  • maxcpu (float) – the maximum CPU load to allow for the task to still start

  • maxmem (float) – the maximum memory usage to allow for the task to still start

  • index (int) – the index of the task – used to start processes asynchronously (default None)

  • interval (float) – the time delay to poll to see if CPU load is OK (default 0.5 seconds)

  • cpu_interval (float) – number of seconds over which to estimate CPU load (default 0.1; to small gives inaccurate readings)

  • maxtime (float) – maximum amount of time to wait to start the task (default 36000 seconds (10 hours))

  • label (str) – the label to print out when outputting information about task delay or start (default None)

  • verbose (bool) – whether or not to print information about task delay or start (default True)

Examples:

# Simplest usage -- delay if CPU or memory load is >80%
sc.loadbalancer()

# Use a maximum CPU load of 50%, maximum memory of 90%, and stagger the start by process number
for nproc in processlist:
    sc.loadbalancer(maxload=0.5, maxmem=0.9, index=nproc)
New in version 2.0.0: maxmem argument; maxload renamed maxcpu
profile(run, follow=None, print_stats=True, *args, **kwargs)[source]

Profile the line-by-line time required by a function.

Parameters
  • run (function) – The function to be run

  • follow (function) – The function or list of functions to be followed in the profiler; if None, defaults to the run function

  • print_stats (bool) – whether to print the statistics of the profile to stdout

  • args – Passed to the function to be run

  • kwargs – Passed to the function to be run

Returns

LineProfiler (by default, the profile output is also printed to stdout)

Example:

def slow_fn():
    n = 10000
    int_list = []
    int_dict = {}
    for i in range(n):
        int_list.append(i)
        int_dict[i] = i
    return

class Foo:
    def __init__(self):
        self.a = 0
        return

    def outer(self):
        for i in range(100):
            self.inner()
        return

    def inner(self):
        for i in range(1000):
            self.a += 1
        return

foo = Foo()
sc.profile(run=foo.outer, follow=[foo.outer, foo.inner])
sc.profile(slow_fn)

# Profile the constructor for Foo
f = lambda: Foo()
sc.profile(run=f, follow=[foo.__init__])
mprofile(run, follow=None, show_results=True, *args, **kwargs)[source]

Profile the line-by-line memory required by a function. See profile() for a usage example.

Parameters
  • run (function) – The function to be run

  • follow (function) – The function or list of functions to be followed in the profiler; if None, defaults to the run function

  • show_results (bool) – whether to print the statistics of the profile to stdout

  • args – Passed to the function to be run

  • kwargs – Passed to the function to be run

Returns

LineProfiler (by default, the profile output is also printed to stdout)

checkmem(var, descend=True, alphabetical=False, compresslevel=0, plot=False, verbose=False, **kwargs)[source]

Checks how much memory the variable or variables in question use by dumping them to file.

Note on the different functions:

  • sc.memload() checks current total system memory consumption

  • sc.checkram() checks RAM (virtual memory) used by the current Python process

  • sc.checkmem() checks memory consumption by a given object

Parameters
  • var (any) – the variable being checked

  • descend (bool) – whether or not to descend one level into the object

  • alphabetical (bool) – if descending into a dict or object, whether to list items by name rather than size

  • compresslevel (int) – level of compression to use when saving to file (typically 0)

  • plot (bool) – if descending, show the results as a pie chart

  • verbose (bool or int) – detail to print, if >1, print repr of objects along the way

  • **kwargs (dict) – passed to load()

Example:

import numpy as np
import sciris as sc
sc.checkmem(['spiffy', np.random.rand(2483,589)])
checkram(unit='mb', fmt='0.2f', start=0, to_string=True)[source]

Measure actual memory usage, typically at different points throughout execution.

Note on the different functions:

  • sc.memload() checks current total system memory consumption

  • sc.checkram() checks RAM (virtual memory) used by the current Python process

  • sc.checkmem() checks memory consumption by a given object

Example:

import sciris as sc
import numpy as np
start = sc.checkram(to_string=False)
a = np.random.random((1_000, 10_000))
print(sc.checkram(start=start))

New in version 1.0.0.

exception LimitExceeded[source]

Custom exception for use with the sc.resourcemonitor() monitor.

It inherits from MemoryError since this is the most similar built-in Python except, and it inherits from KeyboardInterrupt since this is the means by which the monitor interrupts the main Python thread.

class resourcemonitor(mem=0.9, cpu=None, time=None, interval=1.0, label=None, start=True, die=True, kill_children=True, kill_parent=False, callback=None, verbose=None)[source]

Asynchronously monitor resource (e.g. memory) usage and terminate the process if the specified threshold is exceeded.

Parameters
  • mem (float) – maximum virtual memory allowed (as a fraction of total RAM)

  • cpu (float) – maximum CPU usage (NB: included for completeness only; typically one would not terminate a process just due to high CPU usage)

  • time (float) – maximum time limit in seconds

  • interval (float) – how frequently to check memory/CPU usage (in seconds)

  • label (str) – an optional label to use while printing out progress

  • start (bool) – whether to start the resource monitor on initialization (else call start())

  • die (bool) – whether to raise an exception if the resource limit is exceeded

  • kill_children (bool) – whether to kill child processes (if False, will not work with multiprocessing)

  • kill_parent (bool) – whether to also kill the parent process (will usually exit Python interpreter in the process)

  • callback (func) – optional callback if the resource limit is exceeded

  • verbose (bool) – detail to print out (default: if exceeded; True: every step; False: no output)

Examples:

# Using with-as:
with sc.resourcemonitor(mem=0.8) as resmon:
    memory_heavy_job()

# As a standalone (don't forget to call stop!)
resmon = sc.resourcemonitor(mem=0.95, cpu=0.9, time=3600, label='Load checker', die=False, callback=post_to_slack)
long_cpu_heavy_job()
resmon.stop()
print(resmon.to_df())
,
start(label=None)[source]

Start the monitor running

Parameters

label (str) – optional label for printing progress

stop()[source]

Stop the monitor from running

monitor(label=None, *args, **kwargs)[source]

Actually run the resource monitor

check()[source]

Check if any limits have been exceeded

kill()[source]

Kill all processes

to_df()[source]

Convert the log into a pandas dataframe