Specialized Utilities

Domain-specific utilities for web, math, concurrency, and more.

webapp

Web utilities for Flask and web.py applications: CORS support, XSRF protection, authentication decorators, URL building, breadcrumb generation, HTML escaping, JSON encoders with ISO date support.

authd(checker_fn, fallback_fn)[source]

Decorator that checks if user meets an auth criterion.

Parameters:

checker_fn – Callable that returns True if authorized.
fallback_fn – Callable to invoke if not authorized.

Returns:

Decorator function.

make_url(path, **params)[source]

Generate URL with query parameters.

Inspired by werkzeug.urls.Href. Assumes traditional multiple params (does not overwrite). Use __replace__ to overwrite params. Use __ignore__ to filter out certain params.

Parameters:

path (str) – Base URL path.
params – Query parameters.

Returns:

Complete URL with query string.

Return type:

str

Example:

>>> ignore_fn = lambda x: x.startswith('_')
>>> kw = dict(fuz=1, biz="boo")
>>> make_url('/foo/', _format='excel', __ignore__=ignore_fn, **kw)
'/foo/?fuz=1&biz=boo'
>>> make_url('/foo/?bar=1', _format='excel', **kw)
'/foo/?_format=excel&fuz=1&biz=boo&bar=1'
>>> make_url('/foo/', bar=1, baz=2)
'/foo/?bar=1&baz=2'
>>> make_url('/foo/', **{'bar':1, 'fuz':(1,2,), 'biz':"boo"})
'/foo/?bar=1&fuz=1&fuz=2&biz=boo'
>>> make_url('/foo/?a=1&a=2')
'/foo/?a=1&a=2'

>>> kwargs = dict(fuz=1, biz="boo", __ignore__=ignore_fn)
>>> xx = make_url('www.foobar.com/foo/', **kwargs)
>>> 'www' in xx and 'foobar' in xx and '/foo/' in xx and 'fuz=1' in xx and 'biz=boo' in xx
True
>>> xx = make_url('/foo/', _format='excel', **kwargs)
>>> '_format=excel' in xx
False
>>> 'fuz=1' in xx
True
>>> 'biz=boo' in xx
True
>>> yy = make_url('/foo/?bar=1', _format='excel', **kwargs)
>>> 'bar=1' in yy
True
>>> '_format=excel' in yy
False
>>> zz = make_url('/foo/', **{'bar':1, 'fuz':(1,2,), 'biz':"boo"})
>>> 'fuz=1' in zz
True
>>> 'fuz=2' in zz
True
>>> qq = make_url('/foo/?a=1&a=2')
>>> 'a=1' in qq
True
>>> 'a=2' in qq
True

appmenu(urls, fmt_name=None)[source]

Build HTML menu from URL/name pairs.

Parameters:

urls (tuple) – Tuple of alternating (url, name) pairs.
fmt_name (Optional[Callable]) – Formatter for link text (default: splitcap).

Return type:

str

Returns:

HTML unordered list string.

render_field(field)[source]

Render form field with error styling.

Parameters:: field – Form field object.
Return type:: str
Returns:: HTML string for rendered field.

scale(color, pct)[source]

Scale a hex color by a percentage.

Parameters:

color (str) – Hex color string (e.g., ‘#FFF’ or ‘#FFFFFF’).
pct (float) – Scale factor (1.0 = no change).

Returns:

Scaled hex color string.

Return type:

str

safe_join(directory, *pathnames)[source]

Safely join untrusted path components to a base directory.

Prevents escaping the base directory via path traversal.

Parameters:

directory (str) – The trusted base directory.
pathnames (str) – The untrusted path components relative to base.

Returns:

A safe path, or None if path would escape base.

Return type:

str | None

Note

Via github.com/mitsuhiko/werkzeug security.py

local_or_static_join(static, somepath)[source]

Find template in working directory or static folder.

Parameters:

static (str) – Static folder path.
somepath (str) – Relative path to template.

Returns:

Full path to existing template.

Return type:

Path

Raises:

OSError – If template not found in either location.

inject_file(x)[source]

Read file contents for injection into HTML email templates.

Parameters:: x (str) – Path to file (CSS, JS, etc.).
Returns:: File contents.
Return type:: str

inject_image(x)[source]

Generate base64 data URI for image embedding in HTML.

Parameters:: x (str) – Path to image file.
Returns:: Data URI string for use in img src attribute.
Return type:: str

htmlquote(text)[source]

Encode text for safe use in HTML.

Parameters:: text (str) – Text to encode.
Returns:: HTML-encoded string.
Return type:: str

Example:

>>> htmlquote(u"<'&\">")
'&lt;&#39;&amp;&quot;&gt;'

websafe(val)[source]

Convert value to safe Unicode HTML string.

Parameters:: val – Value to convert (string, bytes, or None).
Returns:: HTML-safe string.
Return type:: str

Example:

>>> websafe("<'&\">")
'&lt;&#39;&amp;&quot;&gt;'
>>> websafe(None)
''
>>> websafe(u'\u203d') == u'\u203d'
True

rsleep(always=0, rand_extra=8)[source]

Sleep for a random amount of time.

Parameters:

always (float) – Minimum seconds to sleep.
rand_extra (float) – Maximum additional random seconds.

rand_retry(x_times=10, exception=<class 'Exception'>)[source]

Decorator that retries function with random delays.

Useful for avoiding automated thresholding on web requests.

Parameters:

x_times (int) – Maximum number of retries.
exception – Exception type(s) to catch and retry on.

Returns:

Decorator function.

logerror(olderror, logger)[source]

Wrap internalerror function to log tracebacks.

Parameters:

olderror – Original error handler function.
logger – Logger instance for error output.

Returns:

Wrapped error handler.

class JSONEncoderISODate(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: JSONEncoder

JSON encoder that serializes dates in ISO format.

Example:

>>> JSONEncoderISODate().encode({'dt': datetime.date(2014, 10, 2)})
'{"dt": "2014-10-02"}'

default(obj)[source]

class JSONDecoderISODate(**kw)[source]

Bases: JSONDecoder

JSON decoder that parses date strings into datetime objects.

Example:

>>> JSONDecoderISODate().decode('{"dt": "2014-10-02"}')
{'dt': datetime.datetime(2014, 10, 2, 0, 0)}

class Jinja2Render(template_dir, globals=None, autoescape=True)[source]

Bases: object

Jinja2 render class.

Usage:

render = Jinja2Render('templates/')
render.add_globals({'format': fmt, 'today': datetime.date.today})
html = render('generic.html', title='Page', content=[html1, html2])

__init__(template_dir, globals=None, autoescape=True)[source]

Initialize Jinja2 environment with template directory.

Parameters:

template_dir (str) – Path to template directory.
globals (dict | None) – Dict of global variables/functions for templates.
autoescape (bool) – Enable HTML autoescaping (default True).

__call__(template_name, **context)[source]

Render template with context - same signature as Flask’s render_template.

Parameters:

template_name (str) – Name of template file (e.g., ‘generic.html’).
context – Keyword arguments passed to template.

Return type:

str

Returns:

Rendered HTML string.

add_globals(globals_dict)[source]

Add globals to Jinja2 environment.

Parameters:: globals_dict (dict) – Dict of globals to add.
Return type:: None

add_filter(name, func)[source]

Add custom filter to Jinja2 environment.

Parameters:

name (str) – Filter name to use in templates.
func (callable) – Filter function.

Return type:

None

class ProfileMiddleware(func, log=None, sort='time', count=20)[source]

Bases: object

WSGI middleware for profiling requests.

Parameters:

func – WSGI application callable.
log – Logger instance for output.
sort (str) – Profile sort key (default ‘time’).
count (int) – Number of top functions to show (default 20).

Warning

Should always be last middleware loaded: 1. You want to profile everything else 2. For speed, we return the result NOT the wrapped func

get_request_dict(**defaults)[source]

Get request parameters with defaults, supporting callables.

Example:

get_request_dict(fund='All', date=lambda: Date.today())

Parameters:: defaults – Default values for parameters. Callables are invoked.
Return type:: dict
Returns:: Dict of request parameters with defaults applied.

is_safe_redirect_url(url)[source]

Check if redirect URL is safe (relative path only, no protocol injection).

Parameters:: url (str) – URL to validate.
Return type:: bool
Returns:: True if URL is safe for redirect.

Example:

>>> is_safe_redirect_url('/login/')
True
>>> is_safe_redirect_url('//evil.com')
False
>>> is_safe_redirect_url('https://evil.com')
False
>>> is_safe_redirect_url('')
False

external_url_for(base_url, endpoint, **values)[source]

Generate full URL with domain for external use (emails, etc).

Parameters:

base_url (str) – Base URL including scheme and domain (e.g. ‘https://app.example.com’).
endpoint (str) – Flask endpoint name.
values – URL parameters passed to url_for.

Return type:

str

Returns:

Complete URL.

stats

Mathematical and statistical functions: average, variance, standard deviation, covariance, beta, combinatorics (choose), number parsing, threshold operations.

npfunc(nargs=1)[source]

Decorator to convert args to numpy format and results back to Python.

Parameters:: nargs (int) – Number of arguments to convert to numpy arrays.
Returns:: Decorator function.

avg(x)[source]

Compute average of array, ignoring None/NaN values.

Parameters:: x (Iterable) – Array of values.
Returns:: Average or None if all values are NaN.

Example:

>>> avg((-1.5, 2,))
0.25
>>> avg((None, 2,))
2.0
>>> avg((None, None,)) is None
True

pct_change(x)[source]

Compute percent change between consecutive elements.

Parameters:: x (Iterable) – Array of values.
Returns:: Array of percent changes (first element is None).

Example:

>>> a = [1, 1, 1.5, 1, 2, 1.11, -1]
>>> [f"{_:.2f}" if _ else _ for _ in pct_change(a)]
[None, 0.0, '0.50', '-0.33', '1.00', '-0.44', '-1.90']

diff(x)[source]

Compute one-period difference between consecutive elements.

Parameters:: x (Iterable) – Array of values.
Returns:: Array of differences (first element is None).

Example:

>>> [_ for _ in diff((0, 1, 3, 2, 1, 5, 4))]
[None, 1.0, 2.0, -1.0, -1.0, 4.0, -1.0]

thresh(x, thresh=0.0)[source]

Round to nearest integer if within threshold distance.

Parameters:

x (float) – Number to potentially round.
thresh (float) – Distance threshold for rounding.

Returns:

Rounded integer or original value.

Positive Numbers:

>>> thresh(74.9888, 0.05)
75
>>> thresh(75.01, 0.05)
75

Negative Numbers:

>>> thresh(-74.988, 0.05)
-75
>>> thresh(-75.01, 0.05)
-75

Return Original:

>>> thresh(74.90, 0.05)
74.9
>>> thresh(75.06, 0.05)
75.06

isnumeric(x)[source]

Check if value is a numeric type.

Parameters:: x – Value to check.
Returns:: True if numeric (int, float, or numpy numeric).
Return type:: bool

digits(n)[source]

Count number of integer digits in a number.

Parameters:: n – Number to count digits of.
Returns:: Number of integer digits.
Return type:: int

Example:

>>> digits(6e6)
7
>>> digits(100.01)
3
>>> digits(-6e5)==digits(-600000)==6
True
>>> digits(-100.)==digits(100)==3
True

numify(val, to=<class 'float'>)[source]

Convert value to numeric type, handling common formatting.

Handles None values, already numeric values, and string formatting including whitespace, commas, parentheses (negative), and percentages.

Parameters:

val – Value to convert.
to (type) – Target type (default: float).

Returns:

Converted value or None if conversion fails.

Example:

>>> numify('1,234.56')
1234.56
>>> numify('(100)', to=int)
-100
>>> numify('50%')
50.0
>>> numify(None)
>>> numify('')

parse(s)[source]

Extract number from string.

Parameters:: s – String to parse.
Returns:: Parsed int or float, or None if parsing fails.

Example:

>>> parse('1,200m')
1200
>>> parse('100.0')
100.0
>>> parse('100')
100
>>> parse('0.002k')
0.002
>>> parse('-1')==parse('(1)')==-1
True
>>> parse('-100.0')==parse('(100.)')==-100.0
True
>>> parse('')

nearest(num, decimals)[source]

Round number to the nearest tick value.

Useful for eliminating float errors after arithmetic operations.

Parameters:

num (float) – Number to round.
decimals (float) – Tick size to round to.

Returns:

Rounded number.

Return type:

float

Example:

>>> nearest(401.4601, 0.01)
401.46
>>> nearest(401.46001, 0.0000000001)
401.46001

covarp(x, y)[source]

Compute population covariance between x and y.

Parameters:

x – First array.
y – Second array (same length as x).

Returns:

Population covariance.

Return type:

float

Example:

>>> x = [3, 2, 4, 5, 6]
>>> y = [9, 7, 12, 15, 17]
>>> "{:.5}".format(covarp(x, y))
'5.2'

covars(x, y)[source]

Compute sample covariance between x and y.

Parameters:

x – First array.
y – Second array (same length as x).

Returns:

Sample covariance.

Return type:

float

Example:

>>> x = [3, 2, 4, 5, 6]
>>> y = [9, 7, 12, 15, 17]
>>> "{:.5}".format(covars(x, y))
'6.5'

varp(x)[source]

Compute population variance of x.

Parameters:: x – Array of values.
Returns:: Population variance.
Return type:: float

Example:

>>> x = [1345, 1301, 1368, 1322, 1310, 1370, 1318, 1350, 1303, 1299]
>>> "{:.5}".format(varp(x))
'678.84'

vars(x)[source]

Compute sample variance of x.

Parameters:: x – Array of values.
Returns:: Sample variance.
Return type:: float

Example:

>>> x = [1345, 1301, 1368, 1322, 1310, 1370, 1318, 1350, 1303, 1299]
>>> "{:.5}".format(vars(x))
'754.27'

stddevp(x)[source]

Compute population standard deviation.

Parameters:: x – Array of values.
Returns:: Population standard deviation.
Return type:: float

Example:

>>> x = [1345, 1301, 1368, 1322, 1310, 1370, 1318, 1350, 1303, 1299]
>>> "{:.5}".format(stddevp(x))
'26.055'

stddevs(x)[source]

Compute sample standard deviation.

Parameters:: x – Array of values.
Returns:: Sample standard deviation.
Return type:: float

Example:

>>> x = [1345, 1301, 1368, 1322, 1310, 1370, 1318, 1350, 1303, 1299]
>>> "{:.5}".format(stddevs(x))
'27.464'

beta(x, index)[source]

Compute beta of x with respect to index (typically over returns).

Parameters:

x – Asset returns.
index – Index returns.

Returns:

Beta coefficient.

Return type:

float

Example:

>>> x = [0.10, 0.18, -0.15, 0.18]
>>> y = [0.10, 0.17, -0.17, 0.17]
>>> '{:.2}'.format(beta(x, y))
'0.97'

correl(x, y)[source]

Compute correlation between x and y.

Parameters:

x – First array.
y – Second array.

Returns:

Correlation coefficient.

Return type:

float

Example:

>>> x = [3, 2, 4, 5, 6]
>>> y = [9, 7, 12, 15, 17]
>>> "{:.3}".format(correl(x, y))
'0.997'

rsq(x, y)[source]

Compute R-squared (coefficient of determination) between x and y.

Parameters:

x – First array.
y – Second array.

Returns:

R-squared value.

Return type:

float

Example:

>>> x = [ 6, 5, 11, 7, 5, 4, 4]
>>> y = [ 2, 3,  9, 1, 8, 7, 5]
>>> "{:.5}".format(rsq(x, y))
'0.05795'

rtns(x)[source]

Compute simple returns between consecutive values.

Parameters:: x – Array of prices.
Returns:: Array of returns (one fewer element than input).

Example:

>>> pp = rtns([1., 1.1, 1.3, 1.1, 1.3])
>>> [f'{x:0.2f}' for x in pp]
['0.10', '0.18', '-0.15', '0.18']

logrtns(x)[source]

Compute log returns between consecutive values.

Parameters:: x – Array of prices.
Returns:: Array of log returns (one fewer element than input).

Example:

>>> pp = logrtns([1., 1.1, 1.3, 1.1, 1.3])
>>> [f'{x:0.2f}' for x in pp]
['0.10', '0.17', '-0.17', '0.17']

weighted_average(rows, field, predicate, weight_field)[source]

Compute a weighted average of field in a DataSet using weight_field as the weight. Limit to rows matching the predicate. Uses sum of abs in denominator because we are really looking for the value-weighted contribution of the position.

This handles long/short cases correctly, although they can give surprising results.

Consider two “trades” BL 5000 at a delta of 50% and SS -4000 at a delta of 30%. If you didn’t use abs() you’d get:

(5000 * 50 - 4000 * 30) / (5000 - 4000) = 130

Using abs() you get:

(5000 * 50 - 4000 * 30) / (5000 + 4000) = 14.4

This is really equivalent to saying you bought another 4000 at a delta of -30 (because the short position has a negative delta effect) which then makes more sense: combining two positions, one with positive delta and one with negative should give a value that weights the net effect of them, which the second case does. If the short position were larger or had a larger delta, you could end up with a negative weighted average, which although a bit confusing, is mathematically correct.

linear_regression(x, y)[source]: Compute the least-squares linear regression line for the set of points. Returns the slope and y-intercept.

distance_from_line(m, b, x, y)[source]: Compute the distance from each point to the line defined by m and b.

linterp(x0, x1, x, y0, y1, *, inf_value=None)[source]

Linearly interpolate y between y0 and y1 based on x’s position.

Parameters:

x0 – Start of x range.
x1 – End of x range.
x – Value to interpolate at.
y0 – Y value at x0.
y1 – Y value at x1.
inf_value – Value to return when x1 is infinity (default: y0).

Returns:

Interpolated y value.

Return type:

float

Example:

>>> linterp(1, 3, 2, 2, 4)
3.0
>>> linterp(1, float('inf'), 2, 2, 4)
2.0
>>> linterp(1, float('inf'), 2, 2, 4, inf_value=4)
4.0

np_divide(a, b)[source]

Safely divide numpy arrays, returning 0 where divisor is 0.

Parameters:

a – Numerator array.
b – Denominator array.

Returns:

Result array with 0 where b was 0.

safe_add(*args)[source]

Safely add numbers, returning None if any argument is None.

Parameters:: args – Numbers to add.
Returns:: Sum or None if any arg is None.

safe_diff(*args)[source]

Safely subtract numbers, returning None if any argument is None.

Parameters:: args – Numbers to subtract sequentially.
Returns:: Difference or None if any arg is None.

safe_divide(*args, **kwargs)[source]

Safely divide numbers, returning None if any arg is None.

Parameters:

args – Numbers to divide sequentially.
kwargs – Optional ‘infinity’ for division by zero result.

Returns:

Result or None if any arg is None, inf on division by zero.

Example:

>>> '{:.2f}'.format(safe_divide(10, 5))
'2.00'
>>> '{:.2f}'.format(safe_divide(10, 1.5, 1))
'6.67'
>>> safe_divide(1, 0)
inf
>>> safe_divide(10, 1, None)

safe_mult(*args)[source]: For big lists of stuff to multiply, when some things may be None

safe_round(arg, places=2)[source]

Safely round a number, returning None if argument is None.

Parameters:

arg – Number to round.
places (int) – Decimal places (default 2).

Returns:

Rounded number or None.

safe_cmp(op, a, b)[source]

Compare two values using a comparison operator.

Parameters:

op – Operator string (‘>’, ‘>=’, ‘<’, ‘<=’, ‘==’, ‘!=’) or operator function.
a – First value.
b – Second value.

Returns:

Boolean result of comparison.

safe_min(*args, **kwargs)[source]: Min returns None if it is in the list - this one returns the min value

safe_max(*args, **kwargs)[source]: Max returns None if it is in the list - this one returns the max value

convert_mixed_numeral_to_fraction(num)[source]

Convert mixed numeral string to decimal fraction.

Parameters:: num (str) – Mixed numeral (e.g., ‘1 7/8’).
Returns:: Decimal equivalent.
Return type:: float

convert_to_mixed_numeral(num, force_sign=False)[source]

Convert decimal or fraction to mixed numeral string.

Parameters:

num – Number or string to convert.
force_sign (bool) – Force ‘+’ prefix on positive numbers.

Returns:

Mixed numeral string (e.g., ‘1 7/8’) or None on error.

Return type:

str

Example:

>>> convert_to_mixed_numeral(1.875, True)
'+1 7/8'
>>> convert_to_mixed_numeral(-1.875)
'-1 7/8'
>>> convert_to_mixed_numeral(-.875)
'-7/8'
>>> convert_to_mixed_numeral('-1.875')
'-1 7/8'
>>> convert_to_mixed_numeral('1 7/8', False)
'1 7/8'
>>> convert_to_mixed_numeral('1-7/8', True)
'+1 7/8'
>>> convert_to_mixed_numeral('-1.5')
'-1 1/2'
>>> convert_to_mixed_numeral('6/7', True)
'+6/7'
>>> convert_to_mixed_numeral('1 6/7', False)
'1 6/7'
>>> convert_to_mixed_numeral(0)
'0'
>>> convert_to_mixed_numeral('0')
'0'

round_to_nearest(value, base)[source]

Round value to nearest multiple of base.

Parameters:

value (float) – Value to round.
base – Base multiple to round to (must be >= 1).

Returns:

Rounded value.

Return type:

float

Example:

>>> round_to_nearest(12, 25)
0
>>> round_to_nearest(26, 25)
25

numpy_smooth(x, window_len=11, window='hanning')[source]

Smooth the data using a window with requested size.

https://scipy-cookbook.readthedocs.io/items/SignalSmooth.html

This method is based on the convolution of a scaled window with the signal. The signal is prepared by introducing reflected copies of the signal (with the window size) in both ends so that transient parts are minimized in the begining and end part of the output signal.

Parameters:

x – the input signal
window_len – the dimension of the smoothing window; should be an odd integer
window – the type of window from ‘flat’, ‘hanning’, ‘hamming’, ‘bartlett’, ‘blackman’. Flat window will produce a moving average smoothing.

Returns:

the smoothed signal

Example:

t=linspace(-2,2,0.1)
x=sin(t)+randn(len(t))*0.1
y=smooth(x)

See also

numpy.hanning, numpy.hamming, numpy.bartlett, numpy.blackman, numpy.convolve, scipy.signal.lfilter

Note

The window parameter could be the window itself if an array instead of a string. length(output) != length(input), to correct this: return y[(window_len/2-1):-(window_len/2)] instead of just y.

choose(n, k)[source]

Compute binomial coefficient (n choose k).

Parameters:

n (int) – Total items.
k (int) – Items to choose.

Returns:

Number of combinations.

Return type:

int

Example:

>>> choose(10, 3)
120

sync

Synchronization primitives: timeout decorator and context manager for thread/async operations with configurable timeout handling.

syncd(lock)[source]

Decorator to synchronize functions with a shared lock.

Parameters:: lock – Threading lock to acquire during function execution.
Returns:: Decorator function.

Example:

>>> import threading
>>> lock = threading.Lock()
>>> @syncd(lock)
... def safe_increment(counter):
...     return counter + 1
>>> safe_increment(0)
1

class NonBlockingDelay[source]

Bases: object

Non-blocking delay for checking time elapsed.

timeout()[source]

Check if the delay time has elapsed.

Returns:: True if time is up.
Return type:: bool

delay(delay)[source]

Start a non-blocking delay.

Parameters:: delay (float) – Delay duration in seconds.
Return type:: None

delay(seconds)[source]

Delay non-blocking for N seconds (busy-wait). :rtype: None

Deprecated since version Use: time.sleep() for efficient blocking delays. This function is kept for backward compatibility.

debounce(wait)[source]

Decorator to debounce function calls.

Waits wait seconds before calling function, cancels if called again.

Parameters:: wait (float) – Seconds to wait before executing.
Returns:: Decorator function.

wait_until(hour, minute=0, second=0, tz=datetime.timezone.utc, time_unit='milliseconds')[source]

Calculate time to wait until specified hour/minute/second.

Parameters:

hour (int) – Hour (0-23).
minute (int) – Minute (0-59).
second (int) – Second (0-59).
tz (tzinfo | None) – Timezone (default: UTC).
time_unit (str) – Return unit (‘seconds’ or ‘milliseconds’).

Returns:

Time to wait in specified unit.

Return type:

int

Raises:

ValueError – If hour/minute/second out of range.

class timeout(seconds=100, error_message='Timeout!!')[source]

Bases: object

Context manager for timing out potentially hanging code.

Parameters:

seconds (int) – Timeout in seconds (default: 100).
error_message (str) – Message for timeout error.

Warning

Uses SIGALRM and only works on Unix/Linux systems.

handle_timeout(signum, frame)[source]

crypto

Cryptography and encoding utilities: base64 file encoding, hashing.

Cryptography and encoding utilities.

base64file(fil)[source]

Encode file contents as base64.

Parameters:: fil – Path to file to encode.
Returns:: Base64 encoded bytes.
Return type:: bytes

Example:

>>> import tempfile
>>> with tempfile.NamedTemporaryFile(mode='w', delete=False) as f:
...     _ = f.write('hello world')
>>> base64file(f.name)
b'aGVsbG8gd29ybGQ=\n'

Note

This function reads the entire file into memory. Use with caution on large files.

kryptophy(blah)[source]

Converts a string to an integer by concatenating hex values of characters.

Parameters:: blah (str) – String to convert.
Returns:: Integer representation of the string.
Return type:: int

Example:

>>> kryptophy('AB')
16706
>>> kryptophy('hello')
448378203247

geo

Geographic utilities: Mercator projection coordinate transformations (merc_x, merc_y) for mapping applications.

Geographic utilities for coordinate transformations

merc_x(lon, r_major=6378137.0)[source]

Project longitude into mercator / radians from major axis.

Parameters:

lon (float) – Longitude in degrees.
r_major (float) – Major axis radius in meters (default: Earth WGS84).

Returns:

Mercator x coordinate.

Return type:

float

Example:

>>> "{:0.3f}".format(merc_x(40.7484))
'4536091.139'

merc_y(lat, r_major=6378137.0, r_minor=6356752.3142)[source]

Project latitude into mercator / radians from major/minor axes.

Parameters:

lat (float) – Latitude in degrees.
r_major (float) – Major axis radius in meters (default: Earth WGS84).
r_minor (float) – Minor axis radius in meters (default: Earth WGS84).

Returns:

Mercator y coordinate.

Return type:

float

Example:

>>> "{:0.3f}".format(merc_y(73.9857))
'12468646.871'

dir

Directory operations: recursive creation, temporary directories, file searching, safe moving, directory structure inspection, file downloading.

Directory and file system utilities.

Note

os.walk and scandir were slow over network connections in Python 2.

mkdir_p(path)[source]

Create directory and any missing parent directories.

Parameters:: path – Directory path to create.

Example:

>>> import tempfile, os
>>> tmpdir = tempfile.mkdtemp()
>>> newdir = os.path.join(tmpdir, 'a', 'b', 'c')
>>> mkdir_p(newdir)
>>> os.path.isdir(newdir)
True

make_tmpdir(prefix=None)[source]

Context manager to wrap a temporary directory with auto-cleanup.

Parameters:: prefix (str) – Optional prefix directory for the temp dir.
Yields:: Path to the temporary directory.
Return type:: Path

Example:

>>> import os.path
>>> fpath = ""
>>> with make_tmpdir() as basedir:
...     fpath = os.path.join(basedir, 'temp.txt')
...     with open(fpath, "w") as file:
...         file.write("We expect the file to be deleted when context closes")
52
>>> try:
...     file = open(fpath, "w")
... except IOError as io:
...     raise Exception('File does not exist')
Traceback (most recent call last):
...
Exception: File does not exist

expandabspath(p)[source]

Expand path to absolute path with environment variables and user expansion.

Parameters:: p (str) – Path string to expand.
Returns:: Absolute path with all expansions applied.
Return type:: Path

Example:

>>> import os
>>> os.environ['SPAM'] = 'eggs'
>>> assert expandabspath('~/$SPAM') == Path(os.path.expanduser('~/eggs'))
>>> assert expandabspath('/foo') == Path('/foo')

get_directory_structure(rootdir)[source]

Create a nested dictionary representing the folder structure.

Parameters:: rootdir (str) – Root directory to traverse.
Returns:: Nested dictionary of the directory structure.
Return type:: dict

Example:

>>> import tempfile, os
>>> tmpdir = tempfile.mkdtemp()
>>> os.makedirs(os.path.join(tmpdir, 'sub'))
>>> Path(os.path.join(tmpdir, 'file.txt')).touch()
>>> result = get_directory_structure(tmpdir)
>>> 'file.txt' in result[os.path.basename(tmpdir)]
True

search(rootdir, name=None, extension=None)[source]

Search for files by name, extension, or both in directory.

Parameters:

rootdir (str) – Root directory to search.
name (str) – Optional file name pattern to match.
extension (str) – Optional file extension to match.

Yields Path:

Full path to each matching file.

See also

See tests/test_dir.py for usage examples.

safe_move(source, target, hard_remove=False)[source]

Move a file to a new location, optionally deleting anything in the way.

Parameters:

source (str) – Source file path.
target (str) – Target file path.
hard_remove (bool) – If True, delete existing file at target first.

Returns:

Final target path (may differ if conflict occurred).

Return type:

Path

See also

See tests/test_dir.py for usage examples.

save_file_tmpdir(fname, content, thedate=None, **kw)[source]

Save a document to the specified temp directory, optionally with date.

Parameters:

fname (str) – Filename pattern.
content (str) – Content to write.
thedate – Optional date to append to filename.
kw – Keyword arguments (tmpdir to specify custom temp directory).

Example:

>>> import datetime
>>> content = "</html>...</html>"
>>> save_file_tmpdir("Foobar.txt", content, thedate=datetime.date.today())

get_dir_match(dir_pattern, thedate=None)[source]

Get paths of existing files matching each glob pattern.

Filters zero-size files and returns warnings for missing patterns.

Parameters:

dir_pattern – List of (directory, pattern) tuples.
thedate – Optional date to append to patterns.

Returns:

Tuple of (results list of Path objects, warnings list of strings).

Return type:

tuple[list[Path], list[str]]

See also

See tests/test_dir.py for usage examples.

load_files(directory, pattern='*', thedate=None)[source]

Load file contents from directory matching pattern.

Parameters:

directory (str) – Directory to search.
pattern (str) – Glob pattern to match files.
thedate – Optional date to append to pattern.

Yields:

File contents as strings.

See also

See tests/test_dir.py for usage examples.

load_files_tmpdir(patterns='*', thedate=None)[source]

Load files from temp directory matching patterns.

Parameters:

patterns – Glob pattern(s) to match files.
thedate – Optional date to append to patterns.

Returns:

Iterator of file contents.

Example:

>>> import datetime
>>> patterns = ("nonexistent_pattern_*.txt",)
>>> results = load_files_tmpdir(patterns, datetime.date.today())
>>> next(results, None) is None
True

dir_to_dict(path)[source]

Convert directory structure to a dictionary.

Parameters:: path (str) – Directory path to convert.
Returns:: Dictionary with subdirectories as nested dicts and .files key for files.
Return type:: dict

Example:

>>> import tempfile, os
>>> tmpdir = tempfile.mkdtemp()
>>> Path(os.path.join(tmpdir, 'test.txt')).touch()
>>> result = dir_to_dict(tmpdir)
>>> 'test.txt' in result['.files']
True

download_file(url, save_path=None)[source]

Download file from URL with progress bar and retry logic.

Parameters:

url (str) – URL to download from.
save_path (str or Path) – Optional path to save file (defaults to temp directory).

Returns:

Path to downloaded file.

Return type:

Path

See also

See tests/test_dir.py for usage examples.

splitall(path)[source]

Split path into all its components.

Works with both Unix and Windows paths.

Parameters:: path (str) – Path string to split.
Returns:: List of path components.
Return type:: list
Raises:: TypeError – If path is not a string.

Example:

>>> splitall('a/b/c')
['a', 'b', 'c']
>>> splitall('/a/b/c/')
['/', 'a', 'b', 'c', '']
>>> splitall('/')
['/']
>>> splitall('C:')
['C:']
>>> splitall('C:\\')
['C:\\']
>>> splitall('C:\\a')
['C:\\', 'a']
>>> splitall('C:\\a\\')
['C:\\', 'a', '']
>>> splitall('C:\\a\\b')
['C:\\', 'a', 'b']
>>> splitall('a\\\\b')
['a', 'b']

resplit(path, *args)[source]

Split path by multiple separators.

Parameters:

path (str) – Path to split.
args – Separator characters to split on.

Returns:

List of path components.

Return type:

list

Warning

Tests pass on Windows, not on nix. Not safe to use!

exception

Exception handling: print exceptions with traceback control, try_else wrapper for default values on failure.

print_exception(e, short=True)[source]

Print exception traceback with optional verbosity.

Parameters:

e (Exception) – The exception to print.
short (bool) – If True, prints only traceback above current frame.

Example:

>>> try:
...     raise ValueError("example error")
... except Exception as e:
...     print_exception(e)  

try_else(func, default=None)[source]

Wrap function to return default value if it fails.

Parameters:

func – Function to wrap.
default – Default value or callable to return on failure.

Returns:

Wrapped function.

Example:

>>> import json
>>> d = try_else(json.loads, 2)('{"a": 1, "b": "foo"}')
>>> d
{'a': 1, 'b': 'foo'}
>>> repl = lambda x: 'foobar'
>>> d = try_else(json.loads, repl)('{"a": 1, b: "foo"}')
>>> d
'foobar'

future

Future pattern implementation for running functions asynchronously in separate threads with result retrieval.

class Future(func, *param)[source]

Bases: object

Easy threading via a Future pattern.

Runs a time-consuming function in a separate thread while allowing the main thread to continue uninterrupted.

Parameters:

func – Function to run in background thread.
param – Arguments to pass to the function.

Note

Algorithm from http://code.activestate.com/recipes/84317/

Example:

>>> import time, math
>>> def wait_and_add(x):
...     time.sleep(2)
...     return x+1

Won’t wait 2 seconds here:

>>> start = time.time()
>>> z = Future(wait_and_add, 2)
>>> 1+2
3
>>> int(math.ceil(time.time()-start))
1

At this point we need to wait the 2 seconds:

>>> z()
3
>>> int(math.ceil(time.time()-start)) >= 2
True

Wrapper(func, param)[source]

pandasutils

Pandas utilities: interval-based operations, multi-column table display, downcast (memory optimization), fuzzymerge (fuzzy string matching for joins). Re-exports all pandas functionality.

Pandas wrappers and utilities

This module provides utility functions for pandas DataFrames and Series, including null checking, type downcasting, fuzzy merging, and timezone data.

is_null(x)[source]

Check if value is null/None (pandas required).

For array-like inputs (list, numpy array), returns True only if ALL elements are null. This avoids the “ambiguous truth value” error that occurs when using pandas.isnull() on arrays in boolean contexts.

Parameters:: x – Value to check.
Returns:: True if value is null/None/NaN, or if array-like and all elements are null.
Return type:: bool

Example:

>>> import datetime
>>> import numpy as np
>>> assert is_null(None)
>>> assert not is_null(0)
>>> assert is_null(np.nan)
>>> assert not is_null(datetime.date(2000, 1, 1))
>>> assert is_null([])
>>> assert is_null([None, None])
>>> assert not is_null([1, 2, 3])
>>> assert not is_null([None, 1])

download_tzdata()[source]

Download timezone data for pyarrow date wrangling.

Downloads to the “Downloads” folder.

downcast(df, rtol=1e-05, atol=1e-08, numpy_dtypes_only=False)[source]

Downcast DataFrame to minimum viable type for each column.

Ensures resulting values are within tolerance of original values.

Parameters:

df (DataFrame) – DataFrame to downcast.
rtol (float) – Relative tolerance for numeric comparison.
atol (float) – Absolute tolerance for numeric comparison.
numpy_dtypes_only (bool) – Use only numpy dtypes.

Returns:

Downcasted DataFrame.

Return type:

DataFrame

Note

See numpy.allclose for tolerance parameters.

Example:

>>> from numpy import linspace, random
>>> from pandas import DataFrame
>>> data = {
... "integers": linspace(1, 100, 100),
... "floats": linspace(1, 1000, 100).round(2),
... "booleans": random.choice([1, 0], 100),
... "categories": random.choice(["foo", "bar", "baz"], 100)}
>>> df = DataFrame(data)
>>> downcast(df, rtol=1e-10, atol=1e-10).info()
<class 'pandas.core.frame.DataFrame'>
...
dtypes: bool(1), category(1), float64(1), uint8(1)
memory usage: 1.3 KB
>>> downcast(df, rtol=1e-05, atol=1e-08).info()
<class 'pandas.core.frame.DataFrame'>
...
dtypes: bool(1), category(1), float32(1), uint8(1)
memory usage: 964.0 bytes

fuzzymerge(df1, df2, right_on, left_on, usedtype='uint8', scorer='WRatio', concat_value=True, **kwargs)[source]

Merge two DataFrames using fuzzy matching on specified columns.

Performs fuzzy matching between DataFrames based on specified columns, useful for matching data with small variations like typos or abbreviations.

Parameters:

df1 (DataFrame) – First DataFrame to merge.
df2 (DataFrame) – Second DataFrame to merge.
right_on (str) – Column name in df2 for matching.
left_on (str) – Column name in df1 for matching.
usedtype – Data type for distance matrix (default: uint8).
scorer – Scoring function for fuzzy matching (default: WRatio).
concat_value (bool) – Add similarity scores column (default: True).
kwargs – Additional arguments for pandas.merge.

Returns:

Merged DataFrame with fuzzy-matched rows.

Return type:

DataFrame

Example:

>>> df1 = read_csv(  
...     "https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv"
... )
>>> df2 = df1.copy()  
>>> df2 = concat([df2 for x in range(3)], ignore_index=True)  
>>> df2.Name = (df2.Name + random.uniform(1, 2000, len(df2)).astype("U"))  
>>> df1 = concat([df1 for x in range(3)], ignore_index=True)  
>>> df1.Name = (df1.Name + random.uniform(1, 2000, len(df1)).astype("U"))  
>>> df3 = fuzzymerge(df1, df2, right_on='Name', left_on='Name', usedtype=uint8, scorer=partial_ratio,  
...                         concat_value=True)

rand

Random number generation with enhanced seeding, sampling, and distribution functions.

random_choice(choices)[source]

Random choice from a list, seeded with OS entropy.

Parameters:: choices (list) – List of items to choose from.
Returns:: Randomly selected item.

Example:

>>> result = random_choice(['a', 'b', 'c'])
>>> result in ['a', 'b', 'c']
True

random_int(a, b)[source]

Random integer between a and b inclusive, seeded with OS entropy.

Parameters:

a (int) – Lower bound.
b (int) – Upper bound.

Returns:

Random integer in [a, b].

Return type:

int

Example:

>>> result = random_int(1, 10)
>>> 1 <= result <= 10
True

random_sample(arr, size=1)[source]

Random sample of N elements from numpy array.

Parameters:

arr (np.array) – Array to sample from.
size (int) – Number of elements to sample.

Returns:

Array of sampled elements.

Return type:

np.array

random_random()[source]

Random float in [0, 1), seeded with OS entropy.

Returns:: Random float.
Return type:: float

Example:

>>> result = random_random()
>>> 0 <= result < 1
True

thread

Threading utilities: asyncd decorator for async-style syntax, threaded decorator for background execution in thread pools.

asyncd(func)[source]

Decorator to run synchronous function asynchronously.

Parameters:: func – Synchronous function to wrap.
Returns:: Async wrapper function.

Note

Based on https://stackoverflow.com/a/50450553

call_with_future(fn, future, args, kwargs)[source]

Call function and set result on future.

Parameters:

fn – Function to call.
future – Future to set result on.
args – Positional arguments for fn.
kwargs – Keyword arguments for fn.

class RateLimitedExecutor(max_workers=None, max_per_second=inf, show_progress=False)[source]

Bases: object

Thread pool executor with rate limiting and Request/Response API.

Provides clean request/response API where every response includes the original request for easy result tracking and exception handling.

Basic Usage:

with RateLimitedExecutor(max_workers=10, max_per_second=5, show_progress=True) as executor:
    responses = executor.execute_items(process_fn, items, desc='Processing')
    for response in responses:
        if response.success:
            print(f"Item {response.request.id}: {response.result}")
        else:
            print(f"Item {response.request.id} failed: {response.exception}")

Advanced Usage with Custom IDs:

requests = [TaskRequest(item=x, id=f'custom_{i}') for i, x in enumerate(items)]
responses = executor.execute(process_fn, requests, desc='Processing')
result_map = {r.request.id: r.result for r in responses if r.success}

__init__(max_workers=None, max_per_second=inf, show_progress=False)[source]

Initialize rate-limited executor.

Parameters:

max_workers (int) – Maximum concurrent threads.
max_per_second (float) – Maximum calls per second.
show_progress (bool) – Display progress bar during execution.

execute(fn, requests, desc='Processing', unit='item')[source]

Execute function on all requests and return responses in order.

Parameters:

fn (Callable) – Function that takes request.item and returns a result.
requests (list) – List of TaskRequest objects to process.
desc (str) – Description for progress bar.
unit (str) – Unit name for progress bar.

Returns:

List of TaskResponse objects in same order as requests.

Return type:

list[TaskResponse]

execute_items(fn, items, desc='Processing', unit='item')[source]

Execute function on items with auto-generated request IDs.

Parameters:

fn (Callable) – Function that takes an item and returns a result.
items (list) – List of items to process.
desc (str) – Description for progress bar.
unit (str) – Unit name for progress bar.

Returns:

List of TaskResponse objects with request.id = index.

Return type:

list[TaskResponse]

submit(fn, *args, **kwargs)[source]

Submit a callable to be executed with rate limiting.

Parameters:

fn – Callable to execute.
args – Positional arguments.
kwargs – Keyword arguments.

Returns:

Future representing the result.

Return type:

Future

shutdown(wait=True, cancel_futures=False)[source]

Shutdown the executor.

Parameters:

wait (bool) – Wait for pending futures to complete.
cancel_futures (bool) – Cancel pending futures.

Return type:

None

class TaskRequest(item, id=None)[source]

Bases: object

Request to process an item with optional ID for tracking.

Parameters:

item (Any) – The item to process.
id (Any) – Optional identifier for tracking.

item: Any

id: Any = None

class TaskResponse(result, request, exception=None)[source]

Bases: object

Response from processing a request.

Parameters:

result (Any) – The result from processing.
request (TaskRequest) – The original TaskRequest.
exception (Exception) – Any exception that occurred (None if successful).

result: Any

request: TaskRequest

exception: Exception = None

property success: bool

Check if task completed successfully.

Returns:: True if no exception occurred.
Return type:: bool

threaded(fn)[source]

Decorator to run function in a separate thread.

Returns a Future that can be used to get the result.

Note

Based on https://stackoverflow.com/a/19846691

Example:

>>> class MyClass:
...     @threaded
...     def get_my_value(self):
...         return 1
>>> my_obj = MyClass()
>>> fut = my_obj.get_my_value()  # this will run in a separate thread
>>> fut.result()  # will block until result is computed
1

chart

Plotting utilities for creating charts, particularly time series visualizations using matplotlib.

numpy_timeseries_plot(title, dates, series=None, labels=None, formats=None)[source]

Create a matplotlib timeseries plot with automatic subplot layout.

The layout adapts based on the number of series:

1 series: single plot
2 series: same plot with dual y-axes (overlapping)
3+ series: stacked vertically as subplots

Parameters:

title (str) – Plot title.
dates – Array of dates for x-axis.
series (list) – List of y-value arrays.
labels (list) – Labels for each series.
formats (list) – Formatter functions for each y-axis.

Returns:

BytesIO buffer containing the PNG image.

Return type:

BytesIO

win

Windows-specific utilities: command execution, psexec sessions, file share mounting, WMIC output parsing.

Windows Utilities

run_command(cmd, workingdir=None, raise_on_error=True, hidearg=None)[source]

Execute a shell command and return output.

Parameters:

cmd – Command as string or list of arguments.
workingdir (str) – Directory to execute in (optional).
raise_on_error (bool) – Raise exception on non-zero return code.
hidearg (str) – Argument value to mask in logs (for passwords).

Returns:

Combined stdout and stderr output.

Return type:

bytes

Raises:

Exception – If command fails and raise_on_error is True.

See also

See tests/test_win.py for usage examples.

class psexec_session(host, password)[source]

Bases: object

Context manager for running psexec commands.

Mounts admin share before commands and unmounts on exit.

Parameters:

host (str) – Remote host name or IP.
password (str) – Password for authentication.

Example:

with shell.psexec_session(host, password):
    for cmd in commands:
        out = shell.run_command(cmd)

class file_share_session(host, password, drive, share)[source]

Bases: object

Context manager for temporarily mounting a file share.

Mounts share before commands and unmounts on exit.

Parameters:

host (str) – Remote host name or IP.
password (str) – Password for authentication.
drive (str) – Local drive letter to mount to.
share (str) – Remote share name.

Example:

with shell.file_share_session(host, password, 'Z:', 'data'):
    for cmd in commands:
        out = shell.run_command(cmd)

mount_admin_share(host, password, unmount=False)[source]

Mount or unmount the admin$ share required for psexec commands.

Resolves host to IP address to avoid Windows multiple-connection errors.

Parameters:

host (str) – Remote host name.
password (str) – Password for authentication.
unmount (bool) – If True, unmount instead of mount.

Note

Connects by IP address to work around Windows complaining about multiple connections to a share by the same user.

mount_file_share(host, password, drive, share, unmount=False)[source]

Mount or unmount a Windows file share.

Parameters:

host (str) – Remote host name.
password (str) – Password for authentication.
drive (str) – Local drive letter to mount to.
share (str) – Remote share name.
unmount (bool) – If True, unmount instead of mount.

parse_wmic_output(output)[source]

Parse output from WMIC query into list of dicts.

Parameters:: output (str) – Raw WMIC output string.
Returns:: List of dictionaries with column headers as keys.
Return type:: list[dict]

Example:

>> wmic_output = os.popen('wmic product where name="Python 2.7.11" get Caption, Description, Vendor').read()
>> result = parse_wmic_output(wmic_output)
>> result[0]['Caption']
>> result[0]['Vendor']

exit_cmd()[source]

Kill all running cmd.exe processes via WMI.

Requires pywin32 to be installed on Windows.