Specialized Utilities
Domain-specific utilities for web, math, concurrency, and more.
webapp
Web utilities for Flask and web.py applications: CORS support, XSRF protection, authentication decorators, URL building, breadcrumb generation, HTML escaping, JSON encoders with ISO date support.
- get_or_create(session, model, **kw)[source]
Get existing model instance or create new one (Django-style).
- Parameters:
session – SQLAlchemy session.
model – SQLAlchemy model class.
kw – Keyword arguments for filtering/creating.
- Returns:
Existing or newly created model instance.
- paged(order_by_df, per_page_df)[source]
Decorator to pass in default order/page/per page for pagination.
Steps performed:
Acquire the thread-local request object
Calculate pagination order by/offset/limit from request object
Patch the info into a database connection
- Parameters:
- Returns:
Decorator function.
Warning
Careful not to patch MULTIPLE queries within the controller.
- rsleep(always=0, rand_extra=8)[source]
Sleep for a random amount of time.
- rand_retry(x_times=10, exception=<class 'Exception'>)[source]
Decorator that retries function with random delays.
Useful for avoiding automated thresholding on web requests.
- Parameters:
x_times (int) – Maximum number of retries.
exception – Exception type(s) to catch and retry on.
- Returns:
Decorator function.
- cors_webpy(app, **kw)[source]
Wrap a web.py controller with CORS headers.
Especially useful for views using resources from many websites.
- Parameters:
app – web.py application instance.
kw – CORS options (origin, credentials, methods, headers, max_age, attach_to_all, automatic_options).
- Returns:
Decorator function.
- cors_flask(app, **kw)[source]
Wrap a Flask controller with CORS headers.
Especially useful for views using resources from many websites.
- Parameters:
app – Flask application instance.
kw – CORS options (origin, credentials, methods, headers, max_age, attach_to_all, automatic_options).
- Returns:
Decorator function.
- authd(checker_fn, fallback_fn)[source]
Decorator that checks if user meets an auth criterion.
Works with both web.py and Flask frameworks.
- Parameters:
checker_fn – Callable that returns True if authorized.
fallback_fn – Callable to invoke if not authorized.
- Returns:
Decorator function.
- xsrf_token()[source]
Generate cross-site request forgery protection token.
- Returns:
XSRF token string.
- Return type:
Note
TODO: Add the xsrf tokens to forms.
- xsrf_protected(fn)[source]
Decorator protecting PUT/POST requests from session riding.
- Parameters:
fn – Function to protect.
- Returns:
Wrapped function.
Note
TODO: Decorate controllers for xsrf protected forms.
- valid_api_key(key)[source]
Check if key has valid format.
Validates format only (alphanumeric, underscore, hyphen, 1-255 chars). For user validation, integrate with your user model’s key validation.
- requires_api_key(fn)[source]
Decorator requiring valid API key for controller access.
Protects against directory traversal attacks and permission issues.
- Parameters:
fn – Controller function to protect.
- Returns:
Wrapped function.
- make_url(path, **params)[source]
Generate URL with query parameters.
Inspired by
werkzeug.urls.Href. Assumes traditional multiple params (does not overwrite). Use__replace__to overwrite params. Use__ignore__to filter out certain params.- Parameters:
path (str) – Base URL path.
params – Query parameters.
- Returns:
Complete URL with query string.
- Return type:
Example:
>>> ignore_fn = lambda x: x.startswith('_') >>> kw = dict(fuz=1, biz="boo") >>> make_url('/foo/', _format='excel', __ignore__=ignore_fn, **kw) '/foo/?fuz=1&biz=boo' >>> make_url('/foo/?bar=1', _format='excel', **kw) '/foo/?_format=excel&fuz=1&biz=boo&bar=1' >>> make_url('/foo/', bar=1, baz=2) '/foo/?bar=1&baz=2' >>> make_url('/foo/', **{'bar':1, 'fuz':(1,2,), 'biz':"boo"}) '/foo/?bar=1&fuz=1&fuz=2&biz=boo' >>> make_url('/foo/?a=1&a=2') '/foo/?a=1&a=2' >>> kwargs = dict(fuz=1, biz="boo", __ignore__=ignore_fn) >>> xx = make_url('www.foobar.com/foo/', **kwargs) >>> 'www' in xx and 'foobar' in xx and '/foo/' in xx and 'fuz=1' in xx and 'biz=boo' in xx True >>> xx = make_url('/foo/', _format='excel', **kwargs) >>> '_format=excel' in xx False >>> 'fuz=1' in xx True >>> 'biz=boo' in xx True >>> yy = make_url('/foo/?bar=1', _format='excel', **kwargs) >>> 'bar=1' in yy True >>> '_format=excel' in yy False >>> zz = make_url('/foo/', **{'bar':1, 'fuz':(1,2,), 'biz':"boo"}) >>> 'fuz=1' in zz True >>> 'fuz=2' in zz True >>> qq = make_url('/foo/?a=1&a=2') >>> 'a=1' in qq True >>> 'a=2' in qq True
- prefix_urls(pathpfx, classpfx, urls)[source]
Add prefixes to web.py URL mappings.
- url_path_join(*parts)[source]
Normalize URL parts and join them with a slash.
- Parameters:
parts – URL parts to join.
- Returns:
Joined URL string.
- Return type:
- first_of_each(*sequences)[source]
Return first non-empty element from each sequence.
- Parameters:
sequences – Variable number of sequences.
- Returns:
Generator yielding first non-empty element from each.
- safe_join(directory, *pathnames)[source]
Safely join untrusted path components to a base directory.
Prevents escaping the base directory via path traversal.
- Parameters:
- Returns:
A safe path, or
Noneif path would escape base.- Return type:
str | None
Note
Via github.com/mitsuhiko/werkzeug security.py
- local_or_static_join(static, somepath)[source]
Find template in working directory or static folder.
- inject_file(x)[source]
Read file contents for injection into HTML email templates.
- inject_image(x)[source]
Generate base64 data URI for image embedding in HTML.
- build_breadcrumb(ctx)[source]
Build breadcrumb HTML from web.py app_stack.
- Parameters:
ctx – web.py context object.
- Returns:
HTML string with breadcrumb links.
- Return type:
- breadcrumbify(url_app_tuple)[source]
Patch URL mapping into web.py subapps for breadcrumbs.
- appmenu(urls, home='', fmt=<function _format_link>)[source]
Build HTML menu from web.py URL mapping.
- scale(color, pct)[source]
Scale a hex color by a percentage.
- render_field(field)[source]
Render form field with error styling.
Works with both web.py and Django forms.
- Parameters:
field – Form field object.
- Returns:
HTML string for rendered field.
- Return type:
- login_protected(priv_level=3, login_level=1)[source]
Decorator protecting routes by session authentication.
- userid_or_admin(fn)[source]
Decorator limiting access to own user ID unless admin.
- Parameters:
fn – Function to protect.
- Returns:
Wrapped function.
- manager_or_admin(fn)[source]
Decorator limiting access to managed resources unless admin.
- Parameters:
fn – Function to protect.
- Returns:
Wrapped function.
- logerror(olderror, logger)[source]
Wrap internalerror function to log tracebacks.
- Parameters:
olderror – Original error handler function.
logger – Logger instance for error output.
- Returns:
Wrapped error handler.
- validip6addr(address)[source]
Check if address is a valid IPv6 address.
- Parameters:
address (str) – Address string to validate.
- Returns:
True if valid IPv6 address.
- Return type:
Example:
>>> validip6addr('::') True >>> validip6addr('aaaa:bbbb:cccc:dddd::1') True >>> validip6addr('1:2:3:4:5:6:7:8:9:10') False >>> validip6addr('12:10') False
- validipaddr(address)[source]
Check if address is a valid IPv4 address.
- Parameters:
address (str) – Address string to validate.
- Returns:
True if valid IPv4 address.
- Return type:
Example:
>>> validipaddr('192.168.1.1') True >>> validipaddr('192.168. 1.1') False >>> validipaddr('192.168.1.800') False >>> validipaddr('192.168.1') False
- validipport(port)[source]
Check if port is a valid port number.
- Parameters:
port (str) – Port string to validate.
- Returns:
True if valid port (0-65535).
- Return type:
Example:
>>> validipport('9000') True >>> validipport('foo') False >>> validipport('1000000') False
- validip(ip, defaultaddr='0.0.0.0', defaultport=8080)[source]
Parse IP address and port from string.
- Parameters:
- Returns:
Tuple of (ip_address, port).
- Return type:
- Raises:
ValueError – If invalid IP address/port format.
Example:
>>> validip('1.2.3.4') ('1.2.3.4', 8080) >>> validip('80') ('0.0.0.0', 80) >>> validip('192.168.0.1:85') ('192.168.0.1', 85) >>> validip('::') ('::', 8080) >>> validip('[::]:88') ('::', 88) >>> validip('[::1]:80') ('::1', 80)
- validaddr(string_)[source]
Parse address as IP:port tuple or Unix socket path.
- Parameters:
string (str) – Address string to parse.
- Returns:
(ip_address, port) tuple or socket path string.
- Raises:
ValueError – If invalid format.
Example:
>>> validaddr('/path/to/socket') '/path/to/socket' >>> validaddr('8000') ('0.0.0.0', 8000) >>> validaddr('127.0.0.1') ('127.0.0.1', 8080) >>> validaddr('127.0.0.1:8000') ('127.0.0.1', 8000) >>> validip('[::1]:80') ('::1', 80) >>> validaddr('fff') Traceback (most recent call last): ... ValueError: fff is not a valid IP address/port
- urlquote(val)[source]
Quote string for safe use in a URL.
- Parameters:
val – String to quote (or None).
- Returns:
URL-encoded string.
- Return type:
Example:
>>> urlquote('://?f=1&j=1') '%3A//%3Ff%3D1%26j%3D1' >>> urlquote(None) '' >>> urlquote(u'‽') '%E2%80%BD'
- httpdate(date_obj)[source]
Format datetime object for HTTP headers.
- Parameters:
date_obj – datetime object to format.
- Returns:
HTTP date string in RFC 1123 format.
- Return type:
Example:
>>> import datetime >>> httpdate(datetime.datetime(1970, 1, 1, 1, 1, 1)) 'Thu, 01 Jan 1970 01:01:01 GMT'
- parsehttpdate(string_)[source]
Parse HTTP date string into datetime object.
- Parameters:
string (str) – HTTP date string in RFC 1123 format.
- Returns:
Parsed datetime object, or None if invalid.
- Return type:
datetime.datetime | None
Example:
>>> parsehttpdate('Thu, 01 Jan 1970 01:01:01 GMT') datetime.datetime(1970, 1, 1, 1, 1, 1)
- htmlquote(text)[source]
Encode text for safe use in HTML.
Example:
>>> htmlquote(u"<'&\">") '<'&">'
- htmlunquote(text)[source]
Decode HTML-encoded text.
Example:
>>> htmlunquote(u'<'&">') '<\'&">'
- websafe(val)[source]
Convert value to safe Unicode HTML string.
- Parameters:
val – Value to convert (string, bytes, or None).
- Returns:
HTML-safe string.
- Return type:
Example:
>>> websafe("<'&\">") '<'&">' >>> websafe(None) '' >>> websafe(u'\u203d') == u'\u203d' True
- class JSONEncoderISODate(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]
Bases:
JSONEncoderJSON encoder that serializes dates in ISO format.
Example:
>>> JSONEncoderISODate().encode({'dt': datetime.date(2014, 10, 2)}) '{"dt": "2014-10-02"}'
- default(obj)[source]
- class JSONDecoderISODate(**kw)[source]
Bases:
JSONDecoderJSON decoder that parses date strings into datetime objects.
Example:
>>> JSONDecoderISODate().decode('{"dt": "2014-10-02"}') {'dt': datetime.datetime(2014, 10, 2, 0, 0)}
stats
Mathematical and statistical functions: average, variance, standard deviation, covariance, beta, combinatorics (choose), number parsing, threshold operations.
- npfunc(nargs=1)[source]
Decorator to convert args to numpy format and results back to Python.
- Parameters:
nargs (int) – Number of arguments to convert to numpy arrays.
- Returns:
Decorator function.
- avg(x)[source]
Compute average of array, ignoring None/NaN values.
- Parameters:
x (Iterable) – Array of values.
- Returns:
Average or None if all values are NaN.
Example:
>>> avg((-1.5, 2,)) 0.25 >>> avg((None, 2,)) 2.0 >>> avg((None, None,)) is None True
- pct_change(x)[source]
Compute percent change between consecutive elements.
- Parameters:
x (Iterable) – Array of values.
- Returns:
Array of percent changes (first element is None).
Example:
>>> a = [1, 1, 1.5, 1, 2, 1.11, -1] >>> [f"{_:.2f}" if _ else _ for _ in pct_change(a)] [None, 0.0, '0.50', '-0.33', '1.00', '-0.44', '-1.90']
- diff(x)[source]
Compute one-period difference between consecutive elements.
- Parameters:
x (Iterable) – Array of values.
- Returns:
Array of differences (first element is None).
Example:
>>> [_ for _ in diff((0, 1, 3, 2, 1, 5, 4))] [None, 1.0, 2.0, -1.0, -1.0, 4.0, -1.0]
- thresh(x, thresh=0.0)[source]
Round to nearest integer if within threshold distance.
- Parameters:
- Returns:
Rounded integer or original value.
Positive Numbers:
>>> thresh(74.9888, 0.05) 75 >>> thresh(75.01, 0.05) 75
Negative Numbers:
>>> thresh(-74.988, 0.05) -75 >>> thresh(-75.01, 0.05) -75
Return Original:
>>> thresh(74.90, 0.05) 74.9 >>> thresh(75.06, 0.05) 75.06
- isnumeric(x)[source]
Check if value is a numeric type.
- Parameters:
x – Value to check.
- Returns:
True if numeric (int, float, or numpy numeric).
- Return type:
- digits(n)[source]
Count number of integer digits in a number.
- Parameters:
n – Number to count digits of.
- Returns:
Number of integer digits.
- Return type:
Example:
>>> digits(6e6) 7 >>> digits(100.01) 3 >>> digits(-6e5)==digits(-600000)==6 True >>> digits(-100.)==digits(100)==3 True
- numify(val, to=<class 'float'>)[source]
Convert value to numeric type, handling common formatting.
Handles None values, already numeric values, and string formatting including whitespace, commas, parentheses (negative), and percentages.
- Parameters:
val – Value to convert.
to (type) – Target type (default: float).
- Returns:
Converted value or None if conversion fails.
Example:
>>> numify('1,234.56') 1234.56 >>> numify('(100)', to=int) -100 >>> numify('50%') 50.0 >>> numify(None) >>> numify('')
- parse(s)[source]
Extract number from string.
- Parameters:
s – String to parse.
- Returns:
Parsed int or float, or None if parsing fails.
Example:
>>> parse('1,200m') 1200 >>> parse('100.0') 100.0 >>> parse('100') 100 >>> parse('0.002k') 0.002 >>> parse('-1')==parse('(1)')==-1 True >>> parse('-100.0')==parse('(100.)')==-100.0 True >>> parse('')
- nearest(num, decimals)[source]
Round number to the nearest tick value.
Useful for eliminating float errors after arithmetic operations.
- Parameters:
- Returns:
Rounded number.
- Return type:
Example:
>>> nearest(401.4601, 0.01) 401.46 >>> nearest(401.46001, 0.0000000001) 401.46001
- covarp(x, y)[source]
Compute population covariance between x and y.
- Parameters:
x – First array.
y – Second array (same length as x).
- Returns:
Population covariance.
- Return type:
Example:
>>> x = [3, 2, 4, 5, 6] >>> y = [9, 7, 12, 15, 17] >>> "{:.5}".format(covarp(x, y)) '5.2'
- covars(x, y)[source]
Compute sample covariance between x and y.
- Parameters:
x – First array.
y – Second array (same length as x).
- Returns:
Sample covariance.
- Return type:
Example:
>>> x = [3, 2, 4, 5, 6] >>> y = [9, 7, 12, 15, 17] >>> "{:.5}".format(covars(x, y)) '6.5'
- varp(x)[source]
Compute population variance of x.
- Parameters:
x – Array of values.
- Returns:
Population variance.
- Return type:
Example:
>>> x = [1345, 1301, 1368, 1322, 1310, 1370, 1318, 1350, 1303, 1299] >>> "{:.5}".format(varp(x)) '678.84'
- vars(x)[source]
Compute sample variance of x.
- Parameters:
x – Array of values.
- Returns:
Sample variance.
- Return type:
Example:
>>> x = [1345, 1301, 1368, 1322, 1310, 1370, 1318, 1350, 1303, 1299] >>> "{:.5}".format(vars(x)) '754.27'
- stddevp(x)[source]
Compute population standard deviation.
- Parameters:
x – Array of values.
- Returns:
Population standard deviation.
- Return type:
Example:
>>> x = [1345, 1301, 1368, 1322, 1310, 1370, 1318, 1350, 1303, 1299] >>> "{:.5}".format(stddevp(x)) '26.055'
- stddevs(x)[source]
Compute sample standard deviation.
- Parameters:
x – Array of values.
- Returns:
Sample standard deviation.
- Return type:
Example:
>>> x = [1345, 1301, 1368, 1322, 1310, 1370, 1318, 1350, 1303, 1299] >>> "{:.5}".format(stddevs(x)) '27.464'
- beta(x, index)[source]
Compute beta of x with respect to index (typically over returns).
- Parameters:
x – Asset returns.
index – Index returns.
- Returns:
Beta coefficient.
- Return type:
Example:
>>> x = [0.10, 0.18, -0.15, 0.18] >>> y = [0.10, 0.17, -0.17, 0.17] >>> '{:.2}'.format(beta(x, y)) '0.97'
- correl(x, y)[source]
Compute correlation between x and y.
- Parameters:
x – First array.
y – Second array.
- Returns:
Correlation coefficient.
- Return type:
Example:
>>> x = [3, 2, 4, 5, 6] >>> y = [9, 7, 12, 15, 17] >>> "{:.3}".format(correl(x, y)) '0.997'
- rsq(x, y)[source]
Compute R-squared (coefficient of determination) between x and y.
- Parameters:
x – First array.
y – Second array.
- Returns:
R-squared value.
- Return type:
Example:
>>> x = [ 6, 5, 11, 7, 5, 4, 4] >>> y = [ 2, 3, 9, 1, 8, 7, 5] >>> "{:.5}".format(rsq(x, y)) '0.05795'
- rtns(x)[source]
Compute simple returns between consecutive values.
- Parameters:
x – Array of prices.
- Returns:
Array of returns (one fewer element than input).
Example:
>>> pp = rtns([1., 1.1, 1.3, 1.1, 1.3]) >>> [f'{x:0.2f}' for x in pp] ['0.10', '0.18', '-0.15', '0.18']
- logrtns(x)[source]
Compute log returns between consecutive values.
- Parameters:
x – Array of prices.
- Returns:
Array of log returns (one fewer element than input).
Example:
>>> pp = logrtns([1., 1.1, 1.3, 1.1, 1.3]) >>> [f'{x:0.2f}' for x in pp] ['0.10', '0.17', '-0.17', '0.17']
- weighted_average(rows, field, predicate, weight_field)[source]
Compute a weighted average of field in a DataSet using weight_field as the weight. Limit to rows matching the predicate. Uses sum of abs in denominator because we are really looking for the value-weighted contribution of the position.
This handles long/short cases correctly, although they can give surprising results.
Consider two “trades” BL 5000 at a delta of 50% and SS -4000 at a delta of 30%. If you didn’t use abs() you’d get:
(5000 * 50 - 4000 * 30) / (5000 - 4000) = 130
Using abs() you get:
(5000 * 50 - 4000 * 30) / (5000 + 4000) = 14.4
This is really equivalent to saying you bought another 4000 at a delta of -30 (because the short position has a negative delta effect) which then makes more sense: combining two positions, one with positive delta and one with negative should give a value that weights the net effect of them, which the second case does. If the short position were larger or had a larger delta, you could end up with a negative weighted average, which although a bit confusing, is mathematically correct.
- linear_regression(x, y)[source]
Compute the least-squares linear regression line for the set of points. Returns the slope and y-intercept.
- distance_from_line(m, b, x, y)[source]
Compute the distance from each point to the line defined by m and b.
- linterp(x0, x1, x, y0, y1, *, inf_value=None)[source]
Linearly interpolate y between y0 and y1 based on x’s position.
- Parameters:
x0 – Start of x range.
x1 – End of x range.
x – Value to interpolate at.
y0 – Y value at x0.
y1 – Y value at x1.
inf_value – Value to return when x1 is infinity (default: y0).
- Returns:
Interpolated y value.
- Return type:
Example:
>>> linterp(1, 3, 2, 2, 4) 3.0 >>> linterp(1, float('inf'), 2, 2, 4) 2.0 >>> linterp(1, float('inf'), 2, 2, 4, inf_value=4) 4.0
- np_divide(a, b)[source]
Safely divide numpy arrays, returning 0 where divisor is 0.
- Parameters:
a – Numerator array.
b – Denominator array.
- Returns:
Result array with 0 where b was 0.
- safe_add(*args)[source]
Safely add numbers, returning None if any argument is None.
- Parameters:
args – Numbers to add.
- Returns:
Sum or None if any arg is None.
- safe_diff(*args)[source]
Safely subtract numbers, returning None if any argument is None.
- Parameters:
args – Numbers to subtract sequentially.
- Returns:
Difference or None if any arg is None.
- safe_divide(*args, **kwargs)[source]
Safely divide numbers, returning None if any arg is None.
- Parameters:
args – Numbers to divide sequentially.
kwargs – Optional ‘infinity’ for division by zero result.
- Returns:
Result or None if any arg is None, inf on division by zero.
Example:
>>> '{:.2f}'.format(safe_divide(10, 5)) '2.00' >>> '{:.2f}'.format(safe_divide(10, 1.5, 1)) '6.67' >>> safe_divide(1, 0) inf >>> safe_divide(10, 1, None)
- safe_mult(*args)[source]
For big lists of stuff to multiply, when some things may be None
- safe_round(arg, places=2)[source]
Safely round a number, returning None if argument is None.
- Parameters:
arg – Number to round.
places (int) – Decimal places (default 2).
- Returns:
Rounded number or None.
- safe_cmp(op, a, b)[source]
Compare two values using a comparison operator.
- Parameters:
op – Operator string (‘>’, ‘>=’, ‘<’, ‘<=’, ‘==’, ‘!=’) or operator function.
a – First value.
b – Second value.
- Returns:
Boolean result of comparison.
- safe_min(*args, **kwargs)[source]
Min returns None if it is in the list - this one returns the min value
- safe_max(*args, **kwargs)[source]
Max returns None if it is in the list - this one returns the max value
- convert_mixed_numeral_to_fraction(num)[source]
Convert mixed numeral string to decimal fraction.
- convert_to_mixed_numeral(num, force_sign=False)[source]
Convert decimal or fraction to mixed numeral string.
- Parameters:
num – Number or string to convert.
force_sign (bool) – Force ‘+’ prefix on positive numbers.
- Returns:
Mixed numeral string (e.g., ‘1 7/8’) or None on error.
- Return type:
Example:
>>> convert_to_mixed_numeral(1.875, True) '+1 7/8' >>> convert_to_mixed_numeral(-1.875) '-1 7/8' >>> convert_to_mixed_numeral(-.875) '-7/8' >>> convert_to_mixed_numeral('-1.875') '-1 7/8' >>> convert_to_mixed_numeral('1 7/8', False) '1 7/8' >>> convert_to_mixed_numeral('1-7/8', True) '+1 7/8' >>> convert_to_mixed_numeral('-1.5') '-1 1/2' >>> convert_to_mixed_numeral('6/7', True) '+6/7' >>> convert_to_mixed_numeral('1 6/7', False) '1 6/7' >>> convert_to_mixed_numeral(0) '0' >>> convert_to_mixed_numeral('0') '0'
- round_to_nearest(value, base)[source]
Round value to nearest multiple of base.
- Parameters:
value (float) – Value to round.
base – Base multiple to round to (must be >= 1).
- Returns:
Rounded value.
- Return type:
Example:
>>> round_to_nearest(12, 25) 0 >>> round_to_nearest(26, 25) 25
- numpy_smooth(x, window_len=11, window='hanning')[source]
Smooth the data using a window with requested size.
https://scipy-cookbook.readthedocs.io/items/SignalSmooth.html
This method is based on the convolution of a scaled window with the signal. The signal is prepared by introducing reflected copies of the signal (with the window size) in both ends so that transient parts are minimized in the begining and end part of the output signal.
- Parameters:
x – the input signal
window_len – the dimension of the smoothing window; should be an odd integer
window – the type of window from ‘flat’, ‘hanning’, ‘hamming’, ‘bartlett’, ‘blackman’. Flat window will produce a moving average smoothing.
- Returns:
the smoothed signal
Example:
t=linspace(-2,2,0.1) x=sin(t)+randn(len(t))*0.1 y=smooth(x)
See also
numpy.hanning, numpy.hamming, numpy.bartlett, numpy.blackman, numpy.convolve, scipy.signal.lfilter
Note
The window parameter could be the window itself if an array instead of a string. length(output) != length(input), to correct this: return y[(window_len/2-1):-(window_len/2)] instead of just y.
sync
Synchronization primitives: timeout decorator and context manager for thread/async operations with configurable timeout handling.
- syncd(lock)[source]
Decorator to synchronize functions with a shared lock.
- Parameters:
lock – Threading lock to acquire during function execution.
- Returns:
Decorator function.
Example:
>>> import threading >>> lock = threading.Lock() >>> @syncd(lock) ... def safe_increment(counter): ... return counter + 1 >>> safe_increment(0) 1
- class NonBlockingDelay[source]
Bases:
objectNon-blocking delay for checking time elapsed.
- timeout()[source]
Check if the delay time has elapsed.
- Returns:
True if time is up.
- Return type:
- delay(seconds)[source]
Delay non-blocking for N seconds (busy-wait). :rtype:
NoneDeprecated since version Use: time.sleep() for efficient blocking delays. This function is kept for backward compatibility.
- debounce(wait)[source]
Decorator to debounce function calls.
Waits
waitseconds before calling function, cancels if called again.- Parameters:
wait (float) – Seconds to wait before executing.
- Returns:
Decorator function.
- wait_until(hour, minute=0, second=0, tz=datetime.timezone.utc, time_unit='milliseconds')[source]
Calculate time to wait until specified hour/minute/second.
- Parameters:
- Returns:
Time to wait in specified unit.
- Return type:
- Raises:
ValueError – If hour/minute/second out of range.
crypto
Cryptography and encoding utilities: base64 file encoding, hashing.
Cryptography and encoding utilities.
- base64file(fil)[source]
Encode file contents as base64.
- Parameters:
fil – Path to file to encode.
- Returns:
Base64 encoded bytes.
- Return type:
Example:
>>> import tempfile >>> with tempfile.NamedTemporaryFile(mode='w', delete=False) as f: ... _ = f.write('hello world') >>> base64file(f.name) b'aGVsbG8gd29ybGQ=\n'
Note
This function reads the entire file into memory. Use with caution on large files.
geo
Geographic utilities: Mercator projection coordinate transformations (merc_x, merc_y) for mapping applications.
Geographic utilities for coordinate transformations
- merc_x(lon, r_major=6378137.0)[source]
Project longitude into mercator / radians from major axis.
- Parameters:
- Returns:
Mercator x coordinate.
- Return type:
Example:
>>> "{:0.3f}".format(merc_x(40.7484)) '4536091.139'
- merc_y(lat, r_major=6378137.0, r_minor=6356752.3142)[source]
Project latitude into mercator / radians from major/minor axes.
- Parameters:
- Returns:
Mercator y coordinate.
- Return type:
Example:
>>> "{:0.3f}".format(merc_y(73.9857)) '12468646.871'
dir
Directory operations: recursive creation, temporary directories, file searching, safe moving, directory structure inspection, file downloading.
Directory and file system utilities.
Note
os.walk and scandir were slow over network connections in Python 2.
- mkdir_p(path)[source]
Create directory and any missing parent directories.
- Parameters:
path – Directory path to create.
Example:
>>> import tempfile, os >>> tmpdir = tempfile.mkdtemp() >>> newdir = os.path.join(tmpdir, 'a', 'b', 'c') >>> mkdir_p(newdir) >>> os.path.isdir(newdir) True
- make_tmpdir(prefix=None)[source]
Context manager to wrap a temporary directory with auto-cleanup.
- Parameters:
prefix (str) – Optional prefix directory for the temp dir.
- Yields:
Path to the temporary directory.
- Return type:
Example:
>>> import os.path >>> fpath = "" >>> with make_tmpdir() as basedir: ... fpath = os.path.join(basedir, 'temp.txt') ... with open(fpath, "w") as file: ... file.write("We expect the file to be deleted when context closes") 52 >>> try: ... file = open(fpath, "w") ... except IOError as io: ... raise Exception('File does not exist') Traceback (most recent call last): ... Exception: File does not exist
- expandabspath(p)[source]
Expand path to absolute path with environment variables and user expansion.
- Parameters:
p (str) – Path string to expand.
- Returns:
Absolute path with all expansions applied.
- Return type:
Path
Example:
>>> import os >>> os.environ['SPAM'] = 'eggs' >>> assert expandabspath('~/$SPAM') == Path(os.path.expanduser('~/eggs')) >>> assert expandabspath('/foo') == Path('/foo')
- get_directory_structure(rootdir)[source]
Create a nested dictionary representing the folder structure.
- Parameters:
rootdir (str) – Root directory to traverse.
- Returns:
Nested dictionary of the directory structure.
- Return type:
Example:
>>> import tempfile, os >>> tmpdir = tempfile.mkdtemp() >>> os.makedirs(os.path.join(tmpdir, 'sub')) >>> Path(os.path.join(tmpdir, 'file.txt')).touch() >>> result = get_directory_structure(tmpdir) >>> 'file.txt' in result[os.path.basename(tmpdir)] True
- search(rootdir, name=None, extension=None)[source]
Search for files by name, extension, or both in directory.
- Parameters:
- Yields Path:
Full path to each matching file.
See also
See
tests/test_dir.pyfor usage examples.
- safe_move(source, target, hard_remove=False)[source]
Move a file to a new location, optionally deleting anything in the way.
- Parameters:
- Returns:
Final target path (may differ if conflict occurred).
- Return type:
Path
See also
See
tests/test_dir.pyfor usage examples.
- save_file_tmpdir(fname, content, thedate=None, **kw)[source]
Save a document to the specified temp directory, optionally with date.
- Parameters:
Example:
>>> import datetime >>> content = "</html>...</html>" >>> save_file_tmpdir("Foobar.txt", content, thedate=datetime.date.today())
- get_dir_match(dir_pattern, thedate=None)[source]
Get paths of existing files matching each glob pattern.
Filters zero-size files and returns warnings for missing patterns.
- Parameters:
dir_pattern – List of (directory, pattern) tuples.
thedate – Optional date to append to patterns.
- Returns:
Tuple of (results list of Path objects, warnings list of strings).
- Return type:
See also
See
tests/test_dir.pyfor usage examples.
- load_files(directory, pattern='*', thedate=None)[source]
Load file contents from directory matching pattern.
- Parameters:
- Yields:
File contents as strings.
See also
See
tests/test_dir.pyfor usage examples.
- load_files_tmpdir(patterns='*', thedate=None)[source]
Load files from temp directory matching patterns.
- Parameters:
patterns – Glob pattern(s) to match files.
thedate – Optional date to append to patterns.
- Returns:
Iterator of file contents.
Example:
>>> import datetime >>> patterns = ("nonexistent_pattern_*.txt",) >>> results = load_files_tmpdir(patterns, datetime.date.today()) >>> next(results, None) is None True
- dir_to_dict(path)[source]
Convert directory structure to a dictionary.
- Parameters:
path (str) – Directory path to convert.
- Returns:
Dictionary with subdirectories as nested dicts and
.fileskey for files.- Return type:
Example:
>>> import tempfile, os >>> tmpdir = tempfile.mkdtemp() >>> Path(os.path.join(tmpdir, 'test.txt')).touch() >>> result = dir_to_dict(tmpdir) >>> 'test.txt' in result['.files'] True
- download_file(url, save_path=None)[source]
Download file from URL with progress bar and retry logic.
- Parameters:
- Returns:
Path to downloaded file.
- Return type:
Path
See also
See
tests/test_dir.pyfor usage examples.
- splitall(path)[source]
Split path into all its components.
Works with both Unix and Windows paths.
- Parameters:
path (str) – Path string to split.
- Returns:
List of path components.
- Return type:
- Raises:
TypeError – If path is not a string.
Example:
>>> splitall('a/b/c') ['a', 'b', 'c'] >>> splitall('/a/b/c/') ['/', 'a', 'b', 'c', ''] >>> splitall('/') ['/'] >>> splitall('C:') ['C:'] >>> splitall('C:\\') ['C:\\'] >>> splitall('C:\\a') ['C:\\', 'a'] >>> splitall('C:\\a\\') ['C:\\', 'a', ''] >>> splitall('C:\\a\\b') ['C:\\', 'a', 'b'] >>> splitall('a\\\\b') ['a', 'b']
exception
Exception handling: print exceptions with traceback control,
try_else wrapper for default values on failure.
- print_exception(e, short=True)[source]
Print exception traceback with optional verbosity.
- Parameters:
Example:
>>> try: ... raise ValueError("example error") ... except Exception as e: ... print_exception(e)
- try_else(func, default=None)[source]
Wrap function to return default value if it fails.
- Parameters:
func – Function to wrap.
default – Default value or callable to return on failure.
- Returns:
Wrapped function.
Example:
>>> import json >>> d = try_else(json.loads, 2)('{"a": 1, "b": "foo"}') >>> d {'a': 1, 'b': 'foo'} >>> repl = lambda x: 'foobar' >>> d = try_else(json.loads, repl)('{"a": 1, b: "foo"}') >>> d 'foobar'
future
Future pattern implementation for running functions asynchronously in separate threads with result retrieval.
- class Future(func, *param)[source]
Bases:
objectEasy threading via a Future pattern.
Runs a time-consuming function in a separate thread while allowing the main thread to continue uninterrupted.
- Parameters:
func – Function to run in background thread.
param – Arguments to pass to the function.
Note
Algorithm from http://code.activestate.com/recipes/84317/
Example:
>>> import time, math >>> def wait_and_add(x): ... time.sleep(2) ... return x+1
Won’t wait 2 seconds here:
>>> start = time.time() >>> z = Future(wait_and_add, 2) >>> 1+2 3 >>> int(math.ceil(time.time()-start)) 1
At this point we need to wait the 2 seconds:
>>> z() 3 >>> int(math.ceil(time.time()-start)) >= 2 True
- Wrapper(func, param)[source]
pandasutils
Pandas utilities: interval-based operations, multi-column table display, downcast (memory optimization), fuzzymerge (fuzzy string matching for joins). Re-exports all pandas functionality.
Pandas wrappers and utilities
This module provides utility functions for pandas DataFrames and Series, including null checking, type downcasting, fuzzy merging, and timezone data.
- is_null(x)[source]
Check if value is null/None (pandas required).
For array-like inputs (list, numpy array), returns True only if ALL elements are null. This avoids the “ambiguous truth value” error that occurs when using pandas.isnull() on arrays in boolean contexts.
- Parameters:
x – Value to check.
- Returns:
True if value is null/None/NaN, or if array-like and all elements are null.
- Return type:
Example:
>>> import datetime >>> import numpy as np >>> assert is_null(None) >>> assert not is_null(0) >>> assert is_null(np.nan) >>> assert not is_null(datetime.date(2000, 1, 1)) >>> assert is_null([]) >>> assert is_null([None, None]) >>> assert not is_null([1, 2, 3]) >>> assert not is_null([None, 1])
- download_tzdata()[source]
Download timezone data for pyarrow date wrangling.
Downloads to the “Downloads” folder.
- downcast(df, rtol=1e-05, atol=1e-08, numpy_dtypes_only=False)[source]
Downcast DataFrame to minimum viable type for each column.
Ensures resulting values are within tolerance of original values.
- Parameters:
- Returns:
Downcasted DataFrame.
- Return type:
DataFrame
Note
See numpy.allclose for tolerance parameters.
Example:
>>> from numpy import linspace, random >>> from pandas import DataFrame >>> data = { ... "integers": linspace(1, 100, 100), ... "floats": linspace(1, 1000, 100).round(2), ... "booleans": random.choice([1, 0], 100), ... "categories": random.choice(["foo", "bar", "baz"], 100)} >>> df = DataFrame(data) >>> downcast(df, rtol=1e-10, atol=1e-10).info() <class 'pandas.core.frame.DataFrame'> ... dtypes: bool(1), category(1), float64(1), uint8(1) memory usage: 1.3 KB >>> downcast(df, rtol=1e-05, atol=1e-08).info() <class 'pandas.core.frame.DataFrame'> ... dtypes: bool(1), category(1), float32(1), uint8(1) memory usage: 964.0 bytes
- fuzzymerge(df1, df2, right_on, left_on, usedtype='uint8', scorer='WRatio', concat_value=True, **kwargs)[source]
Merge two DataFrames using fuzzy matching on specified columns.
Performs fuzzy matching between DataFrames based on specified columns, useful for matching data with small variations like typos or abbreviations.
- Parameters:
df1 (DataFrame) – First DataFrame to merge.
df2 (DataFrame) – Second DataFrame to merge.
right_on (str) – Column name in df2 for matching.
left_on (str) – Column name in df1 for matching.
usedtype – Data type for distance matrix (default: uint8).
scorer – Scoring function for fuzzy matching (default: WRatio).
concat_value (bool) – Add similarity scores column (default: True).
kwargs – Additional arguments for pandas.merge.
- Returns:
Merged DataFrame with fuzzy-matched rows.
- Return type:
DataFrame
Example:
>>> df1 = read_csv( ... "https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv" ... ) >>> df2 = df1.copy() >>> df2 = concat([df2 for x in range(3)], ignore_index=True) >>> df2.Name = (df2.Name + random.uniform(1, 2000, len(df2)).astype("U")) >>> df1 = concat([df1 for x in range(3)], ignore_index=True) >>> df1.Name = (df1.Name + random.uniform(1, 2000, len(df1)).astype("U")) >>> df3 = fuzzymerge(df1, df2, right_on='Name', left_on='Name', usedtype=uint8, scorer=partial_ratio, ... concat_value=True)
rand
Random number generation with enhanced seeding, sampling, and distribution functions.
- random_choice(choices)[source]
Random choice from a list, seeded with OS entropy.
- Parameters:
choices (list) – List of items to choose from.
- Returns:
Randomly selected item.
Example:
>>> result = random_choice(['a', 'b', 'c']) >>> result in ['a', 'b', 'c'] True
- random_int(a, b)[source]
Random integer between a and b inclusive, seeded with OS entropy.
- Parameters:
- Returns:
Random integer in [a, b].
- Return type:
Example:
>>> result = random_int(1, 10) >>> 1 <= result <= 10 True
thread
Threading utilities: asyncd decorator for async-style syntax,
threaded decorator for background execution in thread pools.
- asyncd(func)[source]
Decorator to run synchronous function asynchronously.
- Parameters:
func – Synchronous function to wrap.
- Returns:
Async wrapper function.
Note
Based on https://stackoverflow.com/a/50450553
- call_with_future(fn, future, args, kwargs)[source]
Call function and set result on future.
- Parameters:
fn – Function to call.
future – Future to set result on.
args – Positional arguments for fn.
kwargs – Keyword arguments for fn.
- class RateLimitedExecutor(max_workers=None, max_per_second=inf, show_progress=False)[source]
Bases:
objectThread pool executor with rate limiting and Request/Response API.
Provides clean request/response API where every response includes the original request for easy result tracking and exception handling.
Basic Usage:
with RateLimitedExecutor(max_workers=10, max_per_second=5, show_progress=True) as executor: responses = executor.execute_items(process_fn, items, desc='Processing') for response in responses: if response.success: print(f"Item {response.request.id}: {response.result}") else: print(f"Item {response.request.id} failed: {response.exception}")
Advanced Usage with Custom IDs:
requests = [TaskRequest(item=x, id=f'custom_{i}') for i, x in enumerate(items)] responses = executor.execute(process_fn, requests, desc='Processing') result_map = {r.request.id: r.result for r in responses if r.success}
- __init__(max_workers=None, max_per_second=inf, show_progress=False)[source]
Initialize rate-limited executor.
- execute(fn, requests, desc='Processing', unit='item')[source]
Execute function on all requests and return responses in order.
- Parameters:
- Returns:
List of TaskResponse objects in same order as requests.
- Return type:
- execute_items(fn, items, desc='Processing', unit='item')[source]
Execute function on items with auto-generated request IDs.
- submit(fn, *args, **kwargs)[source]
Submit a callable to be executed with rate limiting.
- Parameters:
fn – Callable to execute.
args – Positional arguments.
kwargs – Keyword arguments.
- Returns:
Future representing the result.
- Return type:
Future
- class TaskRequest(item, id=None)[source]
Bases:
objectRequest to process an item with optional ID for tracking.
-
item:
Any
-
id:
Any= None
-
item:
- class TaskResponse(result, request, exception=None)[source]
Bases:
objectResponse from processing a request.
- Parameters:
result (
Any) – The result from processing.request (
TaskRequest) – The original TaskRequest.exception (
Exception) – Any exception that occurred (None if successful).
-
result:
Any
-
request:
TaskRequest
-
exception:
Exception= None
- threaded(fn)[source]
Decorator to run function in a separate thread.
Returns a Future that can be used to get the result.
Note
Based on https://stackoverflow.com/a/19846691
Example:
>>> class MyClass: ... @threaded ... def get_my_value(self): ... return 1 >>> my_obj = MyClass() >>> fut = my_obj.get_my_value() # this will run in a separate thread >>> fut.result() # will block until result is computed 1
chart
Plotting utilities for creating charts, particularly time series visualizations using matplotlib.
- numpy_timeseries_plot(title, dates, series=None, labels=None, formats=None)[source]
Create a matplotlib timeseries plot with automatic subplot layout.
The layout adapts based on the number of series:
1 series: single plot
2 series: same plot with dual y-axes (overlapping)
3+ series: stacked vertically as subplots
win
Windows-specific utilities: command execution, psexec sessions, file share mounting, WMIC output parsing.
Windows Utilities
- run_command(cmd, workingdir=None, raise_on_error=True, hidearg=None)[source]
Execute a shell command and return output.
- Parameters:
- Returns:
Combined stdout and stderr output.
- Return type:
- Raises:
Exception – If command fails and raise_on_error is True.
See also
See
tests/test_win.pyfor usage examples.
- class psexec_session(host, password)[source]
Bases:
objectContext manager for running psexec commands.
Mounts admin share before commands and unmounts on exit.
Example:
with shell.psexec_session(host, password): for cmd in commands: out = shell.run_command(cmd)
- class file_share_session(host, password, drive, share)[source]
Bases:
objectContext manager for temporarily mounting a file share.
Mounts share before commands and unmounts on exit.
- Parameters:
Example:
with shell.file_share_session(host, password, 'Z:', 'data'): for cmd in commands: out = shell.run_command(cmd)
- mount_admin_share(host, password, unmount=False)[source]
Mount or unmount the admin$ share required for psexec commands.
Resolves host to IP address to avoid Windows multiple-connection errors.
- Parameters:
Note
Connects by IP address to work around Windows complaining about multiple connections to a share by the same user.
- mount_file_share(host, password, drive, share, unmount=False)[source]
Mount or unmount a Windows file share.
- parse_wmic_output(output)[source]
Parse output from WMIC query into list of dicts.
- Parameters:
output (str) – Raw WMIC output string.
- Returns:
List of dictionaries with column headers as keys.
- Return type:
Example:
>> wmic_output = os.popen('wmic product where name="Python 2.7.11" get Caption, Description, Vendor').read() >> result = parse_wmic_output(wmic_output) >> result[0]['Caption'] >> result[0]['Vendor']
- exit_cmd()[source]
Kill all running cmd.exe processes via WMI.
Requires pywin32 to be installed on Windows.