fractopo.analysis.length_distributions module

Utilities for analyzing and plotting length distributions for line data.

class fractopo.analysis.length_distributions.Dist(*values)

Bases: Enum

Enums of powerlaw model types.

EXPONENTIAL = 'exponential'
LOGNORMAL = 'lognormal'
POWERLAW = 'power_law'
TRUNCATED_POWERLAW = 'truncated_power_law'
class fractopo.analysis.length_distributions.LengthDistribution(lengths: ndarray, area_value: float, using_branches: bool, name: str = '', _automatic_fit: Fit | None = None)

Bases: object

Dataclass for length distributions.

area_value: float
property automatic_fit: Fit | None

Get automatic powerlaw Fit.

generate_distributions(cut_off: float = 1e-18)

Generate ccdf and truncated length data with cut_off.

lengths: ndarray
manual_fit(cut_off: float)

Get manual powerlaw Fit.

name: str = ''
using_branches: bool
class fractopo.analysis.length_distributions.MultiLengthDistribution(distributions: list[~fractopo.analysis.length_distributions.LengthDistribution], using_branches: bool, fitter: ~collections.abc.Callable[[~numpy.ndarray, ~numpy.ndarray], tuple[float, float]] = <function numpy_polyfit>, cut_offs: list[float] | None = None, _fit_to_multi_scale_lengths: tuple[~numpy.ndarray, float, float] | None = None, _normalized_distributions: tuple[list[~numpy.ndarray], list[~numpy.ndarray]] | None = None, _optimized: bool = False)

Bases: object

Multi length distribution.

cut_offs: list[float] | None = None
distributions: list[LengthDistribution]
fitter(log_ccm: ndarray) tuple[float, float]

Fit numpy polyfit to data.

property names: list[str]

Get length distribution names.

normalized_distributions(automatic_cut_offs: bool) tuple[list[ndarray], list[ndarray], list[ndarray], list[ndarray], list[float]]

Create normalized and truncated lengths and ccms.

optimize_cut_offs(shgo_kwargs: dict[str, ~typing.Any] | None = None, scorer: ~collections.abc.Callable[[~numpy.ndarray, ~numpy.ndarray], float] = <function mean_squared_log_error>) tuple[MultiScaleOptimizationResult, MultiLengthDistribution]

Get cut-off optimized MultiLengthDistribution.

optimized_multi_scale_fit(scorer: Callable[[ndarray, ndarray], float], shgo_kwargs: dict[str, Any]) MultiScaleOptimizationResult

Use scipy.optimize.shgo to optimize fit.

plot_multi_length_distributions(automatic_cut_offs: bool, plot_truncated_data: bool, scorer: ~collections.abc.Callable[[~numpy.ndarray, ~numpy.ndarray], float] = <function mean_squared_log_error>) tuple[Polyfit, Figure, Axes]

Plot multi-scale length distribution.

using_branches: bool
class fractopo.analysis.length_distributions.MultiScaleOptimizationResult(polyfit: Polyfit, cut_offs: ndarray, optimize_result: OptimizeResult, x0: ndarray, bounds: ndarray, proportions_of_data: list[float])

Bases: NamedTuple

Results of scipy.optimize.shgo on length data.

bounds: ndarray

Alias for field number 4

cut_offs: ndarray

Alias for field number 1

optimize_result: OptimizeResult

Alias for field number 2

polyfit: Polyfit

Alias for field number 0

proportions_of_data: list[float]

Alias for field number 5

x0: ndarray

Alias for field number 3

class fractopo.analysis.length_distributions.Polyfit(y_fit: ndarray, m_value: float, constant: float, score: float, scorer: Callable[[ndarray, ndarray], float])

Bases: NamedTuple

Results of a polyfit to length data.

constant: float

Alias for field number 2

m_value: float

Alias for field number 1

score: float

Alias for field number 3

scorer: Callable[[ndarray, ndarray], float]

Alias for field number 4

y_fit: ndarray

Alias for field number 0

class fractopo.analysis.length_distributions.SilentFit(data: ndarray, discrete=False, xmin=None, xmax=None, verbose=True, fit_method='Likelihood', estimate_discrete=True, discrete_approximation='round', sigma_threshold=None, parameter_range=None, fit_optimizer=None, xmin_distance='D', xmin_distribution='power_law', **kwargs)

Bases: Fit

Wrap powerlaw.Fit for the singular purpose of silencing output.

Silences output both to stdout and stderr.

fractopo.analysis.length_distributions.all_fit_attributes_dict(fit: Fit) dict[str, float]

Collect ‘all’ fit attributes into a dict.

fractopo.analysis.length_distributions.apply_cut_off(lengths: ndarray, ccm: ndarray, cut_off: float = 1e-18) tuple[ndarray, ndarray]

Apply cut-off to length data and associated ccm.

>>> lengths = np.array([2, 4, 8, 16, 32])
>>> ccm = np.array([1. , 0.8, 0.6, 0.4, 0.2])
>>> cut_off = 4.5
>>> apply_cut_off(lengths, ccm, cut_off)
(array([ 8, 16, 32]), array([0.6, 0.4, 0.2]))
fractopo.analysis.length_distributions.calculate_critical_distance_value(data_length: int, data_length_minimum: int = 51)

Calculate approximate critical distance value for large (n>50) sample counts.

Assumes significance level of 0.05. If the Kolmogorov-Smirnov distance value is smaller than the critical value, the null hypothesis, i.e., that the distributions are the same, is accepted but not verified.

fractopo.analysis.length_distributions.calculate_exponent(alpha: float)

Calculate exponent from powerlaw.alpha.

fractopo.analysis.length_distributions.calculate_fitted_values(log_lengths: ndarray, m_value: float, constant: float) ndarray

Calculate fitted values of y.

fractopo.analysis.length_distributions.cut_off_proportion_of_data(fit: Fit, length_array: ndarray) float

Get the proportion of data cut off by powerlaw cut off.

If no fit is passed the cut off is the one used in automatic_fit.

fractopo.analysis.length_distributions.describe_powerlaw_fit(fit: Fit, length_array: ndarray, label: str | None = None) dict[str, float]

Compose dict of fit powerlaw attributes and comparisons between fits.

fractopo.analysis.length_distributions.distribution_compare_dict(fit: Fit) dict[str, float]

Compose a dict of length distribution fit comparisons.

fractopo.analysis.length_distributions.fit_to_multi_scale_lengths(ccm: ~numpy.ndarray, lengths: ~numpy.ndarray, fitter: ~collections.abc.Callable[[~numpy.ndarray, ~numpy.ndarray], tuple[float, float]] = <function numpy_polyfit>, scorer: ~collections.abc.Callable[[~numpy.ndarray, ~numpy.ndarray], float] = <function mean_squared_log_error>) Polyfit

Fit np.polyfit to multiscale length distributions.

Returns the fitted values, exponent and constant of fit within a Polyfit instance.

fractopo.analysis.length_distributions.numpy_polyfit(log_lengths: ndarray, log_ccm: ndarray) tuple[float, float]

Fit numpy polyfit to data.

fractopo.analysis.length_distributions.optimize_cut_offs(cut_offs: ndarray, distributions: list[LengthDistribution], fitter: Callable[[ndarray, ndarray], tuple[float, float]], scorer: Callable[[ndarray, ndarray], float], *_) float

Optimize multiscale fit.

Requirements for the optimization function are that the function must take one argument of 1-d array and return a single float. It can take static arguments (distributions, fitter).

fractopo.analysis.length_distributions.plot_distribution_fits(length_array: ndarray, label: str, using_branches: bool, use_probability_density_function: bool, cut_off: float | None = None, fit: Fit | None = None, fig: Figure | None = None, ax: Axes | None = None, fits_to_plot: tuple[Dist, ...] = (Dist.POWERLAW, Dist.LOGNORMAL, Dist.EXPONENTIAL), plain: bool = False) tuple[Fit | None, Figure, Axes]

Plot length distribution and powerlaw fits.

If a powerlaw.Fit is not given it will be automatically determined (using the optionally given cut_off).

fractopo.analysis.length_distributions.plot_fit_on_ax(ax: Axes, fit: Fit, fit_distribution: Dist, use_probability_density_function: bool) None

Plot powerlaw model to ax.

fractopo.analysis.length_distributions.plot_multi_distributions_and_fit(truncated_length_array_all: list[ndarray], ccm_array_normed_all: list[ndarray], full_length_array_all: list[ndarray], full_ccm_array_normed_all: list[ndarray], cut_offs: list[float], names: list[str], polyfit: Polyfit, using_branches: bool, plot_truncated_data: bool) tuple[Figure, Axes]

Plot multi-scale length distribution.

fractopo.analysis.length_distributions.scikit_linear_regression(log_lengths: ndarray, log_ccm: ndarray) tuple[float, float]

Fit using scikit LinearRegression.

fractopo.analysis.length_distributions.setup_ax_for_ld(ax: Axes, using_branches: bool, indiv_fit: bool, use_probability_density_function: bool, plain: bool = False)

Configure ax for length distribution plots.

Parameters:
  • ax – Ax to setup.

  • using_branches – Are the lines in the axis branches or traces.

  • indiv_fit – Is the plot single-scale or multi-scale.

  • use_probability_density_function – Whether to use complementary cumulative distribution function

  • plain – Should the stylizing be kept to a minimum.

fractopo.analysis.length_distributions.setup_length_dist_legend(ax: Axes)

Set up legend for length distribution plots.

Used for both single and multi distribution plots.

fractopo.analysis.length_distributions.sort_and_log_lengths_and_ccm(lengths: ndarray, ccm: ndarray) tuple[ndarray, ndarray]

Preprocess lengths and ccm.

Sorts them and calculates their natural logarithmic.

fractopo.analysis.length_distributions.sorted_lengths_and_ccm(lengths: ndarray, area_value: float | None) tuple[ndarray, ndarray]

Get (normalized) complementary cumulative number array.

Give area_value as None to not normalize.

>>> lengths = np.array([2, 4, 8, 16, 32])
>>> area_value = 10.0
>>> sorted_lengths_and_ccm(lengths, area_value)
(array([ 2,  4,  8, 16, 32]), array([0.1 , 0.08, 0.06, 0.04, 0.02]))
>>> lengths = np.array([2, 4, 8, 16, 32])
>>> area_value = None
>>> sorted_lengths_and_ccm(lengths, area_value)
(array([ 2,  4,  8, 16, 32]), array([1. , 0.8, 0.6, 0.4, 0.2]))