Metric definitions¶

This page gives the metric definitions used by the project and maps them to the implementation in fbool. The bibliography at the end lists the sources behind the standard Boolean-function metrics and the project-specific descriptors.

Notation¶

Let \(f : \{0,1\}^n \to \{0,1\}\) be a Boolean function. fbool stores \(f\) as a truth table with \(2^n\) entries.

For a set of variables \(S\), write \(\bar S\) for the complement. Fixing the variables in \(S\) to a concrete assignment \(x_S\) produces a restricted subfunction:

\[ g_{x_S}(x_{\bar S}) = f(x). \]

The set or multiset of these restricted functions is the main object behind the information, entanglement, entropy-based entanglement, and fragmentation metrics.

Entropy¶

Let \(X\) be uniformly distributed over all inputs. Define:

\[ p_0 = P(f(X)=0), \qquad p_1 = P(f(X)=1). \]

The output entropy is:

\[ H(f) = -p_0 \log_2 p_0 - p_1 \log_2 p_1. \]

Intuition:

constant functions have entropy 0;
balanced functions have maximal output entropy;
entropy sees only the number of zeros and ones, not where they occur.

Background: this is the usual Shannon entropy applied to the output random variable of a Boolean function. General Boolean-function notation and analysis context follow standard references such as O'Donnell [ODonnell2014].

API naming

This definition is the plain output entropy \(H(f)\). It is not what f.entropy() currently computes.

In the current API, entropy() is reserved for the entropy-based entanglement score over balanced variable partitions. This naming will be changed in a future version so the API distinguishes plain output entropy from entropy-based entanglement more explicitly.

Information of a variable set¶

For a variable set \(S\) with \(|S| = k\), fix \(S\) in all \(2^k\) possible ways and collect the restricted subfunctions. The information value is the number of distinct restricted functions:

\[ i(S) = |\{g_{x_S} : x_S \in \{0,1\}^k\}|. \]

Intuition:

low \(i(S)\): many assignments to \(S\) behave the same once the other variables are left free;
high \(i(S)\): \(S\) creates many genuinely different views of the function.

Usage:

Rust Python

let info = f.information(&[0, 1]);

info = f.information([0, 1])

Entanglement¶

The entanglement metric follows the variable-partition view introduced for Boolean functions in the equanimity/entanglement work [Carro2023]. Its communication-style interpretation is related to best-partitioning ideas used in multilevel logic synthesis [Hwang1990].

It is defined over half-sized partitions:

\[ ent(f) = \min_{\substack{S \subset \{1,\dots,n\}\\ |S|=\lfloor n/2 \rfloor}} \left(i(S) + i(\bar S)\right). \]

The half-size restriction avoids uninformative partitions where one side is too small or too large.

Intuition:

if some balanced split makes both sides simple, entanglement is low;
if every balanced split leaves both sides with many possible subfunctions, the variables are strongly interdependent.

Usage:

Rust Python

let value = f.entanglement();
let witnesses = f.entanglement_sets();

value = f.entanglement()
witnesses = f.entanglement_sets()

Min-max entanglement¶

fbool also exposes a conservative variant:

\[ ent_{\max}(f) = \min_S \max(i(S), i(\bar S)). \]

It asks for the best balanced split after judging the split by its harder side, not by the sum of both sides.

Use it when a single difficult side should dominate the score.

Entropy-based entanglement¶

Instead of counting distinct restricted functions, the entropy-based variant uses their empirical distribution.

Let \(F(S)\) be the multiset of restricted functions obtained by fixing \(S\). If a restricted function \(g\) appears with multiplicity \(\mathrm{mul}(g)\), define:

\[ \sigma_g = \frac{\mathrm{mul}(g)}{2^{|S|}}. \]

The partition entropy is:

\[ E(S) = - \sum_{g \in F(S)} \sigma_g \log_2 \sigma_g. \]

The entropy-based entanglement score is:

\[ T(f) = \min_{\substack{S \subset \{1,\dots,n\}\\ |S|=\lfloor n/2 \rfloor}} \left(E(S) + E(\bar S)\right). \]

In fbool, entropy() and entropy_sets() use this partition-entropy view over balanced bipartitions. The fragmentation API generalizes it across all subset sizes.

This is a project-level extension of the same restriction/partition framework used by entanglement [Carro2023].

Fragmentation¶

Fragmentation asks the same entropy-of-restrictions question as \(E(S)\), but organizes it by subset size.

For each \(k\):

\[ S_k(f) = \operatorname{avg}_{|S|=k} E(S). \]

The spectrum is:

\[ (S_0(f), S_1(f), \dots, S_n(f)). \]

Intuition:

it shows at which subset sizes the function starts producing diverse restrictions;
the peak identifies the subset size with maximal average fragmentation;
first and second differences describe how sharply the profile changes.

Usage:

Rust Python

let spectrum = f.fragmentation_spectrum();
let peak = f.fragmentation_peak();

spectrum = f.fragmentation_spectrum()
k_star, s_max = f.fragmentation_peak()

Fragmentation is a project-level profile built from the entropy-based restriction distribution above.

Spectral entropy¶

Let \(\widehat W_\omega(f)\) be the Walsh-Hadamard spectrum. Normalize squared coefficients into an energy distribution:

\[ p_\omega = \frac{\widehat W_\omega(f)^2} {\sum_{\nu \in \{0,1\}^n} \widehat W_\nu(f)^2}. \]

Spectral entropy is:

\[ H_{spectral}(f) = - \sum_{\omega \in \{0,1\}^n} p_\omega \log_2 p_\omega. \]

Intuition:

low spectral entropy: the function is concentrated on a few linear components;
high spectral entropy: spectral energy is spread across many frequencies.

Background: the Walsh-Hadamard/Fourier view of Boolean functions is standard in cryptography and Boolean-function analysis [Carlet2010, ODonnell2014].

Counting¶

Counting is a structural descriptor over the truth table. Let \(Z\) be the number of zeros in the truth table and \(U\) the number of ones. Let \(Z_u\) and \(U_u\) be the number of 1 bits in the binary representations of \(Z\) and \(U\). Then:

\[ counting(f) = Z_u + U_u. \]

Intuition: it is a compact handcrafted signal for how simply the output cardinalities can be expressed as sums of powers of two.

Counting is a project-level descriptor. It is included because it captures a small amount of truth-table regularity that plain output entropy discards.

Rust:

use fbool::StructuralMetrics;

let value = f.counting();

Repetitiveness¶

For 5-variable functions, the truth table has 32 bits. The repetitiveness descriptor splits it into eight consecutive blocks of four bits:

\[ s_f = S_1 S_2 \dots S_8. \]

Repetitiveness is the total length covered by blocks that appear at least twice:

\[ rep(f) = \sum_{j : \exists k \ne j, S_j = S_k} |S_j|. \]

Intuition: it measures repeated local patterns in the truth-table string.

The repetitiveness descriptor follows the earlier experimental work on hard Boolean functions [Villarubia2021].

Rust:

use fbool::StructuralMetrics;

let value = f.repetitiveness();

Non-linearity¶

Non-linearity is the Hamming distance from \(f\) to the closest affine function:

\[ NL(f) = \min_{g \in Affine} d_H(f,g). \]

An affine Boolean function has the form:

\[ g(x_1,\dots,x_n) = (a_1x_1 \oplus \dots \oplus a_nx_n) \oplus b. \]

Intuition: high non-linearity means the function is far from XOR-like approximations.

Background: non-linearity and affine approximation are standard tools in the analysis of Boolean functions, especially in cryptographic settings [Carlet2010].

Degree¶

Every Boolean function has a unique multilinear polynomial over the reals:

\[ P(x) = \sum_{S \subseteq \{1,\dots,n\}} a_S \prod_{i \in S} x_i. \]

The degree is:

\[ deg(f) = \max\{|S| : a_S \ne 0\}. \]

Intuition: degree measures the highest-order interaction needed to express the function.

Background: degree is a standard algebraic measure in Boolean-function analysis [ODonnell2014].

Influence and total influence¶

Let \(x^{\oplus i}\) be the input obtained by flipping bit \(i\). Define:

\[ E_i(f) = \{x : f(x) \ne f(x^{\oplus i})\}. \]

The influence of variable \(i\) is:

\[ Inf_i(f) = \frac{|E_i(f)|}{2^n}. \]

Total influence is:

\[ I(f) = \sum_{i=1}^n Inf_i(f). \]

Intuition: influence is the average chance that flipping a variable flips the output. Total influence is the normalized number of changing hypercube edges.

Background: influence and total influence are central measures in the analysis of Boolean functions [ODonnell2014].

Sensitivity¶

Local sensitivity counts how many single-bit flips change the output at one input:

\[ s(f,x) = |\{i : f(x) \ne f(x^{\oplus i})\}|. \]

Sensitivity is the worst case:

\[ s(f) = \max_x s(f,x). \]

fbool exposes both a maximum and a mean sensitivity:

let max_s = f.max_sensitivity();
let mean_s = f.mean_sensitivity();

Background: sensitivity is a standard local instability measure for Boolean functions [ODonnell2014].

Certificate complexity¶

A certificate for input \(x\) is a subset of input positions \(S\) such that every input \(y\) agreeing with \(x\) on \(S\) has the same output:

\[ (\forall i \in S: y_i = x_i) \Rightarrow f(y) = f(x). \]

The point certificate complexity \(C(f,x)\) is the smallest such subset size. The certificate complexity of the function is the worst case:

\[ C(f) = \max_x C(f,x). \]

Intuition: it measures how many input bits you may need to reveal before the output is forced.

Background: certificate complexity is a standard decision/circuit-complexity measure [AroraBarak2009, ODonnell2014].

Minimal gates¶

The default optimal5 feature maps a 5-variable function to its NPN representative and retrieves exact minimal gate data from the integrated optimal5 engine.

This is not an asymptotic complexity measure. It is exact reference data for the finite 5-input domain used in the experiments.

The data comes from Adam P. Goucher's optimal5 database [Goucher2020], which is based on NPN equivalence classes originally computed by Knuth's BOOLCHAINS work [Knuth2011].

References¶

[AroraBarak2009] S. Arora and B. Barak. Computational Complexity: A Modern Approach. Cambridge University Press, 2009.

[Carlet2010] C. Carlet, Y. Crama, and P. L. Hammer. "Boolean functions for cryptography and error-correcting codes." 2010.

[Carro2023] E. Carro Garrido and J. R. Cobian Fernandez. "Analysis of Boolean functions using equanimity and entanglement." Bachelor's thesis, Universidad Complutense de Madrid, Facultad de Informatica, 2023. https://hdl.handle.net/20.500.14352/88009

[Goucher2020] A. P. Goucher. "optimal5: Optimal circuits for all 5-input 1-output Boolean functions." 2020. https://gitlab.com/apgoucher/optimal5

[Hwang1990] T.-T. Hwang, R. M. Owens, and M. J. Irwin. "Exploiting communication complexity for multilevel logic synthesis." IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 9(10), 1017-1027, 1990.

[Knuth2011] D. E. Knuth. The Art of Computer Programming, Volume 4A. Addison-Wesley, 2011.

[ODonnell2014] R. O'Donnell. Analysis of Boolean Functions. Cambridge University Press, 2014.

[Villarubia2021] J. Villarubia Elvira. "Identificacion experimental de las funciones booleanas que requieren circuitos extensos y aplicacion al estudio de P vs NP." Bachelor's thesis, Universidad Complutense de Madrid, Facultad de Informatica, 2021.