Functions to interpret and manupulate a SOP/DNF expression
These functions interpret an expression written in sum of products (SOP) or in canonical
disjunctive normal form (DNF), for both crisp and multivalue notations. The function
compute()
calculates set membership scores based on a SOP expression applied
to a calibrated data set (see function calibrate()
from package QCA), while the function translate()
translates a
SOP expression into a matrix form.
A function similar to compute()
was initially written by Lewandowski (2015)
but the actual code in these functions has been completely re-written and expanded with
more extensive functionality (see details and examples below).
The function simplify()
transforms a SOP expression into a simpler equivalent,
through a process of Boolean minimization. The package uses the function
minimize()
from package QCA), so users are highly
encouraged to install and load that package, despite not being present in the Imports field
(due to circular dependency issues).
Function expand()
performs a Quine expansion to the complete DNF, or a
partial expansion to a SOP expression with equally complex terms.
compute(expression = "", data = NULL, separate = FALSE) simplify(expression = "", snames = "", noflevels = NULL, ...) translate(expression = "", snames = "", noflevels = NULL, data = NULL, ...) expand(expression = "", snames = "", noflevels = NULL, partial = FALSE, implicants = FALSE, ...)
expression |
String: a SOP - sum of products expression. |
data |
A dataset with binary cs, mv and fs data. |
separate |
Logical, perform computations on individual, separate paths. |
snames |
A string containing the sets' names, separated by commas. |
noflevels |
Numerical vector containing the number of levels for each set. |
partial |
Logical, perform a partial Quine expansion. |
implicants |
Logical, return an expanded matrix in the implicants space. |
... |
Other arguments, mainly for backwards compatibility. |
An expression written in sum of products (SOP), is a "union of intersections", for example
A*B + B*~C
. The disjunctive normal form (DNF) is also a sum of products, with the
restriction that each product has to contain all literals. The equivalent DNF expression is:
A*B*~C + A*B*C + ~A*B*~C
The same expression can be written in multivalue notation: A[1]*B[1] + B[1]*C[0]
.
Expressions can contain multiple values for the same condition, separated by a comma. If B was
a multivalue causal condition, an expression could be: A[1] + B[1,2]*C[0]
.
Whether crisp or multivalue, expressions are treated as Boolean. In this last example, all values in B equal to either 1 or 2 will be converted to 1, and the rest of the (multi)values will be converted to 0.
Negating a multivalue condition requires a known number of levels (see examples below). Improvements
from version 2.5 allow for intersections between multiple levels of the same condition. For a causal
condition with 3 levels (0, 1 and 2) the following expression ~A[0,2]*A[1,2]
is
equivalent with A[1]
, while A[0]*A[1]
results in the empty set.
The number of levels, as well as the set names can be automatically detected from a dataset via
the argument data
. When specified, arguments snames
and
noflevels
have precedence over data
.
The product operator *
should always be used, but it can be omitted when the data
is multivalue (where product terms are separated by curly brackets), and/or when the set names
are single letters (for example AD + B~C
), and/or when the set names are provided
via the argument snames
.
When expressions are simplified, their simplest equivalent can result in the empty set, if the conditions cancel each other out.
For the function compute()
, a vector of set membership values.
For function simplify()
, a character expression.
For the function translate()
, a matrix containing the implicants on the
rows and the set names on the columns, with the following codes:
0 | absence of a causal condition |
1 | presence of a causal condition |
-1 | causal condition was eliminated |
The matrix was also assigned a class "translate", to avoid printing the -1 codes when signaling a minimized condition. The mode of this matrix is character, to allow printing multiple levels in the same cell, such as "1,2".
For function expand()
, a character expression or a matrix of implicants.
For function generate()
, a data frame.
Adrian Dusa
Ragin, C.C. (1987) The Comparative Method: Moving beyond Qualitative and Quantitative Strategies. Berkeley: University of California Press.
Lewandowski, J. (2015) QCAtools: Helper functions for QCA in R. R package version 0.1
# ----- # for compute() ## Not run: # make sure the package QCA is loaded library(QCA) compute(DEV*~IND + URB*STB, data = LF) # calculating individual paths compute(DEV*~IND + URB*STB, data = LF, separate = TRUE) ## End(Not run) # ----- # for simplify(), also make sure the package QCA is loaded simplify("(A + B)(A + ~B)") # result is "A" # works even without the quotes simplify((A + B)(A + ~B)) # result is "A" # but to avoid confusion POS expressions are more clear when quoted # to force a certain order of the set names simplify("(URB + LIT*~DEV)(~LIT + ~DEV)", snames = c(DEV, URB, LIT)) # multilevel conditions can also be specified (and negated) simplify("(A[1] + ~B[0])(B[1] + C[0])", snames = c(A, B, C), noflevels = c(2, 3, 2)) # Ragin's (1987) book presents the equation E = SG + LW as the result # of the Boolean minimization for the ethnic political mobilization. # intersecting the reactive ethnicity perspective (R = ~L~W) # with the equation E (page 144) simplify("~L~W(SG + LW)", snames = c(S, L, W, G)) # [1] "S~L~WG" # resources for size and wealth (C = SW) with E (page 145) simplify("SW(SG + LW)", snames = c(S, L, W, G)) # [1] "SWG + SLW" # and factorized factorize(simplify("SW(SG + LW)", snames = c(S, L, W, G))) # F1: SW(G + L) # developmental perspective (D = Lg) and E (page 146) simplify("L~G(SG + LW)", snames = c(S, L, W, G)) # [1] "LW~G" # subnations that exhibit ethnic political mobilization (E) but were # not hypothesized by any of the three theories (page 147) # ~H = ~(~L~W + SW + L~G) = GL~S + GL~W + G~SW + ~L~SW simplify("(GL~S + GL~W + G~SW + ~L~SW)(SG + LW)", snames = c(S, L, W, G)) # ----- # for translate() translate(A + B*C) # same thing in multivalue notation translate(A[1] + B[1]*C[1]) # tilde as a standard negation (note the condition "b"!) translate(~A + b*C) # and even for multivalue variables # in multivalue notation, the product sign * is redundant translate(C[1] + T[2] + T[1]*V[0] + C[0]) # negation of multivalue sets requires the number of levels translate(~A[1] + ~B[0]*C[1], snames = c(A, B, C), noflevels = c(2, 2, 2)) # multiple values can be specified translate(C[1] + T[1,2] + T[1]*V[0] + C[0]) # or even negated translate(C[1] + ~T[1,2] + T[1]*V[0] + C[0], snames = c(C, T, V), noflevels = c(2,3,2)) # if the expression does not contain the product sign * # snames are required to complete the translation translate(AaBb + ~CcDd, snames = c(Aa, Bb, Cc, Dd)) # to print _all_ codes from the standard output matrix (obj <- translate(A + ~B*C)) print(obj, original = TRUE) # also prints the -1 code # ----- # for expand() expand(~AB + B~C) # S1: ~AB~C + ~ABC + AB~C expand(~AB + B~C, snames = c(A, B, C, D)) # S1: ~AB~C~D + ~AB~CD + ~ABC~D + ~ABCD + AB~C~D + AB~CD # In implicants form: expand(~AB + B~C, snames = c(A, B, C, D), implicants = TRUE) # A B C D # [1,] 1 2 1 1 ~AB~C~D # [2,] 1 2 1 2 ~AB~CD # [3,] 1 2 2 1 ~ABC~D # [4,] 1 2 2 2 ~ABCD # [5,] 2 2 1 1 AB~C~D # [6,] 2 2 1 2 AB~CD
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.