admisc: SOPexpression – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

SOPexpression

Functions to interpret and manupulate a SOP/DNF expression

Description

These functions interpret an expression written in sum of products (SOP) or in canonical disjunctive normal form (DNF), for both crisp and multivalue notations. The function compute() calculates set membership scores based on a SOP expression applied to a calibrated data set (see function calibrate() from package QCA), while the function translate() translates a SOP expression into a matrix form.

A function similar to compute() was initially written by Lewandowski (2015) but the actual code in these functions has been completely re-written and expanded with more extensive functionality (see details and examples below).

The function simplify() transforms a SOP expression into a simpler equivalent, through a process of Boolean minimization. The package uses the function minimize() from package QCA), so users are highly encouraged to install and load that package, despite not being present in the Imports field (due to circular dependency issues).

Function expand() performs a Quine expansion to the complete DNF, or a partial expansion to a SOP expression with equally complex terms.

Usage

compute(expression = "", data = NULL, separate = FALSE)

simplify(expression = "", snames = "", noflevels = NULL, ...)

translate(expression = "", snames = "", noflevels = NULL, data = NULL, ...)

expand(expression = "", snames = "", noflevels = NULL, partial = FALSE,
      implicants = FALSE, ...)

Arguments

`expression`	String: a SOP - sum of products expression.
`data`	A dataset with binary cs, mv and fs data.
`separate`	Logical, perform computations on individual, separate paths.
`snames`	A string containing the sets' names, separated by commas.
`noflevels`	Numerical vector containing the number of levels for each set.
`partial`	Logical, perform a partial Quine expansion.
`implicants`	Logical, return an expanded matrix in the implicants space.
`...`	Other arguments, mainly for backwards compatibility.

Details

An expression written in sum of products (SOP), is a "union of intersections", for example A*B + B*~C. The disjunctive normal form (DNF) is also a sum of products, with the restriction that each product has to contain all literals. The equivalent DNF expression is: A*B*~C + A*B*C + ~A*B*~C

The same expression can be written in multivalue notation: A[1]*B[1] + B[1]*C[0].

Expressions can contain multiple values for the same condition, separated by a comma. If B was a multivalue causal condition, an expression could be: A[1] + B[1,2]*C[0].

Whether crisp or multivalue, expressions are treated as Boolean. In this last example, all values in B equal to either 1 or 2 will be converted to 1, and the rest of the (multi)values will be converted to 0.

Negating a multivalue condition requires a known number of levels (see examples below). Improvements from version 2.5 allow for intersections between multiple levels of the same condition. For a causal condition with 3 levels (0, 1 and 2) the following expression ~A[0,2]*A[1,2] is equivalent with A[1], while A[0]*A[1] results in the empty set.

The number of levels, as well as the set names can be automatically detected from a dataset via the argument data. When specified, arguments snames and noflevels have precedence over data.

The product operator * should always be used, but it can be omitted when the data is multivalue (where product terms are separated by curly brackets), and/or when the set names are single letters (for example AD + B~C), and/or when the set names are provided via the argument snames.

When expressions are simplified, their simplest equivalent can result in the empty set, if the conditions cancel each other out.

Value

For the function compute(), a vector of set membership values.

For function simplify(), a character expression.

For the function translate(), a matrix containing the implicants on the rows and the set names on the columns, with the following codes:

0	absence of a causal condition
1	presence of a causal condition
-1	causal condition was eliminated

The matrix was also assigned a class "translate", to avoid printing the -1 codes when signaling a minimized condition. The mode of this matrix is character, to allow printing multiple levels in the same cell, such as "1,2".

For function expand(), a character expression or a matrix of implicants.

For function generate(), a data frame.

Author(s)

Adrian Dusa

References

Ragin, C.C. (1987) The Comparative Method: Moving beyond Qualitative and Quantitative Strategies. Berkeley: University of California Press.

Lewandowski, J. (2015) QCAtools: Helper functions for QCA in R. R package version 0.1

Examples

# -----
# for compute()
## Not run: 
# make sure the package QCA is loaded
library(QCA)
compute(DEV*~IND + URB*STB, data = LF)

# calculating individual paths
compute(DEV*~IND + URB*STB, data = LF, separate = TRUE)

## End(Not run)


# -----
# for simplify(), also make sure the package QCA is loaded
simplify("(A + B)(A + ~B)") # result is "A"

# works even without the quotes
simplify((A + B)(A + ~B)) # result is "A"

# but to avoid confusion POS expressions are more clear when quoted
# to force a certain order of the set names
simplify("(URB + LIT*~DEV)(~LIT + ~DEV)", snames = c(DEV, URB, LIT))

# multilevel conditions can also be specified (and negated)
simplify("(A[1] + ~B[0])(B[1] + C[0])", snames = c(A, B, C), noflevels = c(2, 3, 2))


# Ragin's (1987) book presents the equation E = SG + LW as the result
# of the Boolean minimization for the ethnic political mobilization.

# intersecting the reactive ethnicity perspective (R = ~L~W)
# with the equation E (page 144)

simplify("~L~W(SG + LW)", snames = c(S, L, W, G))

# [1] "S~L~WG"


# resources for size and wealth (C = SW) with E (page 145)
simplify("SW(SG + LW)", snames = c(S, L, W, G))

# [1] "SWG + SLW"


# and factorized
factorize(simplify("SW(SG + LW)", snames = c(S, L, W, G)))

# F1: SW(G + L)


# developmental perspective (D = Lg) and E (page 146)
simplify("L~G(SG + LW)", snames = c(S, L, W, G))

# [1] "LW~G"

# subnations that exhibit ethnic political mobilization (E) but were
# not hypothesized by any of the three theories (page 147)
# ~H = ~(~L~W + SW + L~G) = GL~S + GL~W + G~SW + ~L~SW

simplify("(GL~S + GL~W + G~SW + ~L~SW)(SG + LW)", snames = c(S, L, W, G))


# -----
# for translate()
translate(A + B*C)

# same thing in multivalue notation
translate(A[1] + B[1]*C[1])

# tilde as a standard negation (note the condition "b"!)
translate(~A + b*C)

# and even for multivalue variables
# in multivalue notation, the product sign * is redundant
translate(C[1] + T[2] + T[1]*V[0] + C[0])

# negation of multivalue sets requires the number of levels
translate(~A[1] + ~B[0]*C[1], snames = c(A, B, C), noflevels = c(2, 2, 2))

# multiple values can be specified
translate(C[1] + T[1,2] + T[1]*V[0] + C[0])

# or even negated
translate(C[1] + ~T[1,2] + T[1]*V[0] + C[0], snames = c(C, T, V), noflevels = c(2,3,2))

# if the expression does not contain the product sign *
# snames are required to complete the translation 
translate(AaBb + ~CcDd, snames = c(Aa, Bb, Cc, Dd))

# to print _all_ codes from the standard output matrix
(obj <- translate(A + ~B*C))
print(obj, original = TRUE) # also prints the -1 code


# -----
# for expand()
expand(~AB + B~C)

# S1: ~AB~C + ~ABC + AB~C 

expand(~AB + B~C, snames = c(A, B, C, D))

# S1: ~AB~C~D + ~AB~CD + ~ABC~D + ~ABCD + AB~C~D + AB~CD 

# In implicants form:
expand(~AB + B~C, snames = c(A, B, C, D), implicants = TRUE)

#      A B C D
# [1,] 1 2 1 1    ~AB~C~D
# [2,] 1 2 1 2    ~AB~CD
# [3,] 1 2 2 1    ~ABC~D
# [4,] 1 2 2 2    ~ABCD
# [5,] 2 2 1 1    AB~C~D
# [6,] 2 2 1 2    AB~CD

admisc

Adrian Dusa's Miscellaneous

v0.12

GPL (>= 3)

Authors

Adrian Dusa [aut, cre, cph] (<https://orcid.org/0000-0002-3525-9253>)

Initial release

2021-03-16