Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

bibConvert

Convert between bibliography formats


Description

Read a bibliography file in one of the supported formats, convert it to nnother format, and write it to a file.

Usage

bibConvert(infile, outfile, informat, outformat, ..., tex, encoding, 
           options)

Arguments

infile

input file, a character string.

outfile

output file, a character string.

informat

input format, a character string, see Details.

outformat

output format, a character string, see Details.

...

not used.

tex

TeX specific options, see Details, a character vector.

encoding

character(2), a length two vector specifying input and output encodings. Default to both is "utf8", see Details.

options

mainly for debugging: additional options for the converters, see Details.

Details

Arguments informat and outformat can usually be omitted, since bibConvert infers them from the extensions of the names of the input and output files, see section "File extensions" below. However, there is ambiguity for the extension "bib", since it is used for Bibtex and BibLaTeX entries. For this extension, the default for both, informat and outformat, is "bibtex".

Package rbibutils supports format "bibentry", in addition to the formats supported by the bibutils library. A bibentry object contains one or more references. Two formats are supported for "bibentry" for both input and output. A bibentry object previously saved to a file using saveRDS (default extension "rds") or an R source file containing one or more bibentry commands. The "rds" file is just read in and should contain a bibentry object.

When bibconvert outputs to an R source file, two variants are supported: "R" and "Rstyle". When (outformat = "R", there is one bibentry call for each reference, just as in a Bibtex file, each reference is a single entry. outformat = "Rstyle" uses the format of print(be, style = "R"), i.e., the bibentry calls are output as a comma separated sequence wrapped in c(). For input, it is not necessary to specify which variant is used.

Note that when the input format and output formats are identical, the conversion is not necessarilly a null operation (except for xml, and even that may change). For example, depending on the arguments the character encoding may change. Also, input BibTeX files may contain additional instructions, such as journal abbreviations, which are expanded and incorporated in the references but not exported. It should be remembered also that there may be loss of information when converting from one format to another.

For complete list of supported bibliography formats, see section "Supported input and output formats" in rbibutils and the documentation of the original bibutils library.

Argument encoding is a character vector containing 2 elements, specifying the encoding of the input and output files. If the encodings are the same, a length one vector can be supplied. The default encodings are UTF-8 for input and output. A large number of familiar encodings are supported, e.g. "latin1" and "cp1251" (Windows Cyrillic). Some encodings have two or more aliases and they are also accepted. If an unknown encoding is requested, a list of all supported encodings will be printed.

Argument tex is an unnamed character vector containing switches for bibtex input and output (mostly output). Currently, the following are available:

uppercase

write bibtex tags/types in upper case.

no_latex

do not convert latex-style character combinations to letters.

brackets

use brackets, not quotation marks surrounding data.

dash

use one dash "-", not two "--", in page ranges.

fc

add final comma to bibtex output.

By default latex encodings for accented characters are converted to letters. This may be a problem if the output encoding is not UTF-8, since some characters created by this process may be invalid in that encoding. For example, a BibTeX file which otherwise contains only cyriilic and latin characters may have a few entries with authors containing latin accented characters represented using the TeX convention. If those characters are not converted to Unicode letters, they can be exported to "cp1251" (Windows Cyrillic) for example. Specifying the option no_latex should solve the problem in such cases.

Argument options is mostly for debugging and mimics the command line options of the bibutils' binaries. The argument is a named character vector and is supplied as c(tag1= val1, tag2 = val2, ...), where each tag is the name of an option and the value is the corresponding value. The value for options that do not require one is ignored and can be set to "". Some of the available options are:

h

help, show all available options.

nb

do not write Byte Order Mark in UTF8 output.

verbose

print intermediate output.

debug

print even more intermediate output.

Value

The function is used for the side effect of creating a file in the requested format. It returns a list, currently containing the following components:

infile

name of the input file,

outfile

name of the output file,

nref_in

number of references read from the input file,

nref_out

number of references written to the output file.

File extensions

If an input or output format is not specified by arguments, it is inferred, if possible, from the file extension. The table below contains the file extension in the in the first column and the corresponding default format in the second.

ads ADS reference format
bib BibTeX
bibtex BibTeX
biblatex BibLaTeX
copac COPAC format references
end EndNote (Refer format)
endx EndNote XML
isi ISI web of science
med Pubmed XML references
nbib Pubmed/National Library of Medicine nbib format
ris RIS format
R R source file containing bibentry commands
r R source file containing bibentry commands
Rstyle R source file containing bibentry commands
rds bibentry object in a binary file created by saveRDS()
xml MODS XML intermediate
wordbib Word 2007 bibliography format

Author(s)

Georgi N. Boshnakov

References

Chris Putnam, Library bibutils, https://sourceforge.net/projects/bibutils/.

Examples

fn_biblatex <- system.file("bib", "ex0.biblatex",  package = "rbibutils")
fn_biblatex
## file.show(fn_biblatex)

## convert a biblatex file to xml
modl <- tempfile(fileext = ".xml")
bibConvert(infile = fn_biblatex, outfile = modl, informat = "biblatex", outformat = "xml")
## file.show(modl)

## convert a biblatex file to bibtex
bib <- tempfile(fileext = ".bib")
bib2 <- tempfile(fileext = ".bib")
bibConvert(infile = fn_biblatex, outfile = bib, informat = "biblatex", outformat = "bib")
## file.show(bib)

## convert a biblatex file to bibentry
rds <- tempfile(fileext = ".rds")
fn_biblatex
rds
be <- bibConvert(fn_biblatex, rds, "biblatex", "bibentry")
bea <- bibConvert(fn_biblatex, rds, "biblatex") # same
readRDS(rds)

## convert to R source file
r <- tempfile(fileext = ".R")
bibConvert(fn_biblatex, r, "biblatex")
## file.show(r)
cat(readLines(r), sep = "\n")

fn_cyr_utf8 <- system.file("bib", "cyr_utf8.bib",  package = "rbibutils")

## Can't have files with different encodings in the package, so below
## first convert a UTF-8 file to something else.
##
## input here contains cyrillic (UTF-8) output to Windows Cyrillic,
## notice the "no_latex" option
a <- bibConvert(fn_cyr_utf8, bib, encoding = c("utf8", "cp1251"), tex = "no_latex")

## now take the bib file and convert it to UTF-8
bibConvert(bib, bib2, encoding = c("cp1251", "utf8"))

## Latin-1 example: Author and Title fileds contain Latin-1 accented
##   characters, not real names. As above, the file is in UTF-8
fn_latin1_utf8  <- system.file("bib", "latin1accents_utf8.bib", package = "rbibutils")
## convert to Latin-1, by default the accents are converted to TeX combinations:
b <- bibConvert(fn_latin1_utf8, bib , encoding = c("utf8", "latin1"))
cat(readLines(bib), sep = "\n")
## use "no_latex" option to keep them Latin1:
c <- bibConvert(fn_latin1_utf8, bib , encoding = c("utf8", "latin1"), tex = "no_latex")
## this will show properly in Latin-1 locale (or suitable text editor):
##cat(readLines(bib), sep = "\n")


unlink(c(modl, bib, bib2, r, rds))

rbibutils

Convert Between Bibliography Formats

v2.1.1
GPL-2
Authors
Georgi N. Boshnakov [aut, cre] (R port, R code, modifications to bibutils' C code, conversion to Bibentry (R and C code)), Chris Putman [aut] (src/*, author of the bibutils libraries, https://sourceforge.net/projects/bibutils/), Richard Mathar [ctb] (src/addsout.c), Johannes Wilm [ctb] (src/biblatexin.c, src/bltypes.c)
Initial release
2021-04-27

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.