Helpers to download and unpack a ZIP file
Details on the internal and helper functions that power use_course()
and
use_zip()
. Only create_download_url()
is exported.
create_download_url(url)
url |
a GitHub, DropBox, or Google Drive URL, as copied from a web browser. |
destdir |
Path to existing local directory where the ZIP file will be
stored. Defaults to current working directory, but note that |
zipfile |
Path to local ZIP file. |
## function signature tidy_download(url, destdir = getwd()) # as called inside use_course() tidy_download( url, ## after post-processing with normalize_url() # conspicuous_place() = `getOption('usethis.destdir')` or desktop or home # directory or working directory destdir = destdir %||% conspicuous_place() )
Special-purpose function to download a ZIP file and automatically determine
the file name, which often determines the folder name after unpacking.
Developed with DropBox and GitHub as primary targets, possibly via
shortlinks. Both platforms offer a way to download an entire folder or repo
as a ZIP file, with information about the original folder or repo transmitted
in the Content-Disposition
header. In the absence of this header, a
filename is generated from the input URL. In either case, the filename is
sanitized. Returns the path to downloaded ZIP file, invisibly.
tidy_download()
is setup to retry after a download failure. In an
interactive session, it asks for user's consent. All retries use a longer
connect timeout.
To make a folder available for ZIP download, create a shared link for it:
A shared link will have this form:
https://www.dropbox.com/sh/12345abcde/6789wxyz?dl=0
Replace the dl=0
at the end with dl=1
to create a download link:
https://www.dropbox.com/sh/12345abcde/6789wxyz?dl=1
You can use create_download_url()
to do this conversion.
This download link (or a shortlink that points to it) is suitable as input
for tidy_download()
. After one or more redirections, this link will
eventually lead to a download URL. For more details, see
https://help.dropbox.com/files-folders/share/force-download and
https://help.dropbox.com/installs-integrations/sync-uploads/download-entire-folders.
Click on the repo's "Clone or download" button, to reveal a "Download ZIP" button. Capture this URL, which will have this form:
https://github.com/r-lib/usethis/archive/master.zip
This download link (or a shortlink that points to it) is suitable as input
for tidy_download()
. After one or more redirections, this link will
eventually lead to a download URL. Here are other links that also lead to
ZIP download, albeit with a different filenaming scheme (REF could be a
branch name, a tag, or a SHA):
https://github.com/github.com/r-lib/usethis/zipball/HEAD https://api.github.com/repos/r-lib/rematch2/zipball/REF https://api.github.com/repos/r-lib/rematch2/zipball/HEAD https://api.github.com/repos/r-lib/usethis/zipball/REF
You can use create_download_url()
to create the "Download ZIP" URL from
a typical GitHub browser URL.
To our knowledge, it is not possible to download a Google Drive folder as a ZIP archive. It is however possible to share a ZIP file stored on Google Drive. To get its URL, click on "Get the shareable link" (within the "Share" menu). This URL doesn't allow for direct download, as it's designed to be processed in a web browser first. Such a sharing link looks like:
https://drive.google.com/open?id=123456789xxyyyzzz
To be able to get the URL suitable for direct download, you need to extract the "id" element from the URL and include it in this URL format:
https://drive.google.com/uc?export=download&id=123456789xxyyyzzz
Use create_download_url()
to perform this transformation automatically.
Special-purpose function to unpack a ZIP file and (attempt to) create the directory structure most people want. When unpacking an archive, it is easy to get one more or one less level of nesting than you expected.
It's especially important to finesse the directory structure here: we want the same local result when unzipping the same content from either GitHub or DropBox ZIP files, which pack things differently. Here is the intent:
If the ZIP archive foo.zip
does not contain a single top-level directory,
i.e. it is packed as "loose parts", unzip into a directory named foo
.
Typical of DropBox ZIP files.
If the ZIP archive foo.zip
has a single top-level directory (which, by
the way, is not necessarily called "foo"), unpack into said directory.
Typical of GitHub ZIP files.
Returns path to the directory holding the unpacked files, invisibly.
DropBox:
The ZIP files produced by DropBox are special. The file list tends to contain
a spurious directory "/"
, which we ignore during unzip. Also, if the
directory is a Git repo and/or RStudio Project, we unzip-ignore various
hidden files, such as .RData
, .Rhistory
, and those below .git/
and
.Rproj.user
.
## Not run: tidy_download("https://github.com/r-lib/rematch2/archive/master.zip") ## End(Not run) ## Not run: tidy_download("https://github.com/r-lib/rematch2/archive/master.zip") tidy_unzip("rematch2-master.zip") ## End(Not run) # GitHub create_download_url("https://github.com/r-lib/usethis") create_download_url("https://github.com/r-lib/usethis/issues") # DropBox create_download_url("https://www.dropbox.com/sh/12345abcde/6789wxyz?dl=0") # Google Drive create_download_url("https://drive.google.com/open?id=123456789xxyyyzzz") create_download_url("https://drive.google.com/open?id=123456789xxyyyzzz/view")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.