deb_control_files:
- control
- md5sums
deb_fields:
Architecture: arm64
Depends: r-api-4.0, r-cran-stringi (>= 1.0.1), r-cran-rcpp (>= 0.12.3), r-cran-snowballc
(>= 0.5.1), libc6 (>= 2.17), libgcc-s1 (>= 3.0), libstdc++6 (>= 14)
Description: |-
GNU R fast, consistent tokenization of natural language text
Convert natural language text into tokens. Includes tokenizers for
shingled n-grams, skip n-grams, words, word stems, sentences,
paragraphs, characters, shingled characters, lines, tweets, Penn
Treebank, regular expressions, as well as functions for counting
characters, words, and sentences, and a function for splitting longer
texts into separate documents, each with the same number of words.
The tokenizers have a consistent interface, and the package is built
on the 'stringi' and 'Rcpp' packages for fast yet correct
tokenization in 'UTF-8'.
Homepage: https://cran.r-project.org/package=tokenizers
Installed-Size: '861'
Maintainer: Debian R Packages Maintainers <r-pkg-team@alioth-lists.debian.net>
Package: r-cran-tokenizers
Priority: optional
Recommends: r-cran-testthat
Section: gnu-r
Source: r-cran-tokenizers (0.3.0-1)
Suggests: r-cran-covr, r-cran-knitr, r-cran-rmarkdown
Version: 0.3.0-1+b1
srcpkg_name: r-cran-tokenizers
srcpkg_version: 0.3.0-1