Get IMS Open Corpus Workbench at SourceForge.net. Fast, secure and Free Open Source software downloads IMS Open Corpus Workbench Logo
Latest news
You can now get CQPwebInABox - a pre-prepared VM image!
Read the new Ziggurat data model specification.

CWB/Perl & other APIs

The CWB/Perl API and tools

The CWB/Perl API is a complete, officially supported API for the IMS Open Corpus Workbench, which can be downloaded here.

All CWB users are encouraged to install the CWB/Perl packages, because they include several useful command-line scripts (cwb-make, cwb-regedit) and are required by various Web GUIs (including CQPweb, BNCweb and the Europarl GUI).

C API for low-level access

An undocumented C API, the corpus library (CL), gives direct low-level access to CWB-indexed corpora. It is not possible to execute CQP queries from a C-level API, as this would require fundamental changes to the CWB source code architecture.

Python APIs

Jørg Asmussen and Yannick Versley have developed a Python API for CQP and the low-level CL library, based on the corresponding CWB/Perl modules. The bundled code can be downloaded from the cwb-python repository as an installable module.

R APIs

The R package rcqp offers direct corpus access and CQP queries from within R. This package has been developed by Bernard Desgraupes and Sylvain Loiseau and is available from CRAN. The package can be compiled from source on Linux and Mac OS X, provided that the external dependencies of CWB 3.5β have been installed. In particular, the Glib2 and PCRE libraries are required (see this document for details).

Other programming languages & APIs

The corpus query interface (CQi) is a cross-language remote client-server interface that provides low-level corpus access as well as CQP functionality. While still in draft stage, it has been used to develop APIs for Java and other programming languages that cannot easily be linked with C libraries or run an interactive CQP backend.