Scholarly Communication Technology Catalogue
Log in

Grobid

Last updated: 2021-02-03 09:29 UTC
Description: GROBID (or Grobid) stands for GeneRation Of BIbliographic Data. It is a machine-learning library for extracting, parsing, and re-structuring journal articles in PDF format into structured TEI-encoded documents that can then be transformed to JATS XML.
Homepage: https://grobid.readthedocs.io/en/latest/Introduction/
Codebase: https://github.com/kermitt2/grobid/
Roadmap:
Hosting: self-hosted
Licensing: http://www.apache.org/licenses/LICENSE-2.0.html
Pricing: free to use
Adoption level: Not Classified
Readiness level: TR9
Governance: Community (ad-hoc)
Business Form: Volunteer Community
Status: Actively Maintained
Categories: Software Component
Functions: Format Conversion
Collections:
General Tags:
Base technologies