Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
-
Updated
Jun 13, 2024 - HTML
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
PDF Verse is a powerful web based PDF Editor with tools for editing, converting, and manipulating PDFs. Merge, compress, add or remove pages, or extract text using OCR technology. Convert PDF to DOC, Excel, PPT, JPG, PNG, Text and many more format as well and vice versa. PDF Verse also has user-friendly interface and wide range of features as well
OCR library to extract text & tables from PDF files and images. Convert any image or PDF to CSV / TXT / JSON / Searchable PDF.
Build a RAG preprocessing pipeline
ByteScout PDF Extractor SDK source code samples
NodeJS library to convert JSON to PDF or vice versa
This project for converting books from PDF to Proper JSON objects by separating title and content. After you take your output, you can insert your JSON file in the database easily.
Graphlit Platform
Python client library for Graphlit Platform
🛠️ ipuresult-cli is tool for creating json files from pdf result files 📚 of GGSIPU Results
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
TypeScript client for Graphlit Platform
Add a description, image, and links to the pdf-to-json topic page so that developers can more easily learn about it.
To associate your repository with the pdf-to-json topic, visit your repo's landing page and select "manage topics."