Spaces:

lfoppiano
/

document-qa

Running

File size: 783 Bytes

a7ac5d0
e8ebf39
d251baf
 
a7ac5d0
d251baf
844c34d
d251baf
6170d15
ebe573d
 
d251baf
6170d15
6f2a39c
 
844c34d
6f2a39c
 
d251baf
6f2a39c
 
6170d15

# DocumentIQA: Scientific Document Insight Question/Answer

## Introduction

Question/Answering on scientific documents.
In our implementation we use [Grobid](https://github.com/kermitt2/grobid) for text extraction instead of the raw PDF2Text converter.
Thanks to Grobid we are able to precisely extract abstract and full-text.
This is just the beginning and publishing might help gathering more feedback. 

**NOTE**: This project focus on scientific articles. Uploading books or other large document might not work as expected. 

**Work in progress**

https://document-insights.streamlit.app/

**OpenAI or HuggingFace API KEY required**


### Acknolwedgement 

This project is developed at the [National Institute for Materials Science](https://www.nims.go.jp) (NIMS) in Japan.