The API module that contains the main function `parse_pdf`.
parse_pdf
(server_address
:str
, file_path
:Path
, port
:str
=''
, timeout
:int
=60
)
This function if successful returns the JSON output of the
science parse server as a dictionary. Else if a Timeout Exception
or any other Exception occurs it will return None. If any of the
exceptions do occur they will be logged as an error.
- server_address: Address of the server e.g.
http://127.0.0.1
- file_path: Path to the pdf file to be processed.
- port: The port to the server e.g. 8080
- timeout: The amount of time to allow the request to take.
returns A dictionary with the following keys:
['abstractText', 'authors', 'id', 'references', 'sections', 'title', 'year']
Note not all of these dictionary keys will always exist if science parse
cannot detect the relevant information e.g. if it cannot find any references
then there will be no reference key.
Note See the example on the main page of the documentation for a
detailed example of this method.