PyPLN Application Programming Interface

PyPLN is a distributed pipeline for natural language processing, made in Python. Learn more at the PyPLN website.

pypln.api is a package that interacts with PyPLN HTTP API to do everything programatically, in a Pythonic way. Basically, you are able to add/list corpora, add/list documents and retrieve documents' properties (resulted from the pipeline processing by the backend).

Installation

pypln.api is available at Python Package Index. So, to install it, just execute:

pip install pypln.api

Example - Usage

You can see docstrings inside pypln.api.PyPLN, but the general usage will be something like this:

from pypln.api import PyPLN

# Start an authenticated session to PyPLN demo server
pypln = PyPLN('http://fgv.pypln.org/', ('username', 'password'))

# You could also use your authentication token:
#pypln = PyPLN('http://fgv.pypln.org/', 'my-auth-token')

# Add a new corpus to your account
new_corpus = pypln.add_corpus(name='test', description='my new corpus')

# Add a document to this new corpus
with open('my-file.pdf') as fp:
    new_doc = new_corpus.add_document(fp)
print('Document added: {}'.format(new_doc))

# Retrieve all available (processed) properties for your brand new document
print('Processed properties:')
for document_property in new_doc.properties:
    print(' - {}'.format(document_property))

# Retrieve one document property:
print('Extracted text from our PDF:')
print(new_doc.get_property('text'))

# Retrieve a document using it's url:
from pypln.api import Document
# Make sure you replace this url for the url of a document you have access to!
my_doc = Document.from_url('http://fgv.pypln.org/documents/1/',
    ('username', 'password'))
print(my_doc.get_property('text'))

# Retrieve wordcloud image built from the document
with open("wordcloud_{}.png".format(doc_id), 'w') as fd:
    fd.write(base64.b64decode(my_doc.get_property("wordcloud")))

ProTip™: use ipython to discover all methods available at PyPLN, Corpus and Document classes - they are very simple and straightford to use.

License

pypln.api is free software, released under the GPLv3.

Name	Name	Last commit message	Last commit date
Latest commit fccoelho updated badge Sep 27, 2015 ccb73fd · Sep 27, 2015 History 79 Commits
pypln	pypln	Fixes issues with unicode strings envolving the wordcloud download	Jul 9, 2015
requirements	requirements	Add yanc and coverage to default test runner	Dec 23, 2013
tests	tests	Fixes builtins module name for python2	Jul 9, 2015
.gitignore	.gitignore	Adds temporary build directories to 'make clean'	Jan 23, 2013
CHANGELOG.markdown	CHANGELOG.markdown	Update log of changes	Feb 11, 2014
COPYING	COPYING	Adds license	Feb 4, 2013
MANIFEST.in	MANIFEST.in	Fix #16 : correct filenames on MANIFEST.in	Feb 11, 2014
Makefile	Makefile	Add yanc and coverage to default test runner	Dec 23, 2013
README.markdown	README.markdown	updated badge	Sep 27, 2015
setup.py	setup.py	Bumps version to 0.3.0	Jul 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyPLN Application Programming Interface

Installation

Example - Usage

License

About

Releases

Packages

Contributors 4

Languages

License

NAMD/pypln.api

Folders and files

Latest commit

History

Repository files navigation

PyPLN Application Programming Interface

Installation

Example - Usage

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages