Commit 5cc37beb authored by Timo Petmanson's avatar Timo Petmanson
Browse files

Updated tutorials

parent f3740a89
......@@ -6,6 +6,8 @@ Estnltk --- Open source tools for Estonian natural language processing
Estnltk is a Python 2.7/Python 3.4 library for performing common language processing tasks in Estonian, funded by `Eesti Keeletehnoloogia Riiklik Programm`_ under the project `EKT57`_.
Estnltk is licensed under `GNU GPL version 2`_.
To get started right now, see :ref:`installation_tutorial`.
.. _Eesti Keeletehnoloogia Riiklik Programm:
.. _EKT57:
.. _GNU GPL version 2:
.. _developer_guide:
Estnltk developer guide
This document is for everyone who is working on Estnltk project (or wishes to work), but do not know how to get started.
Compiling estnltk
python build
python3 build
Version control and branches
......@@ -37,16 +50,44 @@ First, modify your ``.git/setup`` configuration to look like following::
remote = origin
merge = refs/heads/devel
Second, use commands
Second, use commands::
git push origin master
git pull origin master
git push origin master
git pull origin master
to perform pulls and pushes to both repositories without no extra hassel.
Third, your're done! ;)
Checking out devel branch
Try this::
git branch -a
git checkout -t devel origin/devel
git pull
git checkout devel
git branch -a
You should see something similar as output::
* devel
Important thing is that you see ``"* devel"`` .
Writing documentation
......@@ -91,4 +132,14 @@ Then, create a subfolder with the appropriate estnltk version and copy the new d
Creating releases
\ No newline at end of file
Uploading source tarball::
python build
python sdist
python upload
Uploading Windows wheels::
.. _installation_tutorial:
......@@ -9,7 +11,7 @@ Installation on Linux Mint 17.2
In Linux, install dependences, install estnltk and test the installation::
sudo apt-get install g++ python3-dev python3-pip python3-numpy swig
sudo apt-get install g++ python3-dev python3-pip python3-wheel python3-numpy swig
sudo pip3 install estnltk
As a first test, try to run this line of code in your terminal::
......@@ -34,15 +36,14 @@ Although this is Linux Mint 17.2 specific, it should also work in Ubuntu.
(Optional) You might want to use Oracle JDK instead of OpenJDK, because Estnltk uses Java for some tasks.
These tutorials will help you out:: ,
These tutorials will help you install it: , .
Installation on Windows
Installation on Windows is little bit more difficult than in Linux.
Full list of dependencies
......@@ -129,24 +130,18 @@ First thing after installing the dependencies is to get the source.
One option is cloing the repository using latest code::
git clone estnltk
or from mirror repository::
git clone estnltk
or download it as a compressed zip::
Then, extract the sources and issue following commands in the downloaded/cloned folder to build and install::
Then, issue following commands in the cloned folder to build and install::
python build
sudo python install
python3 build
sudo python3 install
Note that ``python`` usually refers to default Python version installed with the system.
Usually, you can also use more specific versions by replacing ``python`` with ``python2.7`` or ``python3.4``.
Note that the same commands work when building in Windows, but you need to execute them in Visual Studio SDK command prompt.
If you want to set up estnltk for development, see :ref:`developer_guide`.
Windows installers
......@@ -169,12 +164,13 @@ Note that you still need to install the dependencies separately.
Post-installation steps
Downloading NLTK tokenizers for Estonian. These are necessary for tokenization::
Downloading NLTK tokenizers for Estonian. These are necessary for tokenization.
This should happen automatically, but if it does not, use this command to download them::
python -m nltk.downloader punkt
python3 -m nltk.downloader punkt
Estnltk comes with pre-built named entity taggers, but you can optionally rebuild them if you have lost them for some reason.
The command to build the default named entity tagger for Estonian::
python -m
python3 -m
......@@ -48,8 +48,7 @@ setup(
'estnltk.estner': ['gazetteer/*', 'models/py2_default/*', 'models/py3_default/*'],
'estnltk.wordnet': ['*.cnf', 'data/*.txt', 'data/*.soi', 'data/*.cnf', 'data/scripts/*.py'],
'estnltk.mw_verbs': ['res/*'],
'estnltk.converters': ['*.mrf'],
#'estnltk.grammarextractor': ['grammars/*']
'estnltk.converters': ['*.mrf']
author = "University of Tartu",
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment