Commit b0c2dd23 authored by rabauti's avatar rabauti
Browse files

DTD korpuse valideerimiseks

parent e4d14ff0
#korpuse üheks failiks kokku koondamine
python joinXml.py > korpus_public/oppijakeel.xml
#korpuse faili formaalse korrektsuse valideerimine
#korpuse faili formaalse korrektsuse valideerimine DTD abil
xmllint --noout korpus_public/oppijakeel.xml
......@@ -39,4 +39,6 @@ cat korpus_public/oppijakeel.xml | grep -o 'tyyp="[^"]*"' | sort | uniq -c | sor
209 tyyp="tõlge"
42 tyyp="referaat"
25 tyyp="referering"
15 tyyp=""
\ No newline at end of file
15 tyyp=""
\ No newline at end of file
<!ELEMENT korpus ( header, tekst ) >
<!ELEMENT header ( #PCDATA | parandajad)* >
<!ELEMENT parandajad (#PCDATA)* >
<!ELEMENT tekst ( eksimus+ ) >
<!ELEMENT eksimus ( algne, parandus+, kommentaar )+ >
<!ATTLIST eksimus emakeel CDATA #REQUIRED >
<!ATTLIST eksimus id CDATA #REQUIRED >
<!ATTLIST eksimus tase CDATA #REQUIRED >
<!ATTLIST eksimus tyyp CDATA #REQUIRED >
<!ELEMENT algne ( #PCDATA ) >
<!ATTLIST algne id CDATA #REQUIRED >
<!ELEMENT parandus ( #PCDATA ) >
<!ATTLIST parandus id CDATA #REQUIRED >
<!ELEMENT kommentaar ( #PCDATA )* >
<!ATTLIST kommentaar id CDATA #REQUIRED >
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment