Apertium Streamparser¶

Usage: streamparser.py [FILE]

Consumes input from a file (first argument) or stdin, parsing and pretty printing the readings of lexical units found.

class streamparser.Knownness[source]¶

Level of knowledge associated with a LexicalUnit.

Values: known, unknown, biunknown, genunknown

class streamparser.known[source]¶

class streamparser.unknown[source]¶: Denoted by *, analysis not available.

class streamparser.biunknown[source]¶: Denoted by @, translation not available.

class streamparser.genunknown[source]¶: Denoted by #, generated form not available.

class streamparser.LexicalUnit(lexical_unit)[source]¶

A lexical unit consisting of a lemma and its readings.

lexical_unit¶

The lexical unit in Apertium stream format.

Type:	str

wordform¶

The word form (surface form) of the lexical unit.

Type:	str

wordbound_blank¶

The wordbound blank of the lexical unit.

Type:	str

readings¶

The analyses of the lexical unit with sublists containing all subreadings.

Type:	List[List[`SReading`]]

knownness¶

The level of knowledge of the lexical unit.

Type:	`Knownness`

class streamparser.SReading¶

A single subreading of an analysis of a token.

baseform¶

The base form (lemma, lexical form, citation form) of the reading.

Type:	str

tags¶

The morphological tags associated with the reading.

Type:	List[str]

baseform: Alias for field number 0

tags: Alias for field number 1

streamparser.mainpos(reading, ltr=False)[source]¶: Return the first part-of-speech tag of a reading. If there are several subreadings, by default give the first tag of the last subreading. If ltr=True, give the first tag of the first subreading, see http://beta.visl.sdu.dk/cg3/single/#sub-stream-apertium for more information.

streamparser.parse(stream, with_text=False)[source]¶

Generates lexical units from a character stream.

Parameters:

stream (Iterator[str]) – A character stream containing lexical units, superblanks and other text.
with_text (Optional[bool]) – A boolean defining whether to output preceding text with each lexical unit.

Yields:

LexicalUnit – The next lexical unit found in the character stream. (if with_text is False)

(str, LexicalUnit) - The next lexical unit found in the character stream and the the text that seperated it from the prior unit in a tuple. (if with_text is True)

streamparser.parse_file(f, **kwargs)[source]¶

Generates lexical units from a file.

Parameters:	f (file) – A file containing lexical units, superblanks and other text.
Yields:	`LexicalUnit` – The next lexical unit found in the file.

Apertium Streamparser¶

Apertium Streamparser

Navigation

Related Topics