markupit.readers package

Submodules

markupit.readers.base_parser module

class markupit.readers.base_parser.BaseParser[source]

Bases: object

Base class for all parsers. It defines the basic interface for all parsers.

GRAMMAR_RULES = {}
RULES_NAMES = []
compile_regex(rules: list[str] | None = None) Pattern[str][source]
get_parse_method(m: Match[str], state: BlockState) int[source]
static insert_rule(rules: list[str], name: str, before: str | None = None) None[source]

markupit.readers.markdown_block_parser module

class markupit.readers.markdown_block_parser.BlockParser[source]

Bases: BaseParser

GRAMMAR_RULES = {'atx_heading': '^ {0,3}(?P<level>#{1,6})(?!#+)(?P<atx_text>[ \\t]*|[ \\t]+.*?)$', 'blank_line': '(^[ \\t\\v\\f]*\\n)+', 'block_quote': '^ {0,3}>(?P<quote_text>.*?)$', 'code_fenced': '^(?P<fnc_spaces> {0,3})(?P<fnc_marker>`{3,}|~{3,})[ \\t]*(?P<fnc_lang>.*?)$', 'code_indent': '^(?: {4}| *\\t)[^\\n]+(?:\\n+|$)((?:(?: {4}| *\\t)[^\\n]+(?:\\n+|$))|\\s)*', 'horizontal_rule': '^ {0,3}((?:-[ \\t]*){3,}|(?:_[ \\t]*){3,}|(?:\\*[ \\t]*){3,})$', 'list': '^(?P<list_spaces> {0,3})(?P<list_marker>[\\*\\+-]|\\d{1,9}[.)])(?P<list_text>[ \\t]*|[ \\t].+)$', 'setext_heading': '^ {0,3}(?P<sep>=|-){1,}[ \\t]*$'}
RULES_NAMES = ['code_fenced', 'code_indent', 'atx_heading', 'setext_heading', 'horizontal_rule', 'blank_line', 'block_quote', 'list']
parse(state: BlockState, rules: list[str] | None = None) None[source]

Parse source Markdown text into blocks. Blocks are stored in the state with the following structure: {

‘type’: str, ‘content’: str, ‘attrs’: dict

}

visit_atx_heading(m: Match[str], state: BlockState) int[source]

Visit method for ATX headings.

# Header example

visit_blank_line(m: Match[str], state: BlockState) int[source]

Visit method for BlankLine.

BlankLine is not present in AST, but still needed to prevent reading empty lines as paragraphs.

visit_block_quote(m: Match[str], state: BlockState) int[source]

Visit method for Block Quote.

> This is a block quote. > ` > It can contain other blocks. > `

visit_code_fenced(m: Match[str], state: BlockState) int[source]

Visit method for Code Fenced.

` python print("Hello, World!") `

visit_code_indent(m: Match[str], state: BlockState) int[source]

Visit method for Code Indent (4 spaces before code).

print(“Hello”) print(“World!”)

visit_horizontal_rule(m: Match[str], state: BlockState) int[source]

Visit method for Horizontal Rule.

visit_list(m: Match[str], state: BlockState) int[source]

Visit method for List.

  • item 1

  • item 2

  1. item 1

  2. item 2

visit_setext_heading(m: Match[str], state: BlockState) int[source]

Visit method for Setext headings.

Header example

markupit.readers.markdown_block_reader module

class markupit.readers.markdown_block_reader.InlineVisitor[source]

Bases: NodeVisitor

generic_visit(node, visited_children)[source]

Default visitor method

Parameters:
  • node – The node we’re visiting

  • visited_children – The results of visiting the children of that node, in a list

I’m not sure there’s an implementation of this that makes sense across all (or even most) use cases, so we leave it to subclasses to implement for now.

visit_content(node, visited_children)[source]
visit_emph(node, visited_children)[source]
visit_space(_1, _2)[source]
visit_strong(node, visited_children)[source]
class markupit.readers.markdown_block_reader.MarkdownBlockReader[source]

Bases: object

Reads the text in the BlockState and parses it into blocks.

parse(text: str) Document[source]
markupit.readers.markdown_block_reader.flatten(nested_list)[source]

markupit.readers.markdown_reader module

class markupit.readers.markdown_reader.MarkdownReader[source]

Bases: Reader

A class representing a Markdown reader.

read(content: str) Document[source]

Read the content and return a Document.

Parameters:

content (str) – The content to read.

Returns:

The Document object.

Return type:

Document

markupit.readers.reader module

class markupit.readers.reader.Reader[source]

Bases: ABC

An abstract class representing a reader.

abstract read(content: str) Document[source]

Read the content and return a Document.

Parameters:

content (str) – The content to read.

Returns:

The Document object.

Return type:

Document

read_file(path: str) Document[source]

Read the content of a file and return a Document.

Parameters:

path (str) – The path to the file.

Returns:

The Document object.

Return type:

Document

markupit.readers.state module

class markupit.readers.state.BlockState(parent: Any | None = None)[source]

Bases: object

State used to save blocks and current cursor position in file.

ENDLINE = re.compile('\\n|$')
add_para(text: str) None[source]
append(block: dict[str, Any]) None[source]
append_para() int | None[source]
find_endline() int[source]
get_text_before(end_pos: int) str[source]
init_child_state(source: str) BlockState[source]
init_parse_text(source: str) None[source]
insert_second_to_last(block: dict[str, Any]) None[source]
property last_block: Any
property nesting_lvl: int

markupit.readers.utils module

markupit.readers.utils.convert_all_tabs_to_spaces(text: str, tab_width: int = 4) str[source]
markupit.readers.utils.convert_leading_tabs_to_spaces(text: str, tab_width: int = 4) str[source]
markupit.readers.utils.unescape_char(text: str) str[source]

Module contents

class markupit.readers.MarkdownReader[source]

Bases: Reader

A class representing a Markdown reader.

read(content: str) Document[source]

Read the content and return a Document.

Parameters:

content (str) – The content to read.

Returns:

The Document object.

Return type:

Document