k3down2

Convert markdown segments into images and other transferable media.

k3down2 is a component of pykit3 project: a python3 toolkit set.

Dependencies

External tools required:

pandoc - render markdown snippets to HTML (tables, etc.)
graphviz - render graphviz diagrams to images
mmdc - convert mermaid charts to SVG (mermaid-js)

Installation

pip install k3down2

API Reference

`k3down2`

k3down2 is utility to convert markdown segment into easy to transfer media such as images. It depends on:

pandoc to render markdown snippet to html, such as tables.
graphviz to render graphviz to image.
playwright (chromium) to render svg/html to png.
mmdc to convert mermaid chart to svg. See: https://mermaid-js.github.io/mermaid/#

`download(url)`

Download content from url and return the responded data.

Parameters:

Name	Type	Description	Default
`url`	`str`	the url from which to download.	required

Returns:

Type	Description
`bytes`	bytes of downloaded data.

Source code in k3down2/down2.py

def download(url: str) -> bytes:
    """
    Download content from ``url`` and return the responded data.

    Args:
        url(str): the url from which to download.

    Returns:
        bytes of downloaded data.
    """

    resp = urllib.request.urlopen(url, timeout=30)
    return resp.read()

`graphviz_to_img(gv, typ)`

Render graphviz source to image.

Requires

brew install graphviz

Source code in k3down2/down2.py

def graphviz_to_img(gv: str | bytes, typ: str) -> bytes:
    """
    Render graphviz source to image.

    Requires:
        brew install graphviz
    """

    _, out, _ = k3proc.command_ex(
        "dot",
        "-T" + typ,
        input=to_bytes(gv),
        text=False,
    )
    return out

`md_to_html(md)`

Build markdown source into html.

Parameters:

Name	Type	Description	Default
`md`	`str`	markdown source.	required

Returns:

Type	Description
`str`	str of html

Source code in k3down2/down2.py

def md_to_html(md: str) -> str:
    """
    Build markdown source into html.

    Args:
        md(str): markdown source.

    Returns:
        str of html
    """

    _, html, _ = k3proc.command_ex(
        "pandoc",
        "-f",
        "markdown",
        "-t",
        "html",
        input=md,
    )

    return html_style + html

`mdtable_to_barehtml(md)`

Build markdown table into html without style.

Parameters:

Name	Type	Description	Default
`md`	`str`	markdown source.	required

Returns:

Type	Description
`str`	str of html

Source code in k3down2/down2.py

def mdtable_to_barehtml(md: str) -> str:
    """
    Build markdown table into html without style.

    Args:
        md(str): markdown source.

    Returns:
        str of html
    """

    # A table with wide column will cause pandoc to produce ``colgroup`` tag, which is not recognized by zhihu.
    # Reported in:
    #      https://github.com/drmingdrmer/md2zhihu/issues/22
    #
    # Thus we have to set a very big rendering window to disable this behavior
    #      https://github.com/jgm/pandoc/issues/2574

    _, html, _ = k3proc.command_ex(
        "pandoc",
        "-f",
        "markdown",
        "-t",
        "html",
        "--column",
        "100000",
        input=md,
    )
    lines = html.strip().split("\n")
    lines = [x for x in lines if x not in ("<thead>", "</thead>", "<tbody>", "</tbody>")]

    return "\n".join(lines)

`mermaid_to_svg(mmd)`

Render mermaid to svg. See: https://mermaid-js.github.io/mermaid/#

Requires

npm install @mermaid-js/mermaid-cli

Source code in k3down2/down2.py

def mermaid_to_svg(mmd: str) -> str:
    """
    Render mermaid to svg.
    See: https://mermaid-js.github.io/mermaid/#

    Requires:
        npm install @mermaid-js/mermaid-cli
    """

    with tempfile.TemporaryDirectory() as tdir:
        output_path = os.path.join(tdir, "mmd.svg")

        puppeteer_config = {
            "args": [
                "--no-sandbox",
                "--disable-setuid-sandbox",
                "--disable-dev-shm-usage",
                "--disable-gpu",
            ]
        }

        config_file_path = os.path.join(tdir, "config.json")
        with open(config_file_path, "w") as f:
            f.write(json.dumps(puppeteer_config))

        k3proc.command_ex(
            "npm",
            "exec",
            "--",
            "mmdc",
            "-o",
            output_path,
            "--puppeteerConfigFile",
            config_file_path,
            input=mmd,
        )
        return fread(output_path)

`render_to_img(mime, content, typ, width=1000, height=2000, asset_base=None)`

Render content that is renderable in a browser to image. Such as html, svg etc into image. Uses Playwright (Chromium) for rendering and Pillow for image processing.

Parameters:

Name	Type	Description	Default
`mime`	`str`	a full mime type such as `image/jpeg` or a shortcut `jpg`.	required
`content`	`str`	content to render, such as jpeg data or svg source.	required
`typ`	`string`	specifies output image type such as "png", "jpg"	required
`width`	`int`	specifies the window width to render a page. Default 1000.	`1000`
`height`	`int`	specifies the window height to render a page. Default 2000.	`2000`
`asset_base`	`str`	specifies the path to assets dir. E.g. the image base path in a html page.	`None`

Returns:

Type	Description
`bytes`	bytes of the image data

Source code in k3down2/down2.py

def render_to_img(
    mime: str, content: str | bytes, typ: str, width: int = 1000, height: int = 2000, asset_base: str | None = None
) -> bytes:
    """
    Render content that is renderable in a browser to image.
    Such as html, svg etc into image.
    Uses Playwright (Chromium) for rendering and Pillow for image processing.

    Args:
        mime(str): a full mime type such as ``image/jpeg`` or a shortcut ``jpg``.

        content(str): content to render, such as jpeg data or svg source.

        typ(string): specifies output image type such as "png", "jpg"

        width(int): specifies the window width to render a page. Default 1000.

        height(int): specifies the window height to render a page. Default 2000.

        asset_base(str): specifies the path to assets dir. E.g. the image base path in a html page.

    Returns:
        bytes of the image data
    """

    if "html" in mime:
        content = r'<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>' + content

        if asset_base is not None:
            base_uri = pathlib.Path(asset_base).as_uri()
            content = '<base href="{}/">'.format(base_uri) + content

    m = mimetypes.get(mime) or mime
    suffix = mime_to_suffix.get(m, mime)

    with tempfile.TemporaryDirectory() as tdir:
        fn = os.path.join(tdir, "xxx." + suffix)
        flags = "w"
        if isinstance(content, bytes):
            flags = "wb"
        with open(fn, flags) as f:
            f.write(content)

        browser = _get_browser()
        page = browser.new_page(
            viewport={"width": width, "height": height},
            device_scale_factor=2,
        )
        page.goto(pathlib.Path(fn).as_uri())

        content_height = page.evaluate("document.documentElement.scrollHeight")
        if content_height > height:
            page.set_viewport_size({"width": width, "height": content_height})

        png_data = page.screenshot(omit_background=True)
        page.close()

    return _trim_and_convert(png_data, typ)

`tex_to_img(tex, block, typ)`

Convert tex source to an image.

Parameters:

Name	Type	Description	Default
`tex`	`str`	tex source	required
`block`	`bool`	whether to render a block(center-aligned) equation or inline equation.	required
`typ`	`str`	output image type such as "png" or "jpg"	required

Returns:

Type	Description
`str \| bytes`	bytes of png data.

Source code in k3down2/down2.py

def tex_to_img(tex: str, block: bool, typ: str) -> str | bytes:
    """
    Convert tex source to an image.

    Args:
        tex(str): tex source

        block(bool): whether to render a block(center-aligned) equation or
            inline equation.

        typ(str): output image type such as "png" or "jpg"

    Returns:
        bytes of png data.
    """

    input_type = "tex_block"
    if not block:
        input_type = "tex_inline"
    return convert(input_type, tex, typ)

`tex_to_plain(tex)`

Try hard converting tex to unicode plain text.

Source code in k3down2/down2.py

def tex_to_plain(tex: str) -> str:
    """
    Try hard converting tex to unicode plain text.
    """

    for reg, cate in (
        (r"_\{([^}]*?)\}", subscripts),
        (r"[\^]\{([^}]*?)\}", superscripts),
        (r"_(.)", subscripts),
        (r"[\^](.)", superscripts),
    ):
        pieces = []
        while True:
            match = re.search(reg, tex, flags=re.DOTALL | re.UNICODE)
            if match:
                chars = match.groups()[0]
                if all_in(chars, cate):
                    chars = [cate[x] for x in chars]
                else:
                    chars = tex[match.start() : match.end()]
                pieces.append(tex[: match.start()])
                pieces.append("".join(chars))
                tex = tex[match.end() :]
            else:
                pieces.append(tex)
                break

        tex = "".join(pieces)

    return LatexNodes2Text().latex_to_text(tex)

`tex_to_zhihu(tex, block)`

Convert tex source to a img tag link to a svg on zhihu. www.zhihu.com/equation is a public api to render tex into svg.

Parameters:

Name	Type	Description	Default
`tex`	`str`	tex source	required
`block`	`bool`	whether to render a block(center-aligned) equation or inline equation.	required

Returns:

Type	Description
`str`	string of a `<img>` tag.

Source code in k3down2/down2.py

def tex_to_zhihu(tex: str, block: bool) -> str:
    """
    Convert tex source to a img tag link to a svg on zhihu.
    www.zhihu.com/equation is a public api to render tex into svg.

    Args:
        tex(str): tex source

        block(bool): whether to render a block(center-aligned) equation or
            inline equation.

    Returns:
        string of a ``<img>`` tag.
    """

    p = _zhihu_tex_params(tex, block)
    return zhihu_equation_fmt.format(**p)

`tex_to_zhihu_compatible(tex)`

Convert tex to zhihu compatible format. - > in img alt mess up the next escaped brace: \{ q > 1 \} --> \{ q > 1 }.

Source code in k3down2/down2.py

def tex_to_zhihu_compatible(tex: str) -> tuple[str, str]:
    r"""
    Convert tex to zhihu compatible format.
    - ``>`` in img alt mess up the next escaped brace: ``\{ q > 1 \} --> \{ q > 1 }``.
    """

    tex = re.sub(r"\n", "", tex)
    tex = re.sub(r"(?<!\\)>", r"\\gt", tex)
    texurl = urllib.parse.quote(tex)
    return tex, texurl

`tex_to_zhihu_url(tex, block)`

Convert tex source to a url linking to a svg on zhihu. www.zhihu.com/equation is a public api to render tex into svg.

Parameters:

Name	Type	Description	Default
`tex`	`str`	tex source	required
`block`	`bool`	whether to render a block(center-aligned) equation or inline equation.	required

Returns:

Type	Description
`str`	string of a `<img>` tag.

Source code in k3down2/down2.py

def tex_to_zhihu_url(tex: str, block: bool) -> str:
    """
    Convert tex source to a url linking to a svg on zhihu.
    www.zhihu.com/equation is a public api to render tex into svg.

    Args:
        tex(str): tex source

        block(bool): whether to render a block(center-aligned) equation or
            inline equation.

    Returns:
        string of a ``<img>`` tag.
    """

    p = _zhihu_tex_params(tex, block)
    return zhihu_equation_url_fmt.format(texurl=p["texurl"], align=p["align"])

`web_to_img(pagefn, typ)`

Render a web page, which could be html, svg etc into image.

Parameters:

Name	Type	Description	Default
`pagefn`	`string`	path to a local file that can be rendered in a browser.	required
`typ`	`string`	specify output image type such as "png", "jpg"	required

Returns:

Type	Description
`bytes`	bytes of the image data

Source code in k3down2/down2.py

def web_to_img(pagefn: str, typ: str) -> bytes:
    """
    Render a web page, which could be html, svg etc into image.

    Args:
        pagefn(string): path to a local file that can be rendered in a browser.

        typ(string): specify output image type such as "png", "jpg"

    Returns:
        bytes of the image data
    """

    intyp = pagefn.rsplit(".")[-1]
    page = fread(pagefn)
    return render_to_img(intyp, page, typ)

k3down2

Dependencies

Installation

API Reference

k3down2

download(url)

graphviz_to_img(gv, typ)

md_to_html(md)

mdtable_to_barehtml(md)

mermaid_to_svg(mmd)

render_to_img(mime, content, typ, width=1000, height=2000, asset_base=None)

tex_to_img(tex, block, typ)

tex_to_plain(tex)

tex_to_zhihu(tex, block)

tex_to_zhihu_compatible(tex)

tex_to_zhihu_url(tex, block)

web_to_img(pagefn, typ)

License

`k3down2`

`download(url)`

`graphviz_to_img(gv, typ)`

`md_to_html(md)`

`mdtable_to_barehtml(md)`

`mermaid_to_svg(mmd)`

`render_to_img(mime, content, typ, width=1000, height=2000, asset_base=None)`

`tex_to_img(tex, block, typ)`

`tex_to_plain(tex)`

`tex_to_zhihu(tex, block)`

`tex_to_zhihu_compatible(tex)`

`tex_to_zhihu_url(tex, block)`

`web_to_img(pagefn, typ)`