Skip to content

k3down2

Action-CI Documentation Status Package

Convert markdown segments into images and other transferable media.

k3down2 is a component of pykit3 project: a python3 toolkit set.

Dependencies

External tools required:

  • pandoc - render markdown snippets to HTML (tables, etc.)
  • graphviz - render graphviz diagrams to images
  • mmdc - convert mermaid charts to SVG (mermaid-js)

Installation

pip install k3down2

API Reference

k3down2

k3down2 is utility to convert markdown segment into easy to transfer media such as images. It depends on:

  • pandoc to render markdown snippet to html, such as tables.
  • graphviz to render graphviz to image.
  • playwright (chromium) to render svg/html to png.
  • mmdc to convert mermaid chart to svg. See: https://mermaid-js.github.io/mermaid/#

download(url)

Download content from url and return the responded data.

Parameters:

Name Type Description Default
url str

the url from which to download.

required

Returns:

Type Description
bytes

bytes of downloaded data.

Source code in k3down2/down2.py
291
292
293
294
295
296
297
298
299
300
301
302
303
def download(url: str) -> bytes:
    """
    Download content from ``url`` and return the responded data.

    Args:
        url(str): the url from which to download.

    Returns:
        bytes of downloaded data.
    """

    resp = urllib.request.urlopen(url, timeout=30)
    return resp.read()

graphviz_to_img(gv, typ)

Render graphviz source to image.

Requires

brew install graphviz

Source code in k3down2/down2.py
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
def graphviz_to_img(gv: str | bytes, typ: str) -> bytes:
    """
    Render graphviz source to image.

    Requires:
        brew install graphviz
    """

    _, out, _ = k3proc.command_ex(
        "dot",
        "-T" + typ,
        input=to_bytes(gv),
        text=False,
    )
    return out

md_to_html(md)

Build markdown source into html.

Parameters:

Name Type Description Default
md str

markdown source.

required

Returns:

Type Description
str

str of html

Source code in k3down2/down2.py
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
def md_to_html(md: str) -> str:
    """
    Build markdown source into html.

    Args:
        md(str): markdown source.

    Returns:
        str of html
    """

    _, html, _ = k3proc.command_ex(
        "pandoc",
        "-f",
        "markdown",
        "-t",
        "html",
        input=md,
    )

    return html_style + html

mdtable_to_barehtml(md)

Build markdown table into html without style.

Parameters:

Name Type Description Default
md str

markdown source.

required

Returns:

Type Description
str

str of html

Source code in k3down2/down2.py
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
def mdtable_to_barehtml(md: str) -> str:
    """
    Build markdown table into html without style.

    Args:
        md(str): markdown source.

    Returns:
        str of html
    """

    # A table with wide column will cause pandoc to produce ``colgroup`` tag, which is not recognized by zhihu.
    # Reported in:
    #      https://github.com/drmingdrmer/md2zhihu/issues/22
    #
    # Thus we have to set a very big rendering window to disable this behavior
    #      https://github.com/jgm/pandoc/issues/2574

    _, html, _ = k3proc.command_ex(
        "pandoc",
        "-f",
        "markdown",
        "-t",
        "html",
        "--column",
        "100000",
        input=md,
    )
    lines = html.strip().split("\n")
    lines = [x for x in lines if x not in ("<thead>", "</thead>", "<tbody>", "</tbody>")]

    return "\n".join(lines)

mermaid_to_svg(mmd)

Render mermaid to svg. See: https://mermaid-js.github.io/mermaid/#

Requires

npm install @mermaid-js/mermaid-cli

Source code in k3down2/down2.py
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
def mermaid_to_svg(mmd: str) -> str:
    """
    Render mermaid to svg.
    See: https://mermaid-js.github.io/mermaid/#

    Requires:
        npm install @mermaid-js/mermaid-cli
    """

    with tempfile.TemporaryDirectory() as tdir:
        output_path = os.path.join(tdir, "mmd.svg")

        puppeteer_config = {
            "args": [
                "--no-sandbox",
                "--disable-setuid-sandbox",
                "--disable-dev-shm-usage",
                "--disable-gpu",
            ]
        }

        config_file_path = os.path.join(tdir, "config.json")
        with open(config_file_path, "w") as f:
            f.write(json.dumps(puppeteer_config))

        k3proc.command_ex(
            "npm",
            "exec",
            "--",
            "mmdc",
            "-o",
            output_path,
            "--puppeteerConfigFile",
            config_file_path,
            input=mmd,
        )
        return fread(output_path)

render_to_img(mime, content, typ, width=1000, height=2000, asset_base=None)

Render content that is renderable in a browser to image. Such as html, svg etc into image. Uses Playwright (Chromium) for rendering and Pillow for image processing.

Parameters:

Name Type Description Default
mime str

a full mime type such as image/jpeg or a shortcut jpg.

required
content str

content to render, such as jpeg data or svg source.

required
typ string

specifies output image type such as "png", "jpg"

required
width int

specifies the window width to render a page. Default 1000.

1000
height int

specifies the window height to render a page. Default 2000.

2000
asset_base str

specifies the path to assets dir. E.g. the image base path in a html page.

None

Returns:

Type Description
bytes

bytes of the image data

Source code in k3down2/down2.py
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
def render_to_img(
    mime: str, content: str | bytes, typ: str, width: int = 1000, height: int = 2000, asset_base: str | None = None
) -> bytes:
    """
    Render content that is renderable in a browser to image.
    Such as html, svg etc into image.
    Uses Playwright (Chromium) for rendering and Pillow for image processing.

    Args:
        mime(str): a full mime type such as ``image/jpeg`` or a shortcut ``jpg``.

        content(str): content to render, such as jpeg data or svg source.

        typ(string): specifies output image type such as "png", "jpg"

        width(int): specifies the window width to render a page. Default 1000.

        height(int): specifies the window height to render a page. Default 2000.

        asset_base(str): specifies the path to assets dir. E.g. the image base path in a html page.

    Returns:
        bytes of the image data
    """

    if "html" in mime:
        content = r'<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>' + content

        if asset_base is not None:
            base_uri = pathlib.Path(asset_base).as_uri()
            content = '<base href="{}/">'.format(base_uri) + content

    m = mimetypes.get(mime) or mime
    suffix = mime_to_suffix.get(m, mime)

    with tempfile.TemporaryDirectory() as tdir:
        fn = os.path.join(tdir, "xxx." + suffix)
        flags = "w"
        if isinstance(content, bytes):
            flags = "wb"
        with open(fn, flags) as f:
            f.write(content)

        browser = _get_browser()
        page = browser.new_page(
            viewport={"width": width, "height": height},
            device_scale_factor=2,
        )
        page.goto(pathlib.Path(fn).as_uri())

        content_height = page.evaluate("document.documentElement.scrollHeight")
        if content_height > height:
            page.set_viewport_size({"width": width, "height": content_height})

        png_data = page.screenshot(omit_background=True)
        page.close()

    return _trim_and_convert(png_data, typ)

tex_to_img(tex, block, typ)

Convert tex source to an image.

Parameters:

Name Type Description Default
tex str

tex source

required
block bool

whether to render a block(center-aligned) equation or inline equation.

required
typ str

output image type such as "png" or "jpg"

required

Returns:

Type Description
str | bytes

bytes of png data.

Source code in k3down2/down2.py
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
def tex_to_img(tex: str, block: bool, typ: str) -> str | bytes:
    """
    Convert tex source to an image.

    Args:
        tex(str): tex source

        block(bool): whether to render a block(center-aligned) equation or
            inline equation.

        typ(str): output image type such as "png" or "jpg"

    Returns:
        bytes of png data.
    """

    input_type = "tex_block"
    if not block:
        input_type = "tex_inline"
    return convert(input_type, tex, typ)

tex_to_plain(tex)

Try hard converting tex to unicode plain text.

Source code in k3down2/down2.py
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
def tex_to_plain(tex: str) -> str:
    """
    Try hard converting tex to unicode plain text.
    """

    for reg, cate in (
        (r"_\{([^}]*?)\}", subscripts),
        (r"[\^]\{([^}]*?)\}", superscripts),
        (r"_(.)", subscripts),
        (r"[\^](.)", superscripts),
    ):
        pieces = []
        while True:
            match = re.search(reg, tex, flags=re.DOTALL | re.UNICODE)
            if match:
                chars = match.groups()[0]
                if all_in(chars, cate):
                    chars = [cate[x] for x in chars]
                else:
                    chars = tex[match.start() : match.end()]
                pieces.append(tex[: match.start()])
                pieces.append("".join(chars))
                tex = tex[match.end() :]
            else:
                pieces.append(tex)
                break

        tex = "".join(pieces)

    return LatexNodes2Text().latex_to_text(tex)

tex_to_zhihu(tex, block)

Convert tex source to a img tag link to a svg on zhihu. www.zhihu.com/equation is a public api to render tex into svg.

Parameters:

Name Type Description Default
tex str

tex source

required
block bool

whether to render a block(center-aligned) equation or inline equation.

required

Returns:

Type Description
str

string of a <img> tag.

Source code in k3down2/down2.py
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
def tex_to_zhihu(tex: str, block: bool) -> str:
    """
    Convert tex source to a img tag link to a svg on zhihu.
    www.zhihu.com/equation is a public api to render tex into svg.

    Args:
        tex(str): tex source

        block(bool): whether to render a block(center-aligned) equation or
            inline equation.

    Returns:
        string of a ``<img>`` tag.
    """

    p = _zhihu_tex_params(tex, block)
    return zhihu_equation_fmt.format(**p)

tex_to_zhihu_compatible(tex)

Convert tex to zhihu compatible format. - > in img alt mess up the next escaped brace: \{ q > 1 \} --> \{ q > 1 }.

Source code in k3down2/down2.py
79
80
81
82
83
84
85
86
87
88
def tex_to_zhihu_compatible(tex: str) -> tuple[str, str]:
    r"""
    Convert tex to zhihu compatible format.
    - ``>`` in img alt mess up the next escaped brace: ``\{ q > 1 \} --> \{ q > 1 }``.
    """

    tex = re.sub(r"\n", "", tex)
    tex = re.sub(r"(?<!\\)>", r"\\gt", tex)
    texurl = urllib.parse.quote(tex)
    return tex, texurl

tex_to_zhihu_url(tex, block)

Convert tex source to a url linking to a svg on zhihu. www.zhihu.com/equation is a public api to render tex into svg.

Parameters:

Name Type Description Default
tex str

tex source

required
block bool

whether to render a block(center-aligned) equation or inline equation.

required

Returns:

Type Description
str

string of a <img> tag.

Source code in k3down2/down2.py
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
def tex_to_zhihu_url(tex: str, block: bool) -> str:
    """
    Convert tex source to a url linking to a svg on zhihu.
    www.zhihu.com/equation is a public api to render tex into svg.

    Args:
        tex(str): tex source

        block(bool): whether to render a block(center-aligned) equation or
            inline equation.

    Returns:
        string of a ``<img>`` tag.
    """

    p = _zhihu_tex_params(tex, block)
    return zhihu_equation_url_fmt.format(texurl=p["texurl"], align=p["align"])

web_to_img(pagefn, typ)

Render a web page, which could be html, svg etc into image.

Parameters:

Name Type Description Default
pagefn string

path to a local file that can be rendered in a browser.

required
typ string

specify output image type such as "png", "jpg"

required

Returns:

Type Description
bytes

bytes of the image data

Source code in k3down2/down2.py
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
def web_to_img(pagefn: str, typ: str) -> bytes:
    """
    Render a web page, which could be html, svg etc into image.

    Args:
        pagefn(string): path to a local file that can be rendered in a browser.

        typ(string): specify output image type such as "png", "jpg"

    Returns:
        bytes of the image data
    """

    intyp = pagefn.rsplit(".")[-1]
    page = fread(pagefn)
    return render_to_img(intyp, page, typ)

License

The MIT License (MIT) - Copyright (c) 2015 Zhang Yanpo (张炎泼)