自制标记编程语言 - Lineup v2.2 新特性 && 原理

Albert

自制标记编程语言 - Lineup v2.2 新特性 && 原理

0x00 概览

lineup项目地址（求赞求下载）：https://gitee.com/albert_zhong/lineup
lineup源文件CDN（min版本）：http://lib.albertz.top/lineup/lineup-min.js
lineup初代版本（markup）介绍视频：https://www.bilibili.com/video/BV1sW4y1L7a3
下面的代码片段展示大多数的lineup特性

#(2) 关于Lineup
		p() 由AlbertZ原创开发的简单、优雅、轻量、开放的标记语言。可以运用在前段开发、富文本写作等场景，支持在HTML直接引入，替代HTML的大量功能。
 #(2) 为什么选择Lineup？
ul(class: list) \
		li() 简单：语法单一简单，特殊符号少，嵌套自然，无论有无编程基础都能快速上手
		li() 优雅：类似YAML和Python的设计原则，强制缩进，大量减少HTML代码的重复点
		li() 轻量：仅8kB的编译器核心，用JavaScript编写，同时支持前段和后端，可直接植入网页进行开发
		li() 开放：开源项目，支持二次创作，语法可扩展，仅修改常量文件就能定制出自己的编译器
#(2) 下载和使用
p() 可以在{ @("https://gitee.com/albert_zhong/lineup", target:new) https://gitee.com/albert_zhong/lineup }下载Lineup最新版本的源代码。

在你的html代码中引用：
先在dom里lineup标签内部写入lineup代码

<lineup style="display:none;">
	p() test
</lineup>

再进行document编译

<script src="http://lib.albertz.top/lineup/lineup-min.js"></script>
<script>Lineup.compileDocument(); </script>

lineup的基本语法是元素名(参数) 内容

#(1) TEST HEADING
p(class:content) THE CONTENT

0x01 新特性

省略元素名

将元素名变为非必选项。默认为span。

() 这是一段代码

内联元素

有效解决了lineup代码臃肿的问题。可以在行内用花括号定义多个元素。

p() 内联语法示例 { (color:red) 这是一段代码 }

Python胶水插件和更好的导入支持

之前的开发是使用js module作为标准。众所周知module的CORS支持很差。故用python编写了一个整合插件（link.py），可以直接以lineup-min.js导入。

直接作为编辑语言

之前的函数已经满足了作为开发语言的需求，本次补充的compileHTML和compileXML函数可以直接将lineup纯代码转为完整的html文件。

0x02 实现原理

要理解以下原理，需首先掌握树结构的生成和遍历算法。

任何编译器的编译原理大致为原来的代码 -> 语法树 -> 编译成的代码。

所以需要一个parse函数生成语法树，一个compose函数将语法树遍历并且重新写成别的语言。

function parse(code) {
    // 用一个奇奇怪怪的原创方法实现的数结构遍历 把所有代码根据缩进转换成语法树的结构
    let lines = code.split('\n');
    let indentIndex = 0; // 初始的当前缩进级别
    let rootNode = {'content':[], 'indentIndex': -1};
    let openNodeList = [rootNode]; // 所有未闭合的节点的引用
    for (let i = 0; i < lines.length; i++) {
        let line = lines[i];
        if (!line.trim()) continue;
        let obj = parseLine(line);
        // 判断是否为叶节点
        if (typeof obj.content != 'string') {
            // 下一级节点（非叶节点）
            indentIndex += 1;
            openNodeList[openNodeList.length-1].content.push(obj);
            openNodeList.push(obj);
        } else {
            if (obj.indentIndex == indentIndex) {
                // 同级节点
                openNodeList[openNodeList.length-1].content.push(obj);
            } else {
                // 返回上一级 / 更多级
                openNodeList = openNodeList.slice(0, obj.indentIndex+1);
                openNodeList[openNodeList.length-1].content.push(obj);
                indentIndex = obj.indentIndex;
            }
        }
    }
    openNodeList = []; // 释放内存 可能有未闭合的节点
    return rootNode
}

// 转译为参数性的程序语言
function parseLine(line) {...}

用这种算法简单理解就是顺着代码内容推进语法树套正则表达式。
生成出来的语法树大概是这样的

Object { content: (4) […], indentIndex: -1 }
		content: Array(4) [ {…}, {…}, {…}, … ]
		0: Object { tagname: "span", content: "test", indentIndex: 0, … }
		1: Object { tagname: "p", content: "() test", indentIndex: 0, … }
			2: Object { tagname: "span", content: "test", indentIndex: 1, … }
		3: Object { tagname: "p", content: "test", indentIndex: 0, … }
		indentIndex: -1

而组合的算法更加类似于翻译

// 将程序语言组合成html代码
function compose(node, inline=false){
    if (inline) {
        ...
        }
        let htmlarg = joinArgs(item.tagname, item.arglist);
        let code = `<${htmltag}${htmlarg}>${content}</${htmltag}>\n`;
        return code;
    }
    for (let item of node) {
        let htmltag, content;
        if (Const.ordinaryElements[item.tagname]) { htmltag = Const.ordinaryElements[item.tagname];}
        else { htmltag = item.tagname;}
        content = item.content;
        item.htmltag = htmltag;
        // 对于h标签的解释
        if (htmltag == 'h') {
            ...
        }
        let htmlarg = joinArgs(item.tagname, item.arglist);
        if (item.indentIndex < indentIndex) {
            while (item.indentIndex < indentIndex) {
               ....
            }
        }
        if (typeof content == 'string') {
            content = content.replaceAll('\\{', '{').replaceAll('\\}', '}');
            htmlcode += `${repeatString(Const.indentSymbol, item.indentIndex)}<${htmltag}${htmlarg}>${content}</${htmltag}>\n`;
            indentIndex = item.indentIndex;
        } else if (typeof content != 'string') {
            htmlcode += `${repeatString(Const.indentSymbol, item.indentIndex)}<${htmltag}${htmlarg}>\n`;
            openNodeList.push(item);
            indentIndex = item.indentIndex;
        }
        if (item.content && typeof item.content != 'string' && item.content.length) 
            ...
    }
    return htmlcode;
}

最后值得一题的是python的胶水代码。短短30行代码可以解决大多数的module转纯js代码的问题。通过正则匹配来移除import和export语句，用object来实现类似命名空间的效果。

# Python 胶水插件代码
import re
...
# 模块转代码
def exportToObject(name, code):
    # 删掉import
    res = re.findall('(import .*;)', code)
    if res:
        code = code.replace(res[0], '')
    # 将export语句变为object
    res = re.findall('(export[\s]*{)(.*)}', code, re.S)[0]
    head = res[0]
    lines = res[1]
    code = code.replace(head, 'const '+name+' = {')
    for line in lines.split('\n'):
        export = re.findall('[\s]*(.*),', line)
        if not export: continue
        export = export[0]
        lineObject = f'    {export}: {export},'
        code = code.replace(line, lineObject)
    return code + '\n'

...

Albert

猫站引用：https://shequ.codemao.cn/community/530349