Converting markdown to HTML with JavaScript - restricting sppported syntax
up vote
1
down vote
favorite
I am using marked.js currently to convert markdown to HTML, so the users of my Web-App can create a structured content. I am wondering if there is a way to restrict the supported syntax tu just an sub-set, like
headers
italic text
bold text
- lists with only 1 depth of indentation
quotes
I would like to prohibit conversion of list with multiple levels of indentation, code blocks, headers in lists ...
The reason is, that my WebApp should the users to create content in a specific way and if there will be possibility create some crazy structured content (list of headers, code in headers, lists of images ...) someone will for sure do it.
javascript html syntax markdown
add a comment |
up vote
1
down vote
favorite
I am using marked.js currently to convert markdown to HTML, so the users of my Web-App can create a structured content. I am wondering if there is a way to restrict the supported syntax tu just an sub-set, like
headers
italic text
bold text
- lists with only 1 depth of indentation
quotes
I would like to prohibit conversion of list with multiple levels of indentation, code blocks, headers in lists ...
The reason is, that my WebApp should the users to create content in a specific way and if there will be possibility create some crazy structured content (list of headers, code in headers, lists of images ...) someone will for sure do it.
javascript html syntax markdown
It might be easier to parse it to HTML then use some DOM queries to see if there are unwanted elements or structures using selectors, e.g.doc.querySelector('li ul')will find a nested ul,'li ol'a nested ol, etc.
– RobG
yesterday
Circa 10 years ago I had a similar issue, at the time it was easier to implement our own very simple parser to only support the 3-4 tags we needed rather than try to restrict a library package. There may be better ways now.
– SazooCat
yesterday
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I am using marked.js currently to convert markdown to HTML, so the users of my Web-App can create a structured content. I am wondering if there is a way to restrict the supported syntax tu just an sub-set, like
headers
italic text
bold text
- lists with only 1 depth of indentation
quotes
I would like to prohibit conversion of list with multiple levels of indentation, code blocks, headers in lists ...
The reason is, that my WebApp should the users to create content in a specific way and if there will be possibility create some crazy structured content (list of headers, code in headers, lists of images ...) someone will for sure do it.
javascript html syntax markdown
I am using marked.js currently to convert markdown to HTML, so the users of my Web-App can create a structured content. I am wondering if there is a way to restrict the supported syntax tu just an sub-set, like
headers
italic text
bold text
- lists with only 1 depth of indentation
quotes
I would like to prohibit conversion of list with multiple levels of indentation, code blocks, headers in lists ...
The reason is, that my WebApp should the users to create content in a specific way and if there will be possibility create some crazy structured content (list of headers, code in headers, lists of images ...) someone will for sure do it.
javascript html syntax markdown
javascript html syntax markdown
asked yesterday
karlitos
75221634
75221634
It might be easier to parse it to HTML then use some DOM queries to see if there are unwanted elements or structures using selectors, e.g.doc.querySelector('li ul')will find a nested ul,'li ol'a nested ol, etc.
– RobG
yesterday
Circa 10 years ago I had a similar issue, at the time it was easier to implement our own very simple parser to only support the 3-4 tags we needed rather than try to restrict a library package. There may be better ways now.
– SazooCat
yesterday
add a comment |
It might be easier to parse it to HTML then use some DOM queries to see if there are unwanted elements or structures using selectors, e.g.doc.querySelector('li ul')will find a nested ul,'li ol'a nested ol, etc.
– RobG
yesterday
Circa 10 years ago I had a similar issue, at the time it was easier to implement our own very simple parser to only support the 3-4 tags we needed rather than try to restrict a library package. There may be better ways now.
– SazooCat
yesterday
It might be easier to parse it to HTML then use some DOM queries to see if there are unwanted elements or structures using selectors, e.g.
doc.querySelector('li ul') will find a nested ul, 'li ol' a nested ol, etc.– RobG
yesterday
It might be easier to parse it to HTML then use some DOM queries to see if there are unwanted elements or structures using selectors, e.g.
doc.querySelector('li ul') will find a nested ul, 'li ol' a nested ol, etc.– RobG
yesterday
Circa 10 years ago I had a similar issue, at the time it was easier to implement our own very simple parser to only support the 3-4 tags we needed rather than try to restrict a library package. There may be better ways now.
– SazooCat
yesterday
Circa 10 years ago I had a similar issue, at the time it was easier to implement our own very simple parser to only support the 3-4 tags we needed rather than try to restrict a library package. There may be better ways now.
– SazooCat
yesterday
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
You have a few difference options:
Marked.js uses a multi-step method to parse Markdown. It uses a lexer, which breaks the document up into tokens, a parser to convert those tokens to a abstract syntax tree (AST) and a renderer to convert the AST to HTML. You can override any of those pieces to alter the handling of various parts of the syntax.
For example, if you simply wanted to ignore lists and leave them out of the rendered HTML, replace the list function from the renderer with one which returns an empty string.
Or, if you want the parser to act as if lists are not even a supported feature of Markdown, you could remove the list and listitem methods from the parser. In that case, the list would remain in the output, but would be treated as a paragraph instead.
Or, if you want to support one level of lists, but not nested lists, then you could replace the list and/or listitem methods in the parser with your own implementation that parses lists as you desire.
Note that there are also a number advanced options, which use the above methods to alter the parser and/or render in various ways. For the most part, those options would not provide the features you are asking for, but browsing though the source code might give you some ideas of how to implement your own modifications.
However, there is the sanitize option, which will accept a sanitizer function. You could provide your own sanitizer which removed any unwanted elements from the HTML output. This would result in a similar end result to overriding the renderer, but would be implemented differently. Depending on what you want to accomplish, one or the other may be more effective.
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
You have a few difference options:
Marked.js uses a multi-step method to parse Markdown. It uses a lexer, which breaks the document up into tokens, a parser to convert those tokens to a abstract syntax tree (AST) and a renderer to convert the AST to HTML. You can override any of those pieces to alter the handling of various parts of the syntax.
For example, if you simply wanted to ignore lists and leave them out of the rendered HTML, replace the list function from the renderer with one which returns an empty string.
Or, if you want the parser to act as if lists are not even a supported feature of Markdown, you could remove the list and listitem methods from the parser. In that case, the list would remain in the output, but would be treated as a paragraph instead.
Or, if you want to support one level of lists, but not nested lists, then you could replace the list and/or listitem methods in the parser with your own implementation that parses lists as you desire.
Note that there are also a number advanced options, which use the above methods to alter the parser and/or render in various ways. For the most part, those options would not provide the features you are asking for, but browsing though the source code might give you some ideas of how to implement your own modifications.
However, there is the sanitize option, which will accept a sanitizer function. You could provide your own sanitizer which removed any unwanted elements from the HTML output. This would result in a similar end result to overriding the renderer, but would be implemented differently. Depending on what you want to accomplish, one or the other may be more effective.
add a comment |
up vote
0
down vote
You have a few difference options:
Marked.js uses a multi-step method to parse Markdown. It uses a lexer, which breaks the document up into tokens, a parser to convert those tokens to a abstract syntax tree (AST) and a renderer to convert the AST to HTML. You can override any of those pieces to alter the handling of various parts of the syntax.
For example, if you simply wanted to ignore lists and leave them out of the rendered HTML, replace the list function from the renderer with one which returns an empty string.
Or, if you want the parser to act as if lists are not even a supported feature of Markdown, you could remove the list and listitem methods from the parser. In that case, the list would remain in the output, but would be treated as a paragraph instead.
Or, if you want to support one level of lists, but not nested lists, then you could replace the list and/or listitem methods in the parser with your own implementation that parses lists as you desire.
Note that there are also a number advanced options, which use the above methods to alter the parser and/or render in various ways. For the most part, those options would not provide the features you are asking for, but browsing though the source code might give you some ideas of how to implement your own modifications.
However, there is the sanitize option, which will accept a sanitizer function. You could provide your own sanitizer which removed any unwanted elements from the HTML output. This would result in a similar end result to overriding the renderer, but would be implemented differently. Depending on what you want to accomplish, one or the other may be more effective.
add a comment |
up vote
0
down vote
up vote
0
down vote
You have a few difference options:
Marked.js uses a multi-step method to parse Markdown. It uses a lexer, which breaks the document up into tokens, a parser to convert those tokens to a abstract syntax tree (AST) and a renderer to convert the AST to HTML. You can override any of those pieces to alter the handling of various parts of the syntax.
For example, if you simply wanted to ignore lists and leave them out of the rendered HTML, replace the list function from the renderer with one which returns an empty string.
Or, if you want the parser to act as if lists are not even a supported feature of Markdown, you could remove the list and listitem methods from the parser. In that case, the list would remain in the output, but would be treated as a paragraph instead.
Or, if you want to support one level of lists, but not nested lists, then you could replace the list and/or listitem methods in the parser with your own implementation that parses lists as you desire.
Note that there are also a number advanced options, which use the above methods to alter the parser and/or render in various ways. For the most part, those options would not provide the features you are asking for, but browsing though the source code might give you some ideas of how to implement your own modifications.
However, there is the sanitize option, which will accept a sanitizer function. You could provide your own sanitizer which removed any unwanted elements from the HTML output. This would result in a similar end result to overriding the renderer, but would be implemented differently. Depending on what you want to accomplish, one or the other may be more effective.
You have a few difference options:
Marked.js uses a multi-step method to parse Markdown. It uses a lexer, which breaks the document up into tokens, a parser to convert those tokens to a abstract syntax tree (AST) and a renderer to convert the AST to HTML. You can override any of those pieces to alter the handling of various parts of the syntax.
For example, if you simply wanted to ignore lists and leave them out of the rendered HTML, replace the list function from the renderer with one which returns an empty string.
Or, if you want the parser to act as if lists are not even a supported feature of Markdown, you could remove the list and listitem methods from the parser. In that case, the list would remain in the output, but would be treated as a paragraph instead.
Or, if you want to support one level of lists, but not nested lists, then you could replace the list and/or listitem methods in the parser with your own implementation that parses lists as you desire.
Note that there are also a number advanced options, which use the above methods to alter the parser and/or render in various ways. For the most part, those options would not provide the features you are asking for, but browsing though the source code might give you some ideas of how to implement your own modifications.
However, there is the sanitize option, which will accept a sanitizer function. You could provide your own sanitizer which removed any unwanted elements from the HTML output. This would result in a similar end result to overriding the renderer, but would be implemented differently. Depending on what you want to accomplish, one or the other may be more effective.
answered yesterday
Waylan
10.9k22455
10.9k22455
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53371483%2fconverting-markdown-to-html-with-javascript-restricting-sppported-syntax%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
It might be easier to parse it to HTML then use some DOM queries to see if there are unwanted elements or structures using selectors, e.g.
doc.querySelector('li ul')will find a nested ul,'li ol'a nested ol, etc.– RobG
yesterday
Circa 10 years ago I had a similar issue, at the time it was easier to implement our own very simple parser to only support the 3-4 tags we needed rather than try to restrict a library package. There may be better ways now.
– SazooCat
yesterday