Evidently, nbconvert is u

huh, wild. Neat package. IIRC, <a class="user-mention notranslate" d

Conversion of markdown files to notebooks about geopyter HOT 7 CLOSED

pysal commented on September 22, 2024

Conversion of markdown files to notebooks

from geopyter.

Comments (7)

ljwolf commented on September 22, 2024

huh, wild. Neat package.

IIRC, @jreades had suggested:

@include{resource = https://imamarkdownnotebook.com/txt.md, 
start = # Inserting data into lists,
stop = #
**options
}
@include{resource = ../me_too.ipynb, 
start = ## Including raw notebooks by converting them to markdown first
stop = ###
}

where
resource supported a URL or a local file address.
start was a specific target tag in the markdown.
stop was a more general stop parse criteria, either a specific section number, a relative/computed criteria, like next header at equal level, next h1 tag, etc.

Thinking about this, I like that syntax a lot. The semantics of stop and start, like how valid options are encoded, is the tricky part of it.

What seems simplest is case- and leading-space-insensitive string matching for start. This would be default, and match on everything after the equals to the newline. This'd make it easy to do the generic "Grab this section" action, and avoids repetitive typing of quotes.

Harder targets, like, substring/subsection match, multiline target, or raw regexp, could be handled by a special starting delimiter, like =sub, =multi, =re, maybe?

If you use the same semantics for stop, then, you get automatic "to next X" behavior. Like, in the statements above, the first include would go from the level-1 header "inserting data into lists" to the next level-1 header. The second include would go from the level-2 header "including raw notebooks by converting them to markdown first", to the next level-3 header.

The stuff in options might make more sense if compared to supported preprocessing options in notedown or something like knitr/rmarkdown.

from geopyter.

jreades commented on September 22, 2024

Having given it some more thought, I am definitely leaning towards including (probably a better term than importing!) on the basis of structure, not start/stop… A little bit like CSS selectors:

@include {
‘resource’ = ‘…’,
‘select’ = ‘h1.Lists'
}

@include {
‘resource’ = ‘…’,
‘select’ = "h1.Lists h3.Let’s Code"
}

So in the first case you would just get everything sitting under the ‘#Lists’ header, whereas in the second you would get whatever comes under the “###Let’s Code” header that is itself under the ‘#Lists' section. That allows you to disambiguate subsections with the same name (e.g. #Dictionaries … ###Let’s Code) and also means that you don’t need to think about start/stop semantics, just “Grab everything at this level or ‘below’ structurally”. And if stuff gets moved around inside the main files your imports don’t fail either!

In the long run this could be extended with a suppression syntax like:
‘deselect’ = “h3.Let’s Code”

Such that the ###Let’s Code would be dropped (or ‘suppress’ed) from the #Lists import.

And, still sticking with the CSS ‘metaphor’:

@include {
‘resource’ = ‘…’,
‘select’ = "h1.Lists h3.Let’s Code, h1.Dictionaries h3.Let’s Code"
}

That would select multiple subsections from the resource.

jon

On 20 Sep 2016, at 19:42, Levi John Wolf [email protected] wrote:

huh, wild. Neat package.

IIRC, @jreades https://github.com/jreades had suggested:

@include{resource = https://imamarkdownnotebook.com/txt.md,
start = # Inserting data into lists,
stop = #
**options
}
@include{resource = ../me_too.ipynb,
start = ## Including raw notebooks by converting them to markdown first
stop = ###
}
where
resource supported a URL or a local file address.
start was a specific target tag in the markdown.
stop was a more general stop parse criteria, either a specific section number, a relative/computed criteria, like next header at equal level, next h1 tag, etc.

Thinking about this, I like that syntax a lot. The semantics of stop and start, like how valid options are encoded, is the tricky part of it.

from geopyter.

jreades commented on September 22, 2024

Also, there’s this interesting section on customising the cell metadata associated with a notebook (which could definitely be useful for selection and/or formatting):

https://nbconvert.readthedocs.io/en/latest/customizing.html#Templates-that-use-cell-metadata

from geopyter.

sjsrey commented on September 22, 2024

Let's say the source atom has something like the following structure

h1 Lists
h3 Let's Code
h3 Other

h1 Dictionaries
h3 Let's Code
h3 Other

then a suppression syntax could be something like

@include {
‘resource’ = ‘…’,
‘select’ = "h1.Lists -h3.Let’s Code, h1.Dictionaries h3.Let’s Code"
}

would result in the selection:

h1 Lists
h3 Other

h1 Dictionaries
h3 Let's Code

whereas

@include {
‘resource’ = ‘…’,
‘select’ = "h1.Lists, h1.Dictionaries, h3 Let's Code"
}

h1 Lists
h3 Let's Code
h3 Other

h1 Dictionaries
h3 Let's Code

and

@include {
‘resource’ = ‘…’,
‘select’ = "-h1.Lists, h3 Other, h1.Dictionaries, h3 Let's Code"
}

gets:

h3 Let's Code

h1 Dictionaries
h3 Let's Code

In other words the rules would be

If you specify only the parent, you get the parent and all children
If you specify a parent and a child you get only the parent and the specified child (other children are suppressed.
If you specify a parent and negate a child, you get the parent and any other children but not the negated child
If you negate a parent you must specify one or more children to include. If you want to suppress a given level (h1) you simply do not include it in select then all children are omitted as well.
If you had a parent with say 5 h3s and you wanted 4 of the h3s but not the parent it would be something like: select=-h1 -h3.Not wanted

from geopyter.

jreades commented on September 22, 2024

Sent from my iPad

On 21 Sep 2016, at 13:42, Sergio Rey [email protected] wrote:

In other words the rules would be

If you specify only the parent, you get the parent and all children
If you specify a parent and a child you get only the parent and the specified child (other children are suppressed.
If you specify a parent and negate a child, you get the parent and any other children but not the negated child
If you negate a parent you must specify one or more children to include. If you want to suppress a given level (h1) you simply do not include it in select then all children are omitted as well.
If you had a parent with say 5 h3s and you wanted 4 of the h3s but not the parent it would be something like: select=-h1 -h3.Not wanted

Yes, that sounds good to me.

One thing that I think you've actually got right in your mental model and just typed put differently (and just to be particular) has to do with the placement of commas. Let's say you have your document:

#Lists
###Let's code

#Dictionaries
###Let's code

#Lists-of-Lists
###Let's code

If the bit you want is only the Let's Code in Lists then your selection statement should be "h1.Lists h3.Let's Code" with no intervening comma. The comma distinguishes between 'statements', so if you had "h1.Lists, h3.Let's Code" then I would expect that to include everything under #Lists and all three ###Let's Code sections regardless of where they are in the notebook. That style gives maximum flexibility and specificity. I guess it also means we need to look out for potentially duplicate included sections...

Jon

from geopyter.

sjsrey commented on September 22, 2024

Good catch, I was overlooking that kind of flexibility that using the comma to delimit statements brings.

from geopyter.

sjsrey commented on September 22, 2024

After some exploration, it seems parsing the notebooks is pretty straightforward (See #8 and here )

Because of this, I think having a template approach where the template notebook is the skeleton that has cells with the @include syntax makes a lot of sense.

Fleshing this out, the question of what cell type we should use for the @includes comes to mind.

If we use raw and specify the include as a dict, then using the json module inside a parser would handle this. But maybe we should split this issue up as this tread might be getting to horizontal?

Going to begin to split this off into separate issues

#11 Specification of @include (basically what were were discussing here before this split)
#10 Reading, writing, sub-setting notebooks
#9 Structure of the atom notebooks

Let's keep this open, and add any separate issues into the previous list. Once the granularity is clear we can close this one.

from geopyter.

Conversion of markdown files to notebooks about geopyter HOT 7 CLOSED

Comments (7)

Related Issues (14)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent