Pandoc User’s Guide

To download metadata using Svcutil.exe

By default, pandoc interprets material inside block-level tags as Markdown. Are we talking ID3? Determining how the various versions of a dataset differ from each other is typically very laborious unless a summary of the differences is provided. Note that the manifest is not self-referencing: It is possible to create a bibliography with all the citations, whether or not they appear in the document, by using a wildcard:.

Webinar: SKOS - Overview and Modeling of Controlled Vocabularies

Use the Organization Service to read and write data or metadata

Include contents of FILE , verbatim, at the end of the header. This option can be used repeatedly to include multiple files in the header. They will be included in the order specified.

Include contents of FILE , verbatim, at the beginning of the document body e. This can be used to include navigation bars or banners in HTML documents. This option can be used repeatedly to include multiple files. List of paths to search for images and other resources. The paths should be separated by: If --resource-path is not specified, the default resource path is the working directory.

Note that, if --resource-path is specified, the working directory must be explicitly listed or it will not be searched. Produce a standalone HTML file with no external dependencies, using data: URIs to incorporate the contents of linked scripts, stylesheets, images, and videos.

Scripts, images, and stylesheets at absolute URLs will be downloaded; those at relative URLs will be sought relative to the working directory if the first source file is local or relative to the base URL if the first source file is remote. Use reference-style links, rather than inline links, in writing Markdown or reStructuredText. By default inline links are used. The placement of link references is affected by the --reference-location option. Specify whether footnotes and references, if reference-links is set are placed at the end of the current top-level block, the current section, or the document.

The default is document. Currently only affects the markdown writer. The default is to use setext-style headers for levels , and then ATX headers. The hierarchy order is part, chapter, then section; all headers are shifted such that the top-level header becomes the specified type. The default behavior is to determine the best division type via heuristics: When the LaTeX document class is set to report , book , or memoir unless the article option is specified , chapter is implied as the setting for this option.

By default, sections are not numbered. Sections with class unnumbered will never be numbered, even if --number-sections is specified. Offset for section headings in HTML output ignored in other output formats. The first number is added to the section number for top-level headers, the second for second-level headers, and so on. Offsets are 0 by default. Use the listings package for LaTeX code blocks.

The package does not support multi-byte encoding for source code. To handle UTF-8 you would need to use a custom template. This issue is fully documented here: Encoding issue with the listings package. Make list items in slide shows display incrementally one by one. The default is for lists to be displayed all at once. Specifies that headers with the specified level create slides for beamer , s5 , slidy , slideous , dzslides.

Headers above this level in the hierarchy are used to divide the slide show into sections; headers below this level create subheads within a slide. Note that content that is not contained under slide-level headers will not appear in the slide show.

The default is to set the slide level based on the contents of the document; see Structuring the slide show. See Header identifiers , below. Specify a method for obfuscating mailto: The default is none. This is useful for preventing duplicate identifiers when generating fragments to be included in other pages. Link to a CSS style sheet. A stylesheet is required for generating EPUB. If none is provided using this option or the stylesheet metadata field , pandoc will look for a file epub.

If it is not found there, sensible defaults will be used. For best results, the reference docx should be a modified version of a docx file produced using pandoc. The contents of the reference docx are ignored, but its stylesheets and document properties including margins, page size, header, and footer are used in the new docx. If no reference docx is specified on the command line, pandoc will look for a file reference. If this is not found either, sensible defaults will be used. To produce a custom reference.

For best results, do not make changes to this file other than modifying the styles used by pandoc: If no reference ODT is specified on the command line, pandoc will look for a file reference. Any template included with a recent install of Microsoft PowerPoint either with. The specific requirement is that the template should contain the following four layouts as its first four layouts:. All templates included with a recent version of MS PowerPoint will fit these criteria.

You can click on Layout under the Home menu to check. You can also modify the default reference. Use the specified image as the EPUB cover. It is recommended that the image be less than px in width and height. The file should contain a series of Dublin Core elements. By default, pandoc will include the following metadata elements: Any of these may be overridden by elements in the metadata file. Embed the specified font in the EPUB. This option can be repeated to embed multiple fonts.

Wildcards can also be used: However, if you use wildcards on the command line, be sure to escape them or put the whole filename in single quotes, to prevent them from being interpreted by the shell. To use the embedded fonts, you will need to add declarations like the following to your CSS see --css:.

The default is to split into chapters at level 1 headers. This option only affects the internal composition of the EPUB, not the way chapters and sections are displayed to users. Some readers may be slow if the chapter files are too large, so for large documents with few level 1 headers, one might want to use a chapter level of 2 or 3. The default is EPUB. To put the EPUB contents in the top level, use an empty string.

Use the specified engine when producing PDF output. The default is pdflatex. If the engine is not in your PATH, the full path of the engine may be specified here. Use the given string as a command-line argument to the pdf-engine. If used multiple times, the arguments are provided with spaces between them. Note that no check for duplicate options is done. If you supply this argument multiple times, each FILE will be added to bibliography.

This option is only relevant with pandoc-citeproc. Use natbib for citations in LaTeX output. This option is not for use with the pandoc-citeproc filter or with PDF output. It is intended for use in producing a LaTeX file that can be processed with bibtex. Use biblatex for citations in LaTeX output.

It is intended for use in producing a LaTeX file that can be processed with bibtex or biber. The default is to render TeX math as far as possible using Unicode characters. However, this gives acceptable results only for basic math, usually you will want to use --mathjax or another of the following options.

Then the MathJax JavaScript will render it. This is the default in odt output. For SVG images you can for example use --webtex https: That directory should contain a katex. So, the procedure is:. Print information about command-line arguments to stdout , then exit. This option is intended primarily for use in wrapper scripts. The first line of output contains the name of the output file specified with the -o option, or - for stdout if no output file was specified.

The remaining lines contain the command-line arguments, one per line, in the order they appear. These do not include regular pandoc options and their arguments, but do include any options appearing after a -- separator at the end of the line.

Ignore command-line arguments for use in wrapper scripts. Regular pandoc options are not ignored. To see the default template that is used, just type. A custom template can be specified using the --template option. Templates contain variables , which allow for the inclusion of arbitrary information at any point in the file. Some variables are set automatically by pandoc. These vary somewhat depending on the output format, but include the following:.

You can use the following snippet in your template to distinguish them:. These can be set through a pandoc title block , which allows for multiple authors, or through a YAML metadata block:.

For some output formats, pandoc will convert it to an appropriate format stored in the additional variables babel-lang , polyglossia-lang LaTeX and context-lang ConTeXt. Native pandoc Spans and Divs with the lang attribute value in BCP 47 can be used to switch the language in that range.

In LaTeX output, babel-otherlangs and polyglossia-otherlangs variables will be generated automatically based on the lang attributes of Spans and Divs in the document. For bidirectional documents, native pandoc span s and div s with the dir attribute value rtl or ltr can be used to override the base direction in some output formats.

This may not always be necessary if the final renderer e. Variables are available for producing slide shows with pandoc , including all reveal. This will include X in the template if variable has a truthy value; otherwise it will include Y. Here a truthy value is any of the following:. X and Y are placeholders for any valid template text, and may include interpolated variables or other conditionals.

A dot can be used to select a field of a variable that takes an object as its value. If you use custom templates, you may need to revise them as pandoc changes.

We recommend tracking the changes in the default templates, and modifying your custom templates accordingly. An easy way to do this is to fork the pandoc-templates repository and merge in changes after each pandoc release. Templates may contain comments: The behavior of some of the readers and writers can be adjusted by enabling or disabling various extensions. The markdown reader and writer make by far the most use of extensions.

In the following, extensions that also work for other formats are covered. Interpret straight quotes as curly quotes, as em-dashes, -- as en-dashes, and If you are writing Markdown, then the smart extension has the reverse effect: If smart is disabled, then in reading LaTeX pandoc will parse these characters literally.

In writing LaTeX, enabling smart tells pandoc to use the ligatures when possible; if smart is disabled pandoc will use unicode quotation mark and dash characters. A header without an explicitly specified identifier will be automatically assigned a unique identifier based on the header text. These rules should, in most cases, allow one to determine the identifier from the header text.

The exception is when several headers have the same text; in this case, the first will get an identifier as described above; the second will get the same identifier with -1 appended; the third with -2 ; and so on. These identifiers are used to provide link targets in the table of contents generated by the --toc --table-of-contents option.

They also make it easy to provide links from one section of a document to another. A link to this section, for example, might look like this:. Accents are stripped off of accented Latin letters, and non-Latin letters are omitted.

However, they can also be used with HTML input. This is handy for reading web pages formatted using MathJax, for example. By default, this is disabled for HTML input. This extension is enabled by default for HTML input. This means that div s are parsed to pandoc native elements. In Markdown output, code blocks with classes haskell and literate will be rendered using bird tracks, and block quotations will be indented one space, so they will not be treated as Haskell code.

In restructured text output, code blocks with class haskell will be rendered using bird tracks. In LaTeX output, code blocks with class haskell will be rendered inside code environments. In HTML output, code blocks with class haskell will be rendered with class literatehaskell and bird tracks. Note that GHC expects the bird tracks in the first column, so indented literate code blocks e.

Read all docx styles as divs for paragraph styles and spans for character styles regardless of whether pandoc understands the meaning of these styles. This can be used with docx custom styles. Natural tables allow more fine-grained global customization but come at a performance penalty compared to extreme tables. This document explains the syntax, noting differences from standard Markdown.

Extensions can be enabled or disabled to specify the behavior more granularly. They are described in the following. See also Extensions above, for extensions that work also on other formats. Whereas Markdown was originally designed with HTML generation in mind, pandoc is designed for multiple output formats. Thus, while pandoc allows the embedding of raw HTML, it discourages it, and provides other, non-HTMLish ways of representing important document elements like definition lists, tables, mathematics, and footnotes.

A paragraph is one or more lines of text followed by one or more blank lines. Newlines are treated as spaces, so you can reflow your paragraphs as you like. If you need a hard line break, put two or more spaces at the end of a line. A backslash followed by a newline is also a hard line break. The header text can contain inline formatting, such as emphasis see Inline formatting , below.

An ATX-style header consists of one to six signs and a line of text, optionally followed by any number of signs. The number of signs at the beginning of the line is the header level:. Standard Markdown syntax does not require a blank line before a header. Pandoc does require this except, of course, at the beginning of the document.

The reason for the requirement is that it is all too easy for a to end up at the beginning of a line by accident perhaps through line wrapping. Many Markdown implementations do not require a space between the opening s of an ATX header and the header text, so that 5 bolt and hashtag count as headers.

With this extension, pandoc does require the space. Headers can be assigned attributes using this syntax at the end of the line containing the header text:. Headers with the class unnumbered will not be numbered, even if --number-sections is specified.

A single hyphen - in an attribute context is equivalent to. If there are multiple headers with identical text, the corresponding reference will link to the first one only, and you will need to use explicit links to link to the others, as described above. Explicit link reference definitions always take priority over implicit header references.

So, in the following example, the link will point to bar , not to foo:. Markdown uses email conventions for quoting blocks of text. Among the block elements that can be contained in a block quote are other block quotes.

That is, block quotes can be nested:. Standard Markdown syntax does not require a blank line before a block quote. A block of text indented four spaces or one tab is treated as verbatim text: The initial four space or one tab indentation is not considered part of the verbatim text, and is removed in the output. In addition to standard indented code blocks, pandoc supports fenced code blocks. Everything between these lines is treated as code. No indentation is necessary:. Like regular code blocks, fenced code blocks must be separated from surrounding text by blank lines.

If the code itself contains a row of tildes or backticks, just use a longer row of tildes or backticks at the start and end:. Here mycode is an identifier, haskell and numberLines are classes, and startFrom is an attribute with value Some output formats can use this information to do syntax highlighting. If highlighting is supported for your output format and language, then the code block above will appear highlighted, with numbered lines. To see which languages are supported, type pandoc --list-highlight-languages.

Otherwise, the code block above will appear as follows:. The numberLines or number-lines class will cause the lines of the code block to be numbered, starting with 1 or the value of the startFrom attribute. To prevent all highlighting, use the --no-highlight flag. To set the highlighting style, use --highlight-style.

For more information on highlighting, see Syntax highlighting , below. A line block is a sequence of lines beginning with a vertical bar followed by a space. The division into lines will be preserved in the output, as will any leading spaces; otherwise, the lines will be formatted as Markdown.

This is useful for verse and addresses:. This syntax is borrowed from reStructuredText. A bullet list is a list of bulleted list items. Here is a simple example:. The bullets need not be flush with the left margin; they may be indented one, two, or three spaces. The bullet must be followed by whitespace. A list item may contain multiple paragraphs and other block-level content. However, subsequent paragraphs must be preceded by a blank line and indented to line up with the first non-space content after the list marker.

List items may include other lists. In this case the preceding blank line is optional. The nested list must be indented to line up with the first non-space character after the list marker of the containing list item.

However, if there are multiple paragraphs or other blocks in a list item, the first line of each must be indented. Ordered lists work just like bulleted lists, except that the items begin with enumerators rather than bullets.

In standard Markdown, enumerators are decimal numbers followed by a period and a space. The numbers themselves are ignored, so there is no difference between this list:.

Unlike standard Markdown, pandoc allows ordered list items to be marked with uppercase and lowercase letters and roman numerals, in addition to Arabic numerals.

List markers may be enclosed in parentheses or followed by a single right-parentheses or period. They must be separated from the text that follows by at least one space, and, if the list marker is a capital letter with a period, by at least two spaces. Pandoc also pays attention to the type of list marker used, and to the starting number, and both of these are preserved where possible in the output format. Thus, the following yields a list with numbers followed by a single parenthesis, starting with 9, and a sublist with lowercase roman numerals:.

Pandoc will start a new list each time a different type of list marker is used. So, the following will create three lists:. Each term must fit on one line, which may optionally be followed by a blank line, and must be followed by one or more definitions. A definition begins with a colon or tilde, which may be indented one or two spaces.

A term may have multiple definitions, and each definition may consist of one or more block elements paragraph, code block, list, etc. The body of the definition including the first line, aside from the colon or tilde should be indented four spaces.

If you leave space before the definition as in the example above , the text of the definition will be treated as a paragraph. For a more compact definition list, omit the space before the definition:.

Note that space between items in a definition list is required. The special list marker can be used for sequentially numbered examples. The numbered examples need not occur in a single list; each new list using will take up where the last stopped. This is because example labels tend to be long, and indenting content to the first non-space character after the label would be awkward. Pandoc behaves differently from Markdown.

Pandoc follows a simple rule: The fact that the list is followed by a blank line is irrelevant. This behavior is consistent with the official Markdown syntax description, even though it is different from that of Markdown. Four kinds of tables may be used. The first three kinds presuppose the use of a fixed-width font, such as Courier.

The fourth kind can be used with proportionally spaced fonts, as it does not require lining up columns. A caption may optionally be provided with all 4 kinds of tables as illustrated in the examples below. A caption is a paragraph beginning with the string Table: It may appear either before or after the table. The headers and table rows must each fit on one line. Column alignments are determined by the position of the header text relative to the dashed line below it: When headers are omitted, column alignments are determined on the basis of the first line of the table body.

So, in the tables above, the columns would be right, left, center, and right aligned, respectively. Multiline tables allow headers and table rows to span multiple lines of text but cells that span multiple columns or rows of the table are not supported.

Here is an example:. In multiline tables, the table parser pays attention to the widths of the columns, and the writers try to reproduce these relative widths in the output. So, if you find that one of the columns is too narrow in the output, try widening it in the Markdown source. It is possible for a multiline table to have just one row, but the row should be followed by a blank line and then the row of dashes that ends the table , or the table may be interpreted as a simple table.

The cells of grid tables may contain arbitrary block elements multiple paragraphs, code blocks, lists, etc. Cells that span multiple columns or rows are not supported. Grid tables can be created easily using Emacs table mode. Alignments can be specified as with pipe tables, by putting colons at the boundaries of the separator line after the header:. Pandoc does not support grid tables with row spans or column spans. This means that neither variable numbers of columns across rows nor variable numbers of rows across columns are supported by Pandoc.

All grid tables must have the same number of columns in each row, and the same number of rows in each column. For example, the Docutils sample grid tables will not render as expected with Pandoc. The beginning and ending pipe characters are optional, but pipes are required between all columns. The colons indicate column alignment as shown. The header cannot be omitted. To simulate a headerless table, include a header with blank cells. Since the pipes indicate column boundaries, columns need not be vertically aligned, as they are in the above example.

So, this is a perfectly legal though ugly pipe table:. The cells of pipe tables cannot contain block elements like paragraphs and lists, and cannot span multiple lines. If a pipe table contains a row whose printable content is wider than the column width see --columns , then the table will take up the full text width and the cell contents will wrap, with the relative cell widths determined by the number of dashes in the line separating the table header from the table body.

On the other hand, if no lines are wider than column width, then cell contents will not be wrapped, and the cells will be sized to their contents.

Other orgtbl features are not supported. The block may contain just a title, a title and an author, or all three elements.

If you want to include an author but no title, or a title and a date but no author, you need a blank line:. If a document has multiple authors, the authors may be put on separate lines with leading space, or separated by semicolons, or both. So, all of the following are equivalent:. All three metadata fields may contain standard inline formatting italics, links, footnotes, etc.

Title blocks will always be parsed, but they will affect the output only when the --standalone -s option is chosen. In HTML output, titles will appear twice: The title in the document head can have an optional prefix attached --title-prefix or -T option. If a title prefix is specified with -T and no title block appears in the document, the title prefix will be used by itself as the HTML title.

The man page writer extracts a title, man page section number, and other header and footer information from the title line. The title is assumed to be the first word on the title line, which may optionally end with a single-digit section number in parentheses. There should be no space between the title and the parentheses.

Anything after this is assumed to be additional footer and header text. A single pipe character should be used to separate the footer text from the header text. A YAML metadata block may occur anywhere in the document, but if it is not at the beginning, it must be preceded by a blank line.

Note that, because of the way pandoc concatenates input files when several are provided, you may also keep the metadata in a separate YAML file and pass it to pandoc as an argument, along with your Markdown files:. Just be sure that the YAML file begins with and ends with or Alternatively, you can use the --metadata-file option.

Using that approach however, you cannot reference content like footnotes from the main markdown input document. Metadata will be taken from the fields of the YAML object and added to any existing document metadata. Metadata can contain lists and objects nested arbitrarily , but all string scalars will be interpreted as Markdown. Fields with names ending in an underscore will be ignored by pandoc. They may be given a role by external processors. Field names must not be interpretable as YAML numbers or boolean values so, for example, yes , True , and 15 cannot be used as field names.

A document may contain multiple metadata blocks. The metadata fields will be combined through a left-biased union: All of the metadata will appear in a single block at the beginning of the document.

Note that YAML escaping rules must be followed. The value of the media-type attribute is not always sufficient to identify the type of linked resource e.

To aid Reading Systems in the identification of such generic resources, the properties attribute can be attached with a semantic identifier. The following example shows the properties attribute used to identify a remote XMP record.

The list of reserved relationships and properties recognized by this specification is defined in [ Link Vocab ]. Authors may add relationships and properties from other vocabularies via the metadata extensibility mechanism defined in this specification.

Authors also may create new values by defining their own prefixes. The following example shows the link element used to associate an informational web page. Note that as foaf is not a predefined prefix , it has to be declared in the prefix attribute. In the case of a linked metadata record , Reading Systems must not skip processing the metadata expressed in the Package Document and only use the information expressed in the record. Linked records are intended to enhance the information available to Reading Systems, and the package metadata typically contains important rendering information.

Reading Systems may compile metadata from multiple linked records; they do not have to select only one record. When it comes to resolving discrepancies and conflicts between metadata expressed in the Package Document and in linked metadata records, Reading Systems must use the document order of link elements in the Package Document to establish precedence i. The following example shows that a remote record that has higher precedence than a local record, which in turn has higher precedence than the metadata found in the metadata element.

Reading Systems must ignore any instructions contained in linked resources related to the layout and rendering of the EPUB Publication. Due to the variety of metadata record formats and serializations that can be linked to an EPUB Publication, and the complexity of comparing metadata properties between them, this specification does not require Reading Systems to process linked records.

The manifest element provides an exhaustive list of the Publication Resources that constitute the given Rendition , each represented by an item element. Required second child of package , following metadata. This specification supports internationalized resource naming, so elements and attributes that reference Publication Resources accept IRIs as their value. The item element represents a Publication Resource.

As a child of manifest. The IRI may be absolute or relative. The resulting absolute IRI must be unique within the manifest scope. All Publication Resources must be referenced from the manifest , regardless of whether they are Local or Remote Resources. Note that the manifest is not self-referencing: The Publication Resource identified by an item element must conform to the applicable specification s as inferred from the MIME media type provided in the media-type attribute. Fallbacks may be provided for Core Media Type Resources e.

Fallback requirements for Foreign Resources are defined in Manifest Fallbacks. This specification reserves the [ Manifest Vocab ] for use with the properties attribute. Terms from other vocabularies may be used provided they have a prefix refer to Reserved Prefixes for a list of prefixes that do not have to be declared. Authors must declare all applicable descriptive metadata properties for each Publication Resource in this attribute, as Reading Systems may optimize the rendering depending on the properties that have been set e.

Reading Systems must ignore all descriptive metadata properties that they do not recognize. Refer to Packaging [ Media Overlays 3. The duration attribute takes a [ SMIL ] clock value that provides the total duration of the audio media referenced from a Media Overlay Document or, in the case of timed media, the total duration of the referenced media file. Refer to Package Metadata [ Media Overlays 3. The order of item elements in the manifest is not significant.

The presentation sequence of content documents is provided in the spine. The following example shows a manifest that contains only Core Media Type Resources. The following example shows a manifest that references two Foreign Resources , and therefore uses the fallback chain mechanism to supply content alternatives.

The fallback chain terminates with a Core Media Type. The following example shows a reference to a remote audio file that has to be referenced from the manifest the audio is rendered inline in the XHTML Content Document so it is a Publication Resource. The following example shows a link to the same audio file, but in this case it is not be listed in the manifest hyperlinked Remote Resources are not Publication Resources. The audio file would only be listed in the manifest if the Author has also referenced it from an [ HTML ] embedded content element, as above i.

The following example shows a link to a local version of the audio file. Foreign Resources may be referenced in contexts in which an intrinsic fallback cannot be provided e. Manifest fallbacks must be provided in such cases. Manifest fallbacks are provided using the fallback attribute on the manifest item element that represents the Publication Resource.

This fallback item may itself specify another fallback item , and so on. The ordered list of all the ID references that can be reached starting from a given item's fallback attribute represents the fallback chain for that item. The order of the resources in the fallback chain represents the Author's preferred fallback order.

A Reading System that does not support the Media Type of a given Publication Resource must traverse the fallback chain until it has identified at least one supported Publication Resource to be used in place of the unsupported resource. If the Reading System supports multiple Publication Resources in the fallback chain, it may select the resource to use based on specific properties of that resource, otherwise it should honor the Author's preferred fallback order.

Fallback chains must conform to one of the following requirements, as appropriate:. For Foreign Resources for which an intrinsic fallback cannot be provided, the chain must contain at least one Core Media Type Resource.

Fallback chains must not contain any circular- or self-references to item elements in the chain. An example of when this feature can be utilized is when providing fallbacks for scripted content [ Content Docs 3. The spine element defines an ordered list of manifest item references that represent the default reading order of the given Rendition.

Required third child of package , following manifest. Reading Systems must provide a means of rendering the Rendition in the order defined in the spine , which includes: All Publication Resources that are hyperlinked to from Publication Resources in the spine must themselves be listed in the spine , where hyperlinking is defined to be any linking mechanism that requires the user to navigate away from the current resource. Common hyperlinking mechanisms include the href attribute of the [ HTML ] a and area elements and scripted links e.

The requirement to list hyperlinked resources applies recursively i. As Remote Resources referenced from hyperlinks are not Publication Resources, they are not subject to the requirement to include in the spine e. Embedded Publication Resources e.

The page-progression-direction attribute sets the global direction in which the content flows. Allowed values are ltr left-to-right , rtl right-to-left and default. When the default value is specified, the Author is expressing no preference and the Reading System can choose the rendering direction. The default value must be assumed when the attribute is not specified. Although the page-progression-direction attribute sets the global flow direction, individual Content Documents and parts of Content Documents may override this setting e.

Reading Systems may also provide mechanisms to override the default direction e. The page-progression-direction attribute defines the flow direction from one fixed-layout page to the next. The order of the itemref elements defines the default reading order of the given Rendition.

As a child of spine. The linear attribute indicates whether the referenced item contains content that contributes to the primary reading order and has to be read sequentially " yes " or auxiliary content that enhances or augments the primary content and can be accessed out of sequence " no ". Examples of auxiliary content include: The linear attribute allows Reading Systems to distinguish content that a user needs to access as part of the default reading order from supplementary content which might, for example, be presented in a popup window or omitted from an aural rendering.

When rendering an EPUB Publication, a Reading System may either suppress non-linear content so that it does not appear in the default reading order, or ignore the linear attribute in order to provide users access to the entire content of the EPUB Publication.

This specification does not mandate which model Reading Systems have to use. A Reading System may also provide the option for users to toggle between the two models. Each Rendition must include at least one itemref whose linear attribute value is either explicitly or implicitly set to " yes ". An itemref that omits the linear attribute is assumed to have the value " yes ". Authors must provide a means of accessing all non-linear content e.

This specification reserves the [ Spine Vocab ] for use with the properties attribute. All applicable descriptive metadata properties defined in [ Spine Vocab ] should be declared. Reading Systems must ignore all metadata properties expressed in the properties attribute that they do not recognize. The following example shows a spine element corresponding to the manifest example above.

Optional sixth element of package. The collection element allows resources to be assembled into logical groups for a variety of potential uses: The collection element, as defined in this section, represents a generic framework from which specializations are intended to be derived e. Such specializations must define the purpose of the collection element within a Rendition, as well as all requirements for its valid production and use specifically any requirements that differ from the general framework presented below.

Each specialization must define a role value that uniquely identifies all conformant collection elements. No roles are defined in this section. Third parties may define custom roles for the collection element, but such roles must be identified using absolute IRIs. Custom roles must not incorporate the string " idpf. To facilitate interoperability of custom roles across Reading Systems, implementers are strongly encouraged to document their use of the collection element in [ Role Extensions ].

The optional metadata element child of collection is an adaptation of the package metadata element, with the following differences in syntax and semantics:. Package-level restrictions on the use of metadata elements may be overridden. A collection may define sub-collections through the inclusion of one or more child collection elements. The link element child of collection is an adaptation of the metadata link element, with the following differences in syntax and semantics:.

The properties attribute also accepts manifest item properties [ Manifest Vocab ] without a prefix e. Each link element must reference a resource that is a member of the group. The order of link elements is not significant. Specializations of the collection element may tailor the requirements defined above to better reflect their needs e. However, the resulting content model must represent a valid subset of the one defined in this section e. Specializations must not define collections in a way that overrides the requirements of the manifest and spine.

In the context of this specification, support for collections in Reading Systems is optional. Reading Systems must ignore collection elements that define unrecognized roles. The rendering of a Rendition must not be dependent on the recognition of collection elements.

The content must remain consumable by a user without any information loss or other significant deterioration. This Unique Identifier , whether chosen or assigned, must be stored in the dc: New identifiers should not be issued when updating metadata, fixing errata or making other minor changes to the EPUB Publication. Determining whether two EPUB Publications with the same Unique Identifier represent different versions of the same publication see Release Identifier , or different publications, might require inspecting other metadata, such as the titles or authors.

The Unique Identifier of an EPUB Publication typically should not change with each minor revision to the package or its contents, as Unique Identifiers are intended to have maximal persistence both for referencing and distribution purposes. Each release of an EPUB Publication normally requires that the new version be uniquely identifiable, however, which results in the contradictory need for reliable Unique Identifiers that are changeable.

To redress this problem of identifying minor modifications and releases without changing the Unique Identifier, this specification defines the semantics for a Release Identifier , or means of distinguishing and sequentially ordering EPUB Publications with the same Unique Identifier.

The Release Identifier is not an actual property in the package metadata section, but is a value that can be obtained from two other mandatory pieces of metadata: When the taken together, the combined value represents a unique identity that can be used to distinguish any particular version of an EPUB Publication from another. Although not a part of the package metadata, for referencing and other purposes all string representations of the identifier must be constructed using the at sign as the separator i.

Whitespace must not be included when concatenating the strings. The following example shows how a Unique Identifier and modification date are combined to form the Release Identifier. Note that it is possible that the separator character may occur in the Unique Identifier, as these identifiers may be any string value. The Release Identifier consequently must be split on the last instance of the at sign when decomposing it into its component parts.

The Release Identifier does not supersede the Unique Identifier, but represents the means by which different versions of the same EPUB Publication can be distinguished and identified in distribution channels and by Reading Systems.

The sequential, chronological order inherent in the format of the timestamp also places EPUB Publications in order without requiring knowledge of the exact identifier that came before.

When an EPUB Container includes more than one Rendition of an EPUB Publication, updating the last modified date of the default rendition for each release — even if it has not been updated — will help ensure that the EPUB Publication does not appear to be the same version as an earlier release, as Reading Systems only have to process the default rendition.

The property , properties , rel and scheme attributes use the property data type to represent terms from metadata vocabularies. A property value is an expression that consists of a prefix and a reference, where the prefix — whether literal or implied — is a shorthand mapping of an IRI that typically resolves to a term vocabulary. To assist Reading Systems in processing property values, this specification defines three mechanisms to establish the IRI a prefix maps to:.

A default vocabulary is a vocabulary that does not require a prefix to be declared in order to use its terms, and whose terms must always be unprefixed. As the Package Document has multiple unrelated uses for metadata terms, a single default vocabulary is not defined for all attributes.

Instead, different default vocabularies are defined for use in attributes that accept a property data type as follows:. The IRIs associated with these vocabularies must not be assigned a prefix using the prefix attribute.

This specification reserves a set of prefixes that Authors may use in package metadata without having to declare. These prefixes are defined in [ Reserved Prefixes ]. The prefixes defined in this document are maintained and updated separately of this specification and are subject to change at any time.

Reading Systems must resolve all reserved prefixes used in Package Documents using their predefined URIs unless a local prefix is declared. Reserved prefixes should not be overridden in the prefix attribute , but Reading Systems must use such local overrides when encountered.

As changes to the reserved prefixes and updates to Reading Systems are not always going happen in synchrony, Reading Systems must not fail when encountering unrecognized prefixes i.

The prefix attribute defines additional prefix mappings not reserved by this specification. The value of the prefix attribute is a white space-separated list of one or more prefix-to-IRI mappings of the form:. The following example shows prefixes for the Friend of a Friend foaf and DBPedia dbp vocabularies being declared using the prefix attribute.

To avoid conflicts, the prefix attribute must not be used to declare a prefix that maps to the default vocabulary. If the prefix attribute includes a declaration for a predefined prefix , Reading Systems must use the URI mapping defined in the prefix attribute, regardless of whether of it maps to the same URI as the predefined prefix.

The property data type is a compact means of expressing an IRI [ RFC ] and consists of an optional prefix separated from a reference by a colon. The following example shows a property value composed of the prefix dcterms and the reference modified. After processing , this property would expand to the following IRI:. When a prefix is omitted from a property value, the expressed reference represents a term from the default vocabulary for that attribute.

The following example shows the [ Manifest Vocab ] mathml property on a manifest item element:. An empty string does not represent a valid property value, even though it is valid to the definition above. If the property consists only of a reference, the IRI is obtained by concatenating the IRI stem associated with the default vocabulary to the reference.

If the property consists of a prefix and reference, the IRI is obtained by concatenating the IRI stem associated with the prefix to the reference. If no matching prefix has been defined, the property is invalid and must be ignored. Reading Systems do not have to resolve this IRI, however. Not all rendering information can be expressed through the underlying technologies that EPUB is built upon. For example, although HTML with CSS provides powerful layout capabilities, those capabilities are limited to the scope of the document being rendered.

This section defines general-purpose properties that allow Authors to express package-level rendering intentions i. If a Reading System supports the desired rendering, these properties enable the user to be presented the content as the Author optimally designed it.

Authors may indicate a preference for dynamic pagination or scrolling. For scrolled content, it is also possible to specify whether consecutive EPUB Content Documents are to be rendered as a continuous scrolling view or whether each is to be rendered separately i.

If a Reading System supports the specified rendering, it should use that method to handle overflow content, but may provide the option for users to override the requested rendering. The default value auto must be assumed by Reading Systems as the global value if no meta element carrying this property occurs in the metadata section.

Reading Systems may support only this default value. If a Reading Systems supports the rendition: In addition to using the rendition: The following values are defined for use with the rendition: The Reading System should dynamically paginate all overflow content.

The Reading System should render all Content Documents such that overflow content is scrollable, and the EPUB Publication represented by the given Rendition should be presented as one continuous scroll from spine item to spine item except where locally overridden.

It is expected that a future version of this specification will provide more information about Reading System behaviors for scrolled-continuous. The Reading System should render all Content Documents such that overflow content is scrollable, and each spine item should be presented as a separate scrollable document. The Author does not have a preference for overflow handling. The Reading System may render overflow content using its default method or a user preference, whichever is applicable.

The scroll direction is vertical if the block flow direction is downward top-to-bottom. It is horizontal if the block flow direction of the root element is rightward left-to-right or leftward right-to-left. The following example demonstrates an Author's intent to have a paginated Rendition with a scrollable table of contents.

This property does not affect the rendering of the spine item, only the placement of the resulting content box. For reflowable content, Reading Systems that support this property must center each virtual page. This version of this specification does not define a default rendering behavior when this property is not supported or specified. Reading Systems may render spine items by their own design.

As support for paged media evolves in CSS, however, this property is expected to be deprecated. Authors are encouraged to use CSS solutions when effective. The content flows, or reflows, to fit the screen and to fit the needs of the user. Sometimes content and design are so intertwined they cannot be separated. Any change in appearance risks changing the meaning, or losing all meaning.

This section defines a set of metadata properties to allow declarative expression of intended rendering behaviors of Fixed-Layout Documents in the context of EPUB 3. When fixed-layout content is necessary, the Author's choice of mechanism will depend on many factors including desired degree of precision, file size, accessibility, etc.

This section does not attempt to dictate the Author's choice of mechanism. The default value reflowable must be assumed by EPUB Reading Systems as the global value if no meta element carrying this property occurs in the metadata section.

When the property is set to pre-paginated for a spine item, its content dimensions must be set as defined in Fixed Layouts [ Content Docs 3. The given Rendition is not pre-paginated. Reading Systems may apply dynamic pagination when rendering.

The given Rendition is pre-paginated. Reading Systems must produce exactly one page per spine itemref when rendering. Reading Systems typically restrict or deny the application of user or user agent style sheets to pre-paginated documents, since, as a result of intrinsic properties of such documents, dynamic style changes are highly likely to have unintended consequences. Authors need to take into account the negative impact on usability and accessibility that these restrictions have when choosing to use pre-paginated instead of reflowable content.

Refer to Guideline 1. The following example demonstrates fully fixed-layout content, using [ Media Queries ] to apply different style sheets for three different device categories.

Note that the Media Queries only affect the style sheet applied to the document; the size of the content area set in the viewport meta tag is static. Reading Systems that support multiple orientations should convey the intended orientation to the user, unless the given value is auto.

The means by which the intent is conveyed is implementation-specific. The following example demonstrates fully fixed-layout content intended to be rendered without synthetic spreads, and locked to landscape orientation. Reading Systems must not incorporate spine items in a Synthetic Spread. Reading Systems should render a Synthetic Spread for spine items only when the device is in landscape orientation. Reading Systems should render a Synthetic Spread regardless of device orientation.

No explicit Synthetic Spread behavior is defined. Reading Systems may use Synthetic Spreads in specific or all device orientations as part of a Content Display Area utilization optimization process. Refer to spine for information about declaration of global flow directionality using the page-progression-direction attribute and that of local page-progression-direction within content documents. The following example demonstrates fully fixed-layout content intended to be rendered using synthetic spreads in landscape orientation, and with no spreads in portrait orientation.

The following example demonstrates reflowable content with a single fixed-layout title page, where the fixed-layout page is intended for right-hand spread slot if the device renders Synthetic Spreads. When a Reading System renders a Synthetic Spread , the default behavior is to populate the spread by rendering the next EPUB Content Document in the next available unpopulated viewport, where the next available viewport is determined by the given page progression direction or by local declarations within Content Documents.

By providing one of the rendition: The presence of rendition: In particular, it does not indicate that a viewport with the size of the whole spread has to be created. This is important so that the scale factor stays consistent between regular and center-spread pages. When a reflowable spine item follows a pre-paginated one, the reflowable one should start on the next page as defined by the page-progression-direction when it lacks a rendition: If the reflowable spine item has a rendition: Similarly, when a pre-paginated spine item follows a reflowable one, the pre-paginated one should start on the next page as defined by the page-progression-direction when it lacks a rendition: If the pre-paginated spine item has a rendition: Although Authors often indicate to use a spread in certain device orientations, the content itself does not represent true spreads i.

To indicate that two consecutive pages represent a true spread, Authors should use the rendition: When a Reading System encounters two spine items that represent a true spread, it should create the spread with no space between the adjacent pages. They allow the use of a single vocabulary for all fixed-layout properties. Authors can use either property set, but older Reading Systems might only recognize the unprefixed versions.

The [ Spine Vocab ] is no longer being extended for package rendering metadata, so an unprefixed page-spread-center is not available. The following example demonstrates reflowable content with a two-page fixed-layout center plate that is intended to be rendered using synthetic spreads in any device orientation.

Note that the author has left spread behavior for the other reflowable parts of the Rendition undefined, since the global value of rendition: The following example demonstrates fixed-layout content, where synthetic spreads, when used, have to be disabled for a center plate.

Note that the rendition: It provides the Author with a mechanism to include a human- and machine-readable global navigation layer, thereby ensuring increased usability and accessibility for the user.

The navigation features of this adaptation are expressed through specializations of the [ HTML ] nav element. Each nav element in an EPUB Navigation Document represents a data island — an embedded source of specialized information within the general markup — from which Reading Systems can retrieve navigational information. Formulating the document as an XHTML Content Document enables its reuse in the linear reading order, avoiding the creation of additional tables of contents i. The visual display of components defined in the EPUB Navigation Document can be controlled using the hidden attribute, which has no effect outside of spine rendering i.

When designing an EPUB Navigation Document for such dual use, however, be aware that machine extraction of the content can result in loss of formatting control.

Scripting, styling and other HTML formatting can be stripped by a Reading System as it generates a custom control, such as the table of contents, from the markup. If such formatting and functionality is used, then the EPUB Navigation Document also needs to be included in the linear reading order. Another design consideration is to use progressive enhancement [ Content Docs 3.

When requested by a user, it must provide access to the links and link labels in the nav elements of the EPUB Navigation Document in a fashion that allows the user to activate the links.

When a link is activated, it must relocate the application's current reading position to the destination identified by that link. When a nav element carries the epub: HTML Heading content [0 or 1]. HTML Phrasing content [1 or more].


Leave a Reply

The term metadata literally means ‘data about data’. Metadata provide additional information about a certain file, such as its author, creation data, possible copyright restrictions or the application used to create the file. To download metadata using Locate the tool at the following location: C:\Program Files\Microsoft SDKs\Windows\v\bin. IOrganizationService is the primary web service that accesses data and metadata for your organization. This web service contains the methods that you use to write code that uses all the data and metadata in Dynamics