Many researchers make use of BibTeX for maintaining a comprehensive bibliography which they can then draw on at will when writing papers.
Bib2ML is a handy utility that converts BibTeX files into HTML pages (or XML or SQL). You can use it to easily maintain an updated online bibliography. In addition, it is possible to specify, for each bibfile entry, a set of additional information that will appears inside the generated pages. The output depends of the used theme. But the pages' hierarchy has a similar structure that the JavaDoc's ones (see the two screenshots generated with the theme 'Simple' and the theme 'Dyna'). It includes a overview page, an index, an list of scientifical domains in which the BibTeX's entries are.
An example of HTML pages generated by bib2ml is available on the demonstration page. These demonstration pages have been generated with the default options ob bib2ml.
To run BibHTML you must install a Perl interpreter. Bib2ML was tested with Perl v5.8.3 under Linux. You must also install the following Perl packages (mostly included inside the default Perl distributions):
File::BasenameFile::PathFile::SpecGetopt::LongPod::UsageYou could download the lastest sources from the Bib2ML Github. The sources are commonly stored inside an archive called bib2ml-x.x.tar.gz where x.x is the number of the Bib2ML's version.
./bib2ml-x.x in which all the sources are.gzip -d -c bib2ml-x.x.tar.gz | tar -x
/usr/local/lib/bib2ml. Copy all the content of the subdirectory ./src:cd bib2ml-x.x mkdir /usr/local/lib/bib2ml cp -R -f ./src/* /usr/local/lib/bib2ml/
chmod ugo+x /usr/local/lib/bib2ml/bib2html.pl
cp -f ./COPYING /usr/local/lib/bib2ml/ cp -f ./Changelog /usr/local/lib/bib2ml/ cp -f ./AUTHORS /usr/local/lib/bib2ml/ cp -f ./AUTHORS /usr/local/lib/bib2ml/ cp -f ./VERSION /usr/local/lib/bib2ml/
From now you could launch Bib2ML by typing one of the following commands:
usr/bin/perl:/usr/local/lib/bib2ml/bib2html.pl
/usr/bin/perl:path_to_perl/perl /usr/local/lib/bib2ml/bib2html.pl
Inside the section where I explain how to run Bib2ML, I assume that the launching command was bib2html. If you don't apply the commands from the following section, you must replace bib2html by one of the above commands.
To finalize the installation, you could create a symbolic link to the Bib2ML's script from one of the directories inside your PATH (I assume that /usr/local/bin was in your PATH):
cd /usr/local/bin ln -s -f /usr/local/lib/bib2ml/bib2html.pl bib2html
This recommendation will permits to all the users to run Bib2ML very simply.
Warning: this recommendation works only if the Perl's interpreter was /usr/bin/perl.
This section explains how to install Bib2ML on a Windows operating system without CygWin installed. Bib2ML was successfully installed on WinXP with TeXLive 2007 and ActivePerl 5.8.8. The installation steps are the steps (thanks to Dan Luecking for his report):
scripts\ in one of the texmf trees (if one doesn't exist). Make a subdirectory named bib2ml\ in scripts\ and a subdirectory named man\ in scripts\bib2ml\. The obtained directory tree should be:C:\path_to_texmf\texmf\
|
\- scripts\
|
\-- man\
src\ from the Bib2ML archive and all subdirectories to scripts\bib2ml\ preserving the subdirectory structure.man\ from the Bib2ML archive to scripts\bib2ml\man\.doc\ to the documentation area of the texmf tree.irun bib2html.pl bib2html.exe
irun bib2sql.pl bib2sql.exe
irun bib2xml.pl bib2xml.exe
*.exe to C:\TeXLive\bin\win32\.texhash.The links created by irun (part of TeXLive) use the kpsewhich libraries and texmf.cnf to find the perl scripts. The default setup should work since the search path for scripts is %TEXMF%/scripts/.
This section explains how to install Bib2ML on a Windows operating system with CygWin installed. Bib2ML was successfully installed on WinXP with CygWin 1.5.24. Bib2ML should be installed on Windows Systems with CygWin in the same way as for Unix operating systems. Please see the section 'Install on Unix Systems' for the details.
Bib2ML takes a list of arguments: the names of the bibfiles you wish to process, e.g.
bib2html firstfile.bib secondfile.bib
The output is written by default is the directory ./bib2html.
bib2html [options] file [file ...]
-[no]b or --[no]bibtex: These options permit to generate (or not) a verbatim of the BibTeX entry.--cvs: If specified, this option disables the deletion of the subfiles .cvs, CVS and CVSROOT in the output directory.--doctitle text: Sets the title that appears in the main page.-f or --force: Forces to overwrite into the output directory.-? or -h: Show the list of supported options.--help or --man or --manual: Show the manual page.-o directory or --output directory: Sets the directory in which the pages will be generated.--protect shell_wildcard: If specified, this option disables the deletion of the subfiles that match the specified shell's wildcard in the output directory.--svn: If specified, this option disables the deletion of the subfiles .svnandsvn` in the output directory.--version: Show the version of Bib2ML.--windowtitle text: Sets the text that appears as the window's title.--d name[=value] or --generatorparam name[=value]: Sets a generator param. It must be a key=value pair or simply a name. Example: "target=thisdirectory" defines the parameter target with corresponding value "thisdirectory". The specified parameters which are not supported by the generator are ignored.--g class or --generator class: Specify the generator to use. class must be a predefined generator's identifier of a valid Perl classname. See --genlist to obtain the list of the predefined generators. The default generator is HTML. See the list of supported generators for more details.--generatorparams: Shows the list of supported parameters, and their semantics for the selected generator.--genlist: Shows the list of the supported generators.--jabref: The generator will translate JabRef's groups into Bib2ML domains.--theme name: Specify the theme used by the generator. See the option --themelist to obtain the complete list of supported themes. See the list of the supported themes for more details about them.--themelist: Shows the list of supported themes. See the list of the supported themes for more details about them.--lang name: Sets the language used by the generator. See --langlist to obtain the list of the supported languages.--langlist: Shows the list of supported language.</td></tr>
-p file or --preamble file: Sets the name of the file to read to include some TeX preambles. You could use this option to dynamicaly defined some unsupported LaTeX commands (see 'how to define and use a preamble').--texcmd: Shows the list of supported LaTeX commands. The supported TeX commands permits to create a specific HTML output accordingly to the TeX semantic.-q: Don't be verbose: only error messages are displayed.--[no]sortw: Shows (or not) a sorted list of warnings by appearence line. For example, this could be use to obtain a better output for a parsing program.-v: Be more verbose. Each time this option was specified, the verbosing level was increazed.--[no]warning: If false, the warning are converted to errors. An error stops the program when it occurs. A warning does not stops the program.Bib2ML use as input files which respest as much as possible the BibTeX file format. It add more restrictive constraints than this official format, and includes some additional fields.
To be recognized by Bib2ML, each entry must begin with an @, immediately followed by the type of entry it is (see the 'list of recognized entry types'), immediately followed by a {. It will then process the fields you've specified for that entry until it hits the closing } (see the 'list of recognized fields'). The format then looks something like this:
@entrytype{entry_key, fieldname1 = "Contents", fieldname2 = {Contents}, fieldname3 = contents, ... }
The first information required by the BibTeX's file format is the identifier of the entry. This entry_key must be unique and, in most of the cases, it is composed by the author's name, the publication year... In LaTeX, this key was used to reference this bibliographical entry.
Three types of field contents are valid, as shown here. In fieldname1, the contents are enclosed in quotes; in fieldname2 they are enclosed in curly braces, and in fieldname3 there are no surrounding characters. The third type is often used to specify pre-defined string values, and any value specified in this way will be compared to the list of @strings you've defined for a possible match (if there is a match, it will be expanded out to the full value of the @string).
Any amount of whitespace can come between the fieldname and the =, or between the = and the contents. In addition, Bib2ML can handle nested {}s in the contents of a field.
Bib2ML recognizes the following bibliography entry types (by the HTMLgenerator ):
@article: an article inside a national or international journal, e.g. International Journal of Production Economics.@book: a book, e.g. Les Systèmes Multi-Agents by Professor Jacques Ferber.@booklet: a standalone part of a book, i.e. a part with its own author, title...@inbook: a chapter or a part of a book.@incollection: an article inside a collection of national or international journals, e.g. Lecture Notes on Artificial Intelligence.@inproceedings: a paper inside the proceedings of an national ou international conference, e.g. European Simulaton Multiconferences.@manual: a technical manual published (or not) by an university. Don't be confused with the technical report which is a report, not a manual.@mastersthesis: a student thesis made under the authority of an university of a school, e.g. engineering's report.@misc: see the note below@phdthesis: a research thesis made under the authority of a laboratory, an institution, an university, e.g. PhD thesis, Doctorat thesis...@proceedings: a book that contains all the papers of a conference. Don't choose if you want a paper inside a conference (see the @inproceedings instead), e.g. Proceedings of the International Conference on Multi-Agent Systems.@techreport: a technical report published by an university. In general a technical report has a internal number which is specific to the institution. Don't be confused with the technical manual which is a manual, not a report.@unpublished: a document which are never published.Any other entry type will be proceeded as @misc.
Note about the type @misc: this entry type is considered as the default. It requires the following fields: author and year. This constraint is not from the definition of the standard BibTeX file format. But it was introduced for the page's generation of Bib2ML.
I welcome requests to support other entry types. The generators could support their own entry types. See the section about supported generator for more details.
Bib2ML recognizes the following bibliography field types (by the HTML generator HTML). The real support of a field depends on the entry type in which it appears. The following table explains where the fields are needed and where they are optional.
| article | book | booklet | inbook | inproceeding / incollection | manual | masterthesis | misc | phdthesis | proceedings | techreport | unpublished | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| address | O | O | O | O | O | O | O | O | O | ||||
| annote | O | O | O | O | O | O | O | O | O | O | O | O | |
| author | R | RO | R | RO | R | O | R | RO | R | RO | R | R | |
| booktitle | R | ||||||||||||
| chapter | R | ||||||||||||
| edition | O | O | O | ||||||||||
| editor | RO | RO | O | RO | RO | ||||||||
| howpublished | O | O | |||||||||||
| institution | R | ||||||||||||
| journal | R | ||||||||||||
| month | O | O | O | O | O | O | O | O | O | O | O | ||
| note | O | O | O | O | O | O | O | O | O | O | O | R | |
| number | O | O | O | O | O | O | |||||||
| organization | O | O | O | ||||||||||
| pages | O | O | O | ||||||||||
| publisher | R | R | O | O | |||||||||
| school | R | R | |||||||||||
| series | O | O | O | O | |||||||||
| title | R | R | R | R | R | R | R | R | R | R | R | R | |
| type | O | O | O | O | |||||||||
| volume | O | O | O | O | O | ||||||||
| year | R | R | R | R | R | O | R | R | R | R | R | R |
I welcome requests to support other fields. The generators could support their own. See the section about supported generator for more details.
Like BibTeX, Bib2ML also handles arbitrary @string definitions, which can be used in any entry field, e.g.
@string{acl = "Association for Computational Linguistics"} ... @proceedings{PROC, publisher = acl, ... }
Bib2ML also supports the definition of TeX preambles. The TeX preambles are TeX commands which are evaluated and ran before any treatement on the BibTeX entries. The definition of a preamble is done with @preamble, e.g.
@preamble{\def\th{\ensuremath{^{th}}}} ...
The TeX commands which can be put inside a @preamble are limited to the commands supported by Bib2ML (see the command-line option --texcmd to obtain a list).
In some fields (e.g. author and editor) you must specify a list of names. This list is composed of names separated by the keyword AND. Each name must respect one of the following syntaxes:
[von] Last, jr, FirstFirst [von] Last, jr[von] Last, First [jr]First [von] Last [jr]If present the jr part must be one of junior, jr., jr, senior, sen., sen, esq., esq, phd. and phd.
Good Example: DUPONT, Henri and Pierre, Alain Michel and Jim WASHINGTON jr.
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|
DUPONT |
Henri |
and |
Pierre |
Alain Michel |
and |
Jim |
WASHINGTON |
jr. |
last |
first |
last |
first |
first |
last |
jr |
Bad Example: Henri DUPONT, Alain Michel Pierre and Jim WASHINGTON jr.
| 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|
Henri DUPONT |
Alain Michel Pierre |
and |
Jim |
WASHINGTON |
jr. |
last |
first |
first |
last |
jr |
The generator is one of the major module of Bib2ML (with the BibTeX parser). It aims to create the HTML files from the internal data structure given by the parser. It is the generator which apply the canvas of the generated pages (use of 3 frames, links to the overview the index from the header of each entry's page...).
In addition to the usable generator listed below, Bib2ML includes an abstract generator which is the basis of all the others.
The generator called HTML is the default HTML generator of Bib2ML. Its purpose is to generate a basic content which is quiet similar to a lot of BibTeX to HTML tools (such as the LaTeX distribution's bib2html).
annote).annote (or comments) inside a section just below the field's table.--bitex was specified (default), a verbatim output of the BibTeX entry was generated in its own section.author-regexp=expr: A Perl's regular expression (which is case-insensitive) against which the lastname of an author is matched. If the author matches, (s)he is included in the overview window author list.hideindex: If presents, hide the index link and do not generate the index files.html-generator=encoding: This parameter is a string that correspond to the character encoding of the generated HTML pages. The default encoding is ISO-8859-1.max-names-overview=integer: An integer which is the maximal count of authors in the overview page.max-names-list=integer: An integer which is the maximal count of authors on the listing in the lower-left frame.newtype=expr: A comma separated list of new publication's types, with singular and plural label. The value must respect the format: type:Singular:Plural[,type:Singular:Plural...], where type is the identifier of the new type, Singular is the label used when this type has zero or one entry, Plurial is the label used when this type has two or more entries. Each new type will appears inside the overview's pages. But this feature does not explain how to generate the content of the corresponding entry's pages. So, the entry's pages will be generated as for @misc entries (except if you define your own generator).stdout: If presents, this option force Bib2ML to output the files onto the standard output instead of files.type-matching=expr: A coma separated list of items which inititalizes an associative array of type entry mappings. Each item must respect one of the following syntaxes:type => type (since the version 1.3)type -> type (since the version 1.3)type > type (since the version 1.3)type , type (original syntax)
For example incollection,article,inproceedings,article means that all the BibTeX's @incollection entries will be displayed as @article entries. The same thing for the @inproceedings. So, the specified value for this parameters must be a list of pairs. n alternative syntax is: type=>type[,type=>type...]. With the same example a above,
the value should be incollection=>article,inproceedings=>article.xml-verbatim: If this parameter was given, Bib2ML will generate a verbatim text that corresponds to the XML specification of the entries. This text is put just below the BibTeX verbatim text.The generator Extended is an extension of HTML. Its purpose is to provide some additional features.
abstract is the abstract associated to the entry (in most of the case, it is written at the begining of the paper's article).adsurl is an URL from the Astrophysics Citation Reference System which is corresponds to the entry. This field supports the URL's protocols ftp:, file:, https:, gopher:, mailto: and http: (this last is the default).doi is the Document Object Identifier (DOI) which is assumed to be an URL linked to a document on Internet. This field supports the URL's protocols ftp:, file:, https:, `gopher:`, `mailto:` and `http:` (this last is the default).
isbn is the ISBN number of the entry.issn is the ISSN number of the entry.keywords is list of the keywords associated to the entry (in most of the case, they are mentionned at the begining of the paper's article).localfile is the path (on your local host) to a electronical version of the document that is described by the entry (I recommended to put only a PDF or a Postscript file here). If this field was present and the corresponding file was found, Bib2ML generates a link to this into the entry's page. See the parameters of this generator to influence the default location of the electronical files.pdf is an URL associted to the entry with corresponds to a PDF file. This field supports the URL's protocols ftp:, file:, https:, gopher:, mailto: and http:.readers is a list of people who read this entry. The value of this field must support the BibTeX's syntax for names.url is an URL associated to the entry. This field supports the URL's protocols ftp:, file:, https:, gopher:, mailto: and http: (this last is the default).abstract and keywords).abstract and keywords are put inside a specific section.absolute-source=path is the absolute path of the directory where the downloadable documents could be found (see the field localfile for details about the downloadable documents). The parameters absolute-source, relative-source and target-url are mutually exclusive.backslash if presents, indicates that backslashes will be removed from the link fields (url,ftp...).doc-repository=path if presents, indicates the directory where are stored the electronical documents. This option assumes hat electronical documents have a name similar to the BibTeX key. For example the entry with the key Galland.esm00 could have an associated electronical document with its name equals to one of Galland.esm00.pdf, Galland.esm00.PDF, Galland.esm00.ps or Galland.esm00.PS.nodownload if presents, indicates that no link to the electronic documents will be generated. By extension, if presents no copy of there files will be made.relative-source=path is the relative path of the directory where the downloadable documents could be found (see the field localfile for details about the downloadable documents). This path is relative to the directory where the BibTeX file is located. The parameters absolute-source, relative-source and target-url are mutually exclusive.target-url=url is an URL where the downloadable documents could be find. It means that if this URL was specified, Bib2ML assumes that all the files could be download from the specified URL. It means also that no copy will be made by Bib2ML. The parameters absolute-source, relative-source and target-url are mutually exclusive.The generator Domain is an extension of Extended. Its purpose is to provide some additional features about the scientifical domains of the entries. This generator introduces the concept of "domain" which corresponds to the name of a scientifical context/domain. An entry could be inside one or more domains.
domain is the first domain in which this entry was. This field does not overset the previous domain's setting (except for domain).domains is a list of domains in which this entry was. The domain's separator is the character :. This field does not overset the previous domain's setting (except for domains).nddomain is the second domain in which this entry was. This field does not overset the previous domain's setting (except for nddomain).rddomainis the third domain in which this entry was. This field does not overset the previous domain's setting (except for rddomain).The generator called XML is the default XML generator of Bib2ML. Its purpose is to generate a basic content which respects the XML DTD defined by BibteXML.
stdout: If presents, this option force Bib2ML to output the files onto the standard output instead of files.xml-encoding=encoding: is character encoding which will be put into the header of the generated XML file. All values for the character encoding supported by the XML specifications are allowed (ISO-8859-1, UTF8...). The default value is ISO-8859-1.The generator called SQL is the default SQL generator of Bib2ML. Its purpose is to generate a basic content which respects the SQL schema illustrated by the following figure.

sql-encoding=name: Defines the character encoding used to generate the SQL script.sql-engine=name: Defines the SQL engine for which the SQL script should be generated. The supported engines are: "mysql" and "pgsql".stdout: If presents, this option force Bib2ML to output the files onto the standard output instead of files.Bib2ML permits to select a theme which influence the look of the generated pages. You could select a theme which the command-line option --theme and list all the supported themes which --themelist.
The theme Simple is the default theme. It is quiet similar to the default output of JavaDoc.

The theme Dyna is an experimental theme. It uses its own look policy and includes some dynamical features such a collapsing lists.

Bib2ML supports the French, the English, Spanish (thanks to Sebastian), Portuguese (thanks to João) and Italian (thanks to Cristian).
I would like to thank the following people for generously taking the time to point out bugs, suggest improvements, or send me Bib2ML patches. Many thanks to:
uniq to eliminate redundant values from a sorted list.\# and \L.__texcommand_map_to when a accentuated TeX command was encoutered but Bib2ML does not known any corresponding HTML character (e.g. \'b).__texcommand_map_to which does not return the right variable's value.save_generator_parameter which prevent to properly set the parameters' values.addentry which permits to add an BibTeX's field. All the fields' names are lower-cased to avoid setting problems.max-titlelength-overview and show-journalparams-overview.I've come across mention of other bib2html programs (see below for a non-exaustive list). This program is in no way related to any of them. For the curious, it was implemented using Perl. Other BibTeX to HTML tools are (non-exhautive list):