SYNOPSIS
src2srcml [-hVnizcgv] [-l language] [-d directory] [-f filename] [-s
version] [-x encoding] [-t encoding] [input-source-code-
file]... [-o output-srcML-file]
DESCRIPTION
The program src2srcml translates source-code files into the XML source-
code representation srcML. The srcML format allows the use of XML for
addressing, querying, and transformation of source code. All text from
the original source-code file is preserved including white-space, com-
ments, and preprocessor statements. No preprocessing of the source code
is done. In addition, the tool can be applied to individual source-code
files, or code fragments.
The translation is fast and uses a stream-parsing approach with top-
down parsing and elements issued as soon as they are detected.
The program src2srcml is typically used with srcml2src which converts
from the srcML format back to source code. Conversion of a source-code
file through src2srcml and then through srcml2src produces the original
source-code file. The program srcml2src also provides a set of utili-
ties for working with srcML files, including efficient querying and
transformation of source code.
Using the character - in the place of an input source-code file file-
name uses standard input, and in place of an output srcML file uses
standard output.
OPTIONS
-h, --help
Output the help and exit.
-V, --version
Output the version of src2srcml then exit.
-e, --expression
Translates a single, standalone expression.
-n, --archive
Stores all input source files into one srcML archive. Default
with more then one input file, a directory, or the --files-from
option.
--files-from
Treats the input file as a list of source files. Each file is
separately translated and collectively stored into a single sr-
cML archive. The list has a single filename on each line start-
ing at the beginning of the line. Blank lines and lines that
begin with the character '#' are ignored. As with input and
output files, using the character - in place of a file name
takes the input list from standard input.
is to be placed inside another XML document.
--no-namespace-decl
No output of namespace declarations. Useful when the output is
to be placed inside another XML document.
-z, --compress
Output is in compressed gzip format. This format can be direct-
ly, and automatically, read by srcml2src.
-c, --interactive
Default is to use buffered output for speed. For interactive ap-
plications output is issued as soon as parsed.
For input from terminal, interactive is default.
-g, --debug
When translation errors occur src2srcml preserves all text, but
may issue incorrect markup. In debug mode the text with the
translation error is marked with a special set of tags with the
prefix err from the namespace http://www.sdml.info/srcML/srcerr.
Debug mode can also be indicated by defining a prefix for this
namespace URL, e.g., --xmlns:err="http://www.sdml.info/sr-
cML/srcerr".
-v, --verbose
Conversion and status information to stderr, including encodings
used. Especially useful with for monitoring progress of the the
--files-from option. The signal SIGUSR1 can be used to toggle
this option.
METADATA OPTIONS
This set of options allows control over various metadata stored in the
srcML document.
-l, --language=language
The programming language of the source-code file. Allowable val-
ues are C, C++, Java, or AspectJ. The language affects parsing,
the allowed markup, and what is considered a keyword. The value
is also stored as an attribute of the root element unit.
If not specified, the programming language is based on the file
extension. If the file extension is not available or not in the
standard list, the default is C++.
--register-ext=extention=language
Sets the extensions to associate with a given language. Note:
the extensions to not contain the '.'.
The programming language of the source-code file. Allowable val-
ues are C, C++, Java, or AspectJ. The language affects parsing,
the allowed markup, and what is considered a keyword. The value
ent filename for standard input or where the filename is not
contained in the input path.
-s, --src-version=version
Sets the value of the attribute version to version. This is a
purely-descriptive attribute, where the value has no interpreta-
tion by src2srcml.
MARKUP EXTENSIONS
Each extensions to the srcML markup has its own namespace. These are
indicated in the srcML document by the declaration of the specific ex-
tension namespace. These flags make it easier to declare.
--literal
Additional markup of literal values using the element literal
with the prefix "lit" in the namespace "http://www.sdml.info/sr-
cML/literal".
Can also be specified by declaring a prefix for literal names-
pace using the --xmlns option, e.g.,
--xmlns:lit="http://www.sdml.info/srcML/literal"
--operator
Additional markup of operators values using the element operator
with the prefix "op" in the namespace "http://www.sdml.info/sr-
cML/operator".
Can also be specified by declaring a prefix for operator names-
pace using the --xmlns option, e.g.,
--xmlns:op="http://www.sdml.info/srcML/operator"
--modifier
Additional markup of type modifiers using the element modifier
with the prefix "type" in the namespace "http://www.sdml.in-
fo/srcML/modifier".
Can also be specified by declaring a prefix for the modifier
namespace using the --xmlns option, e.g.,
--xmlns:type="http://www.sdml.info/srcML/modifier"
LINE/COLUMN POSITION
Optional line and column attributes are used to indicate the position
of an element in the original source code document. Both the line and
column start at 1. The column position is based on the tab settings
with a default tab size of 8. Other tab sizes can be set using the
--tabs.
--position
Insert line and column attributes into each start element.
These attributes have a default prefix of "pos" in the namespace
"http://www.sdml.info/srcML/position".
err http://www.sdml.info/sr-
cML/srcerr
lit http://www.sdml.info/sr-
cML/literal
op http://www.sdml.info/sr-
cML/operator
type http://www.sdml.info/sr-
cML/modifier
pos http://www.sdml.info/sr-
cML/position
The following options can be used to change the prefixes.
--xmlns=URI
Sets the URI for the default namespace.
--xmlns:PREFIX=URI
Sets the namespace prefix PREFIX for the namespace URI.
These options are an alternative way to turn on options by declaring
the URI for an option. See the MARKUP EXTENSIONS for examples.
CPP MARKUP OPTIONS
This set of options allows control over how preprocessing regions are
handled, i.e., whether parsing and markup occur. In all cases the text
is preserved.
--cpp Turns on parsing and markup of preprocessor statements in non-
C/C++ languages such as Java. Can also be enabled by defining a
prefix for this cpp namespace URL, e.g.,
--xmlns:cpp="http://www.sdml.info/srcML/cpp".
--cpp-markup-else
Place markup in #else and #elif regions. Default.
--cpp-text-else
Only place text in #else and #elif regions leaving out markup.
--cpp-markup-if0
Place markup in #if 0 regions.
--cpp-text-if0
Only place text in #if 0 regions leaving out markup. Default.
SIGNAL PROCESSING
The following signals may be used to control src2srcml:
SIGUSR1
Toggles verbose option. Useful with multiple input files as in
the --files-from option.
SIGINT Completes current file translation (and output) with multiple
main.c.xml:
src2srcml --language=C main.c -o main.c.xml
To translate a Java source-code file main.java into the srcML file
main.java.xml:
src2srcml --language=Java main.java -o main.java.xml
To specify the directory, filename, and version for an input file from
standard input:
src2srcml --directory=src --filename=main.cpp --version=1 - -o
main.cpp.xml
To translate a source-code file in ISO-8859-1 encoding into a srcML
file with UTF-8 encoding:
src2srcml --src-encoding=ISO-8859-1 --encoding=UTF-8 main.cpp -o
main.cpp.xml
RETURN STATUS
0: Normal
1: Error
2: Problem with input file
3: Unknown option
4: Unknown encoding
6: Invalid language
7: Language option specified, but value missing
8: Filename option specified, but value missing
9: Directory option specified, but value missing
10: Version option specified, but value missing
11: Text encoding option specified, but value missing
12: XML encoding option specified, but value missing
15: Invalid combination of options
16: Incomplete output due to termination
CAVEATS
Translation is performed based on local information with no symbol ta-
SEE ALSO
srcml2src(1)
AUTHOR
Written by Michael L. Collard and Huzefa Kagdi
src2srcml 1.0 Sun Jul 21 23:22:56 EDT 2013 src2srcml(1)
Man(1) output converted with
man2html