This document is written for consumption by anyone who has written a BBEdit language module, either codeless or compiled. It documents the changes in the language module API as well as information that is essential for developing language modules that make the most of the improvements in BBEdit 11.0 and later.
This document supplements the information provided in the Codeless Language Module Reference as well as the information in the "Writing Language Modules" document included as part of the BBEdit SDK.
Use this one weird trick to debug your compiled language module in BBEdit with minimal effort:
In your language module project's "Run" scheme (in Xcode), go to the "Info" tab, and from the Executable popup, choose "Ask on Launch".
Then, go to the Arguments tab, and add an argument as follows:
Now choose "Run". Xcode will ask you to choose the application
to run. Choose BBEdit. (Note that it must not be running.) Xcode
will then launch BBEdit with the
as provided above, which tells it to load your language module
from its build location. You can then debug it in place.
Language modules support two new property list keys:
them these keys can eliminate the need for modules to implement
kBBLMCanSpellCheckRunMessage message, and the appropriate
use of these keys adds considerable flexibility.
Each of these keys is an array, listing the run kinds that can
(or cannot) be inspected by the spell checker. The test is
BBLMNonSpellableRunKinds is checked first, and
if a run is found there, it is not spell checked. If a run is not
is checked. Either or both of these arrays may contain wildcards.
A common case is to have
BBLMNonSpellableRunKinds contain an
appropriate list of runs, and
BBLMSpellableRunKinds contains a
single entry: "". Thus, specific run kinds listed in
BBLMNonSpellableRunKinds are not spell checked, and all* other
run kinds are. This is a useful and typical construction for
text-oriented language modules, such as TeX and Markdown,
<key>BBLMNonSpellableRunKinds</key> <array> <string>com.barebones.bblm.TeX.verbatim</string> <string>com.barebones.bblm.TeX.inline-verbatim</string> <string>com.barebones.bblm.TeX.command</string> <string>com.barebones.bblm.TeX.math-string</string> <string>com.barebones.bblm.TeX.delimiter-start</string> <string>com.barebones.bblm.TeX.delimiter-stop</string> <string>com.barebones.bblm.TeX.param-command</string> <string>com.barebones.bblm.TeX.param-math-string</string> <string>com.barebones.bblm.TeX.param-string-command</string> </array> <key>BBLMSpellableRunKinds</key> <array> <string>*</string> </array>
Typically, a programming or scripting language will want to allow spell checking in comments, but not elsewhere. For example, in the Python module:
<key>BBLMSpellableRunKinds</key> <array> <string>com.barebones.bblm.line-comment</string> <string>com.barebones.bblm.block-comment</string> </array> <key>BBLMNonSpellableRunKinds</key> <array> <string>com.barebones.bblm.code</string> <string>com.barebones.bblm.double-string</string> </array>
Either of these arrays may be absent or empty. Note that if a match
is not found (and this includes the case in which
BBLMNonSpellableRunKinds is absent
or empty), then BBEdit will still call the language module. If the
module does not implement
kBBLMCanSpellCheckRunMessage, then the
run is not checked.
Made a change to the language module support for
if at least one of these is present, the application will not
call the language module with
so the keys should be complete as needed. If either key is absent
or fails to match the run kind, the behavior is unspecified (but
the application will always try to behave predictably).
Language modules support two new property list keys:
them these keys eliminate the need for modules to implement
kBBLMFilterRunForTextCompletion message, and the appropriate
use of these keys adds considerable flexibility.
Each of these keys is an array, listing the run kinds that can (or
cannot) be tokenized for autocompletion. The test is exclusionary:
BBLMNonCompletableRunKinds is checked first, and if a run is
found there, it is not tokenized. If a run is not found in
checked. Either or both of these arrays may contain wildcards.
Either of these arrays may be absent or empty. Note that if at
least one of these is present, the application will not call the
language module with
kBBLMFilterRunForTextCompletion; and so the
keys should be complete as needed. If either key is absent or fails
to match the run kind, the behavior is unspecified (but the
application will always try to behave predictably).
kBBLMMatchKeywordMessage message is no longer sent to
compiled language modules; only
kBBLMMatchKeywordWithCFStringMessage is used, with a
Language modules can now specify arbitrary sets of
keywords, each grouped by the run kind that should be used to
color them. The
BBLMKeywords key is an array of dictionaries.
In each dictionary, there is a
RunKind key that specifies the
run kind to be used (one of the factory-supplied run kinds, or
one defined in your language module's
BBLMRunColors array), and
Keywords key whose value is an array of keywords to be
colored using that run kind, or a
KeywordFileName key which
refers to a file in the language module's bundle (for compiled
So, for example, the
BBLMKeywords list looks like this for
the built-in PHP language module:
<key>BBLMKeywords</key> <array> <dict> <key>RunKind</key> <string>com.barebones.bblm.keyword</string> <key>KeywordFileName</key> <string>PHP Keywords.txt</string> </dict> <dict> <key>RunKind</key> <string>com.barebones.bblm.predefined-symbol</string> <key>KeywordFileName</key> <string>PHP Predefined Names.txt</string> </dict> </array>
Alternatively, you could write something like this:
<key>BBLMKeywords</key> <array> <dict> <key>RunKind</key> <string>com.barebones.bblm.keyword</string> <key>Keywords</key> <array> <string>abstract</string> <string>and</string> <string>array</string> <string>as</string> <string>break</string> <string>case</string> <string>catch</string> <string>cfunction</string> <string>class</string> <string>clone</string> <!-- and so on ... --> </array> </dict> <dict> <key>RunKind</key> <string>com.barebones.bblm.predefined-symbol</string> <key>KeywordFileName</key> <string>PHP Predefined Names.txt</string> </dict> </array>
The run kinds you can use are not limited to the built-in ones; you
can define your own run kinds and color mappings using a
BBLMRunColors key, as previously described. You must also add a
BBLMRunNames key which maps those run kinds to human-readable
names, so that users can adjust the color settings.
BBLMKeywords supersedes the four old keys, which are
still supported but should no longer be used:
kBBLMMatchPredefinedNameMessage are no longer sent to language
BBLMSupportsPredefinedNameLookups are no longer used in module
plists. Instead, there's a new key,
which triggers the sending of a new message:
kBBLMRunKindForWordMessage. This allows arbitrary mapping at
runtime of words to run kinds, which in turn provides additional
flexibility for coloring.
Static listing of keyword-to-run-kind mapping in the module plist is
still desirable (because it's faster), but for situations where the
test must be done at runtime based on certain string
is an appropriate solution.
The parameters to this message are (input) the potential keyword,
and (output) the run kind that should be used to color the word. (If
the word is not known, return
Language modules may now use an (optional) key:
BBLMKeywordPatterns. This key contains an array of
dictionaries, each with two key/value pairs. The first key,
RunKind, contains the name of the run (in the module's name
space, or one of the factory-defined run kinds). The second key,
Pattern, contains a Grep pattern which is used to match the
keyword. For example:
<key>BBLMKeywordPatterns</key> <array> <dict> <key>RunKind</key> <string>com.example.bblm.fo</string> <key>Pattern</key> <string>fo.*</string> </dict> <dict> <key>RunKind</key> <string>com.example.bblm.fa</string> <key>Pattern</key> <string>fa.*</string> </dict> <dict> <key>RunKind</key> <string>com.example.bblm.fl</string> <key>Pattern</key> <string>fl.*</string> </dict> </array>
If the module has no static
BBLMKeywords entry, or if the word
being examined fails to match an entry in the
then BBEdit will attempt to match the keyword against one of the
patterns. If a match is found, the appropriate run kind is generated
Codeless language modules now support a
key in the
Language Features property. The
Number Pattern key
may be omitted; if so, BBEdit will apply a default pattern which
matches integers, floating point numbers, and hexadecimal numbers
Here is the default pattern, in a representation suitable for
inclusion in codeless language modules. For readability it's
formatted as an SGML
CDATA section and uses the
modifier for extended syntax, which allows comments and whitespace.
<![CDATA[(?x: (?# this just turns on extended syntax, which allows whitespace and comments) (?<![\d\w.]) (?# must not be preceded by a digit or word char or period) (?: (?# non-capturing group for alternation) (?# version 1: hex notation like 0x0123456789abcdef) (?:0x[[:xdigit:]]+) (?# the number written in hex form) | (?: (?# version 2: all other numbers, including whole numbers, decimals and exponentials) [-+]? (?# optional plus or minus sign is included as part of the number) (?: (?# non-capturing group for alternation) \d+\.\d+ (?# version 2a: digits followed by a decimal followed by digits) | \d+ (?# version 2b: just digits) ) (?: (?# optional exponent notation) [eE][-+]? (?# with optional pos/neg) \d+ (?# numeric portion of the exponent) )? ) ) (?=\b) (?# required word boundary after number. Here a decimal is fine.) )]]>
Codeless language modules support a new key in the
Language Features dictionary:
Keyword Pattern. This can be
used to specify runs of text that are to be colored using the
Keywords color, based on a Grep pattern. The intention is to
support languages with multi-word "keywords" which contain
word-break characters or white space; so the pattern you use
should be written accordingly. A pattern that matches across a
line boundary will probably produce unexpected results, so we
recommend using the non-greedy quantifiers when possible, or
character classes which don't include line breaks.
If a language module supplies a
key, the run kinds in that array are used in the specified order
to map names for the preferences UI. If no
BBLMRunNameUIOrdering key is supplied, the keys in the
BBLMRunNames array are sorted alphabetically for presentation
in the UI.
Beginning with BBEdit 12.5, plug-in language modules may specify custom badge information for use in the function menu.
Here's how it works:
BBLMFunctionKinds enumeration in
BBLMInterface.h provides a
pre-defined list of function kinds, and the range between
available for use by language modules.
When you call
fKind field of the function information to an value from the
range of built-in function kinds (
kBBLMLastUsedFunctionKind - 1),
or use a value in the range of
Note that the range of user function kinds corresponds roughly to the printable ASCII range. This is intentional, because the next thing you'll do is add a section to your language module's language property list. Here is an example for Java:
<key>BBLMFunctionItemKinds</key> <dict> <key>P</key> <dict> <key>typeString</key> <string>com.barebones.bblm.Java.package-decl</string> <key>displayName</key> <string>package declaration</string> <key>labelBadgeShape</key> <string>circle</string> <key>labelCharacter</key> <string>p</string> <key>labelColorName</key> <string>CodeSenseLightRed</string> </dict> </dict>
Each top-level key in the
corresponds to the character value that you used for the function
fKind field. Thus, making it a printable ASCII
character is useful for various reasons. The key is required to be a
For each function kind, the values are as follows:
typeString: (required) a reverse-domain description of the function type.
(required) The form is similar to that used for custom run kinds
that you generate: should begin with your plug-in's bundle
displayName: (required) a brief human-readable description of the function
labelBadgeShape: (optional) describes the shape of the badge that
appears in the function menu. Allowed values are
roundRect. If this key is
default is used.
labelCharacter: (optional) tells BBEdit what character to use in
the badge. If this is absent, BBEdit will use the character value of
the item kind's key (in the example above, this would be
labelColorName: (optional) tells BBEdit what background color to use
for the badge. You may use any CSS3 color name; the following built-in
colors are also provided:
`CodeSenseLightBlue` `CodeSenseLightRed` `CodeSenseLightGreen` `CodeSenseLightPurple` `MarkerBadgeColor` `CodeSenseOrange` `BBEditDarkPurple`
If this key is absent, BBEdit will use
Note: Use the built-in function kinds whenever possible. For
example, if your language has the notion of an object class, use
kBBLMFunctionClassImplementation as appropriate, rather than
creating your own badge.
Also, do not attempt to override the built-in mappings.
Beginning with BBEdit 13.0, compiled language modules now have the ability to generate and use their own document-specific data. (Unless you're writing a compiled language module, you can skip this note.)
This can be for any suitable purpose; for example, if a hypothetical
C-family language module wanted to generate an abstract syntax tree
for the document using
clang, it could do so.
BBEdit does not inspect or use any data created by the language
module, nor does it inspect it nor make any assumptions about what's
in it. The only rule is that it will be treated as an
and passed through the API boundary as such, but the language module
can instantiate it as any
NSObject subclass (including one defined
by the module itself) and assume that it will be of that type.
BBLMParamBlock structure gains the following top-level
fDocumentParseData: the module-generated data object for this document
fOutDocumentParseDataIsNew: if the module creates a new data object for this document, it should set
fDocumentParseData to the new object value, and set
fDocumentIdentifier: a unique identifier for the document. The language module can use this to keep track of data for different documents, for the lifetime of the application
fDocumentLocation: if not
nil, provides the location of the document's backing file on disk. Note: you cannot assume that the document data on disk is consistent with what's in memory. You should always (and continue) to rely on the data provided by fText/fTextLength as authoritative.
There are four new messages relating to the management and lifetime of parse data:
kBBLMInitParseDataMessage: When this is called, the language module may allocate any data specific to this document. Note that doing so is not required; you could certainly wait until you receive a
kBBLMRecalculateParseDataMessage to do so.
kBBLMDisposeParseDataMessage: When this is called, the language module should deallocate any data contained in
fDocumentParseData, in the case that it is not intrinsically reference-counted. (Read below for more on this.)
kBBLMRecalculateParseDataMessage: When this is called, the language module may calculate from scratch and return any appropriate parse data for the document.
fDocumentParseData will be the result of a previous
kBBLMInitParseDataMessage. If you opted not to do anything previously, then
fDocumentParseData will be
nil on entry; you should create it as needed, return it in
fDocumentParseData, and set
kBBLMUpdateParseDataMessage: When this is called, the parameter block's
fUpdateParseDataParams member contains information about the location and nature of the change. You can use this information to incrementally recalculate your parse data; or you can recalculate it all from scratch as though you had received a
kBBLMRecalculateParseDataMessage. If you decide to recalculate from scratch and create a new parse data object, put it in
fDocumentParseData and set
Important Notes About Object Lifetimes
Under no circumstances should you attempt to assume ownership of
NSObject subclass that you return in
even if you are changing its value and setting
fOutDocumentParseDataIsNew. If you return a new parse data object,
BBEdit will release the old one for you.
Considerations for non-refcounted data
In some cases, your parse data might be a C++ class instance, or
even an allocated C structure. In order to pass it back and forth
across the API boundary, you must wrap it in an
NSValue as a
pointer value. In that case, you must also take some care to
manage the object lifetime yourself, since BBEdit can't otherwise
know what needs to be done with it. Thus, given some hypothetical
ParseTree C++ class, you would write something like:
myParseTree = new ParseTree; /* ...do some parsing... */ params.fDocumentParseData = [NSValue valueWithPointer: myParseTree]; params.fDocumentParseDataIsNew = true;
You would use this pattern in response to
but also if you calculated a new parse tree in response to
One additional wrinkle, though: when recalculating or updating,
if you make a new C++ object, you need to dispose of the old one,
but not release the
NSValue instance itself. This is because
BBEdit doesn't know what's wrapped up in the
NSValue, or how it
should be managed.
So in the case where you're changing the object during update or recalculate, you'd have code like this:
ParseTree *oldParseTree = NULL; ParseTree *newParseTree = NULL; oldParseTree = static_cast<ParseTree*>(params.fDocumentParseData.pointerValue); delete oldParseTree; // clean up the old data myParseTree = new ParseTree; /* ...do some parsing... */ params.fDocumentParseData = [NSValue valueWithPointer: myParseTree]; params.fDocumentParseDataIsNew = true;
When receiving a
kBBLMDisposeParseDataMessage, you'll have to do the same:
ParseTree *oldParseTree = NULL; oldParseTree = static_cast<ParseTree*>(params.fDocumentParseData.pointerValue); delete oldParseTree; // clean up the old data
Note that you do not ever release
BBEdit will manage it for you once you've created it. (If you do
release it, you'll rapidly find out what a bad idea that was.)