306 lines
12 KiB
Markdown
306 lines
12 KiB
Markdown
# endgen
|
|
This is a converter tool that reads a DSL and generates output files.
|
|
|
|
Endgen is Open Source Software using the [Apache Software License v2.0](http://www.apache.org/licenses/LICENSE-2.0)
|
|
|
|
## Motivation
|
|
The motivation behind this tool is that I wanted to generate boilerplate code for handling HTTP Endpoints (hence the
|
|
endgen name).
|
|
|
|
I had one project written in Scala using the [Tapir](https://tapir.softwaremill.com/) library. Another Java project
|
|
using [Spring](https://docs.spring.io/spring-framework/docs/3.2.x/spring-framework-reference/html/mvc.html) Boot.
|
|
In both of these projects had a very simple endpoints only supporting POST, taking some datatype as payload and using
|
|
the same response type.
|
|
|
|
That's a lot of boilerplate, especially in the Scala case where the payload datatype had to be written in several
|
|
separate files (for endpoint-definitions, circe serializer support etc).
|
|
|
|
So I wrote a [DSL](https://en.wikipedia.org/wiki/Domain-specific_language) parsed by an [ANTLR4](https://www.antlr.org)
|
|
parser and a code generator using [freemarker](https://freemarker.apache.org).
|
|
|
|
```
|
|
| Endgen |
|
|
-------------------------------------------------
|
|
----------- ||--------| |-----------| |------------||
|
|
| endpoint | || Parser | | In-Memory | | Freemarker || ------------------
|
|
| file | --> || | --> | AST | --> | engine || --> | Output file |
|
|
\__________\ ||--------| |-----------| |------------|| | mytemplate.xxx |
|
|
-------------------------------------------------- \________________\
|
|
^
|
|
|
|
|
----------------------
|
|
| mytemplate.xxx.ftl |
|
|
\____________________\
|
|
```
|
|
|
|
Endgen currently contains two separate parsers:
|
|
* endpoint - A DSL for expressing HTTP endpoints.
|
|
* state - A DSL for expressing state and transitions.
|
|
|
|
Only one parser will be sued when reading a file. Determined by the file name ending;
|
|
'.endpoints', or '.states', or by a command line argument.
|
|
|
|
The endpoint-DSL and the state-DSL share the grammar for expressing configuration and data types
|
|
,see below for details.
|
|
|
|
## How to Run
|
|
You need a Java 21 (or later) runtime and java in the path. A very convenient way to install a java runtime is [SdkMan](https://sdkman.io).
|
|
|
|
Unpack the archive, run the provided shellscript file.
|
|
|
|
### Usage
|
|
```
|
|
sage: run.sh [-hvV] [-o=<outputDir>] [-p=<parser>] [-t=<templateDir>] <file>
|
|
Generate source code from an endpoints specification file.
|
|
<file> The source endpoints DSL file.
|
|
-h, --help Show this help message and exit.
|
|
-o, --output=<outputDir> The directory to write the generated code to.
|
|
Default is endpoints-output
|
|
-p, --parser=<parser> Force use of a specific parser instead of
|
|
determining from filename. Valid values:
|
|
Endpoints, States.
|
|
-t, --template=<templateDir>
|
|
The template directory. Default is
|
|
endpoints-template
|
|
-v, --verbose Print verbose debug messages.
|
|
-V, --version Print version information and exit.
|
|
```
|
|
|
|
## Endpoint DSL example
|
|
In the simplest form the DSL looks like this
|
|
```
|
|
/some/endpoint <- SomeType(foo:String)
|
|
```
|
|
|
|
This gets parsed into an [AST](https://en.wikipedia.org/wiki/Abstract_syntax_tree) which in this case holds a list of
|
|
Path segments and a data strucutre representing the input/body type.
|
|
|
|
## Code generation example
|
|
When the parser is done reading the DSL it will look in a directory for [freemarker](https://freemarker.apache.org)
|
|
templates. For each template it finds it sends in the AST. The resulting file (per template) is written to an
|
|
output directory.
|
|
|
|
The templates must have the file ending `.ftl` - this ending is stripped when generating the output file. So a template
|
|
called `types.java.ftl` will generate a file called `types.java`.
|
|
|
|
The idea being that you can take these files and probably adapt them before checking them into your project. Endgen
|
|
does not aim to be a roundtrip tool (i.e. reading the generated source, or being smart in updating them etc). It is also
|
|
a very limited DSL, you can for example not express what type of HTTP Verb to use or declare response codes. There are
|
|
no plans to extend the DSL to do that either.
|
|
|
|
## DSL
|
|
This is the ANTLR grammar for the root of the Endpoint-DSL
|
|
|
|
```antlrv4
|
|
document : generatorconfig? (namedTypeDeclaration|endpoint)* ;
|
|
```
|
|
|
|
the corresponding grammar for the root of the State-DSL
|
|
|
|
```antlrv4
|
|
document : generatorconfig? transition (',' transition)* ;
|
|
```
|
|
|
|
### Configuration block
|
|
Both types of DSL files has an optional `generatorconfig` block at the top.
|
|
|
|
Here is an example:
|
|
```
|
|
{
|
|
package: se.rutdev.senash,
|
|
mykey: myvalue
|
|
}
|
|
```
|
|
|
|
This consists of a config block with 2 items, the 'package' and the 'mykey' definition. These are available to be used
|
|
in the freemarker template as a Map of String-keys to String-values.
|
|
|
|
### Endpoint DSL
|
|
After the optional configuration block you can write either; a type definition, or an endpoint declaration, and repeat
|
|
as many times as you like.
|
|
|
|
Here is an example:
|
|
```
|
|
/some/endpoint <- SomeType(foo:String)
|
|
|
|
Embedded(foo:Bar)
|
|
/some/other/endpoint <- (bar:Seq[Embedded])
|
|
```
|
|
|
|
### Endpoint definition
|
|
|
|
`/some/endpoint <- SomeType(foo:String)` is an endpoint definition. It declares one endpoint that have a request body
|
|
data type called `SomeType` that has a field called `foo` of the type `String`.
|
|
|
|
### Data types
|
|
Both DSL-grammars use the Scala convention of writing data types after the field name separated by a colon. Of course
|
|
the parsers do not know anything about java or scala types, as far as the parser is concerned these are 2 strings and
|
|
the first one is just named: field-name and the other string is named: field-type.
|
|
|
|
`Embedded(foo:Bar)` is a `namedTypeDeclaration` which is parsed the same way as the request type above. But isn't tied
|
|
to a specific endpoint.
|
|
|
|
### Automatically named endpoint data types
|
|
`/some/other/endpoint <- (bar:Seq[Embedded])` is another endpoint declaration. However this time the request body is
|
|
not named in the DSL. But all datatypes must have a name so it will simply name it after the last path segment and
|
|
tack on the string 'Request' at the end. So the AST till contain a datatype named `endpointRequest` with a field named
|
|
`bar` and a type-field with the value `Seq[Embedded]`.
|
|
|
|
Again, the parser does not care about, or know anything about what code is generated, so it has not semantic knowledge
|
|
if these are actual datatypes in the code it generates or if they make sense in java/scala/lua/rust or whatever you
|
|
decide to generate in the templates.
|
|
|
|
The only 'semantic' validation the parser performs is to check that not two types have the same name.
|
|
|
|
### Endpoint reponse data type
|
|
It is possible to have an optional response data type declared like so:
|
|
|
|
`/some/other/endpoint <- (bar:Seq[Embedded]) -> ResponseType(foo: Bar)`
|
|
|
|
The right pointing arrow `->` denotes a response type, it can be an anonymous data type in which case the parser till
|
|
name it from the last path segment and add 'Response' to the end of the data type name.
|
|
|
|
### State DSL
|
|
This is an example of a state file:
|
|
```
|
|
start -> middle: message,
|
|
middle -> middle: selfmessage,
|
|
middle -> end: endmessage
|
|
```
|
|
The file declares 3 transitions. The first line states: Transition from the 'start' state to the 'middle' state with
|
|
the message 'message'.
|
|
|
|
From this we can see that the file contains 3 state definitions `start`, `middle` and `end`.
|
|
A state definition will be parsed as a data type with the name of the state as the type name. It also contains 3
|
|
message definitions `message`, `selfmessage` and `endmessage`. Message definitions will also be parsed as data types.
|
|
|
|
Since the parser will extract datatypes it is possible to define the fields of the data types. This is a slightly more
|
|
complicated example:
|
|
|
|
```
|
|
start -> middle: message(a: String),
|
|
middle(bar:Bar) -> middle: selfmessage,
|
|
middle -> end: endmessage
|
|
```
|
|
The data type for `middle` will have a field declaration with the name `bar` and the type `Bar`.
|
|
|
|
Fields for the same state data type, or message data type, will be merged. Here is a complex example:
|
|
|
|
```
|
|
start(s:S) -> middle(foo:foo): message(foo:foo),
|
|
middle -> middle(bar:bar): selfmessage(bar:bar),
|
|
middle -> end: message(bar:baz)
|
|
```
|
|
|
|
Note that we can declare fields on both the `from` and `to` state declarations. The `middle` datat type will have field
|
|
definitons for `foo` and `bar`.
|
|
|
|
The data type for `message` will have fields for `foo` and `bar`.
|
|
|
|
One restriction is that a state and a messages may share have the same name, i.e. be parsed as the same data type.
|
|
|
|
## Generating
|
|
If the parser is successful it will hold the following data in the AST
|
|
|
|
```java
|
|
public record DocumentNode(
|
|
Map<String, String> config,
|
|
Set<TypeNode> typeDefinitions,
|
|
List<EndpointNode> endpoints,
|
|
Set<StateNode> states) {
|
|
}
|
|
```
|
|
|
|
Depending on the parser used the endpoints or the states will be null but config and typeDefinitions are populated the
|
|
same for both parsers.
|
|
|
|
This will be passed to the freemarker engine as the 'root' data object, meaning you have access to the parts in your
|
|
freemarker template like this:
|
|
|
|
```injectedfreemarker
|
|
<#list typeDefinitions as type>
|
|
This is the datat type name: ${type.name?cap_first} with the first letter capitalized.
|
|
</#list>
|
|
```
|
|
|
|
That is, you can directly reference `typeDefinitions`, `endpoints`, `states` or `config`.
|
|
|
|
### Config
|
|
The config object is simply a String-map with the keys and values unfiltered from the input file. Here is an example
|
|
that writes the value for a config key called 'package'.
|
|
|
|
`package ${config.package}`
|
|
|
|
### Data types
|
|
These are all the data types the parser have collected, either from explicit declarations, request payloads, response
|
|
bodies, states or messages.
|
|
|
|
```java
|
|
public record TypeNode(String name, List<FieldNode> fields) { }
|
|
public record FieldNode(String name, String type) { }
|
|
```
|
|
|
|
Here is an example template that writes the data types as Scala case classes
|
|
```injectedfreemarker
|
|
object Protocol:
|
|
<#list typeDefinitions?sort as type>
|
|
case class ${type.name?cap_first}(
|
|
<#list type.fields as field>
|
|
${field.name} : ${field.type},
|
|
</#list>
|
|
)
|
|
</#list>
|
|
```
|
|
|
|
### Endpoints
|
|
The parser will collect the following data for endpoint declarations
|
|
|
|
```java
|
|
public record EndpointNode(
|
|
PathsNode paths,
|
|
String inputType,
|
|
Optional<String> outputType) {}
|
|
|
|
public record PathsNode(List<String> paths) {}
|
|
```
|
|
|
|
This is an example that will write out the endpoints with the path first, then the Input data type, then the optional
|
|
Output data type.
|
|
|
|
```injectedfreemarker
|
|
<#list endpoints as endpoint>
|
|
<#list endpoint.paths.paths>
|
|
<#items as segment>/${segment}</#items>
|
|
Input:
|
|
${endpoint.inputType?cap_first}
|
|
Output:
|
|
<#if endpoint.outputType.isPresent()>
|
|
${endpoint.outputType.get()?cap_first}
|
|
<#else>
|
|
Not specified
|
|
</#if>
|
|
</#list>
|
|
|
|
</#list>
|
|
```
|
|
|
|
### States
|
|
|
|
The set of states will hold items of this shape:
|
|
|
|
```injectedfreemarker
|
|
public record StateNode(String name, String data, Set<TransitionNode> transitions) {
|
|
}
|
|
```
|
|
* `name` is the name of the state.
|
|
* `data` is the name of the data type for the state.
|
|
* `transistions` are the outgoing arrows from the named state.
|
|
|
|
Transitions have this structure:
|
|
|
|
```injectedfreemarker
|
|
public record TransitionNode(String message, String toState) {
|
|
}
|
|
```
|
|
* `name` is the message name.
|
|
* `toState` is the name of the target state.
|