endgen/README.md

# endgen
This is a converter tool that reads a DSL and generates output files. 

Endgen is Open Source Software using the [Apache Software License v2.0](http://www.apache.org/licenses/LICENSE-2.0)

## Motivation
The motivation behind this tool is that I wanted to generate boilerplate code for handling HTTP Endpoints (hence the 
endgen name).

I had one project written in Scala using the [Tapir](https://tapir.softwaremill.com/) library. Another Java project 
using [Spring](https://docs.spring.io/spring-framework/docs/3.2.x/spring-framework-reference/html/mvc.html) Boot. 
In both of these projects had a very simple endpoints only supporting POST, taking some datatype as payload and using
the same response type.

That's a lot of boilerplate, especially in the Scala case where the payload datatype had to be written in several 
separate files (for endpoint-definitions, circe serializer support etc).

So I wrote a [DSL](https://en.wikipedia.org/wiki/Domain-specific_language) parsed by an [ANTLR4](https://www.antlr.org) 
parser and a code generator using [freemarker](https://freemarker.apache.org).

```
                                 | Endgen |
                  -------------------------------------------------
-----------       ||--------|      |-----------|     |------------||
| endpoint |      || Parser |      | In-Memory |     | Freemarker ||     ------------------
| file     | -->  ||        | -->  | AST       | --> | engine     || --> | Output file    |
\__________\      ||--------|      |-----------|     |------------||     | mytemplate.xxx |
                  --------------------------------------------------     \________________\
                                                           ^
                                                           |
                                                    ----------------------
                                                    | mytemplate.xxx.ftl |
                                                    \____________________\
``` 
## How to Run
You need a Java 24 runtime and java in the path. A very convenient way to install a java runtime is [SdkMan](https://sdkman.io).

Unpack the archive, run the provided shellscript file.

### Usage
```
Usage: run.sh [-hvV] [-o=<outputDir>] [-t=<templateDir>] <file>
Generate source code from an endpoints specification file.
      <file>                 The source endpoints DSL file.
  -h, --help                 Show this help message and exit.
  -o, --output=<outputDir>   The directory to write the generated code to.
                               Default is ~/endpoints-output
  -t, --template=<templateDir>
                             The template directory. Default is
                               ~/endpoints-templates
  -v, --verbose              Print verbose debug messages.
  -V, --version              Print version information and exit.
```

## DSL example
In the simplest form the DSL looks like this
```
/some/endpoint <- SomeType(foo:String)
```

This gets parsed into an [AST](https://en.wikipedia.org/wiki/Abstract_syntax_tree) which in this case holds a list of 
Path segments and a data strucutre representing the input/body type.

## Code generation example
When the parser is done reading the DSL it will look in a directory for [freemarker](https://freemarker.apache.org)
templates. For each template it finds it sends in the AST. The resulting file (per template) is written to an
output directory.

The templates must have the file ending `.ftl` - this ending is stripped when generating the output file. So a template 
called `types.java.ftl` will generate a file called `types.java`.

The idea being that you can take these files and probably adapt them before checking them into your project. Endgen
does not aim to be a roundtrip tool (i.e. reading the generated source, or being smart in updating them etc). It is also
a very limited DSL, you can for example not express what type of HTTP Verb to use or declare response codes. There are
no plans to extend the DSL to do that either.

## DSL 
This is the ANTLR grammar for the root of the DSL

```antlrv4
document                : generatorconfig? (namedTypeDeclaration|endpoint)* ;
```
Meaning that the DSL file has an optional `generatorconfig` block at the top. Then you can write either; a type 
definition, or an endpoint declaration, as many times as you like.

Here is an example:
```
{
    package: se.rutdev.senash,
    mykey: myvalue
}

/some/endpoint <- SomeType(foo:String)

Embedded(foo:Bar)
/some/other/endpoint <- (bar:Seq[Embedded])
```

This consists of a config block with 2 items, the 'package' and the 'mykey' definition. These are available to be used
in the freemarker template as a Map of String-keys to String-values.

`/some/endpoint <- SomeType(foo:String)` is an endpoint declaration. It declares one endpoint that have a request body 
data type called `SomeType` that has a field called `foo` of the type `String`.

### Data types
The DSL uses Scala convention of writing data types after the field name separated by a colon. Of course the DSL parser
does not know anything about java or scala types, as far as it is concerned these are 2 strings and the first one is
just named field-name and the other string is named field-type.

`Embedded(foo:Bar)` is a `namedTypeDeclaration` which is parsed the same way as the request type above. But isn't tied
to a specific endpoint.

### Automatically named data types
`/some/other/endpoint <- (bar:Seq[Embedded])` is another endpoint declaration. However this time the request body is
not named in the DSL. But all datatypes must have a name so it will simply name it after the last path segment and
tack on the string 'Request' at the end. So the AST till contain a datatype named `endpointRequest` with a field named
`bar` and a type-field with the value `Seq[Embedded]`. 

Again, the parser does not care about, or know anything about what code is generated, so it has not semantic knowledge
if these are actual datatypes in the code it generates or if they make sense in java/scala/lua/rust or whatever you 
decide to generate in the templates.

The only 'semantic' validation the parser performs is to check that not two types have the same name.

### Reponse data types
It is possible to have an optional response data type declared like so:

`/some/other/endpoint <- (bar:Seq[Embedded]) -> ResponseType(foo: Bar)`

The right pointing arrow `->` denotes a response type, it can be an anonymous data type in which case the parser till 
name it from the last path segment and add 'Response' to the end of the data type name.

### DSL config
The only key in the config block the generator looks at is called `ending`, this will be used as the file ending for
the resulting file of applying the freemarker template.

## Generating
If the parser is successful it will hold the following data in the AST

```java
public record DocumentNode(
        Map<String, String> config,
        List<TypeNode> typeDefinitions,
        List<EndpointNode> endpoints) {
}
```

This will be passed to the freemarker engine as the 'root' data object, meaning you have access to the parts in your freemarker template like this:

```injectedfreemarker
<#list typeDefinitions as type>
    This is the datat type name: ${type.name?cap_first} with the first letter capitalized.
</#list>
```

That is, you can directly reference `typeDefinitions`, `endpoints` or `config`.

### Config
The config object is simply a String-map with the keys and values unfiltered from the input file. Here is an example
that writes the value for a config key called 'package'.

`package ${config.package}`

### Data types
These are all the data types the parser have collected, either from explicit declarations, request payloads and response 
bodies. 

```java
public record TypeNode(String name, List<FieldNode> fields) { }
public record FieldNode(String name, String type) { }
```

Here is an example template that writes the data types as Scala case classes
```injectedfreemarker
object Protocol:
<#list typeDefinitions?sort as type>
    case class ${type.name?cap_first}(
    <#list type.fields as field>
        ${field.name} : ${field.type},
    </#list>
    )
</#list>
```

### Endpoints
The parser will collect the following data for endpoint declarations

```java
public record EndpointNode(
	PathsNode paths,
	String inputType,
	Optional<String> outputType) {}

public record PathsNode(List<String> paths) {}
```

This is an example that will write out the endpoints with the path first, then the Input data type, then the optional 
Output data type.

```injectedfreemarker
<#list endpoints as endpoint>
    <#list endpoint.paths.paths>
    <#items as segment>/${segment}</#items>
    Input:
        ${endpoint.inputType?cap_first}
    Output:
    <#if endpoint.outputType.isPresent()>
        ${endpoint.outputType.get()?cap_first}
    <#else>
        Not specified
    </#if>
    </#list>

</#list>
```
Use ASL license and started on documentation 2025-04-10 21:35:49 +02:00			`# endgen`
			`This is a converter tool that reads a DSL and generates output files.`

			`Endgen is Open Source Software using the [Apache Software License v2.0](http://www.apache.org/licenses/LICENSE-2.0)`

			`## Motivation`
			`The motivation behind this tool is that I wanted to generate boilerplate code for handling HTTP Endpoints (hence the`
			`endgen name).`

			`I had one project written in Scala using the [Tapir](https://tapir.softwaremill.com/) library. Another Java project`
			`using [Spring](https://docs.spring.io/spring-framework/docs/3.2.x/spring-framework-reference/html/mvc.html) Boot.`
			`In both of these projects had a very simple endpoints only supporting POST, taking some datatype as payload and using`
			`the same response type.`

			`That's a lot of boilerplate, especially in the Scala case where the payload datatype had to be written in several`
			`separate files (for endpoint-definitions, circe serializer support etc).`

			`So I wrote a [DSL](https://en.wikipedia.org/wiki/Domain-specific_language) parsed by an [ANTLR4](https://www.antlr.org)`
			`parser and a code generator using [freemarker](https://freemarker.apache.org).`

Fix packaging 2025-04-12 19:59:44 +02:00			```
			`\| Endgen \|`
			`-------------------------------------------------`
			`----------- \|\|--------\| \|-----------\| \|------------\|\|`
Output files are named as the templates with the ending stripped. 2025-04-13 14:11:03 +02:00			`\| endpoint \| \|\| Parser \| \| In-Memory \| \| Freemarker \|\| ------------------`
			`\| file \| --> \|\| \| --> \| AST \| --> \| engine \|\| --> \| Output file \|`
			`\__________\ \|\|--------\| \|-----------\| \|------------\|\| \| mytemplate.xxx \|`
			`-------------------------------------------------- \________________\`
Fix packaging 2025-04-12 19:59:44 +02:00			`^`
			`\|`
Output files are named as the templates with the ending stripped. 2025-04-13 14:11:03 +02:00			`----------------------`
			`\| mytemplate.xxx.ftl \|`
			`\____________________\`
Fix packaging 2025-04-12 19:59:44 +02:00			```
			`## How to Run`
			`You need a Java 24 runtime and java in the path. A very convenient way to install a java runtime is [SdkMan](https://sdkman.io).`

			`Unpack the archive, run the provided shellscript file.`

			`### Usage`
			```
Output files are named as the templates with the ending stripped. 2025-04-13 14:11:03 +02:00			`Usage: run.sh [-hvV] [-o=<outputDir>] [-t=<templateDir>] <file>`
Fix packaging 2025-04-12 19:59:44 +02:00			`Generate source code from an endpoints specification file.`
			`<file> The source endpoints DSL file.`
			`-h, --help Show this help message and exit.`
			`-o, --output=<outputDir> The directory to write the generated code to.`
			`Default is ~/endpoints-output`
			`-t, --template=<templateDir>`
			`The template directory. Default is`
			`~/endpoints-templates`
			`-v, --verbose Print verbose debug messages.`
			`-V, --version Print version information and exit.`
			```

Use ASL license and started on documentation 2025-04-10 21:35:49 +02:00			`## DSL example`
			`In the simplest form the DSL looks like this`
			```
			`/some/endpoint <- SomeType(foo:String)`
			```

Output files are named as the templates with the ending stripped. 2025-04-13 14:11:03 +02:00			`This gets parsed into an [AST](https://en.wikipedia.org/wiki/Abstract_syntax_tree) which in this case holds a list of`
Use ASL license and started on documentation 2025-04-10 21:35:49 +02:00			`Path segments and a data strucutre representing the input/body type.`

			`## Code generation example`
			`When the parser is done reading the DSL it will look in a directory for [freemarker](https://freemarker.apache.org)`
			`templates. For each template it finds it sends in the AST. The resulting file (per template) is written to an`
			`output directory.`

Output files are named as the templates with the ending stripped. 2025-04-13 14:11:03 +02:00			The templates must have the file ending `.ftl` - this ending is stripped when generating the output file. So a template
			called `types.java.ftl` will generate a file called `types.java`.

Use ASL license and started on documentation 2025-04-10 21:35:49 +02:00			`The idea being that you can take these files and probably adapt them before checking them into your project. Endgen`
			`does not aim to be a roundtrip tool (i.e. reading the generated source, or being smart in updating them etc). It is also`
			`a very limited DSL, you can for example not express what type of HTTP Verb to use or declare response codes. There are`
			`no plans to extend the DSL to do that either.`

			`## DSL`
			`This is the ANTLR grammar for the root of the DSL`

			```antlrv4
			`document : generatorconfig? (namedTypeDeclaration\|endpoint)* ;`
			```
			Meaning that the DSL file has an optional `generatorconfig` block at the top. Then you can write either; a type
			`definition, or an endpoint declaration, as many times as you like.`

			`Here is an example:`
			```
			`{`
			`package: se.rutdev.senash,`
Output files are named as the templates with the ending stripped. 2025-04-13 14:11:03 +02:00			`mykey: myvalue`
Use ASL license and started on documentation 2025-04-10 21:35:49 +02:00			`}`

			`/some/endpoint <- SomeType(foo:String)`

			`Embedded(foo:Bar)`
			`/some/other/endpoint <- (bar:Seq[Embedded])`
			```

Output files are named as the templates with the ending stripped. 2025-04-13 14:11:03 +02:00			`This consists of a config block with 2 items, the 'package' and the 'mykey' definition. These are available to be used`
Use ASL license and started on documentation 2025-04-10 21:35:49 +02:00			`in the freemarker template as a Map of String-keys to String-values.`

			`/some/endpoint <- SomeType(foo:String)` is an endpoint declaration. It declares one endpoint that have a request body
			data type called `SomeType` that has a field called `foo` of the type `String`.

Add readme section about generating 2025-04-13 08:56:47 +02:00			`### Data types`
Use ASL license and started on documentation 2025-04-10 21:35:49 +02:00			`The DSL uses Scala convention of writing data types after the field name separated by a colon. Of course the DSL parser`
			`does not know anything about java or scala types, as far as it is concerned these are 2 strings and the first one is`
			`just named field-name and the other string is named field-type.`

			`Embedded(foo:Bar)` is a `namedTypeDeclaration` which is parsed the same way as the request type above. But isn't tied
			`to a specific endpoint.`

Add readme section about generating 2025-04-13 08:56:47 +02:00			`### Automatically named data types`
Use ASL license and started on documentation 2025-04-10 21:35:49 +02:00			`/some/other/endpoint <- (bar:Seq[Embedded])` is another endpoint declaration. However this time the request body is
			`not named in the DSL. But all datatypes must have a name so it will simply name it after the last path segment and`
			tack on the string 'Request' at the end. So the AST till contain a datatype named `endpointRequest` with a field named
			`bar` and a type-field with the value `Seq[Embedded]`.

			`Again, the parser does not care about, or know anything about what code is generated, so it has not semantic knowledge`
			`if these are actual datatypes in the code it generates or if they make sense in java/scala/lua/rust or whatever you`
			`decide to generate in the templates.`

			`The only 'semantic' validation the parser performs is to check that not two types have the same name.`

Add readme section about generating 2025-04-13 08:56:47 +02:00			`### Reponse data types`
			`It is possible to have an optional response data type declared like so:`

			`/some/other/endpoint <- (bar:Seq[Embedded]) -> ResponseType(foo: Bar)`

			The right pointing arrow `->` denotes a response type, it can be an anonymous data type in which case the parser till
			`name it from the last path segment and add 'Response' to the end of the data type name.`

Use ASL license and started on documentation 2025-04-10 21:35:49 +02:00			`### DSL config`
			The only key in the config block the generator looks at is called `ending`, this will be used as the file ending for
			`the resulting file of applying the freemarker template.`

Add readme section about generating 2025-04-13 08:56:47 +02:00			`## Generating`
			`If the parser is successful it will hold the following data in the AST`

			```java
			`public record DocumentNode(`
			`Map<String, String> config,`
			`List<TypeNode> typeDefinitions,`
			`List<EndpointNode> endpoints) {`
			`}`
			```

			`This will be passed to the freemarker engine as the 'root' data object, meaning you have access to the parts in your freemarker template like this:`

			```injectedfreemarker
			`<#list typeDefinitions as type>`
			`This is the datat type name: ${type.name?cap_first} with the first letter capitalized.`
			`</#list>`
			```

			That is, you can directly reference `typeDefinitions`, `endpoints` or `config`.

			`### Config`
			`The config object is simply a String-map with the keys and values unfiltered from the input file. Here is an example`
			`that writes the value for a config key called 'package'.`

			`package ${config.package}`

			`### Data types`
			`These are all the data types the parser have collected, either from explicit declarations, request payloads and response`
			`bodies.`

			```java
			`public record TypeNode(String name, List<FieldNode> fields) { }`
			`public record FieldNode(String name, String type) { }`
			```

			`Here is an example template that writes the data types as Scala case classes`
			```injectedfreemarker
			`object Protocol:`
			`<#list typeDefinitions?sort as type>`
			`case class ${type.name?cap_first}(`
			`<#list type.fields as field>`
			`${field.name} : ${field.type},`
			`</#list>`
			`)`
			`</#list>`
			```

			`### Endpoints`
			`The parser will collect the following data for endpoint declarations`

			```java
			`public record EndpointNode(`
			`PathsNode paths,`
			`String inputType,`
			`Optional<String> outputType) {}`

			`public record PathsNode(List<String> paths) {}`
			```

			`This is an example that will write out the endpoints with the path first, then the Input data type, then the optional`
			`Output data type.`

			```injectedfreemarker
			`<#list endpoints as endpoint>`
			`<#list endpoint.paths.paths>`
			`<#items as segment>/${segment}</#items>`
			`Input:`
			`${endpoint.inputType?cap_first}`
			`Output:`
			`<#if endpoint.outputType.isPresent()>`
			`${endpoint.outputType.get()?cap_first}`
			`<#else>`
			`Not specified`
			`</#if>`
			`</#list>`

			`</#list>`
			```