tech

[English] [Japanese]

A. Recommendations for transcompiler development

The final article in the LuneScript Primer Advent Calendar takes a different approach and deals with transcompiler development itself.

Reasons for recommending transcompiler development

I am developing a transcompiler called LuneScript.

Through this development, I feel that I was able to have a good experience as an engineer.

The reason for developing LuneScript is written in the following article.

../reason

I decided to write this article because I want as many people as possible to experience transcompiler development for the following reasons.

  • Resolve dissatisfaction with the current situation constructively
  • You can check trending technology
  • Coding rules can be corrected

Each of the above items will be explained below.

Resolve dissatisfaction with the current situation constructively

Are you dissatisfied with current software development?

  • If your boss doesn't like it,
  • The performance of the development PC is poor,
  • low wages or
  • a lot of overtime or

The "dissatisfaction" I am talking about is not the "dissatisfaction with the real world" surrounding software development mentioned above,

To realize the idea in my head when doing software development

  • It is easy to bug due to language specifications,
  • You can only write in a troublesome way that takes a lot of time and effort,
  • You have to create the necessary utility from scratch, and it takes too much time to get to what you want to create.
  • In the first place, it can not be realized due to the language specification,

That's my dissatisfaction with programming languages.

For those who have such complaints, and who have concrete ideas to solve them, I encourage them to develop a transcompiler.

This is the main reason why I wanted to develop a transcompiler myself.

"If you don't have it, you can make it yourself"

Aren't you a "software engineer" because you have the ability to realize your ideas with software?

Why it's a "transcompiler" and not a "compiler"

“Why “transcompiler” instead of “compiler”? ] You may be wondering.

This is because the technical hurdles of creating a "transcompiler" are lower than creating a "compiler." With the advent of LLVM, I think the hurdles for compiler development are lower than before, but the hurdles for transcompilers are still lower.

Especially, when there is a bug in the code after compiling, the transcompiler is overwhelmingly lower in "cause-seeking cost".

If creating a compiler is the goal in itself, that's fine, but creating a compiler is a means, and the ultimate goal is to solve "dissatisfaction with programming languages."

It would be better if it can be realized with as little cost as possible.

Besides, most of the "frustration with programming languages" can be solved with a transcompiler.

Of course, for complaints such as "I want to shorten the compile time", I think that I have to develop a compiler by myself, but most other complaints can be solved with a transcompiler.

Even if the ultimate goal is to build a compiler,

  • First create a transcompiler,
  • Demonstrate that dissatisfaction can be resolved and that it is easy to use as a language,
  • It can be confirmed that using the transcompiler is beneficial,
  • Then after you need to compile directly to native

Ultimately, developing a compiler can be said to be the most efficient method.

Why it's a "transcompiler" instead of an "interpreter"

The reason why it's not a "compiler" is explained above.

Here's why it's not an "interpreter".

Even if I say that, the reason is the same as the time of "compiler", because "hurdles are lower" than "interpreter".

Writing an interpreter is a lower hurdle than a compiler. I myself have experience making simple interpreters several times.

But when it comes to practical interpreters, it's a different story. When it comes to a practical interpreter, there are development man-hour problems and execution performance problems. As I wrote at the time of the compiler, the purpose is to solve "dissatisfaction with the programming language", so let's not think about anything other than that.

There are many practical interpreted languages (scripting languages) out there today. You won't have to bother making your own.

Easy-to-understand output results

One of the advantages of transcompilers over compilers is that the output code of transcompilers is easy for many people to understand.

When it comes to which is easier to understand, the "native code" output by the compiler and the "code in a certain language" output by the transcompiler, everyone agrees that the "code in a certain language" is easier to understand. .

If you ask, "Why is it better to be easy to understand?", it means that the introduction risk is that low.

When introducing new technology into a project, we need to decide whether it is "safe".

The term “safety” can mean many different things.

  • "Information security" to ensure that virus-like things do not enter
  • Whether it can be exported overseas and whether it is subject to the "Foreign Exchange Law"
  • Are there any licensing issues? Even if there is no problem with the license of the part created by the author of the compiler, is there a license problem with the code that the author unintentionally uses?

In order to introduce other new things, we need to clear some safety.

When clearing these safety requirements, it is very important to make the target technology easy to understand.

In the case of compilers, it is impractical to inspect the generated native code, so the compiler's code is inspected for safety.

A transcompiler, on the other hand, only needs to look at the license and the converted code.

You can check trending technology

If you go to great lengths to develop a new transcompiler, it's the nature of engineers to try to make it as user-friendly as possible.

In that case, it is common practice to investigate the characteristics of various languages and incorporate the good ones.

When developing LuneScript, I also investigated as many languages as possible and tried to incorporate various functions.

What is the point of knowing the characteristics of different languages? For example, is it meaningful for engineers who are usually only involved in C language projects to know the features of Go and swift? I'm sure some people wonder about such things.

If you are someone who regularly collects new information on sites such as Qiita, I don't think you will have any doubts about such things, but if you don't, there will be more than a few people who do. .

I think that even if you are an engineer who is usually only involved in C language projects, you should be aware of the features of the modern language.

This is because even if the features of modern language itself cannot be used in C language, the way of thinking and the essence can be introduced in C language as well.

For example, the concept of a functional language can be implemented in C without using Haskell.

Of course, there are things that are difficult to write with the C language syntax, and things that cannot be realized due to the C language specifications.

However, there is a difference between knowing the concept of a functional language and intentionally writing in a C-like way, and writing in C as usual because you only know the C language.

Also, incorporating the features of the modern language as a transcompiler function will enable a deeper understanding than simply using the modern language. Without a deep understanding, it cannot be taken in.

In this way, it can be said that the development of a transcompiler is a good experience in order to deepen the understanding of the characteristics of modern languages.

Coding rules can be corrected

This is a little different from what I've mentioned so far, but I think it's pretty important.

I think everyone has an ideal way of writing when coding, such as "I want to write like this" and "I should write like this".

Developing a transcompiler also means being able to correct (enforce) coding rules at the language level.

I think that there are many people who have doubts about this explanation, "What do you mean?", but I think that it will be understood by saying "block by indentation" in python.

Python represents blocks by indentation, not by keywords or delimiters.

In other words, instead of binding a coding rule that "blocks should be indented", Python corrects (enforces) at the language level that "if the indentation is not complete, it cannot be treated as a block".

Coding rules are often controversial. Developing your own transcompiler prevents such useless disputes at the language level.

I wasn't very conscious of it until I developed my own transcompiler, but among the coding rules I'm usually conscious of, there are some rules that are unnecessary in other languages because they can't be enforced at the language level. I understand.

When developing a new language, I think you should not only incorporate new functions, but also review your usual coding rules and consider whether it can be handled at the language level.

lastly

So far, I have made simple interpreters and source code tag system as a hobby.

Through LuneScript development this time, I was able to realize the fun of software development once again.

I think there are many people who think that there is no point in creating a new language now. At least I thought so.

However, by developing LuneScript this time, I think I was able to grow as a software engineer.

No matter what people think, if you can feel the growth yourself, that's the best.

If you are dissatisfied with software development, please try to develop a new programming language.