tech

[English] [Japanese]

A. Reason for developing Lua transcompiler LuneScript

Lua is a lightweight language with high execution performance. Lua's name recognition is low compared to the same scripting languages such as Ruby, Python, and JavaScript, but it can be said that it is one of the most major languages that can be used for system expansion and is easy to embed. In fact, there are many systems that embed Lua.

In terms of execution performance, it is also one of the fastest scripting languages. If you don't do DSP-like processing, it's unlikely to become a system performance bottleneck.

I myself have experience in developing some software (for both hobby and business) using Lua, and it is one of my favorite languages.

Why we developed Lua transcompiler LuneScript

Lua is one of my favorite languages. However, for the following reasons, I stopped writing Lua code directly and started thinking about developing using the transcompiler LuneScript.

  • Have fun and write safely
  • Lua has problems unique to dynamically typed languages

    • No static error checking
    • It is difficult to understand the contents of other people's code
    • High risk of maintenance, feature addition, and refactoring
    • Completion when coding is not cool
    • Field access control in table is not possible
  • Dissatisfied with Lua functionality

    • not nil safe
    • no macros
  • There are few alternatives for the ease of embedding and high execution performance that are characteristics of Lua
  • Many systems already use Lua

Each of these will be described below.

Have fun and write safely

Yukihiro Matsumoto of Ruby wants Ruby to be fun.

I'm not looking for fun in LuneScript. No, I think that I want to be "easy" rather than seeking "fun".

Of course, whatever you do should be fun. I personally enjoy software development, so I do software development as a hobby (free of charge) in my private time.

Nowadays, even pure software stores that can't do business can use services like Crowdworks to get guaranteed jobs. In such an era, the motivation to kill your private time and develop free software is nothing but "fun".

However, software development itself is fun, but bug-fixing and test code creation are not fun. The reason why such unpleasing work is necessary is that software development is prone to bugs, and unless the bugs are removed, the software will not work properly.

Again, this work is not fun. It might be fun for some people, but at least for me it's a torture. Well, there is a sense of accomplishment that "I've done it", but I don't want to go out of my way to waste my private time.

I want to enjoy safe software development without doing such unpleasant work as much as possible.

Perl's creator Larry Wall lists "lazyness, impatience and arrogance" as the three great virtues of a programmer. I think "fun and safe software development" is pretty similar to this.

Lua does not provide a mechanism to make software development fun and safe. If it's not provided, just make your own.

I spare no effort to make it easy.

This is the main reason I develop LuneScript.

Issues specific to dynamically typed languages

I'm not against dynamically typed languages.

I myself often write processes in a dynamically typed language, and I don't want to use a statically typed language when writing a simple process that doesn't even reach 100 lines.

The problem with dynamic typed language here is that it is not a script that an individual creates and maintains alone, but a script that may be developed by an unspecified number of people. , points out that it is likely to be a problem.

No static error checking

Humans make mistakes.

If you're a full-time software engineer coding, you've probably passed in the wrong type of data in an argument countless times. A common mistake is to pass the result of parsing a numeric string input to a function, and pass the parsed string data as is, even though the function expects a numeric value. When I think about it, many other things come to mind.

When developing with multiple members, the probability of mistakes occurring due to miscommunication etc. increases further.

If it is a statically typed language, the mistake will be noticed as a type mismatch error at compilation time, or at coding time.

However, in a dynamically typed language, you don't know until you actually run it. Also, in some cases, it does not occur simply by moving it, and it may even occur only at a specific path or at a specific timing.

A simple mistake can cause serious problems later on, and it often costs a lot of money to get to the cause.

I think there is an idea that "tests can cover it", but writing tests is not free. As I said at the beginning, I don't enjoy writing tests. If the compiler guarantees it without writing tests, I'd go with that.

It is possible to perform static checks to some extent on code written in a dynamically typed language. However, it is much more expensive and less precise than statically typed languages.

With a statically typed language, at least type-related mistakes can be reliably analyzed statically.

Of course, it is not possible to analyze if it is a type that can be anything like void * in c or Object in java, or if forced type conversion is used.

I believe that deep learning and other technologies will advance static analysis techniques in the future, making software development even more enjoyable. And I think it's statically typed languages, not dynamically typed languages, that support such development.

Well, maybe it's a different paradigm.

It is difficult to understand the contents of other people's code

Other people's code is harder to understand than the code you have written yourself. This is a matter of course.

That's not what I mean here.

Also, it's not a low-level thing like the indentation isn't aligned or the coding conventions aren't followed.

No matter how well-known engineers write it, if it's written in a dynamically typed language, it's harder to grasp than the code written in a statically typed language.

This is because the data type information, which is an important factor of the program, is hardly written. If there are engineers who say that type information etc. are not so important, it is better to take the unit of "algorithms and data structures" again.

Note that the type can be inferred from the symbol name. Also, symbols should be named as such.

However, this is just a guess, not a fact. When I develop software, I don't want to rely on guessing games.

Also, some people may say that type information is described in comments or documentation, so you should check it. However, the comments and documentation often differ from the implementation. Better.

All I can say is that I want to have a good time.

High risk of maintenance, feature addition, and refactoring

It's rare that any piece of code is written once and left untouched.

There are various reasons such as the OS being run has changed, the need to add functions, or the discovery of latent bugs, but there are many opportunities to modify existing code.

Dynamically typed languages are more risky than statically typed languages when modifying such existing code.

Again, some might say, "If you write the tests well, there's no problem." However, it is half right and half wrong.

"Adding a hand" is synonymous with "behavior changes." Even if there is a difference in the degree of change, there is no difference in change. And if the behavior changes, even if there is a test, it can not be said that it is safe.

This is because the test is for confirming that the behavior is correct, and since the behavior changes, the test cannot be used as it is. Of course, it doesn't mean that everything can't be used.

Now, let's get back to the topic of dynamically typed and statically typed languages.

Why is there a greater risk of modifying existing code in a dynamically typed language than in a statically typed language? That's because it's difficult to correct without omission the affected parts by adding hands.

If it is a statically typed language, it can be said that the correction is almost complete as long as it is compiled. On the other hand, in dynamically typed languages, even if you try to run the test after correcting everything, it is often said that it will not work properly due to an error due to omission of correction. Eliminate the errors one by one and finally complete.

If you think about which takes more time to deal with compile errors or test errors, it's overwhelmingly test errors. If it's a compile error, you can fix the compile error line, but if it's a test error, you'll have to work extra to identify the cause of the error. Furthermore, if there is an omission in the existing test itself, there is a possibility that the omission of correction itself cannot be found.

Also, it would be fine if the person who modified the module was the person who created the module, but it is not uncommon for a completely different person to handle it. In that case, the synergistic effect of "difficult to grasp the contents of other people's code" mentioned earlier will further increase the risk.

During the development of LuneScript, design changes were made many times, but it is chilling to imagine if this was done in a dynamically typed language.

Completion when coding is not cool

A decent completion function is essential to make coding easier.

Recently, even in dynamically typed languages, the coding completion function works quite hard. But have you ever been disappointed by the suggestions that the completion feature lists? Or maybe things that should be listed in the first place aren't listed at all?

Completing a dynamically typed language is rather difficult. This is because the completion function recognizes completion candidates based on type information, but it is difficult to recognize them statically in dynamically typed languages.

With a statically typed language, type information can be determined statically, so type-related completion can be implemented accurately.

Of course, LuneScript also offers completion.

See the next article for details.

../completion

Field access control in table is not possible

Access control is important.

This is because it is possible to clearly state which data/functions can be accessed.

As a major premise at the time of design, it is common sense to disclose functions and data that can be used from the outside, and to keep functions and data that cannot be guaranteed to work when used from the outside.

However, Lua does not allow this for table fields.

Perhaps, if you make full use of metatable, dynamic control may be possible, but at least static control is not possible.

As I said many times, the ability to detect errors dynamically is just better than the ability to detect errors, and it is overwhelmingly inconvenient compared to being able to detect errors statically.

Even in languages with access restrictions, it is possible to access functions and data that have been kept private by using the reflection function, but I don't think there is a particular problem with this.

This is because access control clearly indicates the intention of the module designer, and if another person accesses the module without understanding the intention when using the module, the access deviates from the intention of the designer. This is because I believe that the purpose is to inform the public.

Especially when writing test code, it is sometimes required to be able to access private functions and data, so having access to private functions and data is not a problem per se.

The problem is that there is no such control and everything is accessible.

Dissatisfied with Lua features

Lua is a compact and powerful language, but there are many features not supported in plain Lua.

One of the purposes of developing the transcompiler is to support functions not supported by plain Lua without modifying Lua.

not nil safe

Lua's nil is a useful value, but it also causes dynamic errors. Many engineers are plagued by this nil -related error.

The solution to that problem is nil-safety.

Many features that are indispensable in modern programming have already been realized since the days of Lisp. For example, GC, lambda expressions, closures, etc. have been around for decades.

In other words, it can be said that it has hardly evolved since that time.

"Hardly evolved" also means "somewhat evolved," and nil-safety can be included in that evolution. That's how important it is.

However, Lua does not support nil-safety, and as a modern language, it can be said that this is a considerable deduction target.

On a side note, Rust addresses the danger of nil (null) with the concept of lifetime and ownership. When I saw this approach for the first time, I was very interested in thinking, "Is there such a pattern?"

In addition, Rust solves various problems such as memory management and data access races, as well as nil-safety through lifetimes and ownership.

If you've never touched Rust before, be sure to check out lifetimes and ownership.

Quiet topic.

In LuneScript, nilable types that can take nil and non-nilable types that cannot take nil are managed as separate types to prevent nil errors from occurring at unintended timing.

In addition, by supporting unwrap processing for conversion from nilable to non-nilable types, and nil conditional operators for easy access to multilevel nilable data, nil errors can be handled easily and safely. .

no macros

Speaking of macros, Lisp has very powerful macros and can be said to be the representative of languages with macros. It's no exaggeration to say that macros underpin Lisp's fascination.

However, I feel that many relatively new languages do not support macros.

Even the C language has "Nanchatte macros", so why is that?

Well, even if the language itself doesn't have macros, if an engineer creates a separate script that automatically generates code from some data, it might be possible to say that macros are unnecessary.

However, doing so would result in a flood of "some data" and "automatically generated scripts".

I think that macros are necessary to prevent such things from happening.

However, macros as sophisticated as Lisp are difficult to implement, and require a certain amount of learning on the part of those who use them.

LuneScript provides macros that are easy to implement, easy to learn, and effective for anyone to use.

I use macros in self-hosting of LuneScript, but I feel once again that macros are indispensable for programming languages.

There are few alternatives for the ease of embedding and high execution performance that are characteristics of Lua

As mentioned earlier, Lua is one of the easiest languages to integrate into your system.

In particular, its compactness and the fact that it can be compiled using only standard C functions are extremely useful for embedding.

There are several other embedded languages, but I don't know any language that surpasses Lua in terms of embedding.

Many systems already use Lua

There are many systems that embed Lua.

Once incorporated into a system, Lua will continue to live as long as the system is alive.

Just because you don't like it doesn't mean you can change it.

lastly

LuneScript is developed to compensate for the shortcomings of Lua.

This is not because Lua is a language that cannot be used, but because it is a shame to leave Lua's shortcomings and flirt with other languages.

If you have an opportunity to consider embedded languages in the future, please consider the fact that Lua has LuneScript.

As I said many times, Lua is a lightweight language with high execution performance. And keep in mind that Lua also has a LuneScript option.