Corpus Analysis For Range

The folks working on the Range proposal asked for some help in determining whether or not Range should be an Iterator or an Iterable.

I used python to validate it as the language has had this feature for a long time.

In python, although range is reusable, it is generally not reused. Typically a new throw-away range is created as seen in this example. We use quite a bit of python at mozilla, and our search is stronger than github. The example of Mozilla’s python code is not representative, but as it reuses a number of python libraries and is easily searchable – I think it gives us a decent idea of what we are working with.

In a review of searchfox, I found that the usage of range as a throw away value exceeded the use of bare range as a reusable value significantly. There were only 112 bare reuse cases. This is compared to throw-away usage which numbers over 100,000. In this long, repetitive argument, We are modifying the behaviour for the benefit of 0.01% of cases in this example.

I noted a few different types of range re-use, including wrapped range cases:

I did find true reuse in this form, but it was pretty rare. The closest to a refactoring hazard as described above that I found was a range used across a language boundary – but as it is crossing a language boundary into C – it is cast into a different type which is a list. Another interesting case is where it looks like, because range is reusable, it is being used in a surprising wway where the column list might replaced by a range and eventually cast back into a list.

One of the findings was that range would sometimes be reused in the following form:

list(range(10))
enumerate(range(10))
iter(range(10))
etc.

This form also outnumbered bare range assignments. That is, transforming a range and using it for something semantically meaningful like a list – because a range in and of itself is not meaningful. This is very interesting, and perhaps my most important finding.

I would argue that–given that reuse is rare and this is usually used as a throw away value–designing for that use-case at the expense of language consistency is a poor language design choice. We would be prioritizing a very rare use over the learnability of the language.

However, we can make the uncommon case meaningful. In fact, the JavaScript language already has idioms that support this. The proposed form of

let x = Iterator.from(Number.range(0, 10));
let y = Array.from(Number.range(0, 10));

Works very nicely for reuse and addresses most of the use cases. It reflects the reuse of range in a meaningful way – we had a “range”, and it can be an iterator, or an array – whatever the use-case for the developer. For example if they needed an array specifically, or were simply more comfortable with it. Number.range would function equally in both cases, while maintaining itself as effectively a throw away value (the common case). In fact, this form would more closely mirror what python programmers tend to do than what is being proposed here, which is shoehorn range into being an iterator with an irregular syntax. There is much to be lost in trying to force JavaScript to be something it is not.

Terse code isn’t better code – it is often more confusing and prone to errors, and does not do self-teaching to the programmer. By using Iterator.from, we can communicate what is being built while maintaining existing categorizations of behaviour in the language – the only difference is that it is longer. Existing forms already exist such as Array.from. There is no need to invent something new for this proposal. It undermines an otherwise good proposal.

My observation here is that we are stuck not on usability, but on aesthetics of the uncommon case. The use-case was never proven (that is–one can always argue that JS will be different) and we are talking in circles about a hypothetical situation that is rare even in languages that had this built in from the start. Remember, this was appx. 0.01% of what I found.

There may be reason to continue discussing usability, but it is not this refactoring hazard. I would like it to pivot it towards readability and communication, away from this semantics discussion. I believe we should think more creatively of how to communicate to programmers rather than privilege the rare case over the common one. It is an exercise we will likely need to do anyway: think about how we are naming this. range is a noun here. We could change to a verb such as count or iterate. This would be a more productive direction than going in circles. By using a verb, the consumption of the value will become more self-evident.

Notes mentioning this note

There are no notes linking to this note.