The Intellectual Wilderness There is nothing more useless than doing efficiently that which should not be done at all.

2019.11.21 17:24

Testing Textually Composed Numbers for Primality

Filed under: Computing,Science & Tech — Tags: , , , , , — zxq9 @ 17:24

Last night on Twitter one of my favorite accounts, @fermatslibrary, put up an interesting post:

Start at 82 and write it down, then 81, write that down, etc. until you reach one, and you’ve written a (huge) prime number. Wow!

This seemed so strange to me, so of course I got curious:

After some doodling around I wrote a script that checks whether numbers constructed by the method above are prime numbers when the number construction is performed in various numeric bases (from 2 to 36, limited to 36 because for now I’m cheating using Erlang’s integer_to_list/2). It prints the results to the screen as the process handling each base finishes and writes a text file “texty_primes-result-[timestamp].eterms” at the end for convenience so you can use file:consult/1 on it later to mess around.

There are a few strange aspects to large primes, one of them being that checking whether or not they are prime can be a computationally intense task (and nobody knows a shortcut to this). To this end I wrote the Miller-Rabin primality test into the script and allow the caller to decide how many rounds of Miller-Rabin to run against the numbers to check them. So far the numbers that have come out have matched what is expected, but once the numbers get extremely large (and they get pretty big in a hurry!) there is only some degree of confidence that they are really prime, so don’t take the output as gospel.

I wrote the program in Erlang as an escript, so if you want to run it yourself just download the script and execute it.
The script can be found here: texty_primes

A results file containing the (very likely) prime constructions in bases 2 through 36 using “count-back from X” where X is 1 to 500 can be found here: texty_primes-result-20191121171755.eterms
Analyzing from 1 to 500 in bases 2 through 36 took about 25 minutes on a mid-grade 8-core system (Ryzen5). There are some loooooooooong numbers in that file… It would be interesting to test the largest of them for primality in more depth.

(Note that while the script runs you will receive unordered “Base X Result” messages printed to stdout. This is because every base is handed off to a separate process for analysis and they finish at very different times somewhat unpredictably. When all processing is complete the text file will contain a sorted list of {Base, ListOfPrimes} that is easier to browse.)

An interesting phenomenon I observed while doing this is that some numeric bases seem simply unsuited to producing primes when numbers are generated in this manner, bases that themselves are prime numbers in particular. Other bases seem to be rather fruitful places to search for this phenomenon.

Another interesting phenomenon is the wide number of numeric bases in which the numbers “21”, “321”, “4321” and “5321” turn out to be prime. “21” and “4321” in particular turn up quite a lot.

Perhaps most strangely of all is that base 10 is not a very good place to look for these kinds of primes! In fact, the “count back from 82” prime is the only one that can be constructed starting between 1 and 500 that I’ve found. It is remarkable that anyone discovered that at all, and also remarkable that it doesn’t happen to start at 14,562 instead of 82 — I’m sure nobody would have noticed this were any number much higher than 82 the magic starting point for constructing a prime this way.

This was fun! If you have any insights, questions, challenges or improvements, please let me know in the comments.

2019.08.3 05:20

Building Erlang 22.0 on Debian/Ubuntu

Filed under: Computing,Science & Tech — Tags: , , — zxq9 @ 05:20

Every time I switch to a new system and have to build a new release of Erlang with kerl I sit and scratch my head to remember which dependencies are required. Once you’re set up or have a prep script it is just too easy to forget which thing is needed for what over the next few years.

Here is my list of pre-build package installs on Ubuntu 18.04 — note that they are in three groups instead of just being a single long apt install command (why apt couldn’t manage to install these all at once is beyond me…):

Group1:

  • gcc
  • curl

Group2:

  • g++
  • dpkg-dev

Group 3:

  • build-essential
  • automake
  • autoconf
  • libncurses5-dev
  • libssl-dev
  • flex
  • xsltproc
  • libwxgtk3.0-dev

2019.07.23 08:56

Erlang: R22.0 doc Mirror Updated

Filed under: Computing — Tags: , , , , , — zxq9 @ 08:56

The Erlang doc mirror linked here has been updated to include the R22.0 docs.

Note that some of the internal links and labels say “ERTS-10.4” and “Version 10.4” instead of “ERTS-11.0” and “Version 11.0”. This is an error. The docs refer to ERTS 11.0 but that detail seems to not have been updated when these docs were generated (whoops!). I was looking at fixing that throughout the docs and links, but it turns out to be a lot more complicated than I’m willing to deal with because of the number of references that include the string “10.4” (and some of them are in PDFs and other things more annoying to update than HTML pages). When the R22.1 docs come out that will probably be fixed and I’ll update to avoid confusion in the distant future.

2019.05.11 15:46

Erlang doc mirror

Filed under: Computing — Tags: , , , — zxq9 @ 15:46

erlang.org is undergoing scheduled maintenance this weekend.
In the meantime here is the doc mirror link:
http://zxq9.com/erlang/docs/

2018.08.14 15:37

Silly: Hextexting via the command line…

Filed under: Computing,Society — Tags: , , , , , , , — zxq9 @ 15:37

A silly thread on Twitter came to my attention today that stirred some late 1980’s/1990’s phreak/hax0r nostalgia in me. So, of course, I did what any geek would do, wrote a one-off utility script for it. Have fun confusing your parents, kids.

#! /usr/bin/env escript

-mode(compile).

main([Command | Input]) ->
    ok = io:setopts([{encoding, unicode}]),
    Output = convert(Command, Input),
    io:format("~ts~n", [Output]).

convert("t", Input) ->
    String = string:join(Input, " "),
    string:join(lists:map(fun(C) -> integer_to_list(C, 16) end, String), " ");
convert("h", Input) ->
    lists:map(fun(C) -> list_to_integer(C, 16) end, Input);
convert(_, _) ->
    "hextext usage: `hextext t|h [text]".

(Also, look up rot13 — ’twas all the rage 30 years ago, and still makes an appearance as a facilitator of hidden easter eggs in some games. A lot of “garbled alien/monster/otherling speech” text is rot13.)

2018.07.11 17:38

Erlang: Getting Started Without Melting

There are two things that might be meant when someone references “Erlang”: the language, and the environment (the EVM/BEAM and OTP). The first one, the language part, is actually super simple and quick to learn. The much larger, deeper part is learning what the BEAM does and how OTP makes your programs better.

It is clear that without an understanding of Erlang we’re not going to get very far in terms of understanding OTP and won’t be skilled enough to reliably interact with the runtime through a shell. So let’s forget about the runtime and OTP for a bit and just aim at the lowest, most common beginners’ task in coding: writing a script that tells me “Hello, World!” and shows whatever arguments I pass to it from the command line:

#! /usr/bin/env escript

% Example of an escript
-mode(compile).

main(Args) ->
    ok = io:setopts([{encoding, unicode}]),
    ok = io:format("Hello, world!~n"),
    io:format("I received the args: ~tp~n", [Args]).

Let’s save that in a file called e_script, run the command chmod +x e_script to make it executable, and take a look at how this works:

ceverett@takoyaki:~$ ./e_script foo bar
Hello, world!
I received the args: ["foo","bar"]
ceverett@takoyaki:~$

Cool! So it actually works. I can see a few things already:

  1. I need to know how to call some things from the standard library to make stuff work, like io:format/2
  2. io:setopts([{encoding, unicode}]) seems to makes it OK to print UTF-8 characters to the terminal in a script
  3. An escript starts execution with a traditional main/1 function call

Some questions I might have include how or why we use the = for both assignment and assertion in Erlang, what the mantra “crash fast” really means, what keywords are reserved, and other issues which are covered in the Reference Manual (which is surprisingly small and quick to read and reference).

An issue some newcomers encounter is that navigating an unfamiliar set of documentation can be hard. Here are the most important links you will need to know to get familiar and do useful things with the sequential language:

This is a short list, but it is the most common links you’ll want to know how to find. It is also easy to pull up any given module for doing a search for “erlang [module name]” on any search engine. (Really, any of them.)

In the rare case that erlang.org is having a hard time I maintain a mirror of the docs for various Erlang release versions here as well: http://zxq9.com/erlang/

Start messing with sequential Erlang. Don’t worry about being fancy and massively concurrent or maximizing parallelization or whatever — just mess around at first and get a feel for the language using escript. It is a lot of fun and makes getting into the more fully encompassing instructional material much more comfortable.

2018.06.27 00:09

Erlang: R21 doc mirror

Filed under: Computing — Tags: , , , — zxq9 @ 00:09

Erlang doc mirror for R21 is now up.
(For those times when erlang.org takes a nap…)

http://zxq9.com/erlang/docs/reg/21.0/

Erlang: ZJ docs

Filed under: Computing — Tags: , , , , , — zxq9 @ 00:06

Docs for the ZJ Erlang JSON encoder/decoder are now available here:

http://zxq9.com/projects/zj/docs/

The binary_encode/1 function will probably be live tomorrow, along with a proper v1.0 release.

2018.06.26 14:52

Your tests don’t tell you what you think they do

Filed under: Computing,Science & Tech — Tags: , , , , , , , , — zxq9 @ 14:52

Yesterday I wrote a tiny JSON encoder/decoder in Erlang. While the Erlang community wasn’t in dire need of yet another JSON parser, the ones I saw around do things just a tiny bit differently than I want them to and writing a module against RFC-8259 isn’t particularly hard or time consuming.

Someone commented on (gasp!) the lack of tests in that module. They were right. I just needed the module to do two things, the code is boring, and I didn’t write tests. I’m such a rebel! Or a villain! Or… perhaps I’m just someone who values my time.

Maybe you’re thinking I’m one of those coding cowboys who goes hog wild on unsafe code! No. I’m not. Nothing could be further from the truth. What I have learned over the last 30 years of fiddling about with software is that hand-written tests are mostly a waste of time.

Here’s what happens:

  1. You write a new thingy.
  2. You throw all the common cases at it in the shell. It seems to work. Great!
  3. Being a prudent coder you basically translate the things you thought to throw at it in the shell into tests.
  4. You hook it up to an actual project you’re using somewhere — and it breaks!
  5. You fix the broken bits, and maybe add a test for whatever you fixed.
  6. Then other people start using it in their projects and stuff breaks quite a lot more ZOMG AHHH!

Where in here did your hand-written tests help out? If you write tests to define the bounds of the problem before you actually wrote your functions then tests might help out quite a lot because they deepen your understanding of the problem before you really tackle it head-on. Writing tests before code isn’t particularly helpful if you already thoroughly understand the problem and just need something to work, though.

When I wrote ZJ yesterday I needed it to work in the cases that I care about — and it did, right away. So I was happy. This morning, however, someone else decided to drop ZJ into their project and give it a go — and immediately ran into a problem! ZJ v0.1.0 returns an error if it finds trailing commas in JSON arrays or objects! Oh noes!

Wait… trailing commas aren’t legal in JSON. So what’s the deal? Would tests have discovered this problem? Of course not, because hand-written tests would have been bounded by the limits of my imagination and my imagination was hijacked by an RFC all day yesterday. But the real world isn’t an RFC, and if you’ve ever dealt with JSON in the wild that you’re not generating you’ll know that all sorts of heinous and malformed crap is clogging the intertubes, and most of it sports trailing commas.

My point here isn’t that testing is bad or always a waste of time, my point is that hand-written tests are themselves prone to the exact same problems the code being tested is: you wrote them so they carry flaws of implementation, design and scope, just like the rest of your project.

“So when is testing good?” you might ask. As mentioned earlier, those cases where you are trying to model the problem in your mind for the first time, before you’ve written any handling code, is a great time to write tests for no other reason than they help you understand the problem. But that’s about as far as I go with hand-writing tests.

The three types of testing I like are:

  • type checks
  • machine generated (property testing)
  • real-world (user testing)

A good type checker like Dialyzer (or especially ghc’s type system, but that’s Haskell) can tell you a lot about your code in very short order. It isn’t unusual at all to have sections of code that are written to do things that are literally impossible, but you wouldn’t know about until much later because, due simply to lack of imagination, quite often hand-written tests would never have executed the code, or not in a way that would reveal the structural error.
Typespecs: USE THEM

Good property testing systems like PropEr and QuickCheck generate and run as many tests as you give them time to (really, it is just constrained by time and computing resources), and once they discover breakages can actually fuzz the problem out to pinpoint the exact failing cases and very often indicate the root cause pretty quickly. It is amazing. If you ever experience this you’ll never want to hand write tests again.
Property Testing: USE IT

What about user testing? It is simply necessary. You’ll never dream up the insane stuff to try that users will, and neither will a property-based test generation system. Your test and development environment will often bear little resemblance to your users’ environments (a few weirdos out there still use Windows!), the things you might think to store in your system will rarely look anything like the sort of stuff they will wind up storing in it (you were thinking text, they were thinking video), and the frequency of operation that you assumed might look realistic will almost never been anywhere close to the mark (your one-off utility program that you assumed would run in isolation initiated by a user command in ~/bin/ may become the core part of a massively parallelized service script executed every minute by a cron job running as root).
Your Users: COMMUNICATE WITH THEM

Ultimately, hand-written tests tend to reveal a lot more about the author of the tests than the status of the software being tested.

2018.06.25 21:03

Tiny strings-as-strings JSON in portable Erlang

There are several JSON libs for Erlang at this point, and as there is no correct mapping between JSON types and Erlang types, all make different tradeoffs that either work or don’t for your project. Beyond that, various interface and implementation differences exist due to the tradeoffs inherent in manipulating elements of the Black Tongue known as lolscript:

  • Accept values to encode as magic tagged tuples so you can specify exactly what you want VS being ambiguous
  • Never allow “naked” values (everything must be in a list/array or a map or a [whatever]) VS “hanging” values
  • Treat all strings ever as binaries because “strings are big” VS treating all strings (and binaries) as strings because strings are easy to manipulate (io_lists…)
  • Decode JSON “objects” as proplists VS decode JSON objects to dicts or maps VS add an “options” argument to the decode function
  • Encode and decode values various ways based on optional switches VS “sane defaults” (aka “works for me”)
  • Achieve lolspeed via NIFs and only work on *nix VS maintain portability via pure Erlang
  • etc.

No combination is correct for every situation, hence the proliferation of libraries. In addition to proliferation, something as simple as what is described by RFC-8259 shouldn’t require a 20k LoC dependency to manage, at least not in Erlang of all languages.

The general strings-as-strings + portability tradeoffs were made by mochiweb years ago, with mochijson2 being the go-to JSON parser for lots of projects. Now that “tuple calls” have finally been retired after years of obsolescence and deprecation, mochijson2 is finally giving up the ghost as well (as it was based on tuple calls). As a replacement that makes mostly the same tradeoffs but is arguably simpler, I wrote a single-module JSON encoder/decoder lib. It treats all strings as strings, is in pure Erlang, and is utterly boring in how simple the code is. Nothing magical to see. At all. So don’t get excited.

If you need to read things in and read things out, in JSON, and don’t really care about lolspeed but want to understand what is happening, then ZJ is for you: ZJ project @ gitlab

Note that if you have roughly the same requirements but you want to make the strings-as-binaries tradeoff then JSX is the lib for you.

 

Older Posts »

Powered by WordPress