Saturday, November 17, 2012

Avoiding Perl Locale Errors with Remote Git

I've long been troubled with annoying locale warnings that I got with every Git command.

$ git pull
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
        LANGUAGE = (unset),
        LC_ALL = (unset),
        LANG = "fi_FI.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
Current branch master is up to date.

Generating the locales with "locale-gen fi_FI fi_FI.UTF-8", which was suggested by some, didn't help at all.

It seems that the problem occurs with remote SSH repositories, when Git makes an SSH call to the server. Apparently, it runs a shell in the remote host. It forwards language environment variables to the remote host, but when the remote host doesn't have the locale installed, the error occurs.

So, the solutions are:

  1. generate the locales in the remote repository server, or if that is not possible,
  2. change the locale to LANG=C in the local host.

I prefer to have the fi_FI as my system locale, so I wrapped the git calls in the following Bash script:

#!/bin/bash

# This is needed when git makes an SSH call to a remote
# server which doesn't have # all the locales installed
export LANG=C

# Note: the "$@" is important notation to avoid
# splitting arguments at spaces
/usr/bin/git "$@"

Monday, November 12, 2012

Scala Streams

The Scala course at Coursera is on its last week. The course has been quite an interesting introduction to Scala, from the pure functional point of view, which is probably the most relevant one from Computer Science perspective. I'm still doing the last excercices, which should be finished the next Sunday. The excercises have been quite interesting; I haven't gotten full points but almost.

Anyhow, the streams are quite an interesting data structure in Scala. Yesterday, I did the following small encoder, which encodes any text input in the pattern of the SETI (Search for Extra-Terrestial Intelligence) message sent from Arecibo radio telescope in 1974. Fairly limited purpose, but anyhow, the streams worked quite nicely in solving this problem.

So, here's my solution:

  def setiencode(input: String): String = {
    val seticode =
      "00000010101010000000000\n" +
        "00101000001010000000100\n" +
        "10001000100010010110010\n" +
        "10101010101010100100100\n" +
        "00000000000000000000000\n" +
        "00000000000011000000000\n" +
        "00000000001101000000000\n" +
        "00000000001101000000000\n" +
        "00000000010101000000000\n" +
        "00000000011111000000000\n" +
        "00000000000000000000000\n" +
        "11000011100011000011000\n" +
        "10000000000000110010000\n" +
        "11010001100011000011010\n" +
        "11111011111011111011111\n" +
        "00000000000000000000000\n" +
        "00010000000000000000010\n" +
        "00000000000000000000000\n" +
        "00001000000000000000001\n" +
        "11111000000000000011111\n" +
        "00000000000000000000000\n" +
        "11000011000011100011000\n" +
        "10000000100000000010000\n" +
        "11010000110001110011010\n" +
        "11111011111011111011111\n" +
        "00000000000000000000000\n" +
        "00010000001100000000010\n" +
        "00000000001100000000000\n" +
        "00001000001100000000001\n" +
        "11111000001100000011111\n" +
        "00000000001100000000000\n" +
        "00100000000100000000100\n" +
        "00010000001100000001000\n" +
        "00001100001100000010000\n" +
        "00000011000100001100000\n" +
        "00000000001100110000000\n" +
        "00000011000100001100000\n" +
        "00001100001100000010000\n" +
        "00010000001000000001000\n" +
        "00100000001100000000100\n" +
        "01000000001100000000100\n" +
        "01000000000100000001000\n" +
        "00100000001000000010000\n" +
        "00010000000000001100000\n" +
        "00001100000000110000000\n" +
        "00100011101011000000000\n" +
        "00100000001000000000000\n" +
        "00100000111110000000000\n" +
        "00100001011101001011011\n" +
        "00000010011100100111111\n" +
        "10111000011100000110111\n" +
        "00000000010100000111011\n" +
        "00100000010100000111111\n" +
        "00100000010100000110000\n" +
        "00100000110110000000000\n" +
        "00000000000000000000000\n" +
        "00111000001000000000000\n" +
        "00111010100010101010101\n" +
        "00111000000000101010100\n" +
        "00000000000000101000000\n" +
        "00000000111110000000000\n" +
        "00000011111111100000000\n" +
        "00001110000000111000000\n" +
        "00011000000000001100000\n" +
        "00110100000000010110000\n" +
        "01100110000000110011000\n" +
        "01000101000001010001000\n" +
        "01000100100010010001000\n" +
        "00000100010100010000000\n" +
        "00000100001000010000000\n" +
        "00000100000000010000000\n" +
        "00000001001010000000000\n" +
        "01111001111101001111000\n"

    val parts: List[String] = seticode.split("1").map(x => x.replace('0', ' ')).toList
    val parts0: List[String] = parts.dropRight(1)
    val partsn: List[String] = parts0.updated(0, parts.last + parts.head)
    val partstream: Stream[String] = parts0.toStream ++ (Stream.continually(partsn).flatten)

    (input.toList zip partstream).map(x => x._2 + x._1).reduce(_ + _)
  }
Especially the possibility to repeat an infinite stream infinitely with Stream.continually() was quite interesting, as well as the joining of a string and a stream with with zip method, were quite interesting. I suppose an even better solution would have used a stream split operation, but I didn't find one built-in, so the above seemed to be the easiest.