Articles Labelled with “Article”

Google Android

First of all Google decided to bind its brand new and "revolutionary" operating system to an old pile of crap like Java.

Then the Big G decided to enlighten us with App inventor, nothing more than a toy for kids.

As a consequence of this policy, aimed to entice smart people, someone decided to write PFA, PHP for Android, doubtless an excellent idea.

My question now is: what's next? Maybe COBOL?

PyCon3

Ho avuto davvero una settimana intensa, motivo per cui scrivo solo ora delle mie impressioni sul PyCon3, conclusosi lo scorso fine settimana.

Anzitutto, perché andare ad una manifestazione su Python, dal momento che il mio linguaggio di programmazione preferito è l’Objective Caml? Le risposte sono molteplici.

Anzitutto Python, tra i linguaggi a tipizzazione dinamica, è quello più pulito ed ortogonale che ci sia in circolazione. Ha molte caratteristiche che lo avvicinano alla programmazione funzionale, il che è solo un pregio per me, ed in barba a quel che afferma Guido van Rossum, papà di Python. Non sono l’unico comunque a pensare che il successo di Python sia dovuto soprattutto a questi fattori. In un celeberrimo post del 2002 Paul Graham spiegava proprio con la continua aggiunta di caratteristiche funzionali al linguaggio il suo successo.

Quindi il linguaggio merita certamente molta attenzione e nessuno programmatore professionista dovrebbe farsi scappare la possibilità di impararlo, se già non lo conosce.

Un altro ottimo motivo per partecipare al PyCon è certamente l’eccellente livello degli ospiti invitati. Su tutti quest’anno svetta certamente lo stesso van Rossum, creatore del linguaggio, che ha tenuto due talk estremamente interessanti, sulla genesi e lo sviluppo della versione 3.0 di Python e sugli ultimi sviluppi di Google App Engine. Estremamente interessante il talk di Alex Martelli intitolato “Lo Zen e l’Arte della Manutenzione delle Astrazioni” in cui ha parlato come secondo lui un’astrazione deve essere concepita e quali obiettivi si debba porre per essere manutenibile nel lungo periodo. Martelli ha anche spiegato un pattern molto efficace, la Dependency Injection. Questo pattern mi piace moltissimo e l’unica nota un po’ critica (anzi ironica, direi) che vorrei fare è che… assomiglia moltissimo all’uso quotidiano che si fa in OCaml dei funtori, anzi, sono proprio i funtori implementati in Python. Sempre con buona pace per Guido van Rossum :-)

Altri interventi di relatori meno “big” sono stati comunque molto interessanti: certamente la qualità dei talk è complessivamente da alta a molto alta.

Un altro buon motivo per partecipare al PyCon è la comunità di smanettoni e professionisti che gravita attorno a Python. Il solito Paul Graham scriveva nel 2004 del Python paradox (traduzione italiana qui) vale a dire: se programmi in Python sei uno smanettone, visto che nell’industria non si usa(va); se sei uno smanettone ci sono buone possibilità che ti piaccia programmare e sappia farlo bene; ergo, se devo cercare un buon programmatore, meglio cercarlo tra coloro che conoscono Python piuttosto che tra coloro che mettono Java in curriculum. Non fa una piega. E se Graham fosse venuto al PyCon3, invece di stare a casa a poltrire, avrebbe avuto un’altra conferma del suo paradosso. Tanta gente simpatica e competente, possibilità di parlare con tutti delle proprie esperienze di programmatore, ma anche di fesserie collegate al mondo dell’informatica in generale, tipo: “ma tu c’hai un Mini Dell? Come va Linux su ’sto coso?”. Cose così insomma.

L’elogio dell’organizzazione è doveroso, ma non ci voglio spendere molte parole. Il PyCon3, come il PyCon2 l’anno scorso, è organizzato in maniera perfetta sotto ogni punto di vista. Punto e basta.

Una riflessione finale: nel momento in cui i “big” dell’informatica, come Google, promuovono Python a spintoni (Guido van Rossum è pagato da Google per passare metà del suo tempo su Python) e lo adottano come principale piattaforma di sviluppo, possiamo dire finito il “paradosso Python”? Cominceremo a vedere programmatori mediocri mettere in curriculum Python perché va di moda come nel 2004 andava di moda Java? Probabilmente non siamo arrivati a questo punto, ma poco ci manca. Spero che la comunità Python si conservi vivace come ora, ma ho ottime speranze: quella Java non è stata mai così vivace neanche nei suoi momenti migliori (quali, poi?).

Ah, dimenticavo: vince il premio di domanda più assurda del PyCon3 quella di una persona che ha chiesto a Guido van Rossum se Google ha intenzione di esporre il proprio API di Google App Engine anche per altri linguaggi, oltre a Python e Java, tipo, per esempio… Ruby o PHP! Risate a scena aperta dalla platea (devo dire, me compreso). Guido attende la traduzione in cuffia e, quando arriva, strabuzza gli occhi, capisce il motivo della risata generale ma, compostissimo, risponde che per ora no, non se ne parla ed il futuro è molto imprevedibile. Un vero gentleman!

All’anno prossimo, speriamo sempre a Firenze, tremo all’idea che possano spostare il PyCon a Cinisello Balsamo!

Information technology disgrace

The world of information technology is made by men and, like any other activity in which human beings are involved in, mistakes happens. Sometimes huge mistakes. And huge mistakes turn out into disgraces. One of these disgraces has a name: PHP.

Here is the story: some days ago wordpress.org released the latest version and I decided to upgrade. No db changes, everything seems ok. But, wait a moment… the sidebar is broken! To be honest, the page itself seems to be broken. I did nothing strange. Ok, let’s rationalize this: ssh on the server, cd to the wordpress directory and issue:

$ php index.php
Segmentation fault

What the frack?!? I had no time to investigate more, and I decided to install the most recent working backup (I use GIT to track everything, including db backups) and forgot it for some days. Tonight I decided to solve the problem. I installed everything on my PC and, by hacking the DB, I was able to remove all the sidebar widgets, among which I suspected the guilty should be. Than, from Wordpress admin I added exactly the same widgets, in the very same order and with the same configuration.

Result? No more segfault.

Now, how can a so popular application be so fragile? Are Wordpress guys stupid, or what? No, this time the problem is with the technology. One word suffices: PHP.

Yes, I know, I perfectly know I shouldn’t focus myself on this or that technology, I now Facebook is made with PHP, I know everything, but… as an engineer I simply can’t ignore how poor this language is!

Hey, if I considered the language choice unimportant I should work in the marketing :-)

End of this rant: I started my own blog, I’ll write it in a real programming languages, Objective Caml, I’ll never be rich but at least I’ll never ever spend 3 hours of my life debugging a PHP buggy blog.

Draw something on the screen... and interact with it!

Summary of the previous episodes: 10 days ago Richard Jones complained about the difficulties to achieve simple tasks (drawing a function graph on the screen) on modern computers with modern programming languages; the day after Erik de Castro Lopo replied with a post in which he used GTK and Cairo (better: the OCaml bindings) to achieve the result to draw a simple function on the screen. Yesterday Matias Giovannini added some pepper to this argument using SDL to draw the Newton fractal.

So, what can be added to all this? With a perfect graphic toy you can draw on a window with simple commands, of course, but you also want to interact with the objects you drew. So I elaborated Erik example to add some keyboard and mouse interaction with the graphics on the screen.

Downloading and compiling

First of all, download the source code or, if you want the latest version, clone my GIT repository:

$ git clone https://www.ex-nunc.org/projects/pdonadeo/cairo_toy.git cairo_toy.git

To compile the program you need:

  • OCaml (I have version 3.10.2, but probably 3.10.0 or 3.10.1 are ok);
  • Lablgtk2, the OCaml binding to GTK2;
  • the OCaml binding to libcairo.

All these packages are available in any recent Linux distribution; on Debian/Ubuntu:

$ aptitude install ocaml liblablgtk2-ocaml-dev libcairo-ocaml-dev

To compile instruct this command inside the program directory:

$ ocamlbuild demo_toy.native

The code

The program is very simple and is essentially derived from Erik's code: the core is the functor Toy_maker.Make which accepts, as input, a module with the following signature INTERACTOR:

module type INTERACTOR =
sig
  type state
  val init_state : state
  val win_title : string
  val init_width : int
  val init_height : int
  val cmd_line_handler : state -> string array -> state
  val keyboard_callback : state -> GdkEvent.Key.t -> state * bool
  val pointer_buttons_callback : state -> GdkEvent.Button.t -> state * bool
  val pointer_motion_callback : state -> GdkEvent.Motion.t -> state * bool
  val pointer_scroll_callback : state -> GdkEvent.Scroll.t -> state * bool
  val repaint : state -> Cairo.t -> int -> int -> state * bool
end

In this module the user must provide a type state, which contains the application state, some initialization values, a command line handler (in case you need) and 4 event handlers for the following events:

  • keyboard;
  • mouse motion;
  • mouse buttons;
  • mouse wheel event (scroll events in GTK).

The user also provides a repaint function, which takes care of repainting the Cairo context.

As a demo I wrote a simple My_interactor module implementing the following simple features:

  • left click on the gray background creates a new circle;
  • left click inside an existing circle moves it around;
  • right click inside a circle deletes it;
  • the mouse wheel zooms (in and out);
  • middle click is used to pan;

Here is the result.

Yes, it's somewhat dull, but it does its job. Have fun!

PyCon Due

PyCon2 Italia official logo

Si è appena conclusa la seconda convention italiana dedicata al linguaggio Python, il PyCon2, svoltosi a Firenze lo scorso fine settimana. Pur non essendo un vero pythonista sono comunque interessato a Python perché in questo momento gran parte della mia attività di consulenza richiede la conoscenza di questo linguaggio e, devo dire, perché molto attirato da alcuni key note, tra i quali spicca certamente quello di Alex Martelli su Google App Engine.

Un altro aspetto interessante è stato per me il constatare come molti talk fossero incentrati sulle tecnologie legate al web e siccome sto scrivendo Ex-nunc, un web framework in Objective Caml, ho voluto informarmi su quale fosse lo stato dell’arte di progetti simili scritti in Python.

L’organizzazione della manifestazione è stata a dir poco eccellente. Develer, l’azienda che ha promosso ed organizzato il PyCon2, non ha tralasciato nessun dettaglio, a partire dal sito della manifestazione, dettagliatissimo e ricco di informazioni, dalla scelta dell’albergo centralissimo e raggiungibile a piedi dalla stazione ferroviaria, fino al ricco buffet.

Anche l’organizzazione logistica dei talk è stata di livello eccezionale: ottima acustica e disponibilità di traduzione simultanea per gli ospiti di lingua Inglese.

Infine, magari meno importante, la cornice della città di Firenze è splendida e, seppure immagino sia possibile organizzare un PyCon3 a Sesto San Giovanni, spero davvero che scelgano ancora, l’anno prossimo, una città in cui ad ogni angolo si vedono panorami come questo.

Organizzazione perfetta, argomenti trattati interessanti ed ospiti internazionali hanno fatto del PyCon2 un appuntamento di assoluto rilievo, che non teme il confronto di nessuna manifestazione simile nel mondo. Per il panorama italiano si tratta poi di un fatto del tutto eccezionale anche dal punto di vista culturale, in un paese che di tecnologia parla sempre meno e di software libero non parla affatto.

Sending emails via Gmail with Objective Caml

Gmail logo with the caml

Motivation

Last week I was writing a Python script to make an automatic backup, and I decided to send me an email in case of scp failure. I decided to use Python to send the email, possibly via GMail and I found this interesting blog post: Sending emails via Gmail with Python. I like Python, it’s a good programming language, but my heart (as a developer!) beats for the Objective Caml programming language.

So I decided to port the script presented in the post in OCaml. The result is this sendmail.ml.

Compiling the script

To compile the script you need four software components:

  1. the Objective Caml environment. You can download it from the INRIA site;
  2. Findlib, to make compiling very simple;
  3. Ocamlnet: here is the home page of the project;
  4. OCaml binding to the SSL library.

You can of course compile all this stuff, but every decent Linux distributions has all packaged. In Debian you have to run the following command:

# aptitude install ocaml libocamlnet-ocaml-dev \
  libssl-ocaml-dev ocaml-findlib

Now, to compile the script, issue the command:

$ ocamlfind ocamlopt -linkpkg -package \
  netstring,smtp,ssl,str sendmail.ml -o sendmail

Before using it, remember to customize your name, email address, GMail user and password.

Code comparison

The first difference that jumps out at everyone confronting the two scripts is the number of lines: 41 lines for Python against 163 of my OCaml version. The difference is justified by the fact that the Python standard library comes with an almost full featured SMTP client, with ESMTP and TLS capability. On the other side Objective Caml has a very concise standard library, which includes essential modules and data structures, but no “batteries” are provided out of the box. This is a precise design decision by INRIA and, in some ways, I agree with them. Luckily the OCaml community is a source of excellent libraries and bindings, like Ocamlnet by Gerd Stolpmann and the SSL library binding, written by Samuel Mimram. The first one is in particular the Swiss Army Knife for network oriented battles.

Since the SMTP client provided by Ocamlnet doesn’t include TLS capability I decided to stole the source code and adapt it to my needs, to have a more comfortable and high level interface resembling the one offered by the Python standard library.

So the different length is easily explained: 109 lines of code are devoted to the smtp_client class, and the actual script is 54 lines long.

The forward pipe operator

All Turing complete computer languages are equivalent, but everyone knows this is only the theory and everyone have a programming language of choice. Here are two examples of what you can do in OCaml.

The first is the pipe operator:

let (|>) x f = f x

Here we define a (very common in FP) infix operator which simply inverts the order of its operands. What the frack is this? Very simple, we use it to invert the order of a function with its last parameter so, if we want to compute the 3rd Fibonacci number we can write:

let fib3 = fibonacci 3

but also:

let fib3 = 3 |> fibonacci

This is not a style issue, we can define a simple infix operator that feeds a function with a value; we can of course connect several functions together, like in a shell script with the Unix pipe operator, transforming an ugly and difficult to be read call:

let result = func1(func2 (func3(x)))

into:

let result = x |> func3 |> func2 |> func1

In the sendmail.ml script, line 127, we read:

email_string |>
  Str.global_replace new_line_regexp "\r\n" |>
    Str.split crlf_regexp |>
      List.iter (fun s ->
        self#output_string (if String.length s > 0 && s.[0] = '.' then
                              ("." ^ s ^ "\r\n")
                            else s^"\r\n"));

Here we take the string containing the email, we replace all new lines with the sequence “\r\n”, split the stream into lines and in the end send each line to the SMTP server, taking care of quoting each line starting with a period. In 6 lines of code.

Algebraic data type

Algebraic data type are a very interesting aspect of functional programming. We can easily wrap two heterogeneous data types into a single one with two line of code:

type socket =
  | Unix_socket of Unix.file_descr
  | SSL_socket of Ssl.socket

The smtp_client class contains a reference to the connection handle used for communicating with the server which is a plain file descriptor or an SSL socket, which one depends on the state of the communication. I do not want to create a virtual class or an interface and two implementing class as I should do in horrible languages like Java, spending half an hour deciding which methods to put in the public interface, and so on; after all, it’s only a file descriptor!

Now I have a new type which is a disjoint union of the two original types and I can write code like this (line 54):

let input = match channel with
  | Unix_socket s -> Unix.read s
  | SSL_socket s -> Ssl.read s in

Here we say: if channel is actually a Unix file descriptor, let’s define a new function “input” which is the standard function “read”, from Unix module, otherwise, if channel is an SSL socket, let’s define “input” as the Ssl.read function, which works only in ciphered sockets. From now on I’ll use input instead of one of the two original functions.

Ok, it’s time to stop the waffle. Enjoy the script if you need, it’s completely free, like in free beer, in free speech and even in free sex! :-)

Tagging is not the right way

Bad tagging example

Yesterday I was listening to some music with the Last.fm client, when a song I particularly like started. As always, I decided to mark the song as “loved” and to tag it with something useful. If you use the Last.fm client, you know that it suggests the most common tags for the tune you want to tag. Ok, usually the list of tags includes a lot of stupid words, but this time I was surprised to see the word “gnocca” in the list.

For people not speaking Italian, “gnocca” is a coarse term referring to the vagina and, by extension, to a sexy girl: it can be translated with “pussy” in the first case and… I don’t know a suitable translation for the second meaning.

Actually, this is the worst case of tag I’ve ever found, but it’s not the first time I was disappointed with other user’s tag choice.

As a matter of fact many internet sites that use tags to classify user contents, are showing their limits. The whole paradigm of user-defined tags, well known with the term folksonomy, is based on three ideas:

  1. it’s nearly impossible to classify contents inside a tree of categories;
  2. associating words (tags) to contents is effective, because the user will remember the world and then, searching for the tag, he will easily be able to recover the piece of information she looks for;
  3. as a good side effect, if many users tag the same object, the most appropriate tags will emerge and a big number of users will automatically screen unused or not relevant tags, so that other people will easily retrieve information.

While I agree with the first statement, the last two are questionable, at least. Sure it’s very difficult, if not impossible, to arrange a large and heterogeneous set of objects into a tree-shaped data structure, particularly if the set grows with time and you don’t know in what “direction” the tree’s growing. Everyone who owns a personal computer and has tried to sort out his or her “document” folder , now can understand what I mean: there isn’t a hierarchy that fits to all needs, because many documents can be correctly folded in different places at the same time.

The proposed solution is to tag documents with a chaotic cloud of words freely chosen by people, where the only valid criterion seems to be common sense or a more hazy association of ideas.

My experience is that tagging without a criterion is only another way to lose information. Using a hierarchical tree of directories (or categories) leads the user to lose documents, because people tend to forget the aspect of the document they have chosen to catalogue it. The same situation still gets worst with tags: I usually choose as many tags as I think appropriate, in the secret hope that the large number will help me in the task of searching information later on. The net effect is the proliferation of synonyms, singular and plural tags (eg: tool and tools) and completely useless words, because too much generic (eg: hardware, programming or software), or too much specialistic (eg: xgl or xen), so that about the 48% of my tags actually label only one document.

Those statistics are based on my personal experience using del.icio.us, one of the services to which I pay great attention when I choose my tags, because Internet bookmarks are very important for my job. You can download here the file containing my del.icio.us tags, ordered by frequency. More than 48% of tags is used only once, and only a 20% is used twice, so I guess that the most of my tags is completely useless. Not very good, actually.

The lesson here is that cataloguing a large quantity of information is not for free: a simple and easy way to have tons of documents well ordered, always accessible at any time under your fingertips, is an utopistic dream.

I think now it’s time to drop buzzwords like “web2.0″ (the parent of all buzzwords) and to pass to some more serious and structured ideas about information architecture. Since I don’t like to reinvent the wheel again and again and since I need something to index my documents, I decided to investigate how librarians organize the knowledge in a big library, following these two ideas:

  1. the librarian is a very old job, and librarians can boast a thousand-year-old experience;
  2. I have many documents on my hard disk, and these documents are very different about topics, media type and relevance, but nothing compared to the Library of Congress or other similar libraries in the world.

A quick search around led me to some readings and I discovered a whole universe of studies about information indexing. The most appealing theory I found is the faceted classification, in which multiple trees of “facets” are used to reach information. What are facets, actually?

A faceted classification system is composed by a number of categories (facets) what represent different aspects of the items we are going to classify. Each facet (aspect) is explained and developed in a tree of terms (now to be known as “foci”, or individually, “a focus”). To classify an item, therefore, you apply one or more terms from one or more facets to the item. In this way you have a multidimensional approach to the items you are indexing.

There are two main criteria developed by librarians to compose faceted classifications:

  1. the list of facets should represent several aspects of the items to be classified, and should be “orthogonal”, as much as possible;
  2. the tree of terms belonging to each facet should present at each node a unique criterion of division, i.e. the set of children of a node must be a partition of the whole parent node, so that the hierarchy has no overlapping terms.

Following these two principles the result will be a set of trees in which items are classified, and consequently several access points from which to start the search. Instead of a tree, the final data structure resulting from this kind of classification is a DAG (a directed acyclic graph), which provide a flexible way to organize knowledge, without being chaotic like a “tag cloud”.

An excellent example of faceted classification is the Nobel price winners page, a demo of the Flamenco Project of the Berkeley University: you can navigate through the various criteria in which Nobels are classified in an intuitive way, with a simple and effective interface.

Another example of use of facets is the Amazon jewelry page. You can reach the page going to Amazon.com and looking for “Jewelry and Watches” in the Product Categories menu.

Since I find the idea of facets very interesting, I decided to start a little experiment in this blog: Wordpress handles trees of categories, so I decided to adapt them and use the whole category system as a facets system. Is will not be perfect, because there are no facilities to navigate into the DAG, like in the Flamenco Project, and the reader cannot choose more than one focus at time, but it’s a starting point. If the result, in a few months, will be better than my experience with del.icio.us tags, I’ll probably start a software project to handle files on my computer using facets.

Copyright © 2004–2019 by .
Creative Commons License Content on this site is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 Italy License.

RSS Feed. Valid XHTML 1.1. This blog is written in Objective Caml. Design based on the work of Rodrigo Galindez.