MadBlog
Saturday 28 February 2009

How to not make a function safe

This code comes from the actual sources of the last PHP release (ext/sockets/sockets.c). It's probable that this code has been here like, forever:

First, yes, this is for real a static variable in the C module, those guys don't know about either of the const or static keyword.

   /* inet_ntop should be used instead of inet_ntoa */                                                       
   int inet_ntoa_lock = 0;
   /* ... */

Then they wrap getpeername:

   PHP_FUNCTION(socket_getpeername)
   {
       /* ... */
       if (getpeername(php_sock->bsd_socket, sa, &salen) < 0) {
           PHP_SOCKET_ERROR(php_sock, "unable to retrieve peer name", errno);
           RETURN_FALSE;
       }

so far so good. Or maybe not so good.

       switch (sa->sa_family) {
           case AF_INET:
               sin = (struct sockaddr_in *) sa;
               while (inet_ntoa_lock == 1);
               inet_ntoa_lock = 1;

Riiiiiight, locks 101, you fail.

               addr_string = inet_ntoa(sin->sin_addr);
               inet_ntoa_lock = 0;

QUICK ! let's release the lock before we actually use addr_string !

               zval_dtor(arg2);
               ZVAL_STRING(arg2, addr_string, 1);
               /*... */

And you know the best of it ? Of course that code is totally useless since[1]:

   $ grep -wc inet_ntoa **/*.[hc]|grep -v :0
   ext/sockets/sockets.c:4
   ext/standard/basic_functions.c:1
   ext/standard/dns.c:2
   main/network.c:1
   sapi/cgi/fastcgi.c:1
   sapi/tux/php_tux.c:1

Those guys are priceless. Especially when you consider the number of 50-liners to implement inet_ntop that lie around.

Notes

[1] update: this grep means that inet_ntoa isn't wrapped in the code at all, and that this "locking" code, in addition to be pitiful, is just worthless

Thursday 26 June 2008

firefox3 and SSL

Yes Stefano, I too believe that the SSL certificate thingy in Iceweasel 3 is just a bad joke.

I mean WHAT THE FUCK WERE THEY THINKING ? At least in the RCs it improved, and the logo isn't the same as 404, it took me 10 refresh the first time I saw it to understand that it was not a 404 but an untrusted self-signed certificate.

And the whole "exception" thing is absurd. NO I DON'T WANT TO ADD AN EXCEPTION you morons. Here would be a good UI, that I never saw implemented in any browser but would make sense:

  • I want to browse the damn site, that you remember for me the current certificate, and browse without pretending it's safe (this way users do not believe to be protected when they are not).
  • Next time I go to that website again, if the untrusted certificate changed, then and only then I want to be bugged about the fact that this website changed its certificate.
  • I don't want any popup, anything, no "yellow" background in the URL (as the certificate is not trusted) no nothing, just browse it as if it was plain HTTP wrt the UI.

It's enough for 99.9% of the users.

Then, some of us really care about importing some untrusted certificates (for our own webmails e.g.). Then well, a small button that allows advanced users to check, import, and mark an untrusted certificate as trusted is just what one needs. One can even hide that in some menu, it would be fine. I for example, have only checked the SSL certificate from my webmail and something like 3 or 4 websites. I do NOT want to be bugged for the other.

But no, instead we have this completely delirious interface that no-one can sanely call User Interface. They're on crack.

Anyways, here is a way to make your pain easier, either in about:config or in your ${HOME}/.mozilla/firefox/${profile-with-name-on-crack-id}/user.prefs set the following values this way[1]:

 user_pref("browser.ssl_override_behavior", 2);
 user_pref("browser.xul.error_pages.expert_bad_cert", true);

It makes validating a certificate two clicks away (one of the settings shows the 'add exception' buttons instead of showing a link to make them visible, the other settings makes firefox download the certificate for you so that you don't have to ASK HIM to do so[2].

And you know the worst thing ? I'm sure the guy(s) who wrote this fucking dialog is(are) very proud of it.

Notes

[1] Thanks to glandium for the hints

[2] WTF people, WTF do you need us to make you download the certificate. This is nonsensical, the user had already 3 clicks to have the add exception dialog, and you need one to download the certificate ? WTF are you thinking!!!

Wednesday 7 May 2008

git prompt

Following Martin's post on the subject, since I created my prompt, I've updated quite a lot.

It only showed the branch before, now it does so even when I'm on a detached head or something, and also shows when I'm in the middle of a rebase or a merge. You can see the zsh-fu for this. For example, when I'm in the middle of a rebase on my paid work repository, it looks like:

┌─(10:34)──<~/dev/mmsx master <rebase -i>>──
└[artemis]                                                 (~/dev/mmsx/Build/)

With nice colors I cannot really show without a screenshot that I'm too lazy to do :) This is a recent addition that I shamelessly took from the contrib bash prompt in git-core package. And to be frank it's really needed, because it's cheap tests (basically looking for magic file names) and that it can tell you if you forgot to end a rebase or a merge, which can happen if you have been disturbed in the middle of it by a colleague for example.

I liked the '*' idea from Martin to show if the tree is dirty. Sadly it's not an option. Martin, to do that, you can do:

 unclean=
 git diff-files --quiet && git diff-index --cached --quiet HEAD -- || unclean='*'

But this is a very expensive operation. On the glibc git repository, it takes seconds with cold cache (and it's not very surprising because it basically has to stat(3) a lot of stuff). And not having a shell for seconds is a bit extreme.

PS: I know my prompt only supports git, but:

  1. I barely care about other VCSes as I only use git and sometimes svn for packaging ;
  2. when I have to use svn it's for cheap stuff where I don't really need the prompt help.
Monday 25 February 2008

Dear John…

wrt your issue yes it's true that for short series of patches, it's often asked to rebase it on a clean state if it doesn't merges fully, though if you're doing a complicated work, meaning dozens of patches e.g., it's usually two things:

  • you're a regular contributor ;
  • you spend quite some time working on it.

When this happens, upstreams are usually okay with merging from a public repository that you would set-up. Though, it's usually a tiny more work for the upstreams to work with a new remote repository, and those are only used for these cases.

Also note that upstream could really fake the same work branching off from the point you branched off, and using git-am from that point instead of git am -3 on the top of the current devel branch (which is similar to a rebase, hence creates new sha's).

IOW, it's not really a git deficiency (even if git-format-patch could maybe annotate _more_ where it comes from, and git-am grok that to re-create a topic branch from that) as git has the features, just that it probably doesn't make it easy enough, and that people usually don't care enough for very short series.

Tuesday 5 February 2008

mixed utf-8 and 8bit charset foo

Deinterlacing utf-8 from a 8bit encoding (say it's latin1, but it works with all 8bits encoding the same), and that could have multiple utf-8 rencoding, there is a very simple tool to write for the task.

Let's assume that you have somewhere a function void charset_to_utf8(FILE *, int c) that takes a FILE * it writes the utf-8 encoded character c. Then write something that roughly looks like that:

   static int utf8_wclen(const unsigned char *s, int maxlen)
   {
       static char const utf8_len[32] = {
           1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
           0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 3, 3, 4, 0,
       };
   
       int trail = utf8_len[(*s++) / 8];
       if (trail > maxlen)
           return 0;
       switch (trail) {
         case 4: if ((*s++ & 0xc0) != 0x80) return 0;
         case 3: if ((*s++ & 0xc0) != 0x80) return 0;
         case 2: if ((*s++ & 0xc0) != 0x80) return 0;
         case 1: return trail;
         default: return 0;
       }
   }
   
   int charset_utf8_deinterlace(FILE *f, void *data, int dlen)
   {
       const unsigned char *s = data;
       int pos = 0, res = 0;
   
       while (pos < dlen) {
           int wclen = utf8_wclen(s + pos, dlen - pos);
           if (wclen) {
               fwrite(s, wclen, 1, f);
               pos += wclen;
               res = 1;
           } else {
               /* assume its $charset */
               charset_to_utf8(f, s[pos++]);
           }
       }
       return res;
   }

You can easily base a tool that mmaps a file passed as an argument, and prints an utf-8 clean file to stdout, and that exits with a specific code when it met something that looked like valid utf-8[1]. let's call that tool recode_to_utf8, then if you fear you have multiple reencoding of your data, you need to do that:

 #! /bin/sh
 wrap_recode() {
   recode_to_utf8 "$1" > "$2"
   case $? in)
     0) true;;
     $still_utf8) false;;
     *) echo "WOOPS IO Error" 1>&2; exit 1;;
   esac
 }
 cp your_source dirty
 while ! wrap_recode dirty utf8_clean; do
   iconv -f utf8 -t $charset utf8_clean -o dirty
 done
 rm -f dirty
 # result in utf8_clean

This is all very sketchy, but I've never found a tool that does the job properly, and it's quite simple to derive tools from the methods above. Note that it assumes that it's highly unlikely that a valid sequence of your original charset can form a valid utf-8 codepoint, which for text is usually true (at least in latin1).

Notes

[1] you want the code to be specific to catch IO errors that I ignored for the sake of shortness, the post being quite long already

Wednesday 12 December 2007

Yay \o/

The french ISP free which I'm a custommer now gives ipv6 connectivity (through a curious 6to4 clone though). The nice news is that we now have really near 6to4 nodes in France all over the country. It even has quite a nice ping.

All the stuff is done through the modem, and gives autoconfiguration to the network. I just had to "modprobe ipv6" on my boxen and voilà !

I've not looked at reverse DNS yet, as it's not regular 6to4 (it has a 2a01 prefix) http://6to4.nro.net/ cannot do a thing about it yet. We'll see how it goes :)

Monday 26 November 2007

git-commit(1) vim ftplugin update…

Dear vim-ers, yes I know my git.vim ftplugin was kind of a dirty hack, and that it didn't worked that well.

Since a few days, git.git next branch shows relative paths in git-status(1) and that was the last hit to my already not that stable hack. So I raised the question on the git mailing list on how to do that plug-in properly, and it happens there is a perfect solution, that was here from day 1…

So here is a new release of the plugin that uses the special index[1] git-commit(1) builds for the occasion, hence can use git-diff(1) directly instead of parsing clumsily the commit buffer.

Direct benefits:

  • it does renaming detection well;
  • it works correctly when you are --amend'ing or committing something that was pushed to the index, and that the corresponding files have non staged local changes;
  • it's damn fast (even when you commit more than 10 files);
  • you can even hide whitespaces changes, just add let git_diff_opts = "-C -C -w -b" in your .vimrc and you're done !

Notes

[1] or commit staging area ;D

Saturday 10 November 2007

M(adM)utt pr0n

Here is what happens when a (mad)tty meets a (mad)mutt

C

Okay, for real it has zillions of usability issues right now:

  • you shouldn't resize the mutt without leaving your $EDITOR first ;
  • you cannot interact with the rest of mutt either ;

But all those things needs a proper event loop (and anyone that ever looked at the mutt code knows that well…).

and yes it works for emacs too

Sunday 9 September 2007

git awsome-ness [git rebase --interactive]

With the last git release, git-rebase gained a new option: --interactive.

If you already had the feeling that in a patch series of yours you should have ordered patches differently, or merged some, then this command is what you dreamed of. Here is how it works…

Let's pretend you want to rework your last 10 patches, you'll run:

   $ git rebase -i HEAD~10

It will launch your $EDITOR and you'll see something like:

   # Rebasing 16d3800..14f3d11 onto 16d3800 
   # 
   # Commands: 
   #  pick = use commit 
   #  edit = use commit, but stop for amending 
   #  squash = use commit, but meld into previous commit 
   # 
   # If you remove a line here THAT COMMIT WILL BE LOST. 
   # 
   pick 6270640 Simplify write_tree using strbuf's. 
   pick 27c528a Further strbuf re-engineering. 
   pick fd82c9a Eradicate yet-another-buffer implementation in buitin-rerere.c 
   pick eee488f More strbuf uses in cache-tree.c. 
   pick 16878b5 Add strbuf_rtrim and strbuf_insert. 
   pick e9081af Change semantics of interpolate to work like snprintf. 
   pick 99c3ef5 Rework pretty_print_commit to use strbufs instead of custom buffers. 
   pick 203db5d Use strbuf_read in builtin-fetch-tool.c. 
   pick a20d939 Use strbufs to in read_message (imap-send.c), custom buffer--. 
   pick 14f3d11 Replace all read_fd use with strbuf_read, and get rid of it.
   ~
   ~
   ~
   ~
   ~
   ~[1]

Then you can rewrite "pick" into "edit" if you want to change something in a commit, or "squash" if you want to merge it with the one from the line before.

What the small help doesn't say is that you can actually reorder your commits, and it will do what you expect it to do. I used it 10 minutes ago, because I have this string buffer module I extend on a regular basis, I squashed every API extension of that module in one commit using that.

Each time one change needs you to edit anything because either you asked for it, or that one of the change you asked for generated a conflict, then as usual the rebase will stop. You will be prompted to make the change, or fix the conflict, or merge comments (in case of a squash), and when all is in order, you just need to:

   $ git rebase --continue

This is just awsomely simple and intuitive

Notes

[1] if you don't have those, your $EDITOR sucks btw

Tuesday 27 March 2007

I now feel I've achieved something ...

The so called code is ugly (at least I would have beated my student to write such crap[1]). And to paraphrase Linus: I'm a disgusting pig, and proud of it !!!

   $ ./madmutt -f test.mbox
   MCore.pwd()    = /home/madcoder/dev/madmutt
   MCore.shell    = /bin/zsh
   -> setting MCore.shell to /madmutt/is/on/lua/crack gives:
   MCore.shell    = /madmutt/is/on/lua/crack
   MCore.version  = devel
   MTransport.sendmail   = /usr/sbin/sendmail -eom -oi
   -> exiting

edit: Some have wondered: I'm just extatic because I'm slowly replacing the good old muttrc with lua, and that my script that generates the lua bindings for me just works as expected. (Yes I'll obviously write some kind of legacy-thing importer at some point, but I'm really not anywhere near that point yet, even if I use madmutt daily, it's not even alpha quality: it basically works for me).

Notes

[1] yeah in another life I teached OCaml...

Sunday 25 February 2007

git rebase is not harmful, it's just _not_ always the best solution, that's all.

I must say I disagree with John a lot about git rebase.

git-rebase is the most nice feature I've ever seen float around in a SCM, and is part of the things I love in git. I mean, I do not disagree with the fact that it cannot be used in every single case. It is indeed meant to be used in private topic branches. git-rebase is meant to be used in a workflow where you have a topic branch (meaning some non releaseable nor push-able work) and that you still want to keep up with others work.

With other SCM's, you have to update your working repository, wich in many cases generates nasty conflicts, hard to deal with. Especially in svn e.g.. With git, you commit your work, pull the remote branch, and then rebase ontop of it. It makes a lot of sense, and when your topic branch is indeed ready, you can push it upstreams, and the next rebase will merge the bits that have been accepted automatically. I just can't think of a simpler way.

Btw, we use svn at work, and I do use git-svn instead of svn to be able to develop my own patches without fearing conflicts in the same way I did before. I know I will benefit from the powerfull git merging capabilities and help at any stage, even if I did not pulled the svn for a long time. That makes the developpement much more sane, as I only try to push coherent changes, hence on a less regular basis. You could not do that without git-rebase. In that sense, git-rebase is anything but evil.

Wednesday 21 February 2007

git tricks ...

when you have a svn-like use of git (I mean with a central repository), it's a good thing to repack the central repository from time to time. If your repository lives (totally random example) in /git/pkg-xorg/lib/mesa.git, you can do:

   cd /git/pkg-xorg/lib/mesa.git && GIT_DIR=. git repack -a -d

I'm told that some people that had more than 1.5Go of git repositories have seen their main repositories shrink well under 200Mo of disk usage. On my own git central server, I use that as a cron:

 #! /bin/bash
 GIT_BASE=/git
 for repo in $GIT_BASE/*.git; do
     pushd $repo &>/dev/null
     GIT_DIR=. git repack -a -d  &>/dev/null
     popd $repo &>/dev/null
 done

but on a big and loaded server one could make it better and repack only when it seems to be needed and use (untested):

 #! /bin/bash
 GIT_BASE=/git
 GIT_THRESHOLD=5000
 for repo in $GIT_BASE/*.git; do
     pushd $repo &>/dev/null
     if test $(find objects | cut -d/ -f3 | wc -w) -gt $GIT_THRESHOLD; then
         GIT_DIR=. git repack -a -d  &>/dev/null
     fi
     popd $repo &>/dev/null
 done