mercredi 23 décembre 2009

Excel 2000/2003 bug



Trouvé ce jour un bug sous la version française de Excel 2003. Rien d'extraordinaire, vous me direz... Je publie néanmoins parce qu'il est a mon sens révélateur de la philosophie des produits Microsoft (à laquelle je suis de plus en plus réfractaire).

Le descriptif du bug:
- dans une cellule, taper les 3 lettres "dec"
- copier/coller (CTRL-C/CTRL-V) cette cellule, et la coller dans celle du dessous.
- Sélectionner les 2 cellules, puis à la souris, pointer la poignée (coin inférieur droit de la sélection), clic-gauche, et on descend (recopie incrémentale).
Comme on a sélectionné deux cellules identiques, il n'y a bien sur pas incrémentation, par contre, Excel transforme le contenu des cellules en "déc".

On pourrait se dire qu'il s'agit du formatage automatique de la cellule en fonction de ce qu'on saisit dedans. Exemple, je tape "1/1/09" dans une cellule, et il considère ça comme une date, et active le formatage adéquat. Du coup, si dans la même cellule, je retape "1", il m'affiche "01/01/1900". Déjà ça, c'est limite, mais on s'y fait (si, si...)

Mais en l'occurrence, non, c'est même pas ça ! Si dans une cellule copiée contenant "déc", je saisis "1", j'obtiens bien "1" affiché !

Voilà bien le genre de chose qui m'horripile au plus haut point. Lui se dit : "bon, ce crétin parle du mois de décembre, mais il ne sait pas qu'il y a un accent, alors je vais lui mettre l'accent, dans son dos, sans rien lui dire"

Je ne supporte pas qu'un programme fasse des trucs dans mon dos. Qu'il me fasse des suggestions, des propositions, tout ce qu'on veut, mais quand je tape 3 lettres, que je copie ensuite, je ne veut pas qu'il me les change. C'est moi qui connait la sémantique derrière mes saisies clavier, je ne lui demande pas de faire des hypothèses là-dessus.

Après essais, existe aussi dans la version 2000. Et sur 2007 ?

mercredi 9 décembre 2009

Transforming column datafiles into line datafiles on Windows




In the field of computer science, you frequently need to visualize data: it always makes things clearer when you show a chart, the reader gets the picture just by looking at it, without having to read the painfull 15-lines paragraph below.

So you have data. Sometimes lots of data. Sometimes badly organised text data files. For instance, where successive values are not on successive lines, but on successive columns. This might not be clear for everybody, so here is an example. Say you have a file containing temperatures measured every hour, for some period of time (a month, for instance). The usual way of doing is one measure per line:

1/1/2009;0;22
1/1/2009;1;23
1/1/2009;2;22.5
2/1/2009;0;24
...
and so on. And sometimes you run across datafiles where the layout is:

1/1/2009;22;23;22.5
2/1/2009;24; ...
...
While regular people won't find anything bad about this (it does indeed save some disk space!), this type of layout is actually unlogical: columns are supposed to be the different fields of data, not successive values. And it makes things more complicated when it comes to plotting...

I ran across this issue when trying to illustrate my previous post on gnuplot with a nice figure. I wanted to have a fancy "real-world data" illustration, so I downloaded electricity daily consumption datafiles from RTE (you can get those here). They provide Excel files per year, with 365 lines, one per day, and the power consumption for every half-hour (48 values per day). And guess what, the layout is just as described here...

So, first, before writing an adequate gnuplot script file, you need to transform columns into lines. And this is what this post is about, for Windows users. It can also be considered as a demo of what you can do with the Windows command-line interpreter.

A quick search shows some interesting material, mostly based on Linux tools (see here for example). And yes, Windows users, you'll need to get some new software, because Windows lacks some basic tools. At present we will only need the 'cut' tool, a binary can be downloaded through the gnuwin32 coreutils package.

What we need to do here is to process each line of the file, cut it into fields, and write a one datum per line output datafile. So, lets go first for the line-by-line processing, using the for command (you have of course already converted the file to .csv format):

for /F "delims=." %%a in (%file%) do call :sp1 "%%a"
Of course, you will have previously put the file name into the 'file' variable with set file=myfile.csv. The "delims" item is there just to get the whole line of data.

Then, we need to produce one output file for every column that contains a data value. This is done with the "numeric" version of the 'for' command:

--------------------------------------------
:sp1
set line=%~1
echo %line% >line.txt
for /L %%b in (1,1,48) do call :sp2 %%b
goto :eof
--------------------------------------------
Finally, cut the same line 48 times, and add each of the columns to the output files (column 1 is the date, column 2 is some non-significant data, the first value is in column 3):

--------------------------------------------
:sp2
set /A col=2+%1
"%app%" -d; -f1,%col% line.txt >> all.dat
goto :eof
--------------------------------------------
with the variable 'app' containing "c:\program files\gnuwin32\bin\cut.exe"

And that's about it, the output file all.dat will now contain one line per data point. The only thing left is to leave a line between each day, so gnuplot can figure out where the day stops, so sp1 is actually more like:

...
for /L %%b in (1,1,48) do call :sp2 %%b
:: gnuplot needs 1 blank lines to separate records
echo. >> all.dat
goto :eof
--------------------------------------------
Finally, we can plot the thing with some classical gnuplot scripting:

------------------------------------------------------
set title "France power consumption on mondays, 2008\n(data source : RTE)"
set xrange [0:47]
set xlabel "day period"
set ylabel "week"
set yrange [0:51]
set zlabel "MW" offset 0,3
set ztics 30000,10000
set datafile separator ";"
set style data lines
set grid
set pm3d
set surface
set hidden3d
fn="all.dat"
set view 42.0,56.0
unset colorbox
splot fn using 2 every 1:7 notitle
pause -1
set terminal png size 640,480
set output "RTE_2008_monday.png"
replot
------------------------------------------------------
Feel free to comment (english or french) if you'd like more details.


mercredi 25 novembre 2009

A warning on behavior of gnuplot with numerical values


My favorite plotting software is gnuplot. Not that I am an expert in plotting software, it was just the first I really invested time into, after getting tired of ugly Excel plots... But I got used to it, and I like the idea of script-based plotting.

However, I must say it can be quite tricky, and I often stumble upon some things difficult to achieve, or to make them work. Yesterday, after stumbling for several hours (!) through a function plot that did not work as expected, I finally found out why, and discovered a strange "feature" of gnuplot(4.3): it does not treat integer values as floating-point values !

While this seems quite obvious in a programming language (C), I don't understand why it is so in gnuplot. As far as I can see, a plotting app should treat all numbers as "real" numbers (understand "floating-point"). But this seems to be an opinion I don't have in common with the designers of gnuplot.

To make it clear, what I mean is that the value 10*k/3 is NOT the same as k/3*10, if you happen to give k=2
(just plot these two expressions, you'll see what I mean)
To have it correct, you have to type k=2.0
If not, then 10*k/3 will be equal to 6, and k/3*10 will be equal to 0, while you expected to be 6.66. Yep, you got it, integer division striked again...

Maybe this feature is useful in some situations, I have no clue. After searching the manual, this is indeed explained in section 13 ("Expressions"), page 15 for 4.2 version manual. Of course, I discovered this after trying several hours to make the damn thing work...

It must be pointed out that in this case, I knew what the plot was supposed to look like. So I was able to track down where the problem was. In most situations, such an error is most likely to stay undetected most of the time, until one day, with one particular case of data, you get a plot full of nonsense...

mercredi 30 septembre 2009

Integrating a LaTeX/Beamer build system under Windows shell



I'm a LaTeX user for a couple of years now, but I only switched to beamer presentation this summer. I always was quite impressed in conferences or other events by the presentations that were made with beamer. At that time, I was happily using MS PowerPoint, but over the time, I felt more and more unsatisfied with mine.
"Beamer guys" presentations looked always cool, the slides were clean, clear and classy. Woah ! I had to switch...

So I recently gave it a try, and I must say at present, I don't think I will ever come back to MS.

I intend to prepare some kind of a tutorial on Beamer, as there's not that much out there specifically on beamer, and I already know some useful tricks. But today I will focus on another point, not directly LaTeX-related, but about a usability trick for building different pdf files, while needing an identical presentation standard for all the documents, and needing each document in different modes. This happened to me when preparing courses. I do a lot of teaching, and I always give a paper handout to the students, so they can focus on what I'm saying, rather than handwriting notes.

With LaTeX, when you are building several documents that must share the same presentation, keeping a consistent header can quickly become tedious. You often need to tweak it, add a package, change a package option, ... If you are working on several documents at once, it's easy (understand: unavoidable) to end up with differents headers in your different documents. And you get slight (or heavy...) aspects changes in the final documents. Moreover, you are likely to forget which header is the correct one.

One way to handle to this is to concentrate on your part of the document, that is, what starts after \begin{document} and ends before \end{document}. All the rest is about formating information, and should be common to all documents.
Just let an automated script do the painful job (adding the header, and calling the compiler). And as I like fooling around with windows shell, I present here an example of what can be done in such a context. For a quick idea, when I'm done editing my file, I just call the right-clic menu on it, and here it goes...

Before:



After:




And I get my ready-to-use pdf file, with the right formatting, and compiled in the right mode:



For now, three different building modes are available:

  • the standard beamer version, with all overlays,

  • a 4 on 1 handout printable version,

  • and a 1 on 1 version, same as the beamer version, but with no overlays, useful for quickly checking page rendering.


And of course, if I need fine-tuning of the LaTeX stuff, I still have the (generated) .tex file in my folder.

If you only want to get the thing working, you can skip the rest, and go directly at the bottom to download. Else, I'll explain how the trick works. Please be adviced that this needs some knowledge about how an OS and a computer works. In my case, it is MS Windows (XP for me, but should be fine on others) and its standard "cmd" script langage, but Linux users should be able to translate in their own shell (at least, experienced Linux users...)

First, your file. As I said, it starts with
\begin{document}
and ends with
\end{document}
You need a title page, on which will be written your name, date, and... the title. So the first lines of the document will look like:
\begin{document}
\title[small title in footer]{Main Title On First Page}
\begin{frame}
\titlepage
\end{frame}
The date is usually set up automatically, and the authors name isn't something that changes every day, so it can be lying in a the header template. These settings can be of course overridden.
\begin{document}
\title[small title in footer]{Main Title On First Page}
\date{2050}
\author{Gill Bates}
\begin{frame}
\titlepage
\end{frame}
Ok, now, how about compiling ? This is done by a simple batch file, that basically concatenates three files: "mode" header, regular LaTeX header, and your document. On Windows, it goes like this (Linux users will tweak this easily):
copy /A "%mp%\head_beamer.tex"+"%mp%\common_header.tex"+"%fn%.texb" "%fn%.tex" > nul
pdfLaTeX.exe --interaction=batchmode "%fn%.tex" 1> LaTeX_log_stdout.txt 2> LaTeX_log_stderr.txt
'mp' is the path where the script and template headers lie, anf 'fn' is the file name. 'head_beamer.tex' is the short header that defines the mode 'beamer', while 'common_header.tex' is where the "real" header stuff goes.

Ok, and how do you choose the right header ? Well, this is really windows specific, as it is about it's registry, and I don't know how you define this with Linux.

To make it short, with Windows, each file extension defines a "type" of file, and each type gets associated with some things you can do with it. This information is stored in the so-called "registry". This can be (badly) handled through the 'assoc' cmd command, or using the 'regedit' GUI. But the best way is to write a .reg file, that will automated this process.

First, we define a new file extension: ".texb"(.tex + B for beamer), in order to avoid changing the default .tex file behaviour you have on your system :
[HKEY_CLASSES_ROOT\.texb]
@="LaTeX.BeamerBody"
And associate this file type (LaTeX.BeamerBody) with the corresponding commands:
[HKEY_CLASSES_ROOT\LaTeX.BeamerBody\shell\build_B]
@="Build (Beamer version)"
[HKEY_CLASSES_ROOT\LaTeX.BeamerBody\shell\build_B\command]
@="\"C:\\program files\\sk_scripts\\BuildWithBeamer\\BuildWithBeamer.bat\" \"%1\" B"
The 'B' letter at the end of the command is an argument that is passed to the script, so it gets the right header. For example, the 'beamer' header will look like this:
\documentclass{beamer}
\usetheme{Madrid}
the 'handout' version like this:
\documentclass[handout]{beamer}
\usetheme{Madrid}% change this to whatever beamer theme you want
and the 'handout 4 on 1' like this:
\documentclass[handout]{beamer}
\usetheme{default}
\selectcolormodel{gray}
\usepackage{pgfpages}
\pgfpagesuselayout{4 on 1}[a4paper,landscape,border shrink=1mm]
\pgfpageslogicalpageoptions{1}{border code=\pgfsetlinewidth{1.5bp}\pgfusepath{stroke}}
\pgfpageslogicalpageoptions{2}{border code=\pgfsetlinewidth{1.5bp}\pgfusepath{stroke}}
\pgfpageslogicalpageoptions{3}{border code=\pgfsetlinewidth{1.5bp}\pgfusepath{stroke}}
\pgfpageslogicalpageoptions{4}{border code=\pgfsetlinewidth{1.5bp}\pgfusepath{stroke}}

% because the 'default' does not add page numbers
\addtobeamertemplate{footline}{\insertframenumber/\inserttotalframenumber}
All this 'pgf' stuff is there to define a solid border around each slide, as the 'default' beamer theme is quite sober (thanks to all the guys on this excellent french-spoken LaTeX mailing list for this trick.)

So what if you want to have this shell menu available on your system ? Well, all this configuration is provided in this zip file: download, unzip, and launch Install.bat. This will copy everything in convenient places, and import settings in registry. You're ready to go, assuming, of course, you have a working LaTeX installed, and available in the path (I use MikTeX).

And of course, an 'uninstall' is provided, to remove the settings from the registry. Once you have this installed, you can check it by going down in the 'demo' folder, and double-clic the file 'demo.texb': it should produce a nice example pdf !

Q & A

Q: what if I don't like the theme you choose ?
Q: what if I want to change colors ?
Q: what if I want to add a package ?
Q: what if I wan't to remove your stupid badly designed logo I get on every page ?
A: All these questions have the same answer: just edit the file c:\program files\sk_scripts\BuildWithBeamer\common_header.tex as you like. This can be conveniently done with the fourth contextual menu item. If you are english native, you will probably have to, as at present, this header is set for french-written documents. Of course, this needs some LaTeX knowledge.

This command calls notepad++, if you use another editor, just change corresponding lines in the file 'install.reg', et re-import it into the registry. And don't forget: with LaTeX, to get the summary correct, you need to compile twice.

If you like this trick, you can tell me about, add a comment, or just drop me a line (firstname DOT lastname AT univ-rouen DOT fr).

samedi 19 septembre 2009

PovRay experience

Well, before this gets lost, here are some thoughts about Povray and what I did with it.

A couple of years ago, I spend some time investigating on synthetic image rendering.
I considered several softwares, and quickly found out that they all needed a lot of learning time to really get into it.

I ended up selecting PovRay, and started learning its syntax. I must say it is very well documented, and the tutorial takes you step by step through the process, so that's not the hard part.

What can be frustrating is the rendering time: as soon as you get into something rather complex, it can become very very long, and may take up to several hours per image if you want full quality (anti-aliasing) and full size images. This is, of course, depending on the processing power of your computer. What you're supposed to do is to do pre-rendering with low-size and no AA, but often what causes the trouble is precisely all the high-level details you can only see with full-scale images...

As a example of what can be done with PovRay, you can find below an animated gif. As you can see, two light sources (two shadows), and a hole inside the blob that has a changing transmit value. This was done in 2007, I think, with PovRay 3.5. Of course, by now, I suppose this software has been updated, and that things can be made more easyly, but I have no more time to follow that. If you want to get going into this direction, make sure you check PovRay, although I know there are lots of others apps (Blender, for example).


cubatrous


Animating is done by generating several images with a automatically varying 'clock' variable, declared in PovRay. Then, the gif file has been produced with the smallest windows GUI program I know: Unfreez. Only 19.5 kbytes! Be sure to check this very simple but efficient tool.

Here is one of the frames with a higher resolution:






You can also notice that this image is far from perfect: although it was rendered with Anti-Aliasing on, you can see some aliasing at the border between the solid and the empty part. I didn't get to the "expert" level, so I wasn't able to handle this. But if you want to work on this, corresponding PovRay script is available here. Included files are standard libraries, so you should be able to make it run right away.


I did all this in a research context: I was working on embedded stereovision, and I had trouble getting a useful dataset. So the idea was to produce synthetic stereo pairs this way, that would simulate what could have been acquired by an on-board camera set. An example of my work can be found below (I used Oyonale's well-known Mini-cooper model). This image took several hours to render, and is still not perfect (see aliasing at the border between land and sky).






However, I found out that practical image processing did not produce the same results than a "real" image: for example, a corner detector will only detect garbage in such an image. This can probably be explained with some maths, but I did not want to lose more time on this approach, so I moved on. This is what research is all about...