Posted by Diego "Flameeyes" Pettenò
Wed, 02 Jul 2008 23:36:00 GMT
Today I was feeling somewhat blue, mostly because I’m demotivated to do most stuff, and I wanted to see what it was like to work in Gentoo two years ago.
One thing I read is that a little shy of exactly two years ago, ICQ broke Kopete, just like they did yesterday. Interestingly enough, even though a workaround has been found for Kopete 0.12 (the one shipped with KDE 3.5), there is no bump I see in the tree this time. Sign that the KDE support in Gentoo has changed, most likely.
There is also the whole thing with ALSA problems, which span so many posts that it’s not worth listing all them. The current ALSA maintainer simply gave up on providing something that, at least for some users, ended up being quite important.
And all the work on Gentoo/FreeBSD! Although Javier is doing a huge work now to support FreeBSD 7.0, he’s not prone to blog about it, and you can see that Gentoo/FreeBSD is easily ending up in the “historical” memory, rather than being discussed and tried out by users daily.
What didn’t change at all is my insomnia, it’s almost 2AM and I’m still up. And this time I don’t even have Antiques Roadshow to watch. I’m currently working on xine, just like two years ago.
In general, I think a lot of areas in Gentoo did go downhill from two years ago, rather than improving. While Portage is certainly improved, thanks to Zac, Genone and the rest of the team, and we can see that in the new extended repoman checks, that also helps QA. But the general user support seems, to me, lacking.
This is a direct consequence, in my opinion, of leaving open doors for people who are just driving Gentoo’s energy away, by taking over projects to make them stall, by discussing details over and over and over, by repeating the same request even when people reject it as it stands, and so on.
I hope things will improve in the next months, thanks to a new council that can finally grow some balls, straightening up the situation, but if this does not happen, I’m already preparing for my plan B…
Posted in Gentoo, Personal, English | Tags ALSA, FreeBSD, GentooFreeBSD, ICQ, KDE, Kopete, Past | 6 comments
Posted by Diego "Flameeyes" Pettenò
Wed, 02 Jul 2008 13:43:00 GMT
I’ve been working as an experiment on rewriting xclip to use XCB rather than Xlib. This is mostly because I always have been interested in XCB but I never had time to learn the internals too much.
To make my task easier I ended up using some funcitons that are not available in the currently-released version of xcb-util, the side-package of XCB that contains some higher-level functions that make it easier to replace Xlib.
Beside the fact that xcb-util still haven’t bumped its version, which makes it impossible to check for the right version with pkg-config, there is one interesting point in using the latest available version through the x11 overlay.
Letting alone some problems with being able to actually fetch and install the packages I need (Donnie, I’ll send you the patches later if I can polish them a bit), over the actual GIT tree there are a few patches applied, coming from Jamey Sharp (an XCB developer) from March 2008. These remove one library (libxcb-xlib) and change the locking method used to make Xlib use the same socket as XCB. These changes not only break ABI (without changing the soname, alas!) but also make it impossible to build the old libX11 against the new libxcb. Using the live version of libX11 (that is also patched to use the new hand-out mechanism) fixes this problem, but the result is a way bigger trouble.
First of all, this is a perfectly good example of what I said about preserve-libs. If you are not using --as-needed, and you had libX11 built with xcb USE flag enabled, you’ll have libxcb.so.1 links on almost all X-using binaries in your system; after rebuilding the new libxcb and libX11 (which respectively would install libxcb.so.2, in theory, and let libX11 link to that), all the binaries will have in their process space both the old and the new libxcb. With different ABIs. And that’s a huge problem on itself.
Then there is the other problem, that is related to the .la files I discussed a few months back. As a huge amount of KDE modules (and not limited to) linked to Xlib, they also had libxcb-xlib listed in their .la files dependencies. Which causes everything to fail linking with libtool as it’s looking for the missing libxcb-xlib.la file.
I suppose it’s time to spend time to get a script to fix this situation, but I admit I’m not much motivated at the moment. Especially since my system is pretty slow when it comes to rebuild stuff for testing, and my employer is not going to pay me anytime soon to allow me getting a newer box.
Once the script is available, it should probably be much much easier to get rid of .la files in ebuilds, as we could just say to the users to run the fixing script and be done with that..
But I admit I was planning on doing some different things in the next days, I had little time for myself lately to begin with, and I’m following way too many things at once. Sigh.
Posted in Gentoo, Technical, English | Tags lafiles, libtool, Lives, X11, XCB, xcbutil, Xlib, Xorg | 3 comments
Posted by Diego "Flameeyes" Pettenò
Mon, 30 Jun 2008 16:48:00 GMT
Fellow developer Zhang Le wrote about the new preserve-libs feature from Portage 2.2 that removes the need for revdep-rebuild.
As I wrote on gentoo-dev mailing list when Marius asked for comments, there are a few prolems with its implementation as it is, in my not-so-humble opinion. (Not-so-humble because I know exactly what I’m talking about, and I know it’s a problem).
Let’s take a common scenario, a system where --as-needed as never used, that is updating a common library from ABI 0 to ABI 1 (so with a change of soname). This library might be, for instance, libexpat.
I don’t want to discuss here what an ABI is and what an ABI bump consists of. Let’s just say that when you make an ABI bump you either remove functions, or you change the meaning of some functions (like the parameters, the behaviour or other things like those).
In the first case the bump is annoying but not much of a problem, executables stop being loaded because symbols are undefined; with lazy-loaded executables, they might die in the middle at the moment the undefined symbols is called, but that’s not our concern here.
The problem comes when a function with the same exact name changes meaning, parameters or return type. In this case, the executable might pass too much or too little data to the function, he pointers might be referring to something completely different, or might be truncated. In general, when you change the interface or the meaning of a function, if the executables built to use the previous version are executed with the new version, they’ll either crash down or behave in a corrupted manner. Which are two subtle issues which we should be looking forward to, as they are hard to debug unless you know about them.
So let’s return to our library changing ABI. Let’s say we have libfooA.so.0 and libfooA.so.1 installed, the first is preserved by preserved-libs, the second is the new one. libfooB.so.$anything links to libfooA as it uses it directly, so it will be in the set of packages to rebuild.
Introducing libfooC.so.$anything that links to libfooB.so.$anything, but as destiny wishes, is also using libfooA.
At this point before the ABI bump we have libfooB depending on libfooA.0, and libfooC depending on libfooB and libfooA.0; after the bump, we decide to rebuild only libfooB, which means that libfooB now depends on libfooA.1 while libfooC is still depending on libfooA.0.
What this means is that, minus symbol versioning, the same symbol would have two (probably different) definitions, which will collide one with the other, leading to subtle crashes, misbehaviour and other fun-to-debug problems.
The problem is that the two ABIs of the libraries are both being loaded in the same userspace, which is a very bad thing, unless the symbols are versioned. On the other hand, symbol versioning is a bit of a mess, it’s not implemented by all operating systems, and I find it quite convoluted.
At the moment I don’t see anything in portage that stops you from shooting in your own foot by doing a partial rebuild. I hope I’m mistaken, but if I’m not, please remember to always do a full rebuild, rather than a partial one. Instead of having programs not starting, you might have programs corrupting your data, otherwise.
Posted in Gentoo, Technical, English | Tags ABI, ELF, Linking, portage, Versioning | 6 comments
Posted by Diego "Flameeyes" Pettenò
Mon, 23 Jun 2008 20:29:00 GMT
One nice thing about rbot is certainly, to me, the fact that installing it is mostly just an emerge away. For quite a while there has been a live ebuild of rbot that fetched first from Subversion and now from GIT.
The only problem was that the ebuild fetched the sources, then made a gem, installed it, and then removed some files. Not exactly the shortest way around.
Thanks to the guys in #rbot, this has been solved. Now rbot installs itself without using gems, but by simple setup.rb. This means not only that you won’t need rake and zip to install it, but also that the installed files have decent path, rather than being buried inside the rubygems directories.
An additional advantage comes from the size of the installed files; when you install a gem, a copy of the sources is left in the rubygems tree; this can be quite a waste. The size of the rbot binpkg has halved:
-rw-r--r-- 1 root root 959426 May 24 12:08 /usr/portage-packages/All/rbot-9999-r8.tbz2
-rw-r--r-- 1 root root 425034 Jun 23 20:18 /usr/portage-packages/All/rbot-9999-r9.tbz2
Add to this that the ruby-gettext dependency is gone too, now you got a nls USE flag that allows you to choose whether you want localised bots or not, and if you don’t want them, it won’t ask you for ruby-gettext at all. Nice, uh?
But it’s not finished here. I was hacking at the ebuild today for a different reason. rbot has a figlet plugin, that executes the figlet command and copy the output on IRC. The ebuild, though, didn’t have an option to pull in figlet, and I wanted to fix that.
So together with the removal of rubygems dependency, the new ebuild also has USE flags to enable the figlet, cal, host and fortune plugins. So you either get your dependencies pulled in, or the bot has the plugins disabled out of the ebuild. Cool!
And talking about the fortune plugin, listing the categories didn’t work on Gentoo, as we install the database in a different directory than rbot expected them to be. This is fixed in rbot upstream repository by yours truly.
Always a pleasure to fix rbot on Gentoo ;) Kudos welcome as well as bribes.
Posted in Gentoo, Technical, English | Tags Ebuilds, Figlet, rbot, Ruby, RubyBot | no comments
Posted by Diego "Flameeyes" Pettenò
Sat, 21 Jun 2008 20:22:00 GMT
Doing support work in #gentoo-it is probably one of the most user-facing tasks I’ve been doing lately, it’s nice because you can often gather common misconceptions about problems and tools.
One of these is related to ccache. Some users seem to think that ccache will improve their compile speed in any situation. This couldn’t be more wrong.
First of all, in the situation of an always different source file, ccache will make build take a longer time than a non-cached build. The reason is pretty simple once you think of it. The cache is indexed by an hash, the md5 of the preprocessed source file; and the content of the cache is the resulting object file. When you build a given source file, ccache will have to take an md5 hash of the preprocessor’s result. Then it should look for it on the cache tree, and if it’s not found, it will have to compile it and write the output twice (once as the output of the build and once in the cache). It might not be a huge overhead but it’s an overhead nonetheless.
So there has to be a benefit to use ccache for it to be useful. The benefit is that, when you build the same sources twice, the hash, lookup and copy takes less time than the build, usually. But when do you actually build the same sources twice?
The first myth to debunk is that it’s helpful for packages using libtool as they build sources twice (one PIC and one non-PIC). While it’s true they build the same sources twice, they are not compiled in the same way, so the cache is not saving anything. If they were built the same way, there would be no reason to actually build it twice, no?
The second is that ccache helps when you change your CFLAGS. The idea of ccache is that it gives you the exact output the compiler would give you. And this means that changing CFLAGS will change the resulting output too. If it was ignoring the change in CFLAGS and returning data from cache it would be breaking your setup by disallowing you to change CFLAGS. Again, ccache is not helping you.
The third myth is that ccache makes changing USE flags a matter of seconds rather than hours. While it’s true that this is a case where commonly you do have an advantage on using ccache, it’s not that simple. Changing USE flags usually means changing the compiled code; there are rare cases (like xorg-server PDEPENDs) that allows you to keep the same exact sources when changing USE flags.
Even then, if you change versions of the libraries used by the software, then the preprocessed sources will change, and we’re back to square one.
All in all, ccache is not bad, it’s helpful in various situations, it’s quite useful for developers. But it’s not a panacea for Gentoo users.
Posted in Gentoo, Technical, English | Tags ccache, myths, panacea | 6 comments
Posted by Diego "Flameeyes" Pettenò
Sat, 14 Jun 2008 13:00:00 GMT
With capital B and I, of course.
I wrote yesterday about maintainer mode use by ebuilds and why it’s a problem. Today I want to focus on a slightly related issue: conditional patching and autotools rebuild.
One very common use of maintainer mode is due to conditional patching: you can’t inherit autotools.eclass conditionally, so you leave it up to the autotools to decide whether to rebuild themselves or not.
This is bad not only for the problems I outlined for maintainer mode itself, but also because conditional patch, as well as conditional autotools rebuild, is something you should not do in an ebuild unless there is an extremely good reason to.
The first problem is quite obviously that you cannot send conditional patches upstream (see also my old article Best practices for portable patches). In Gentoo we should really prefer patches taken from upstream and/or that can be sent and accepted by upstream.
But it’s not limited to that: conditional patches gets applied only on particular conditions, this means that if you are bumping the package and those conditions don’t apply to you, then you won’t be seeing whether the patch applies or not. And if it’s an important fix for a particular architecture, and the patch cannot apply, then that architecture will have a broken package. If it’s a build-fix you can live with it a bit (although it will upset with reason the users of that arch), but if it’s a runtime fix then it’s a problem, because it might go months without being noticed.
And it’s not just during bump, but also when you add a new patch, it might not stack correctly against one of the conditional patches, and you wouldn’t be seeing the failure. So for instance to fix a build error on one setup you might break build on a different one.
In general, you want the patches to be unconditional, and you want to rebuild autotools at the minimal change. This even if the rebuild would be needed by a reduced set of users. The reason is that it’s better to waste a little more time for all the users rather than having a (big or small) subset of them being stuck entirely because of a conditional patch or rebuild.
Also, the sooner you can send a patch upstream, the sooner you won’t be needing it in the ebuild. If you prepare a conditional, raw and quick patch to fix a setup, you cannot send it as it is to upstream, so you’ll have to either spend more time to prepare also a refined good patch, or you’ll have to deal with that patch for a long time. Neither solution is a good one for your workflow.
So please, don’t make anything conditional, unless it’s really really needed.
Posted in Gentoo, Technical, English | Tags autotools, BestPractices, Conditionals, patches | 6 comments
Posted by Diego "Flameeyes" Pettenò
Fri, 13 Jun 2008 14:12:00 GMT
Developers happen to edit Makefile.am and configure.ac a lot during the development of any software. For this reason, a long time ago, autotools included a support called maintainer mode.
When maintainer mode is active, at a change to those files, as well as other autotools files, the proper tool is called (autoconf, automake, etc) to rebuild the whole series of files. This is useful because a simple make call does almost everything for the developer.
Recently, though, maintainer mode is no more optional by default, but it’s instead force-fed on every project, unless AM_MAINTAINER_MODE is explicitly called in the configure.ac file.
This is all well and good, so why am I talking about this? Well because maintainer mode is bad to use in ebuilds. If you patch a Makefile.am or configure file, and you don’t explicitly rebuild autotools, maintainer mode will do it for you, but it causes a few problems this way.
The most annoying problem for users is that if the maintainer mode is triggered by an edit to configure.ac, the build will run ./configure twice. It’s boring enough to run once.
But more problems arise for instance when a new version of autoconf or automake is released. Because in that case the versions will not be the same between the original and the one that would be built, aborting the whole build (see my autofailure guide that reports that for instance).
Also, if you leave the task up to maintainer mode, it won’t even take into consideration the WANT_AUTO* variables that we use to force a particular version of autoconf and automake, so you’ll have automagic dependencies. And most likely you’ll forget to add autotools to the dependencies of the ebuild.
Okay so we know why maintainer mode is bad, how do we make sure that ebuilds don’t trigger maintainer mode? Beside the fact that developers should know what they do, by rebuilding autotools explicitly when patching the files, it’s worth looking “postmortem” if an ebuild triggered the maintainer mode rebuild. It’s actually useful because sometimes upstream is crazy and fixes the tarball without going through make dist and maintainer mode is triggered for unpatched vanilla tarballs too (in that case you should rebuild autotools explicitly anyway).
To check if a build triggered maintainer mode, you can just grep the build log for the string “missing --run” (without quotes of course), that is the announce that maintainer mode was triggered. If that’s the case… it’s time to file a bug to latch to this tracker .
I suppose I’ll have to fix bugs on a few more packages I don’t use/care about myself directly in the next weeks. Sigh, I need the new box.
Posted in Gentoo, Technical, English | Tags automake, autotools, MaintainerMode | 3 comments
Posted by Diego "Flameeyes" Pettenò
Wed, 11 Jun 2008 16:45:00 GMT
I’m in the middle of an emerge -e world I started to try getting as much glibc 2.8 failure have a bug already on Gentoo’s Bugzilla so that users hitting them don’t have to report them anew.
One thing I noticed during this rebuild is the amount of time spent with no good reason to build tests and examples.
I have the test feature disabled, I only enable it for the software I maintain, and I usually don’t have so much time to look at the test of all software. Still some packages, like dbus, hal, gtk, and so on, build their tests anyway. They don’t run them, but they build tests and example during the default make all call.
I talked with Lennart about that some time ago as also libdaemon does that. He pointed out to me that these examples are often a way for the developers to test that the code still links correctly, for instance, and thus should not be disabled by default during developmet. I agree it can be useful that way.
For this reason, I prepared a patch for libdaemon, that you can now find here, which I should have committed and I forgot about – I’ll see to do that soonish :P – which should make it possible to have the best of both options: developers still get their examples (and/or tests), while distributions can opt-out them.
For ebuilds, this means that the src_test function should do a make -C tests or make -C examples to build tests and examples, after such a patch is applied, to explicitly build the code that most users won’t need during standard run.
I know it’s just seconds, or minutes, of time to build the examples, but if you add all those up, you’re wasting lots of time. And we should always strive to optimise time usage, no?
Posted in Gentoo, Technical, English | Tags autotools, Builds, Ebuilds, Examples, libdaemon, patches, Tests, Upstream | 2 comments
Posted by Diego "Flameeyes" Pettenò
Mon, 09 Jun 2008 13:06:00 GMT
Mike already wrote to gentoo-dev about the new build failures with glibc 2.8. Thanks GNU, really, for starting the long overdue cleanup.
I say long overdue because other C libraries like FreeBSD’s and others already had headers that included the very minimum needed. So some of the issues that we’re going to fix now were just as valid for FreeBSD and others.
So this is going to help portability to FreeBSD and viceversa, stuff already ported to FreeBSD should build just fine with the new glibc version.
As I’m borderline masochist, I started looking at patches to see if I can easily tell if they are good and commit them myself as needed (yeah I know I step over lots of toes; with most people I got agreements before though, so sometimes it’s just fine, beside as long as the patch works and I don’t break stuff nobody complained recently ;) ).
One thing I noticed was a misfiled bug, it reported failure with glibc 2.8 when it was instead a failure with gcc 4.3. The issue is more or less the same, missing includes most of the time, because both projects cleaned up their header files, but it’s important to know which package caused it to make sure that the faulty ebuild are all stable before gcc and glibc can make stable.
For the list of bugs caused by glibc 2.8 I refer you to the tracker as you can see there, the usual problems are:
- PATH_MAX undeclared;
- ARG_MAX undeclared;
- field ‘reg’ has incomplete type;
- storage size of ‘peercred’ isn’t known.
The solution for some of these issues is to add the right include (like limits.h), for others is to properly define _GNU_SOURCE to get the GNU extensions enabled. Usually projects using autotools already have a check for that, but recent autoconf versions use config.h to define that, and not all the sources include it.
On ebuilds you can work that around by using append-flags -D_GNU_SOURCE but upstream should fix their source files by including config.h right at the start of every source file.
For failures related to C++ code, like undeclared ostream or cerr, the problem is not with glibc 2.8 but with gcc 4.3 instead.
Enjoy!
Posted in Gentoo, English | Tags BuildFailures, Cleanup, GCC, GLIBC | 4 comments
Posted by Diego "Flameeyes" Pettenò
Sat, 07 Jun 2008 20:09:00 GMT
While Ciaran continues spreading the word that --as-needed is broken because it disallows some techniques that, in my opinion, are crazy enough by themselves, and wouldn’t work with Sun’s compiler or with some executable formats like Microsoft’s own PE, there are some more information about --as-needed that I’d like people to know about.
In general, with the exception of some rare, and as I said crazy, to me, designs, --as-needed is almost always fixable. The speed of the fix is usually inversely proportional to the decency of the build system.
On a standard autotools-based build system, with some exception about cyclic dependencies of modules, the problem is usually solvable in a few minutes. Usually, it’s due to libraries passed through _LDFLAGS variables instead of _LDADD or _LIBADD (depending whether the target is an executable or a library). Sometimes its caused by mistakes in the configure script though.
In particular, if you’re testing for the presence of a library, you want to pass it through the LIBS variable, rather than LDFLAGS (and remember I consider AC_CHECK_LIB harmful). For those who disagree because some software passes flags like -L through their config scripts, well, those are flags that fit perfectly in LIBS as they are libary lookup flags!
Unfortunately there are very broken build systems, like the one of xmlwrapp, that creates shared objects in ELF with no dependency, without linking any other library together and thus expecting all its users to know the list of libraries it links against (it reports a huge list of libraries through pkg-config).
In these cases, fixing the issue might require quite a lot of work. Doesn’t mean it’s impossible though, of course, just annoying, long, and boring.
To avoid introducing more issues like these, in agreement with Mike, I’ve introduced a new warning in append-ldflags, if you try to pass the -l option to link against a library, it will tell you not to do that. That’s a mistake that will cause --as-needed to fail.
Whatever I’m saying here, anyway, makes no difference on whether the designs that clash with --as-needed are sane or not. Passing libraries through LDFLAGS is a mistake, --as-needed or not, it only happens to break stuff when --as-needed is used. So let’s not put our heads under the sand and filter --as-needed directly, let’s try to fix the issue, okay?
And of course to finish, if you are unable to fix an --as-needed issue, whether you don’t have time, will, or simply think to lack the skill to fix it, let me know and I’ll do everything I can to fix the issue myself, just please don’t filter the flag up without leaving a bug open, okay? If the package is important and you want users to be able to build anyway, well, okay filter out the flag, but leave a bug open so I know it has to be fixed.
Developers seem to feel like having bugs open is a failure, instead it’s just a way to track things down. If an issue is just partly worked around, or is fixed in the wrong way, leave the bug open, maybe put it to assigned, so that you can filter it out, use the status whiteboard, whatever you want, but don’t write the bug off as FIXED, if it is not.
Posted in Gentoo, English | Tags AsNeeded, autotools, Bugs | 3 comments