Meeting C++ 2016

This is my first time at Meeting C++ in Berlin. I came here with my boss Andi. To profit more, we split up during the talks. Afterwards we shared what we learned.
I will complete this post later, and add links to the presentations and videos as they become available.

I attended the following talks:

Opening Keynote by Bjarne Stroustrup

He talked about the evolution and future direction of C++. Explaining the guiding principles and philosophy of the language. He also explained how the standards committee works, and that even he himself is sometimes over voted. He could tell that and even name the people of other opinions without any bitterness. Very professional and focused!
The main point that sticked out was: “zero overhead abstractions”

C++ Core Guidelines: Migrating your Code Base by Peter Sommerlad

Unfortunately Peter Sommerlad was sick and couldn’t come. So Bjarne Stroustrup agreed ten minutes before his own keynote to jump in, and give the talk without any preparation. He claimed never to have had a talk about this topic. He had some slides with the name of his employer, and he jumped around in those slides. Other than this barely noticeable detail, you couldn’t tell that the talk was not prepared. He talked about how to use the [GSL](https://github.com/Microsoft/GSL) in new code. But the main focus was on how to gradually improve old legacy code by introducing the types the GSL provides. In the future there should be even tools to perform the task automatically.

Reduce: From functional programming to C++17 fold expressions by Nikos Athanasiou

He started out by showing how fold can be performed at runtime with std::accumulate(). Then he gave some theory and showed the syntax of other languages such as: haskell, python and scala. The C++17 fold expression operator doesn’t just add syntactic sugar, but open up a load of new possibilities. With constexpr functions, the folds can be evaluated at compile time. As a consequence they can not only operate on values, but even on types. The talker shared with us how he broke his personal error message record: During his experiments he got an error with a quarter of a million lines!

Implementing a web game in C++14 by Kris Jusiak

In this talk we witnessed how a relatively simple game can be implemented with help of the following libraries: ranges, dependency injection and state machine. The code was all in pure C++14 and was then compiled to asm.js and/or webassembly using emscripten. The result was a static website that runs the game very efficiently in the browser. In the talk we were walked through the different parts of the implementation. In contrast to a naive imperative approach, after the initial learning curve this can be maintained and extended a lot easier.

Learn Robotics with C++ in 1 hour by Jackie Kay

We didn’t actually learn how to program robots. First, she walked us through some history of robotics. By highlighting some of the mayor challenges, she explained different solutions, and how they evolved over time. Because robots run in a real time environment and have lots of data to process, performance is crucial. In the past the problems were solved more analytically, while nowadays the focus is on deep learning with neuronal networks. She had a strong emphasis on libraries that are being used in robotics. To my surprise, I knew and used most of them, even the ones she introduced as lesser known such as dlib.

Nerd Party

In the evening there was free beer in the big underground hall. There was no music, so that people could talk. Not really how you would usually imagine a party. We had a look at the different sponsor booths, and watched some product demos. After a while we went up to the sky lounge in the 14th floor with a marvelous view over the city.

SYCL building blocks for C++ libraries by Gordon Brown

Even though I experimented with heterogeneous parallel computing a few years ago, I was not really aware what is in the works with SYCL. My earlier experiments were with OpenCL and Cuda. They were cool, but left a lot to be desired. I never looked into OpenAMP despite the improved syntax. In Contrast SYCL seems to do it right on all fronts. I hope this brings GPGUP within reach, so that I could use it in my day to day work sometimes. In the talk, he showed the general architecture, how the pipelines work. Rather than defining execution barriers yourself and schedule the work, you define work groups, and their dependencies. SYCL then figures out how to best arrange and schedule the different tasks onto the different cores. Finally he talked about higher level libraries where SYCL is being integrated: std parallel algorithms, tensor flow and computer vision.

Clang Static Analysis by Gabor Horvath

During this talk we learned how static analyzers find the potential problems in the code to warn the developers about. Starting with simple semantic searches, through path tracing with and without branch merging. Bottom line was that there is no one tool to beat them all, but that the more tools you use, the better. Because they all work differently, each on can find individual problems.

Computer Architecture, C++, and High Performance by Matt P. Dziubinski

This talk made me realize how long ago it was, that I learned about hardware architectures in school. Back in the day we thought mainly about the simple theoretical model of how an ALU works. The talk made clear how you could boost performance by distributing the work to the different parallel ALU’s that exist within every CPU core. In the example he boosted the performance by two simply by manually partially unroll a summation loop. Another important point to take home is the performance gap between CPU and memory access. Even for caches, it is widening with every new hardware generation. Traditional algorithm analysis considers floating point operations as the expensive part. But meanwhile you can execute hundreds of FLOP’s in the time it takes to resolve a single cache miss. On one side he showed some techniques to better utilize the available hardware. And on the other hand he demonstrated tools to measure different aspects, such as usage of the parallel components within the core, or cache misses. With so diverse hardware it is really difficult to predict, thus measuring is key.

Lightning talks

The short talks were of varying quality, but mostly funny. As with a good portion of the talks, there were technical difficulties with connecting the notebooks to the projectors.

Closing keynote by Louis Dionne

C++ metaprogramming: evolution and future directions
We both didn’t know what to expect from this talk. But it proved to be one of the best of the conference. He started out by showing some template meta programming with the boost::mpl, transitioned to boost::fusion, and landed at his hana library. The syntax for C++ TMP is generally considered insane. But with his hana library, types are treated like values. This makes the compile time code really readable and only distinguishable from runtime code at a second glance. True to the main C++ paradigm of zero overhead abstraction he showcased an implementation of an event dispatcher that looks like runtime code with a map, but actually resolves at compile time to direct function calls. Cool stuff really. Leveraging knowledge that is available at compile time and use it at compile time. He even claimed that in contrast to some other TMP techniques, compile times should not suffer so much with hana.

Conclusions

C++ is fancy again!
I have been programming professionally for about 17 years. In all this time C++ has been my primary language. Not only that, it has also always been my preferred language. But there were times where it seemed to be stagnating. Other languages had fancy new features. They claimed to catch up with C++ performance. But experience showed that none ever managed to run as fast as C++ or produced such a small footprint. The fancy features proved either not as useful as they first appeared, or they are being added to C++. In retrospect it seems to have been the right choice to resist the urge to add a garbage collector. It’s better to produce no garbage in the first place. RAII turns out to be the better idiom as it can be applied to all sorts of resources, not only memory. The pace with which the language improves is only accelerating.
Yes, there is old ugly code that is using dangerous features. That is how the language evolved, and we can’t get rid of it. But with tools like the GSL and static analyzers we still can improve the security of legacy code bases.
Exciting times!

Code coverage for C++

Ever since I wrote automated tests, I wondered how complete the coverage was. Of course you have a feeling which parts are better covered than others. For some legacy code you might prefer not to know at all. But I thought test coverage was something easy to do with a language running on a VM such as Java, but hard with C++. Some things are not as hard as you think, once you give it a try.

The thing that triggered my interest was the coveralls badge on the readme page of vexcl. By following it through, I learned that coveralls is just for presenting the results that are generated by gcov. Some more research showed what compiler- and linker flags I need to use. In addition I found out that lcov’s genhtml can generate nice human readable html reports, while gcovr writes machine readable xml reports. So the following is really all that needs to be added to your CMakeLists.txt:

OPTION(CODE_COVERAGE       "Generate code coverage reports using gcov" OFF)

IF(CODE_COVERAGE)
    SET(CMAKE_C_FLAGS          "${CMAKE_C_FLAGS}
        -fprofile-arcs -ftest-coverage")
    SET(CMAKE_CXX_FLAGS        "${CMAKE_CXX_FLAGS}
        -fprofile-arcs -ftest-coverage")
    SET(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS}
        -fprofile-arcs -ftest-coverage")

    FILE(WRITE ${PROJECT_BINARY_DIR}/coverage.sh "#! /bin/sh"n)
    FILE(APPEND ${PROJECT_BINARY_DIR}/coverage.sh "lcov --zerocounters
        --directory . --base-directory ${MyApp_MAIN_DIR}"n)
    FILE(APPEND ${PROJECT_BINARY_DIR}/coverage.sh "lcov --capture --initial
        --directory . --base-directory ${MyApp_MAIN_DIR} --no-external
        --output-file MyAppCoverage"n)
    FILE(APPEND ${PROJECT_BINARY_DIR}/coverage.sh "make test"n)
    FILE(APPEND ${PROJECT_BINARY_DIR}/coverage.sh "lcov --no-checksum
        --directory . --base-directory ${MyApp_MAIN_DIR} --no-external
        --capture --output-file MyAppCoverage.info"n)
    FILE(APPEND ${PROJECT_BINARY_DIR}/coverage.sh "lcov
        --remove MyAppCoverage.info '*/UnitTests/*' '*/modassert/*'
        -o MyAppCoverage_filtered.info"n)
    FILE(APPEND ${PROJECT_BINARY_DIR}/coverage.sh
        "genhtml MyAppCoverage_filtered.info"n)

    FILE(APPEND ${PROJECT_BINARY_DIR}/coverage.sh
        "gcovr -o coverage_summary.xml -r ${MyApp_MAIN_DIR} -e '/usr.*'
         -e '.*/UnitTests/.*' -e '.*/modassert/.*' -x --xml-pretty"n)

    ADD_CUSTOM_TARGET(CODE_COVERAGE bash ${PROJECT_BINARY_DIR}/coverage.sh
                        WORKING_DIRECTORY ${PROJECT_BINARY_DIR}
                        COMMENT "run the unit tests with code coverage and produce an index.html report"
                        SOURCES  ${PROJECT_BINARY_DIR}/coverage.sh)
    SET_TARGET_PROPERTIES(CODE_COVERAGE PROPERTIES
        FOLDER "Testing"
    )

ENDIF(CODE_COVERAGE)

The resulting html page is very detailed and shows you the untested lines in your source files in red.
From the produced xml file it’s easy to extract the overall percentage for example. You could use this figure to fail your nightly builds when it’s decreasing.

revisiting enable_if

It was roughly 2008, when I wanted to make a template function for serialization, only available to container types. Template stuff can become complicated at times, and from reading the documentation boost::enable_if seemed to be just what I needed. I didn’t get it to work, and I blamed Microsoft Visual Studio 2005 for not being standards compatible enough. And somehow I remembered enable_if as being difficult and hard to get to work, despite highly desirable if it would work. I ended up providing explicit template overloads for all the supported container types.

Fast forward to five years later, enable_if made it into the C++11 standard, and I didn’t even notice until reading “The C++ programming language” by Barne Strousup. In the book the facility is presented as a concise template that is easy to use and even to implement. To understand it’s value, let’s start with an example. Suppose, I want to implement a template function to stream the contents of containers to stdout.

#include <iostream>
#include <vector>
#include <list>

template<class ContainerT, class StreamT>
StreamT& operator<<(StreamT& strm, const ContainerT& cont)
{
	strm << '{';
	for(const auto& element : cont)
		strm << element << " ";
	strm << "} ";
        return strm;
}

int main()
{
	std::vector<int> ints{8, 45, 87, 90, 99999};
	std::list<float> floats{3.14159, 2.71828, 0.57721, 1.618033};
	std::cout << ints << floats;

	return 0;
}

So far so good, this does the trick. And the output is just what we expected: {8 45 87 90 99999 } {3.14159 2.71828 0.57721 1.61803 } But now we also write an output stream operator for some user defined interface type. Continue reading “revisiting enable_if”

Adding a display to rfid time tracking

More than a year ago, I blogged here about using RFID to track presence times in the BORM ERP system. I used the system a lot since then. But the BlinkM was really limited as the only immediate feedback channel. To use it with multiple users, a display was needed. The default Arduino compatible displays seemed a bit overpriced, and the Nokia phone that I disassembled didn’t have the same display as I used for the spectrum analyzer. But these displays are available for a bargain from china. The only problem was that the bifferboard didn’t have enough GPIO pins available to drive the “SPI plus extras” interface. But i2c was already configured for the BlinkM.

So, the most obvious solution was to use an AtMega8 as an intermediary. I defined a simple protocol and implemented it as i2c and uart on the AVR. I also wrote a small python class to interface it from the client side. As I buffer only one complete command, I had to add some delays in the python script to make sure the AVR can complete the command before the next one arrives. Apart from that, it all worked well when testing on an Alix or RaspberryPi. But i2c communication refused to work entirely when testing with the bifferboard. Not even i2cdetect could locate the device. That was bad, since I wanted to use it with the Bifferboard, and the other two were only for testing during the development. I checked with the oscilloscope, and found out that the i2c clock on the bifferboard runs with only 33kHz while the other two run at the standard 100kHz. So I tried to adjust the i2c clock settings on the AVR, as well as different options with the external oscillators and clock settings, but I was still still out of luck. Then I replaced the AtMega8 with an AtMega168 and it immediately worked. Next, I tried another AtMega8 and this one also worked with the Bifferboard. I switched back and forth and re-flashed them with the exact same settings. Still, one of them worked with all tested linux devices, while the other one still refused to work with the Bifferboard. So I concluded, one of these cheap AVR’s from china must be flaky, and I just used the other one. Seems like that’s what you get for one 6th of the price you pay for these chips in Switzerland.

Apart from the display, I also added an RGB LED that behaves like the BlinkM before. And on top of that a small piezo buzzer. But since I could hardly hear it’s sound when driven with 3.3V, I didn’t bother re-soldering it when it fell off.

Now, my co-workers also started logging their times with RFID.

The code is still on github.

accelerated ray tracer

In all the great online classes I attended over the last year, there was one topic missing. Finally I found an offering for a Computer Graphics class. After all, that’s the field I ‘ve been working in for the last five and a half years. The class is offered at edx.org and is from Berkley. It’s the first class I’m taking from edx, and the style of the class is comparable to coursera and udacity.

The first part of the class was concerned about OpenGL, and we implemented an interactive scene viewer. Although I didn’t work directly with regular OpenGL before, only with WebGL which is based on OpenGL ES, it was mostly repetition. But nonetheless it was good training for working with homogeneous coordinates and matrices with different orderings. For grading, we had to produce 12 screenshots from the same scene with different transformations. Once it was implemented I had only to change the order of some transformations to have all images correct.

The second part was concerned with ray tracing. Eventhough I was familiar with the basic concept, working with it was new to me. And in the class, we had to build a ray tracer from scratch.The theory sounded straight forward. But somehow I was not so lucky in implementing it. In every new part I made some silly mistake. I developed it not exemplary test driven, but with unit tests for every key part that I wanted to verify. With that in place I could usually find and correct the problem in time. For grading, we had to produce seven images. Continue reading “accelerated ray tracer”

RaspberryPi reading analog input using an AtTiny through i2c

The raspberrypi has some GPIO (General Purpose Input Output) pins. That’s great for experimenting with electronics for example sensors and actuators. It’s totally different than an Arduino in many respects, but that’s something they have in common. Some of the pins have special functions. For example SPI, I2C, UART …

There is a breakboard adapter for all the GPIO pins with a ribbon cable that you can order from the US. That’s cool, but ordering stuff from abroad can be expensive. And the pins look somehow like good old IDE. So I soldered an adapter myself and bought an IDE cable. Well, some pins worked, and some didn’t… Enough for the first round of experimenting, but it took a while to find out what’s going on. I just assumed that all the wires of the IDE cable were connected which for some reason was not the case.

But something is missing that the arduino offers: analog. Before I really needed analog sensing capabilities, I found an article, describing a hack to read analog input by measuring the time it takes to discharge a capacitor through the resistance you want to measure. Immediately, I tried it myself with a photo resistor. The author warned, that the timings with the python script are not really accurate, and that the correct values for the components would have to be calculated. The Values I got were fluctuating wildly, and I couldn’t really see a difference with the brightness in all that noise.

So I looked for something more accurate. I still have some AtTiny’s and they have analog inputs. But SPI is the only means of communication they support in hardware. Last week, I implemented uart receiving capabilities in software, but this time I was looking for i2c. Continue reading “RaspberryPi reading analog input using an AtTiny through i2c”

cmake with MSVC

I have used cmake for a couple of years with my hobby projects, and I love it. It is a cross platform meta build system. Like with Qt, people tend to first think that “cross platform” is the main feature. But like with Qt it’s actually one great feature amongst many others. It brings so many advantages that I can’t even list them all here.  Since last week, we also use it for PointLine at work. While the process is straightforward on linux, there are some things worth mentioning when using it on Windows.

Finding External libraries

Cmake has lots of finder scripts for commonly used libraries, and they work great in most cases. But we want to have multiple versions of the same libraries side by side, and depending on the version of PointLine we develop for, use the appropriate versions of the libraries. To be precise, not just the libraries, but also the headers and debug symbols need to be present in different versions. And we want to be able to debug different versions of our product using different versions of the libraries, simultaneously on the same machine. Continue reading “cmake with MSVC”

Optimizing compile time of a large C++ project

The codebase of our PointLine CAD is certainly quite large. sloccount calculated roughly  770’000 lines of C++ code. I know, this is not a very good metric to describe a project, but it gives an idea. Over time the compile time steadily increased. Of course we also added a lot of new stuff to the product. We also used advanced techniques to reduce the risk of bugs, that have to be paid with compile time. But still, the increase was disproportionate. We mitigated it by using IncrediBuild. Just like distcc, it distributes the compilation load across different machines on the LAN. If I’m lucky, I get about 20 cores compiling for me.

About once a year, one of us does some compile time optimization and tunes the precompiled headers. I did so about three years ago, and then this week it was my turn again. Reading what I could find about precompiled headers on the internet and applying that, I could get only a small speedup, roughly 10%. So I cleaned up the physical structure of the codebase. Here are some things I performed: Continue reading “Optimizing compile time of a large C++ project”

OpenCL First Steps

There is an increasing noise about GPGPU computing and how much faster than CPU (even parallel) it is. If you didn’t hear about all that, GPGPU is about using the computer’s graphics card(s) to do general purpose computations. The key to the performance lies in the parallel architecture of these devices. From what I read, an average graphics card has 64 parallel units, but they are not as versatile as the CPU of which a typical PC these days has 4 cores. That means, if the task is well suited, it can boost performance significantly, but if not, it’s nothing more than a lot of wasted work.

So I wanted to see for myself. To get started I read the book “OpenCL Programming Guide“. It gave a good overview. But now it was time to give it a try.

Continue reading “OpenCL First Steps”

packaging libboost compiled with llvm clang

I read many articles and posts over the last year or so, citing how great llvm clang is. On one side it shall have a static checker that makes lint redundant, and on the other side the optimizer has an -o4 where the -o3 shall be comparable to other optimizers. On top of that, compilation speed shall be really fast. And the part that makes it interesting for folks like Apple (who uses and contributes), is that it’s licensed under a BSD style license. What more could you want?

Continue reading “packaging libboost compiled with llvm clang”