Cork – A High Performance Library for Geometric Boolean/CSG Operations

Gilbert Bernstein is currently a Ph.D. student at Stanford and had published some remarkable papers on computational geometry.  I was first drawn to his work by his 2009 paper on Fast, Exact, Linear Booleans as my interest in 3D printing led me to create some tooling of my own.  The various libraries I found online for performing Constructive Solid Geometry (CSG) operations were certainly good but overall, very slow.  CGAL is one library I had worked with and I found that the time required for operations on even moderately complex meshes was quite long.  CGAL’s numeric precision and stability is impeccable, the 3D CSG operations in CGAL are based on 3D Nef Polyhedra but I found myself waiting quite a while for results.

I exchanged a couple emails with Gilbert and he pointed me to a new library he had published, Cork.  One challenge with the models he used in his Fast, Exact paper is that the internal representation of 3D meshes was not all that compatible with other toolsets.  Though the boolean operations were fast, using those algorithms imposed a conversion overhead and eliminates the ability to take other algorithms developed on standard 3D mesh representations and use them directly on the internal data structures.  Cork is fast but uses a ‘standard’ internal representation of 3D triangulated meshes, a win-win proposition.

I’ve always been one to tinker with code, so I forked Gilbert’s code to play with it.  I spent a fair amount of time working with the code and I don’t believe I found any defects but I did find a few ways to tune it and bump up the performance.  I also took a swag at parallelizing sections of the code to further reduce wall clock time required for operation, though with limited success.  I believe the main problem I ran into is related to cache invalidation within the x86 CPU.  I managed to split several of the most computationally intensive sections into multiple thread of execution – but the performance almost always dropped as a result.  I am not completely finished working on threading the library, I may write a post later on what I believe I have seen and how to approach parallelizing algorithms like Cork on current generation CPUs.

Building the Library

My fork of Cork can be found here: https://github.com/stephanfr/Cork.  At present, it only builds on MS Windows with MS Visual Studio, I use VS 2013 Community Edition.  There are dependencies on the Boost C++ libraries, the MPIR library, and Intel’s Threading Building Blocks library.  There are multiple build targets, both for Win32 and x64 as well as for Float, Double and SSE arithmetic based builds.  In general, the Win32 Float-SSE builds will be the quickest but will occasionally fail due to numeric over or underflow.  The Double x64 builds are 10 to 20% slower but seem solid as a rock numerically.  An ‘Environment.props’ file exists at the top level of the project and contains a set of macros pointing to dependencies.

The library builds as a DLL.  The external interface is straightforward, the only header to include is ‘cork.h’, it will include a few other files.  In a later post I will discuss using the library in a bit more detail but a good example of how to use it may be found in the ‘RegressionTest.cpp’ file in the ‘RegressionTest’ project.  At present, the library can only read ‘OFF’ file formats.

There is no reason why the library should not build on Linux platforms, the dependencies are all cross platform and the code itself is pretty vanilla C++.  There may be some issues compiling the SSE code in gcc, but the non-SSE builds should be syntactically fine.

Sample Meshes

I have a collection of sample OFF file meshes in the Github repository:  https://github.com/stephanfr/SolidModelRepository.  The regression test program loads a directory and performs all four boolean operations between each pair of meshes in the directory – writing the result mesh to a separate directory.

These sample meshes range from small and simple to large and very complex.  For the 32bit builds, the library is limited to one million points and some of the samples when meshed together will exceed that limit.  Also, for Float precision builds, there will be numeric over or underflows whereas for x64 Double precision builds, all operations across all meshes should complete successfully.

When Errors (Inevitably) Occur

I have tried to catch error conditions and return those in result objects with some descriptive text, the library should not crash.  The code is very sensitive to non-manifold meshes.  The algorithms assume both meshes are two manifold.  Given the way the optimizations work, a mesh may be self intersecting but if the self intersection is in a part of the model that does not intersect with the second model, the operation may run to completion perfectly.  A different mesh my intersect spatially with the self intersection and trigger an error.

Meshes randomly chosen from the internet are (in my experience) typically not two manifold.  I spent a fair amount of time cleaning up the meshes in the sample repository.  If you have a couple meshes and they do not what to union – use a separate program like MeshLab to look over the meshes and double check that they are both in fact 2 manifold.

Conclusion

If you are interested in CSG and need a fast boolean library, give Cork a shot.  If you run into crashes or errors, let me know – the code looks stable for my datasets but they are far from exhaustive.

 

 

 

 

 

 

Metal Parts from 3D Prints

Introduction

Although 3D printing technology is advancing rapidly and home 3D printing is becoming both increasingly accessible and reliable, it will likely be a while before metal printing catches up with plastic Fused Filament Fabrication technology.  That said, there is a way to fabricate metal parts from some sets of 3D FFF designs.  In this blog post, I will describe a technique I have been using with some success to produce high quality metal parts from 3D prints.

Metal Clays and ‘Lost HIPS’ Molding

Metal clay is essentially a very low-tech approach to powder metallurgy.  Metal clays are a combination of atomized metal powder and organic, water soluble binders.  When soft, metal clay can be worked like a regular ceramic clay, dried to a hard yet brittle state and finally sintered in a conventional kiln to produce a solid metal piece.  The first metal clays were silver compounds but today metals such as bronze, brass, copper and steel are also readily available.

High Impact Polystyrene (HIPS) is a standard FFF filament, though definitely less popular than PLA or ABS.  My experience with HIPS as a general printing filament is quite good, it is easy to print with and can be printed very successfully on a Elmer’s glue coated, heated borosilicate glass plate.  An advantage of HIPS is that it is soluble in Limonene, a solvent derived from citrus fruit rinds.  As organic solvents go, limonene is about as safe as you can find, it is used medicinally for heartburn and GERD.  There is anecdotal evidence of it making a good margarita mixer…

Putting 3D printing with HIPS, metal clay, limonene as a solvent and finally kiln sintering together, we come up with the ‘Lost HIPS’ technique for creating metal parts from 3D prints.

Step 1 : Create a 3D Mold

The beauty and power of CAD/CAM is the ability to define, manipulate, visualize and refine 3D parts numerically prior to actually creating the physical part.  For our purposes, it is straightforward to take a 3D object definition in an STL file and create a mold for that part by performing a boolean difference operation between the 3D part and a rectangular cubiod (i.e. block).  For this post, I used the ‘Sun Medallion’ design I found on Thingiverse.  I then used OpenScad to create the cuboid and perform the binary difference to create a mold of the medallion.  I tweaked the mold along the way to strengthen the connections of the arms to the central solar disk but the design is still quite obviously that of Hank Dietz.

When printing a mold, try to find a good middle ground between a mold that is physically strong enough to work with but contains a minimum of HIPS material.  In ‘Lost HIPS’, all the mold material has to be dissolved by the limonene – so less is definitely more.

266

Sun Medallion mold printed in HIPS on a MakerGear M2 Printer.

HIPS is soluable in acetone as well, which means it can also be vapor polished in the same way as ABS.  I find vapor polishing to be helpful in smoothing the surface of the mold and sealing up any small holes or creases that may be left in the mold after printing – particularly when printing with thin layers.

Step 2: Fill the Mold with Metal Clay

At present, I am using FastFire Bronze Clay as it is relatively cheap (~$200/kg) and easy to work with though I have found it very sensitive to sintering temperatures.  I have also worked with PMC+ and it is easier to work with and very forgiving with respect to sintering temperatures but it is expensive (~$1500/kg).

When filling the mold, I have had the best luck painting the mold with water containing just a tiny amount of dish soap.  The water will cause the clay to form a thinner slurry next to the mold (much like ‘slip‘) and the detergent acts as a surfactant to help insure the slurry covers the entire base of the mold.  NB – do not use much water/detergent solution in the mold, as making the clay runny has lead to poor results for me.  I just paint the surface of the mold with a brush and that is it.  I usually put down a first layer of clay with an emphasis on insuring all the corners, nooks and crannies are filled and then fill the rest of the mold.  I use an old credit card to scrape off excess clay.

Once the mold is filled, I let it stand for a day to dry and sand the whole thing with a 200 grit sanding sponge to remove any excess clay.  It doesn’t take much sanding to get to a point where the finer features of the mold are visible again.  Finally, I wanted to make the sun medallion into necklaces for my daughters, so I added a loop to the back of the piece.  To make the loop, I used three pieces of HIPS together and placed a bit more clay over the HIPS and onto the back of the medallion.

279

Sun Medallion mold after drying, sanding to remove excess clay and the addition of the necklace loop.

Step 3: Lose the HIPS

Once the clay is dry, place the mold into a container of limonene and let the solvent do its work.  I use a glass container with a flourinated plastic lid that I found at Bed, Bath and Beyond (don’t forget your coupons).  Limonene is a solvent and will attack non flourinated plastics, though plastic gas cans and many consumer plastics are flourinated these days.  The more HIPS in the mold, the longer it will take to remove the material so expect anywhere from overnight to a couple days to get all the HIPS removed.  Fortunately, the metal clay does not appear to be nearly as sensitive to limonene as does the HIPS, so a couple days in a limonene bath does not appear to effect the clay.

Once the bulk of the HIPS is gone, I soak the piece in fresh limonene for a couple hours to get rid of the rest of the mold material and then dunk it in acetone for a minute or two.  The acetone serves two purposes.  First, it removes any gooey HIPS /limonene emulsion from the surface of the piece and second, it is a drying agent so after just a few minutes in air the piece is dry and can be worked a bit before sintering.

288

The bronze metal clay Sun Medallion after HIPS removal.  Note a bit of stringy HIPS material on the mesh holder and some HIPS left around the crease between the central disk and the sun arms.  This extra HIPS on the piece will burn off in the kiln.

Step 4: Make Repairs to the Clay Piece before Sintering

In its current state, the dried clay can be worked just as green clay can be worked.  I will typically sand off visible printing artifacts (i.e steps between layers), fill any voids with fresh clay and file off any excess material from the piece.  At this point as it is much easier to add/remove the clay material compared to post sintering.  I also find it helpful to use the water/detergent solution again to paint the surface a couple times to get a smoother finish.  The metal clay will absolutely reproduce every detail in the printed mold, so it you want a smoother aesthetic look – now is the time to take off the rough edges.

 Step 5: Burn out HIPS and Binder then Sinter

Once you are happy with the appearance of the piece, it is time to sinter.  I have a Paragon Caldera kiln which I love.  I did not get the digitally controlled version which I would suggest strongly for anyone looking to purchase a kiln.  I find the difference between a beautiful finished piece and an under-fired or over-fired piece to be just tens of degrees F.  Thus I end up having to watch my kiln closely as it finishes its ramp to insure it gets into the right temperature range and holds that range long enough to fully sinter the piece.

Pretty much any material other than silver needs to be fired in an anoxic (i.e oxygen free) environment.  For firing metal clays, someone far more clever than I figured out that one could easily create a locally oxygen free environment by burying the piece in carbon granules during firing.  This process works spectacularly well.  I will not go into the details here, there are plenty of references online.

I use a ceramic container for firing.  Firing in a stainless steel vessel leaves lots of black oxide in my kiln whereas the ceramic fiber pot leaves no residue whatsoever.  Having sintered with both, I also expect the pot to outlast a stainless vessel  as well.

I typically rest the piece on a piece of fiber kiln paper and then put the piece on the paper into the container filled with an inch or two of acid washed carbon granules.  I do not cover the piece but ramp my kiln to 400F and leave it there for an hour to burn off any remaining HIPS and the binder in the metal clay.

289

The cleaned up Sun Medallion on a piece of kiln paper.

296

The bronze clay Sun Medallion and kiln paper on a bed of carbon in the firing vessel.

After burning off any organic compounds left on the piece, I put another piece of kiln paper over the top of the piece and fill the container with carbon granules to within an inch of the top.  I put the lid on the container and ramp my kiln to 1450F and leave it there for an hour.  I then turn the kiln off and crack the lid to cool the piece quickly.

Step 6: Cleaning the Piece

It can take several hours for everything to cool to a point where it can be touched.  In particular, the vessel and carbon will hold heat well.

Once everything has cooled, I remove the piece from the carbon granules and clean it.  I use a Dremel tool with a wire brush to take the black scale off the surface of the piece and then use a bath of Picklean to remove the rest of the oxidation.  It may take a couple Picklean baths to really get the piece cleaned up but it is the only way I have found to get all the little details in the piece bright and shiny.

DSC_1059[1]

The finished product

Other Examples

I have created a number of other designs as well, below is a Tudor Rose extracted from a Thingiverse design.  What is interesting about this design is that there are regions in which upper layers overlap lower layers and if your printer does a decent job of bridging and you can force the metal clay into the mold, you can get a fairly intricate design which would be hard to fabricate using other means – like straight up stamping.

DSC_1061[1]

A Tudor Rose in Silver and Bronze.  In this example, the bronze piece has been slightly overfired and lost some of the detail of the original.  In contrast, the silver has retained much of the original detail to the point where the individual printing layers are clearly visible.

Conclusion

Though there are a number of steps involved with this process, for folks with more engineering talent than artistic talent this provides a way to create some gorgeous pieces simply by ‘turning the crank’.  After a few practice runs, I have found the ‘Lost HIPS’ process to be fairly straightforward.

Next I will probably try to fabricate some structural pieces using steel clay.  I have tinkered with steel clay once early on and I expect to have similar success with that material as well.

Writing a CGAL Mesh to an STL File

Introduction

The CGAL library for computational geometry is truly a work of art.  It focuses on precision and accuracy above all else and yet manages to stay very flexible through perhaps the most comprehensive use of C++ metaprogramming that I have ever encountered.  CGAL deals efficiently and elegantly with rounding errors in IEEE floating point operations by escalating from IEEE floating point to exact numeric computation when rounding errors may occur.  The library is the product of 15+ years of development by some of the best computational geometry developers on the planet.

Using CGAL

CGAL is not the most accessible library to just pick up and use.  The template based generic programming paradigm can be difficult to wrap your head around at first but there are just enough examples to jump-start HelloWorld style apps.  Beyond that, the data structures are optimized for computational geometry not for obviousness.  Traversing the data structures requires a bit of study and thought to accomplish a task.

One of the more powerful aspects of the CGAL library is the Delaunay 3D Mesh Generator.  This generator can take a variety of geometric elements, such as polyhedrons, and generate a 3D triangulated mesh.  This operation is key to 3D printing as it is that triangulated mesh which is then sliced to create the layer-by-layer extrusion paths.  The OpenSCAD 3D CAD Modeller (http://www.openscad.org/) uses CGAL for binary polyhedral operations and mesh generation.

Writing a CGAL Mesh in STL file Format

There are not a lot of persistence formats supported in the CGAL library itself.  For 3D printing, the primary file format is arguably the STL (Standard Tessellation Language) format.  An STL file contains a list of triangular faces defined by 3 vertices and a normal to the facet.

Though the STL format is straightforward, it took a bit of poking around and experimentation to figure out how to traverse the mesh and order the vertices to insure that the mesh is manifold.  Getting the various template arguments right was also an occasional issue.

The code below takes a CGAL  Mesh_complex_3_in_triangulation_3 instance, a list of subdomains within the mesh complex and an open stream.  The template function writes the listed subdomains to the stream.  Each subdomain is written as a distinct solid in the STL file. It compiles under VS2010 and newer g++ releases.


#ifndef __MESH_TO_STL_H__
#define __MESH_TO_STL_H__</code>

#include &lt;CGAL/bounding_box.h&gt;
#include &lt;CGAL/number_utils.h&gt;

#include #include

template
struct SubdomainRecord
{
SubdomainRecord( const SubdomainIndex index,
const std::string label )
: m_index( index ),
m_label( label )
{}

SubdomainIndex m_index;
std::string m_label;
};

template
class SubdomainList : public std::list&lt;SubdomainRecord&gt;
{};

//
// The TriangulationPointIterator and TriangulationPointList template classes
// ease the task of iterating over the points associated with vertices in the mesh.
//

template
class TriangulationPointIterator : public std::iterator&lt;std::forward_iterator_tag, typename Triangulation::Point&gt;
{
typename Triangulation::Finite_vertices_iterator m_currentLoc;

public:

TriangulationPointIterator()
{}

TriangulationPointIterator( typename Triangulation::Finite_vertices_iterator&amp; vertIterator )
: m_currentLoc( vertIterator )
{}

TriangulationPointIterator(const TriangulationPointIterator&amp; mit)
: m_currentLoc( mit.m_currentLoc )
{}

TriangulationPointIterator&amp; operator++() {++m_currentLoc;return *this;}
TriangulationPointIterator operator++(int) {TriangulationPointIterator tmp(*this); operator++(); return tmp;}
bool operator==(const TriangulationPointIterator&amp; rhs) { return( m_currentLoc == rhs.m_currentLoc ); }
bool operator!=(const TriangulationPointIterator&amp; rhs) { return( m_currentLoc != rhs.m_currentLoc ); }
typename Triangulation::Point&amp; operator*() {return( m_currentLoc-&gt;point() );}
};

template
class TriangulationPointList
{
const Triangulation m_triangulation;

public:

TriangulationPointList( const Triangulation&amp; triangulation )
: m_triangulation( triangulation )
{}

TriangulationPointIterator begin()
{
typename Triangulation::Finite_vertices_iterator beginningVertex = m_triangulation.finite_vertices_begin();

return( TriangulationPointIterator( beginningVertex ));
}

TriangulationPointIterator end()
{
typename Triangulation::Finite_vertices_iterator endingVertex = m_triangulation.finite_vertices_end();

return( TriangulationPointIterator( endingVertex ));
}

};

//
// This function writes the ASCII version of an STL file. Writing the binary version should be
// a straightforward modification of this code.
//

template
std::ostream&amp;
output_boundary_of_c3t3_to_stl( const C3T3&amp; c3t3,
const SubdomainList&amp; subdomainsToWrite,
std::ostream&amp; outputStream )
{
typedef typename C3T3::Triangulation Triangulation;
typedef typename Triangulation::Vertex_handle VertexHandle;

// This is an ugly path to the Kernel type but this works and is all compile time anyway

typedef typename Triangulation::Geom_traits::Compute_squared_radius_3::To_exact::Source_kernel Kernel;

// Get the bounding box for the mesh so we can offset it into the all positive quadrant

TriangulationPointList pointList( c3t3.triangulation() );

typename Kernel::Iso_cuboid_3 boundingBox = CGAL::bounding_box( pointList.begin(), pointList.end() );

typename Kernel::Vector_3 offset( 1 - boundingBox.xmin(), 1 - boundingBox.ymin(), 1 - boundingBox.zmin() );

// Iterate over the facets in the mesh

std::array&lt;VertexHandle,3&gt; vertices;

for( SubdomainList::const_iterator itrSubdomain = subdomainsToWrite.begin(); itrSubdomain != subdomainsToWrite.end(); itrSubdomain++ )
{
// Write the solid prologue to the stream

outputStream &lt;&lt; "solid " &lt;&lt; itrSubdomain-&gt;m_label &lt;&lt; std::endl;
outputStream &lt;&lt; std::scientific; for( typename C3T3::Facets_in_complex_iterator itrFacet = c3t3.facets_in_complex_begin(), end = c3t3.facets_in_complex_end(); itrFacet != end; ++itrFacet) { // Get the subdomain index for the cell and the opposite cell typename C3T3::Subdomain_index cell_sd = c3t3.subdomain_index( itrFacet-&gt;first );
typename C3T3::Subdomain_index opp_sd = c3t3.subdomain_index( itrFacet-&gt;first-&gt;neighbor( itrFacet-&gt;second ));

// Both cells must be in the subdomain we are writing

if(( cell_sd != itrSubdomain-&gt;m_index ) &amp;&amp; ( opp_sd != itrSubdomain-&gt;m_index ))
{
continue;
}

// Get the vertices of the facet

for( int j=0, i = 0; i &lt; 4; ++i ) { if( i != itrFacet-&gt;second )
{
vertices[j++] = (*itrFacet).first-&gt;vertex(i);
}
}

// If the facet is not oriented properly, swap the first two vertices to flip it

if(( cell_sd == itrSubdomain-&gt;m_index ) != ( itrFacet-&gt;second%2 == 1 ))
{
std::swap( vertices[0], vertices[1] );
}

// Get the unit normal so we can write it

const typename Kernel::Vector_3 unit_normal = CGAL::unit_normal( vertices[0]-&gt;point(), vertices[1]-&gt;point(), vertices[2]-&gt;point() );

// Write the facet record to the file

outputStream &lt;&lt; "facet normal " &lt;&lt; unit_normal &lt;&lt; std::endl;
outputStream &lt;&lt; "outer loop" &lt;&lt; std::endl;
outputStream &lt;&lt; "vertex " &lt;&lt; vertices[0]-&gt;point() + offset &lt;&lt; std::endl;
outputStream &lt;&lt; "vertex " &lt;&lt; vertices[1]-&gt;point() + offset &lt;&lt; std::endl;
outputStream &lt;&lt; "vertex " &lt;&lt; vertices[2]-&gt;point() + offset &lt;&lt; std::endl;
outputStream &lt;&lt; "endloop" &lt;&lt; std::endl;
outputStream &lt;&lt; "endfacet" &lt;&lt; std::endl &lt;&lt; std::endl;
}

// Write the epilog for the solid

outputStream &lt;&lt; "endsolid " &lt;&lt; itrSubdomain-&gt;m_label &lt;&lt; std::endl;
}

// Return the stream and we are done

return( outputStream );
}

#endif // __MESH_TO_STL_H__

The code above includes a pair of helper template classes to ease iterating over mesh points for determining the bounding box for the mesh. The STL format requires that all points be positive but it doesn’t care about units.

The following code snippet demonstrates how to call the template function. The template parameter is inferred from the function arguments.

 SubdomainList<C3t3::Subdomain_index> subdomainsToWrite;

 subdomainsToWrite.push_back( SubdomainRecord<C3t3::Subdomain_index>( 0, std::string( "elephant" ) ));


 std::ofstream outputStream( "elephant.stl" );

 output_boundary_of_c3t3_to_stl( c3t3, subdomainsToWrite, outputStream );

 outputStream.close();

Conclusion

In later posts, I will follow up with CGAL examples of using Nef Polyhedra and performing the kinds of binary operations needed for CSG (Constructive Solid Geometry) applications.  Having the facility to mesh the polyhedra and persist the mesh in a file format that can then be consumed by a slicer and printed is a valuable stepping stone.

Configure a Lubuntu 14.04 Development Image

I tend to follow a model of disposo-images for new development projects.  I create an image in ESXi for a project, put just the tools I want into the image, use it only for a month or two and then blow it away when I have either finished the project or simply want to return to a clean environment.  In the past I have used Ubuntu but most recently I have been using a lighter Lubuntu install and using X2Go for remoting the desktop.

The process below works for the 14.04 LTS release of Ubuntu.

Installing Lubuntu

The easiest way to install a minimal Lubunutu instance is to use the alternative installation process with the Ubuntu mini-iso distribution.  A link to the Lubuntu minimal install documentation appears below:

https://help.ubuntu.com/community/Lubuntu/Documentation/MinimalInstall

The mini-iso is small and the majority of the distribution is downloaded during the install process.  The installer is the standard Ubuntu text-graphical tool, not the more polished desktop installer.  The first screen appears below:

 

Lubuntu Install First Screen

First screen of the Ubuntu mini-iso installer.

Choose the ‘Install’ option and follow the dialogs.  I typically just choose to use the entire disk when prompted for storage options.  I also typically leave automatic updating off and handle that manually if I feel it makes sense to update.  Eventually, it will get to the Tasksel dialog.

Tasksel dialog.  Choose the OpenSSH server and the Lubuntu Desktop options.

Tasksel dialog. Choose the OpenSSH server and the Lubuntu Desktop options.

On this dialog I usually just choose OpenSSH and the Lubuntu Desktop.  The Lubuntu desktop doesn’t have nearly the bloat of the Ubuntu desktop so while it is not truly the ‘minimal installation’, it is very light none-the-less and you would have to install the majority of the pre-installed apps anyway.

If you are using ESXi you can install VMWare Tools at this point but you do not have to install them as most of the virtual drivers are already in the Ubuntu distribution and we will not be needing the nice X integration into the VSphere client.

Installing X2Go

X2Go is an implementation of the NX protocol and is based on the NX 3.x libraries as NoMachine went closed-source starting with the V4.0 libraries.  My experience with X2Go has been very good, the quality of the server and client are such that the projects have been adopted by the Fedora community.  Installing the X2Go server on Lubuntu is a breeze:

sudo apt-add-repository ppa:x2go/stable
sudo apt-get update
sudo apt-get install x2goserver x2goserver-xsession x2golxdebindings

Once installed, reboot the VM and you should be able to connect with an X2Go client.  As Lubuntu uses the LXDE desktop environment, it is necessary to configure the client session type as a ‘custom desktop’ with the following command:

lxsession -s Lubuntu -e LXDE

I typically map the desktop display to an entire monitor.  Sometimes on the first connection after a reboot, the desktop does not resize to fill the screen but I’ve found that suspending the session and then restarting it forces the display to resize.  Maybe there is a more elegant fix out there but the suspend/reconnect works anyway.

Installing Eclipse

I use Eclipse CDT as my C++ IDE.  Installation into a minimal Lubuntu environment is straightforward.  First, install a JRE or JDK and then just download the ‘Eclipse for C/C++ Developers’ zip file from the Eclipse website.  I use the OpenJDK 7 JDK

sudo apt-get install openjdk-7-jdk

After you have downloaded the Eclipse CDT zip file and extracted everything, it is usually nice to create a desktop shortcut.  You can do this in Lubuntu using the lxshortcut tool.  Open a terminal window and move to your ~/Desktop directory, then do the following.

cd ~/Desktop
lxshortcut -o eclipse

That will pop a small dialog, just click ‘OK’.  You should see a shortcut labelled ‘eclipse’ on your desktop.  Right-click on the icon and select ‘Shortcut Editor’ from the drop-down menu.  A dialog for the shortcut editor will open and will allow you to choose the executable for the shortcut (i.e. eclipse) and an icon file (i.e. icon.xpm from the eclipse install directory).

Installing the GCC C/C++ Compilers

The minimal install of Lubuntu does not include the GCC C/C++ compilers.  Adding them is also very straightforward.

sudo apt-get install build-essential

Conclusion

At this point, you should have an image with compilers, JDK and Eclipse CDT installed and configured.  I usually snapshot the image at this point so I can branch new, clean images for other projects without having to go through the install process again.

 

Building GCC Plugins – Part 3 C++ Libraries

As discussed in the prior post, I have started a set of C++ libraries to reduce the complexity of writing GCC Plugins and interpreting the GCC Abstract Syntax Tree.  In this post I will provide a high level description of the libraries and walk through the dependencies and directory structures.  The libraries are available on Github: ‘stephanfr/GCCPlugin’.

NB – At the time of writing, I am going through successive revisions and refactoring passes on the library, so expect anything you build now to break with my next commit to GitHub.  The interfaces will settle down in time and I will ‘chill’ them at some point in hopefully the not too distant future.

Licensing and Dependencies

All of the libraries with the exception of the unit test library link directly with the GCC source code, therefore they are all licensed with GPL V3.0.  The libraries are built with the C++11 language features and have dependencies on the Standard Library shipped with GCC and Boost libraries.  The unit testing framework depends on the Google Test libraries.

Programming Style

For what it is worth, I’ve been writing C++ code for a long, long time and am somewhat opinionated regarding some development practices.  First, I use anything in the standard c++ library – in particular I do not write containers.  Second, I use the std::string class in preference of char* strings almost exclusively.  For external interfaces I may expose a char* type but under the interface any char* will almost always map straight back to a std::string instance.  Third, I use anything from the Boost library that suits my needs.  The Boost libraries are excellent, don’t waste your time re-inventing a component in that library; in all likelihood your component will not be as good anyway.  Fourth, there are some naked pointers in these libraries but in general I try to use a std::unique_ptr or std::shared_ptr in any code written today (I will fix any naked pointers in this library as I refactor).  The standard library smart pointers are a bit more difficult to use than naked pointers, but that difficulty is a result of them enforcing the semantics necessary to know when to delete a pointer they wrap.  Finally, I really like C++ 11 – I’d strongly suggest cutting over to it.

With regard to my coding format, it is idiosyncratic.  Indentation and spacing don’t quite adhere to any standard, but at least I no longer use Hungarian notation – though that was a hard habit to kick.

Project and Directory Structure

The project is currently composed of seven directories, each with a single Eclipse CDT C++ project:

  1. CPPLanguageModel – a compiler-neutral class library of C++ language elements
  2. GCCInternalsTools – a set of classes and functions tailored specifically to the GCC g++ compiler to build a CPPLanguageModel representation of the code being compiled and to enable insertion of new code into the AST
  3. GCCInternalsUTFixture – a test fixture providing an abstraction of the GCCInternalsTools designed to permit the creation of unit tests for the library without any dependency on the GCC specific libraries themselves
  4. GCCInternalsUnitTest – a set of unit tests for key features of the GCC Plugins libraries
  5. TestExtensions – a collection of test framework ‘plugins’ that rely on GCCInternalsTools and the GCC headers; a separate project is used to prevent dependencies on GCC internals to leak into the main Unit Test framework
  6. GCCPlugin – a ‘HelloWorld’ style plugin for GCC Plugins using this framework
  7. Utility – Various utility classes to simply coding and implement design patterns I like

The most up-to-date examples of using the libraries will be in the unit test projects.  Similarly, if you go wandering through the code you will frequently see blocks of code commented out.  I tend to leave code I have refactored in place for a revision or two just in case a bug crawls out.  I find it is a bit easier than going back through prior revisions in source code control but it can make the code a little messy at points.  When I get to a version I am happy with, I go through a couple cleaning passes and knock out dead or legacy code.

Design Philosophy

The innards of GCC are absolutely not for the faint of heart.  A primary design goal of this framework is to insulate someone wanting to produce a GCC Plugin from the complexity of the compiler and its design paradigms.  At present, only a single GCC header file is required to build a plugin with this framework and all functionality exposed through the framework’s API is abstracted from GCC itself.  The framework is built for manipulating the Abstract Syntax Tree for C++ language programs but could be modified to match other languages.

To use the framework, you ought to only include header files from the CPPLanguageModel project.  Actually, the ASTDictionary.h and PluginManager.h header files will pull in most of the declarations needed to build your plugin.  Two header files from the gcc distribution are also needed: config.h and gcc-plugin.h

The object model exposed by the framework is that of a Dictionary of all the types and declarations in the code being compiled by g++ with the plugin loaded.  The dictionary is indexed by namespace, entry fully qualified name, entry source code location, entry UID and and an identity field.  All of the indices are exposed by the ASTDictionary class and can be used for searching the dictionary for a specific entry.  The identity, UID and fully qualified name indices are unique whereas the namespace and source location indices are non unique and may return a range of results.

The dictionary contains entries for different types and declarations.  Entries will be one of the following ‘kinds’: CLASS, UNION, FUNCTION, GLOBAL_VAR, TEMPLATE or UNRECOGNIZED.  The UNRECOGNIZED kind is simply a catch-all for any AST tree elements that have not yet been added to the tree parser.  Dictionary entries are effectively stubs from which the actual definition of the entry may be extracted.  Definitions contain the detailed, ‘kind’ specific information about the entry.  For example, the ClassDefinition object contains the base classes, fields, methods, template methods and friends for the class type.  Source location, namespace, UID, static and extern flags and a list of attributes are available for all dictionary entries and those values are copied into the more detailed definitions as well.

I’ve tried to insure that the AST tree parser will pass through the tree adding dictionary entries for elements it recognizes and ignoring everything else.  My intent is that it should not crash on encountering some language element it does not recognize in the AST but I have not run the parser over a whole lot of code so I will stick to ‘intent’ for now.  At present, the parser recognizes unions but does not yet provide a detailed definition of union types.  I figured it was more valuable to get some code injection functionality in place before sweating through the details of union representations in the GCC AST.

Current Supported Versions of GCC

The internals of GCC are constantly in flux and functionally there are no ‘frozen’ APIs or data structures that one can depend upon remaining static release over release.  The changes are unlikely to be significant release over release but there is a high probability of breaking changes associated with any release.

The code currently compiles and runs with GCC 4.8.0.  I can make no guarantees that it will compile and run with later releases, though hopefully nothing should break between double dot releases.

Example Plugin

An example ‘HelloWorld’ plugin appears below.  The four header files appear at the top.  The plugin_is_GPL_compatible symbol is needed for licensing compliance with the GCC suite.

There exists an implementation of the CPPModel::CallbackIfx interface which is used by the framework to call back into the plugin at specific times in the compilation process.  There are entry points for when the AST is ready, for a point at which namespaces may be declared and a point at which code may be injected.  For the sample plugin, all that happens is that the contents of the TestNamespace inside the code being compiled is dumped to cerr.  The plugin_init function is part of the GCC plugin framework and is rather straightforward when using these abstraction libraries.


/*-------------------------------------------------------------------------------
Copyright (c) 2013 Stephan Friedl.

All rights reserved. This program and the accompanying materials
are made available under the terms of the GNU Public License v3.0
which accompanies this distribution, and is available at
http://www.gnu.org/licenses/gpl.html

Contributors:
 Stephan Friedl
-------------------------------------------------------------------------------*/

#include "config.h"

#include "ASTDictionary.h"
#include "PluginManager.h"

#include "gcc-plugin.h"

int plugin_is_GPL_compatible;

class Callbacks : public CPPModel::CallbackIfx
{
public :

 Callbacks()
 {}

 virtual ~Callbacks()
 {}

 void ASTReady()
 {
 std::list<std::string> namespacesToDump( { "TestNamespace::" } );

 CPPModel::GetPluginManager().GetASTDictionary().DumpASTXMLByNamespaces( std::cerr, namespacesToDump );
 };

 void CreateNamespaces()
 {
 };

 void InjectCode()
 {
 };

};

Callbacks g_pluginCallbacks;

int plugin_init( plugin_name_args* info, plugin_gcc_version* ver )
{
 std::cerr << "Starting Plugin: "<< info->base_name << std::endl;

 CPPModel::GetPluginManager().Initialize( "HelloWorld Plugin", &g_pluginCallbacks );

 return( 0 );
}

 

Example Output

A sample program to be compiled appears below.  This code has the TestNamespace declared and it is the contents of that namespace that will be dumped by the plugin above.

#include <iostream>

namespace TestNamespace
{
	class TestClass
	{
	public :

		int			publicInt;

		int			getPublicInt() const
		{
			return( publicInt );
		}

	protected :

		double		getPrivateDouble() const
		{
			return( privateDouble );
		}

	private :

		double		privateDouble;
	};

	char*		globalString = "This is a global string";

	TestClass	globalTestClassInstance;
}

int main()
{
	std::cout << "!!!Hello World!!!" << std::endl; // prints !!!Hello World!!!

	return 0;
}

The command line required to invoke g++ with the plugin and compile the above file follows:

/usr/gcc-4.8.0/bin/gcc-4.8.0 -c -std=c++11 -fplugin=libGCCPlugin.so HelloWorld.cpp

When g++ initializes, it loads the sample plugin and when the AST is ready, the plugin dumps the following to the standard output.  It isn’t prefect XML but ought to be good enough to analyze the program being compiled.

8: 2014-09-23 21:03:42   [LoggingInitialization] [NORMAL]  Logging Initiated
Starting Plugin: libGCCPlugin
HelloWorld.cpp:34:24: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
  char*  globalString = "This is a global string";
                        ^
<ast>
    <dictionary>
        <namespace name="TestNamespace::">
            <dictionary_entry>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <name>TestClass</name>
                <uid>20720</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>9</line>
                    <char-count>1</char-count>
                    <location>6451683</location>
                </source-info>
            </dictionary_entry>
            <dictionary_entry>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <name>globalString</name>
                <uid>28506</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>34</line>
                    <char-count>1</char-count>
                    <location>6454884</location>
                </source-info>
                <static>true</static>
            </dictionary_entry>
            <dictionary_entry>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <name>globalTestClassInstance</name>
                <uid>28507</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>36</line>
                    <char-count>1</char-count>
                    <location>6455143</location>
                </source-info>
                <static>true</static>
            </dictionary_entry>
        </namespace>
    </dictionary>
    <elements>
        <namespace name="TestNamespace::">
            <class type="class">
                <name>TestClass</name>
                <uid>20720</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>9</line>
                    <char-count>1</char-count>
                    <location>6451683</location>
                </source-info>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <compiler_specific>
                    </artificial>
                </compiler_specific>
                <base-classes>
                </base-classes>
                <friends>
                </friends>
                <fields>
                    <field>
                        <name>publicInt</name>
                        <source-info>
                            <file>HelloWorld.cpp</file>
                            <line>13</line>
                            <char-count>1</char-count>
                            <location>6452196</location>
                        </source-info>
                        <type>
                            <kind>fundamental</kind>
                            <declaration>int</declaration>
                        </type>
                        <access>PUBLIC</access>
                        <static>false</static>
                        <offset_info>
                            <size>4</size>
                            <alignment>4</alignment>
                            <offset>0</offset>
                            <bit_offset_alignment>128</bit_offset_alignment>
                            <bit_offset>0</bit_offset>
                        </offset_info>
                    </field>
                    <field>
                        <name>privateDouble</name>
                        <source-info>
                            <file>HelloWorld.cpp</file>
                            <line>30</line>
                            <char-count>1</char-count>
                            <location>6454374</location>
                        </source-info>
                        <type>
                            <kind>fundamental</kind>
                            <declaration>double</declaration>
                        </type>
                        <access>PRIVATE</access>
                        <static>false</static>
                        <offset_info>
                            <size>8</size>
                            <alignment>8</alignment>
                            <offset>0</offset>
                            <bit_offset_alignment>128</bit_offset_alignment>
                            <bit_offset>64</bit_offset>
                        </offset_info>
                    </field>
                </fields>
                <methods>
                    <method>
                        <name>getPublicInt</name>
                        <uid>28497</uid>
                        <source-info>
                            <file>HelloWorld.cpp</file>
                            <line>15</line>
                            <char-count>1</char-count>
                            <location>6452452</location>
                        </source-info>
                        <access>PUBLIC</access>
                        <static>false</static>
                        <result>
                            <type>
                                <kind>fundamental</kind>
                                <declaration>int</declaration>
                            </type>
                        </result>
                        <parameters>
                            <parameter>
                                <name>this</name>
                                <type>
                                    <kind>derived</kind>
                                    <declaration>
                                        <operator>pointer</operator>
                                        <type>
                                            <kind>class-or-struct</kind>
                                            <declaration>TestNamespace::TestClass</declaration>
                                            <namespace>
                                                <name>TestNamespace::</name>
                                            </namespace>
                                        </type>
                                    </declaration>
                                </type>
                                <compiler_specific>
                                    </artificial>
                                </compiler_specific>
                            </parameter>
                        </parameters>
                    </method>
                    <method>
                        <name>getPrivateDouble</name>
                        <uid>28499</uid>
                        <source-info>
                            <file>HelloWorld.cpp</file>
                            <line>22</line>
                            <char-count>1</char-count>
                            <location>6453350</location>
                        </source-info>
                        <access>PROTECTED</access>
                        <static>false</static>
                        <result>
                            <type>
                                <kind>fundamental</kind>
                                <declaration>double</declaration>
                            </type>
                        </result>
                        <parameters>
                            <parameter>
                                <name>this</name>
                                <type>
                                    <kind>derived</kind>
                                    <declaration>
                                        <operator>pointer</operator>
                                        <type>
                                            <kind>class-or-struct</kind>
                                            <declaration>TestNamespace::TestClass</declaration>
                                            <namespace>
                                                <name>TestNamespace::</name>
                                            </namespace>
                                        </type>
                                    </declaration>
                                </type>
                                <compiler_specific>
                                    </artificial>
                                </compiler_specific>
                            </parameter>
                        </parameters>
                    </method>
                </methods>
                <template_methods>
                </template_methods>
            </class>
            <global_var_entry>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <name>globalString</name>
                <uid>28506</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>34</line>
                    <char-count>1</char-count>
                    <location>6454884</location>
                </source-info>
                <static>true</static>
                <type>
                    <kind>derived</kind>
                    <declaration>
                        <operator>pointer</operator>
                        <type>
                            <kind>fundamental</kind>
                            <declaration>char</declaration>
                        </type>
                    </declaration>
                </type>
            </global_var_entry>
            <global_var_entry>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <name>globalTestClassInstance</name>
                <uid>28507</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>36</line>
                    <char-count>1</char-count>
                    <location>6455143</location>
                </source-info>
                <static>true</static>
                <type>
                    <kind>class-or-struct</kind>
                    <declaration>TestNamespace::TestClass</declaration>
                    <namespace>
                        <name>TestNamespace::</name>
                    </namespace>
                </type>
            </global_var_entry>
        </namespace>
    </elements>
</ast>
Declaring Globals

Conclusion

It has taken a while to get this far but I will dive into the internals of the framework and provide examples of code injection in future posts.

 

Printing with Taulman Bridge Nylon

I have been dabbling in 3D printing for the last six months with my MakerGear M2 printer (http://www.makergear.com a fantastic precision machine tool) and have done a lot of printing in PLA.  PLA is great for a lot of objects, particularly the various elephants and other things I print for my kids but is too brittle for some types of applications.  I have a couple projects I am contemplating that require a tougher material, so I gave Taulman Bridge Nylon a try.  Like all things in 3D printing, there is a learning curve but once you have a process for printing with this nylon, it is a fantastic material.

Taulman Bridge

The ‘Bridge’ nylon is intended to combine the toughness of nylon with the printing ease of PLA.  I have not tried printing with regular nylon so I cannot comment on how much easier it is to print with Bridge, but printing with Bridge is not quite like printing with PLA.  Bridge still absorbs water, it requires a higher print temperature and at least for me, I need to print with thicker layers at a slower speed with Bridge than with PLA.  That said, with the right adjustments in place my success printing with Bridge is pretty darn close to my success rate with PLA – probably 90%+ prints I start complete acceptably.

Using Bridge

First off, this is the process I use.  I live in Colorado at about 5000ft elevation and relatively low humidity.  Your mileage may vary…

Step 1: Dry the Material

Bridge comes on a small diameter spool sealed in a bag with silica gel drying packets inside the bag.  Despite the Bridge formulation to reduce wetness, the packaging and the relative dryness of Colorado, I had little success printing with Bridge right out of the bag.  Right out of the bag, you will see a lot of steam coming out of the nozzle and I got intermittent sputtering as well.  I tried baking the spool in the oven for 6 hours at 175F and this seemed to work reasonably well though the spool deformed a bit.  Also, I got the sense that the inner layers of the spool may not have dried as well as the outer layers.

I adjusted my drying technique a bit by getting a small toaster oven from Walmart and then using it to bake just enough loose filament pulled off the spool for a given print:

Baking Taulman Bridge Nylon

Preparing to dry a length of Taulman Bridge nylon in a small toaster oven in my garage. I typically pull enough material from the main spool for a specific print, clip it and then bake it loose in the oven at 175F for 6 to 8 hours.

After baking, I let the material cool in the oven for a bit and then I transfer it immediately into a zip-lock bag with a couple packs of silica gel for further cooling and drying overnight.

Step 2:  Printer Settings

I use the Simplify3D (http://www.simplify3d.com) software package to slice my models and control the M2.  I’ve used a number of the more popular open source packages as well but I really like all the key functions in one place with an easy to navigate GUI.  I took the suggested settings for the M2 and nylon and through trial and error made some adjustments from that starting point.  The primary problem I ran into was ‘popcorn’ from the print instead of a smooth stream of nylon.  After that I also had some adhesion and warping issues.  The four main changes I made were to bump up the extruder temperature to 245C, start the build plate at 70C then ramp to 90C for the second layer on, cut the printing speed in half and finally stick to a 0.3mm layer height.

Below are the config screens from Simplify3D with the settings I use for Bridge:

Taulman-Extruder

Extruder settings for printing with Taulman Bridge on my MakerGear M2. Note the Ooze Control settings.

Taulman-Layer

Layer settings

Taulman-PrimaryExtruder

Extrusion temperature set at 245C

Taulman-HeatedBed

Heated bed at 70C for the first layer and then 90C for the rest.

Taulman-Other

I reduced the default printing speed by 50% from the speed used for PLA or suggested for nylon,

Taulman-Advanced

Retraction control and ooze rate.

Step 3: Preparing the Build Plate

A couple test prints with clean glass had adhesion and/or warping problems.  I gave the Elmer’s glue coating I use for PLA a shot and it worked extraordinarily well.  So well in fact that I have to use a very thin layer of glue, as with thicker layers the nylon is very, very hard to detach.  Below are a couple picture of the wet glue on the plate and what it looked like dry just before printing.

Build Plate with Fresh Elmer's Glue

Build plate with a thin coating of Elmer’s white glue to improve nylon adhesion.

Build Plate with Dried Elmer's Glue

Plate with the dried glue at 70C.

Finished Product:

For an example of what is possible with Bridge on the M2, I printed Emmett’s Gear Bearing from Thingiverse (http://www.thingiverse.com/thing:53451).  This is an absolutely ingenious design of a bearing that can only be produced with 3D printing.  If you wanted to use this in a real project, then nylon would be a far better material than PLA or ABS.  If you look at the design, it is pretty clear that if your printer or material aren’t dialed in well your odds of getting a working bearing are slim.  There are lots of opposing surfaces which could fuse and render the bearing a hockey puck.  Emmett’s designs on Thingiverse are exceptional, if you have not looked them over then do yourself a favor and do so.

If the piece will not come loose easily from the build plate, I usually put the plate in the freezer for 5 or 10 minutes after which the piece generally pops right off.  Check the start of the video for that demo.

Completed Bearing in Nylon

Completed Gear Bearing by Emmett printed on a MakerGear M2 with Taulman Bridge nylon.

Demo Video:

 

Final Thoughts :

My experience with the Bridge material has been great, once I got the process right.  Dry material, higher temperature, slow printing, thick layers and glue for adhesion all seem to matter but the results as demonstrated by the pictures and video are pretty self-evident.

Building GCC Plugins – Part 2: Introduction to GCC Internals

Once the basic scaffolding is in place for a GCC Plugin, the next step is to analyze and perhaps modify the Abstract Syntax Tree (AST) created by GCC as a result of parsing the source code.  GCC is truly a marvel of software engineering, it is the de-facto compiler for *nix environments and supports a variety of front ends for different langauages (even Ada…).  That said, the GCC AST is complex to navigate for a number of reasons.  First, parsing and representing a variety of languages in a common syntax tree is a complex problem so the solution is going to be complex.  Second, history – looking at the GCC internals is a bit like walking down memory lane; this is the way we wrote high-performance software when systems had limited memory (think 64k) and CPUs had low throughput (think 16Mhz clock cycles).  Prior to GCC 4.8.0, GCC was compiled with the C compiler, so don’t bother looking for C++ constructs in the source code.

The AST Tree

The primary element in the GCC AST is the ‘tree’ structure.  An introduction to the tree structure appears in the GCC Internals Documentation.  Figure 1 is extracted from the tree.h header file and provides a good starting place for a discussion of the GCC tree and how to approach programming with it.


union GTY ((ptr_alias (union lang_tree_node),
 desc ("tree_node_structure (&%h)"), variable_size)) tree_node {
 struct tree_base GTY ((tag ("TS_BASE"))) base;
 struct tree_typed GTY ((tag ("TS_TYPED"))) typed;
 struct tree_common GTY ((tag ("TS_COMMON"))) common;
 struct tree_int_cst GTY ((tag ("TS_INT_CST"))) int_cst;
 struct tree_real_cst GTY ((tag ("TS_REAL_CST"))) real_cst;
 struct tree_fixed_cst GTY ((tag ("TS_FIXED_CST"))) fixed_cst;
 struct tree_vector GTY ((tag ("TS_VECTOR"))) vector;
 struct tree_string GTY ((tag ("TS_STRING"))) string;
 struct tree_complex GTY ((tag ("TS_COMPLEX"))) complex;
 struct tree_identifier GTY ((tag ("TS_IDENTIFIER"))) identifier;
 struct tree_decl_minimal GTY((tag ("TS_DECL_MINIMAL"))) decl_minimal;
 struct tree_decl_common GTY ((tag ("TS_DECL_COMMON"))) decl_common;
 struct tree_decl_with_rtl GTY ((tag ("TS_DECL_WRTL"))) decl_with_rtl;
 struct tree_decl_non_common GTY ((tag ("TS_DECL_NON_COMMON"))) decl_non_common;
 struct tree_parm_decl GTY ((tag ("TS_PARM_DECL"))) parm_decl;
 struct tree_decl_with_vis GTY ((tag ("TS_DECL_WITH_VIS"))) decl_with_vis;
 struct tree_var_decl GTY ((tag ("TS_VAR_DECL"))) var_decl;
 struct tree_field_decl GTY ((tag ("TS_FIELD_DECL"))) field_decl;
 struct tree_label_decl GTY ((tag ("TS_LABEL_DECL"))) label_decl;
 struct tree_result_decl GTY ((tag ("TS_RESULT_DECL"))) result_decl;
 struct tree_const_decl GTY ((tag ("TS_CONST_DECL"))) const_decl;
 struct tree_type_decl GTY ((tag ("TS_TYPE_DECL"))) type_decl;
 struct tree_function_decl GTY ((tag ("TS_FUNCTION_DECL"))) function_decl;
 struct tree_translation_unit_decl GTY ((tag ("TS_TRANSLATION_UNIT_DECL")))
 translation_unit_decl;
 struct tree_type_common GTY ((tag ("TS_TYPE_COMMON"))) type_common;
 struct tree_type_with_lang_specific GTY ((tag ("TS_TYPE_WITH_LANG_SPECIFIC")))
 type_with_lang_specific;
 struct tree_type_non_common GTY ((tag ("TS_TYPE_NON_COMMON")))
 type_non_common;
 struct tree_list GTY ((tag ("TS_LIST"))) list;
 struct tree_vec GTY ((tag ("TS_VEC"))) vec;
 struct tree_exp GTY ((tag ("TS_EXP"))) exp;
 struct tree_ssa_name GTY ((tag ("TS_SSA_NAME"))) ssa_name;
 struct tree_block GTY ((tag ("TS_BLOCK"))) block;
 struct tree_binfo GTY ((tag ("TS_BINFO"))) binfo;
 struct tree_statement_list GTY ((tag ("TS_STATEMENT_LIST"))) stmt_list;
 struct tree_constructor GTY ((tag ("TS_CONSTRUCTOR"))) constructor;
 struct tree_omp_clause GTY ((tag ("TS_OMP_CLAUSE"))) omp_clause;
 struct tree_optimization_option GTY ((tag ("TS_OPTIMIZATION"))) optimization;
 struct tree_target_option GTY ((tag ("TS_TARGET_OPTION"))) target_option;
};

Figure 1: The tree_node structure extracted from the GCC code base.

Fundamentally, a tree_node is a big union of structs.  The union contains a handful of common or descriptive members, but the majority of union members are specific types of tree nodes.  The first tree union member: tree_base is common to all tree nodes and provides the basic descriptive information about the node to permit one to determine the precise kind of node being examined or manipulated.  There is a bit of an inheritance model introduced with tree_base being the foundation and tree_typed and tree_common adding another layer of customization for specific categories of tree nodes to inherit but from there on out the remainder of the union members are specific types of tree nodes.  For example, tree_int_cst is an integer constant node whereas tree_field_decl is a field declaration.

Tree nodes are typed but not in the C language sense of ‘typed’.  One way to think about it is that the tree_node structure is a memory-efficient way to model a class in C prior to C++.  Instead of member functions or methods, there is a large library of macros which act on tree nodes.  In general, macros will fall into two categories: predicate macros which will usually have a ‘_P’ suffix and return a value which can be compared to zero to indicate a false result and transformation macros which take a tree node and usually return another tree node.  Despite the temtpation to dip directly into the public tree_node structure and access or modify the data members directly – don’t do it.  Treat tree nodes like a C++ classes in which all the data members are private and rely on the tree macros to query or manipulate tree nodes.

Relying on the macros to work with the tree_node structure is the correct approach per GCC documentation but will also simply make your life easier.  GCC tree_node structures are ‘strongly typed’ in the sense that they are distinct in the GCC tree type-system and many of the macros expect a specific tree_node type.  For example the INT_CST_LT(A, B) macro expects to have two tree_int_cst nodes passed as arguments – even though the C++ compiler cannot enforce the typing at compile time.  If you pass in the wrong  tree_node type, you will typically get a segmentation violation.  An alternative approach is to compile GCC with the –enable-checking flag set which will enforce runtime checking of node types.

In terms of history, this type of modelling was common back in the day when machines were limited in memory and compute cycles.  This approach is very efficient in terms of memory as the union overlays all the types and there are no virtual tables or other C++ class overhead that consumes memory or requires compute overhead.  The price paid though is that it is 100% incumbent on the developer to keep the type-system front-of-mind and insure that they are invoking the right macros with the right arguments.  The strategy of relying on the compiler to advise one about type mis-matches does not work in this kind of code.

Basics of AST Programming

There are 5 key macros that can be invoked safely on any tree structure.  These three are: TREE_CODE, TREE_TYPE, TREE_CHAIN, TYPE_P and DECL_P.  In general after obtaining a ‘generic’ tree node, the first step is to use the TREE_CODE macro to determine the ‘type’ (in the GCC type-system) of the node.  The TREE_TYPE macro returns the source code ‘type’ associated with the node.  For example, the node result type of a method declaration returning an interger value will have a TREE_TYPE with a TREE_CODE equal to INTEGER_TYPE.  The code for that statement would look like:


TREE_CODE( TREE_TYPE( DECL_RESULT( <em>methodNode</em> ))) == INTEGER_TYPE

Within the AST structure, lists are generally represented as singly-linked lists with the link to the next list member returned by the TREE_CHAIN macro.  For example, the DECL_ARGUMENTS macro will return a pointer to the first parameter for a function or method.  If this value is NULL_TREE, then there are no parameters, otherwise the tree node for the first parameter is returned.  Using TREE_CHAIN on that node will return NULL_TREE if it is the only parameter or will return a tree instance for the next parameter.  There also exists a vector data structure within GCC and it is accessed using a different set of macros.

The TYPE_P and DECL_P macros are predicates which will return non-zero values if the tree passed as an argument is a type specification or a code declaration.  Knowing this distinction is important as it then quickly partitions the macros which can be used with node.  Many macros will have a prefix of ‘TYPE_’ for type nodes and ‘DECL_’ for declaration nodes.  Frequently there will be two sets of identical macros, for instance TYPE_UID will return the GCC generated, internal numeric unique identifier for a type node whereas DECL_UID is needed for a declaration node.  In general, I have found that calling a TYPE_ macro on a declaration or a DECL_ macro on a type specification will result in a segmentation violation.

Other frequently used macros include: DECL_NAME and TYPE_NAME to return a tree node that contains the source code name for a given element.  IDENTIFIER_POINTER can then be used on that tree to return a pointer to the char* for the name.  DECL_SOURCE_FILE, DECL_SOURCE_LINE and DECL_SOURCE_LOCATION are available to map an AST declaration back to the source code location.  As mentioned above, DECL_UID and TYPE_UID return numeric unique identifiers for elements in the source code.

In addition to the above, for C++ source code fed to g++, the compiler will inject methods and  fields not explicitly declared in the c++ source code.  These elements can be identified with the DECL_IS_BUILTIN and DECL_ARTIFICIAL macros.  If as you traverse the AST you trip across oddly named elements, check the node with those macros to determine if the nodes have been created by the compiler.

Beyond this simple introduction, sifting through the AST will require a lot of time reviewing the tree.h and other header files to look for macros that you will useful for your application.  Fortunately, the naming is very consistent and quite good which eases the hunt for the right macro.  Once you think you have the right macro for a given task, try it in your plugin and see if you get the desired result.  Be prepared for a lot of trial-and-error investigation in the debugger.  Also, though there are some GDB scripts to pretty-print AST tree instances, looking at these structure in the debugger will also require some experience, as again the debugger isn’t able to infer much about GCC’s internal type system.

Making the AST Easier to Navigate and Manipulate

I have started a handful of C++ libraries which bridge the gap between the implicit type system in the GCC tree_node structure and explicit C++ classes modelling distinct tree_node types.  For example, a snippet from my TypeTree class appears below in Figure 2.


class TypeTree : public DeclOrTypeBaseTree
 {
 public :

TypeTree( const tree& typeTree )
 : DeclOrTypeBaseTree( typeTree )
 {
 assert( TYPE_P( typeTree ) );
 }

TypeTree& operator= ( const tree& typeTree )
 {
 assert( TYPE_P( typeTree ) );

(tree&)m_tree = typeTree;

return( *this );
 }

 const CPPModel::UID UID() const
 {
 return( CPPModel::UID( TYPE_UID( TYPE_MAIN_VARIANT( m_tree ) ), CPPModel::UID::UIDType::TYPE ) );
 }

 const std::string Namespace() const;

std::unique_ptr<const CPPModel::Type> type( const CPPModel::ASTDictionary& dictionary ) const;

CPPModel::TypeInfo::Specifier typeSpecifier() const;

CPPModel::ConstListPtr<CPPModel::Attribute> attributes();
 };

Figure 2: TypeTree wrapper class for GCC tree_node.

Within this library I make extensive use of the STL, Boost libraries and a number of C++ 11 features.  For example, ConstListPtr<> is a template alias for a std::unique_ptr to a boost::ptr_list class.


template <class T> using ListPtr = std::unique_ptr<boost::ptr_list<T>>;
 template <class T> using ConstListPtr = std::unique_ptr<const boost::ptr_list<T>>;

template <class T> using ListRef = const boost::ptr_list<T>&;

template <class T> ConstListPtr<T> MakeConst( ListPtr<T>& nonConstList ) { return( ConstListPtr<T>( std::move( nonConstList ) ) ); }

Figure 3: Template aliases for lists.

At present the library is capable of walking through the GCC AST and creating a dictionary of all the types in the code being compiled.  Within this dictionary, the library is also able to provide detailed information on classes, structs, unions, functions and global variables.  It will scrape out C++ 11 generalized attributes on many source code elements (not all of the yet though) and return proper declarations with parameters and return types for functions and methods.  The ASTDictionary and the specific language model classes have no dependency on GCC Internals themselves.

The approach I followed for developing the library thus far was to get enough simple code running using the GCC macros that I could then start to refactor into C++ classes.  Along the way, I used Boost strong typedefs to start making sense of the GCC type system at compile time.  Once the puzzle pieces started falling into place and the programming patterns took shape, developing a plugin on top of the libraries is fairly straightforward.  That said, there is a long and painful learning curve associated with GCC internals and the AST itself.

Getting the Code and Disclaimers

The library code is available on Github: ‘stephanfr/GCCPlugin’.  All of the code is under GPL V3.0 which is absolutely required as it runs within GCC itself.  I do not claim that the library is complete, stable, usable or rational – but hopefully some will find it useful if for nothing more than providing some insight into the GCC AST.  For the record, this is not my job nor is it my job to enrich or bug fix the library so you can get your compiler theory class project done in time.  That said, if you pick up the code and either enrich it or fix some bugs – please return the code to me and I will merge what makes sense.

The code should ‘just run’ if you have a GCC Plugin build environment configured per my prior posts.  One detail is that the ‘GCCPlugin Debug.launch’ file will need to be moved to the ‘.launches’ directory of Eclipse’s ‘org.eclipse.debug.core’ plugin directory.  If the ‘.launches’ directory does not exist, then create it.