Useful links
Click on one of the categories below to go quickly to the corresponding link section.
Categories: Benchmarking and data sets,
 Algorithms for Graph Matching and Graph Edit Distance ,
 Graph Drawing,
 Combinatorial maps,
 Graph kernels,
 Other TCs,
 Various links
Benchmarking and data sets
 The Graph Database < http://mivia.unisa.it/datasets/graphdatabase>
 IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning
 Chemistry datasets provided either by the GREYC laboratory or by the chemoinformatics web site.
 The LEMS database of shapes
 ILPIso datasets
 The Graph Data Repository For Graph Edit Distance (GDR4GED)
 Results of ICPR 2016 contest on Graph edit distance
 Challenge 1: computation of the exact or an approximate graph edit distance
 Challenge 2: computation of a dissimilarity measure for graph classification
 ICPR 2016 Competition on Subgraph Spotting in Graph Representations of Comic Book Images (SSGCI)
The /ARG Database/is a huge collection of labeled and unlabeled graphs realized by the MIVIA Labof the University of Salerno (Italy). The aim of this collection is to provide the graph research community with a standard test ground for the benchmarking of graph matching algorithms.The database is organized in two section: labeled and unlabeled graphs.
Both labeled and unlabeled graphs have been randomly generated according to different generation models (2D, 3D and 4D regular meshes, 2D, 3D and 4D nearlyregular meshes, bounded valence, nearlybounded valence, randomly generated) regular, each involving different possible parameter settings.
As a result, 168 diverse kinds of graphs are contained in the database. Each type of unlabeled graph is represented by thousands of pairs of graphs for which an isomorphism or a graphsubgraph isomorphism relation holds, for a total of 143,600 graphs. Furthermore, each type of labeled graph is represented by thousands of pairs of graphs holding a not trivial common subgraph, for a total of 166,000 graphs. Currently graphs have size ranging from a few up to thousands of nodes.
The above website aims at introducing a repository of graph data sets
and corresponding benchmarks, covering a wide spectrum of different
applications. We expect that the graph repository provides a major
contribution towards promoting the use of graph based representations
and making graph based pattern recognition and machine learning
algorithms better comparable against each other. In future work we
want to further expand the graph set repository. Towards this end, we
highly encourage the community not only to use the available sets for
developing and testing their algorithms, but also to integrate their
own graph sets into our repository.
This repository is well documented in the following paper:
Riesen, K., Bunke, H.: IAM graph database repository for graph
based pattern recognition and machine learning, in N. da Vitoria Lobo et
al. (Eds.): Structural, Syntactic, and Statistical Pattern Recognition,
Springer, LNCS 5342, 2008, 287  297
Several databases of graphs corresponding to
classification or regression problems in
chemistry.
The Vision Group at LEMS proposes several shape databases composed of 99, 216 and 1032 shapes. This database is essential for people dealing with shape recongition. Interested readers may also have a look at this small database of 33 holed shapes. This database may (and should) be used in conjunction with one of the LEMS database.
Four synthetic datasets are provided. Each of them is composed of several triplets (pattern graph, target graph and groundtruth). The graphs have been randomly generated thanks to the ErdosRenyi model with or without the constraint of producing connected graphs. For each option, one version of the dataset has an exact mapping (equal labels between matched vertices/edges) whereas an other version includes noise on label values. The groundtruth information gives the onetoone vertex mapping involving the minimal cost assignment.
An additionnal dataset is composed of graphs generated from a real application. Target graphs are region adjacency graphs representing architectural plans. Pattern graphs are structural representations of graphical symbols. The groundtruth information gives the instances of patterns in the target, which correspond to instances of symbols on the plans.
To precisely evaluate the accuracy of Graph edit distance (GED) methods, GroundTruth with information about the quality of the matching between pairs of graphs (low level information) is required in addition to higher level information (label of the object represented by the graph, classification rates) usually provided and used as GT to evaluate the methods. Most of the publicly available Graph repositories with associated groundtruths are dedicated to evaluating graph classification or exact graph matching methods and so the matching correspondences as well as the distance between each pair of graphs are not directly evaluated.
This Dataset called GDR4GED consists of a graph database repository annotated with low level information like graph edit distances values (optimal and suboptimal) and also their corresponding node to node matching (Edit Paths). A set of performance evaluation metrics that can be computed using this dataset to assess the performance of GED methods is also proposed in the conference paper (GbR2015) describing this repository (available with the repository).
Measuring the dissimilarity between graphs is a key step in data analysis based on graph representations. This contest focusses on the definition and the computation of general dissimilarity measures through two challenges:
SSGCI is focused on the challenge of searching a query attributed graph in a database of attributed graphs and to provide the node correspondences between query and result graphs.
The graph datasets for the competition have been constructed by representing the contents of scanned comic book images by attributed region adjacency graphs (RAGs) containing attributes on nodes. The graphs are provided in GraphML format.
Algorithms for Graph Matching and Graph Edit Distance
 LAD Software <a href="http://liris.cnrs.fr/csolnon/LAD.html">
 VF Library <http://mivia.unisa.it/datasets/graphdatabase/vflib/>
 GUB Library <http://amalfi.dis.unina.it/graph/misc/gub_tool.tar>
 Graphm package <http://cbio.ensmp.fr/graphm/>
 Graph Edit distance Computation
 A software toolkit for graph edit distance computation
 A* algorithm with bipartite Heuristic (exact graph edit distance)
 Beam search
 Bipartite Graph edit distance using Assignment Algorithms
 Quadratic (QAP) and linear (LSAP) based approaches to solve the graph edit distance
 ILPIso
LAD is a software (written in C) for solving the subgraph isomorphism problem, the goal of which is to decide if there exists a copy of a pattern graph in a target graph. It is distributed under the CeCILLB FREE SOFTWARE LICENSE.
This is a graph matching library,developed by the Mivia Lab of the University of Salerno (Italy) which provides several algorithms for graph isomorphism, graphsubgraph isomorphism and graph monomorphism. The library is an efficient implementation of the VF2 graph matching algorithm, currently among the fastest ones, able to process very huge graphs due to its linear memory complexity.
The library is written in C++, and can be easily adapted to use different formats for the graphs, and different types of attributes. The web page contains the library documentation, including some usage examples, and the source files of the library.
This is a library for graph matching incorporating several algorithms. The library has been developed by Bruno Messmer at the University of Bern.
GraphM is a tool for approximate solving of large scale graph matching problems. The package was developed in the Centres for computational biology and for mathematical morphology at Mines ParisTech.
Graph edit distance is one of the most flexible mechanisms for error tolerant graph matching. We present here some links to several software computing this distance.
The present software provides both a library and a graphical interface to compute the edit distance between sets of graphs using various heuristics. Namely:
This software provides a set of tools allowing to compute fast sub optimal estimation of the edit distance using LSAP formulations and more precise ones using an exact formulation of the GED problem based on quadratic minimization procedures (QAP).
The QAP component of this software is based on the following paper while the LSAP component is based on this one.
ILIPIso is a software that implements the approach decribed in [Le Bodic et al. 2012] for substitutiontolerant subgraph isomorphism. It aims at finding assignments between a pattern graph and a target graph. Every vertex/edge of the pattern graph is matched to a different vertex/node in the target graph. Differences between attributes are allowed but penalized by a cost function. The matching is done so that the global assignment cost is minimal. It allows to process graphs whose vertices and/or edges can be labeled with numerical vector of any dimension.
The software can be configured in order to search for several instances of the matching, and return them by nondecreasing cost order. Several settings are available to discard solutions previously found, when searching for a new one.
Currently, only an online version and a compiled version are available (other architectures on demand), but source code should be provided soon.
Graph Drawing
 Graph Drawing <http://www.graphdrawing.org>
A site about graph drawing, mantained by some of the organizers of the International Symposium on Graph Drawing. The site contains links to several graph file formats, most of them based on XML.
Combinatorial Maps
 Descartes
 Girl Library
 An implementation of Combinatorial Pyramids
Descartes is a segmentation software based on a combinatorial map encoding of the partitions. It can import/export segmentation and is designed for an interactive use: The user select the regions to split or to merge.
A library which allows to structurate a partition thanks to combinatorial maps. Multiple adjacency and inclusions are encoded. The data structure may be efficiently updated after both split and merge operations.
A library which allows to design hierarchical segmentation algorithms using pyramids of combinatorial maps (combinatorial pyramids). Multiple adjacency are explicitely encoded and inside/outside (i.e. inclusion) relationships may be effciently computed. This implementation is based on an implicit encoding which allows to retreive any partition of the pyramid at a low memory cost.
Graph kernels

CHEMCPP
ChemCpp is a C++ toolbox for chemoinformatics focusing on the computation of kernel functions between chemical compounds.
Other TCs
 STRUCTURAL & SYNTACTICAL PATTERN RECOGNITION (TC2)
 Graphics Recognition (TC10)
 Reading systems (TC11)
IAPR's Technical Committee 2 on SSPR promotes interaction among researchers working on Structural and Syntactical Pattern Recognition (SSPR).
Graphics Recognition is an exciting field of pattern recognition. Along with optical character recognition (OCR) and document layout analysis, it forms the broader area of document image analysis and recognition. IAPR's Technical Committee 10 on Graphics Recognition promotes interaction among researchers working in document image analysis in general, and graphics recognition in particular.
Technical Committee 11 is concerned with theory and application of Reading Systems. The goal is to understand and develop systems which are able to analyze media containing character symbols and will return an encoded representation of the text content and structure.
This area covers research in the offline recognition of printed texts and handwritten material, research on the automatic online recognition of handwriting as in pencomputing applications, as well as research in the analysis of electronic documents such as web pages. Applications include digital libraries, penbased computing, check and mail reading, signature verification, web mining and content repurposing, web security using human interactive proofs, textual content analysis in videos, and historical document preservation and archiving.
The research topics cover all levels of processing, starting with the image or planar coordinates delivered by sensory technology, the necessary image and signalprocessing stages, the subsequent segmentation and normalization stages, to feature extraction and selection, classification methods, statistical and syntactical pattern recognition and machine learning techniques.
Various Links
 Simbad FP7 European Projet
 GbR'2011 report
 Machine learning open source software
This project aims at undertaking a thorough study of several aspects of purely similaritybased pattern analysis and recognition methods, from the theoretical, computational, and applicative perspective. It aims at covering a wide range of problems and perspectives. It will consider both supervised and unsupervised learning paradigms, generative and discriminative models, and our interest will range from purely theoretical problems to realworld practical applications.
This report documents the discussion about challenges in graphbased representations held at the 8th IAPR  TC15 workshop on graphbased representations in pattern recognition. The workshop took place in Münster (Germany) from the 18th to the 20th of May 2011. The discussion was initiated by four presenters (Milan Sonka, Luc Brun, Edwin Hancock, Pasquale Foggia) and moderated by Horst Bunke.
Open source tools have recently reached a level of maturity which makes them suitable for building largescale realworld systems. At the same time, the field of machine learning has developed a large body of powerful learning algorithms for a wide range of applications. Inspired by similar efforts in bioinformatics (BOSC) or statistics (useR), the aim of mloss community is to build a forum for open source software in machine learning.