DOT files layout

  Kiến thức lập trình

I have written some basic code to create a C project visualiser. It’s main goal is to be able to quickly understand a structure of a project, the dependecies of the files, and how those files are arranged in directories. I have not managed ifdefs, only comments (this is a dirty quick prototype)

Here is the code

import os
import re
import random
import graphviz
# Directory containing the C files
directory = "VAMYTune-main"
output_file_path = 'output_graph'
# Regular expression to match #include directives
include_regex = re.compile(r'(?<!//|/*)#includes+["<](.+?)[">]')

def random_hex_color():
    return '#' + ''.join([random.choice('0123456789ABCDEF') for _ in range(6)])

def construct_graph(top_level_directory):
    # Create a new Digraph
    dot = graphviz.Digraph(comment='File Dependencies')

    adjacency_list = {}
    
    # Inits the adj, because some includes have no dependencies (std lib for example)
    def init_adj_lst(parent_dir):
        for entry in os.listdir(parent_dir):
            entry_path = os.path.join(parent_dir, entry)
            if entry.endswith((".c", ".h", ".C", ".H")):
                with open(entry_path, 'r') as f:
                    content = f.read()
                    includes = include_regex.findall(content)
                    for included_file in includes:
                        adjacency_list[included_file.lower()] = []
            elif os.path.isdir(entry_path):
                init_adj_lst(entry_path)
    
    # Function to add edges between nodes in the adjecency list
    # Not into the graphviz (should not be done here because the files would be incorrectly placed)
    def add_edges(file_path, node):
        with open(file_path, 'r') as f:
            for included_file in include_regex.findall(f.read()):
                # Check if the included file is already included by a parent file
                if included_file.lower() not in adjacency_list.get(node, []):
                    adjacency_list.setdefault(node, []).append(included_file.lower())

    # Function to recursively add subgraphs and nodes
    def add_subgraph(parent_dir, parent_graph : graphviz.Digraph):
        # List all files and directories in the current directory
        for entry in os.listdir(parent_dir):
            entry_path : str = os.path.join(parent_dir, entry)
            # If a subdirectory
            if os.path.isdir(entry_path):
                # Add subgraph for each subdirectory
                with parent_graph.subgraph(name='cluster_' + entry_path.replace(os.sep, '_')) as c:
                    # Set subgraph attributes
                    c.attr(label=entry, style='filled', color=random_hex_color())
                    c.node_attr.update(style='filled', color='white')
                    # Recursively add nodes and subgraphs
                    add_subgraph(entry_path, c)
            elif entry.endswith((".c", ".h", ".C", ".H")):
                # Add only c and h file nodes directly under the current parent graph
                parent_graph.node(entry.lower())
                # Add the edges to this node
                add_edges(entry_path, entry.lower())

    # initialize the adjlist
    init_adj_lst(top_level_directory)
    # Start the graph creation from the top-level directory
    add_subgraph(top_level_directory, dot)

    
    def get_all_deps(tabs,file,deps):
        res = set()
        for dep in deps[file]:
            res.add(dep)
            if deps[dep] == []:
                continue
            else:
                res |= (get_all_deps(tabs+1,dep,deps))
        return res

    def update_deps(file,deps):
        curr_deps = deps[file]
        new_deps = curr_deps
        for dep in curr_deps:
            all_dep = get_all_deps(1,dep,deps)
            new_deps = [x for x in new_deps if x not in set(new_deps).intersection(all_dep)]
        return new_deps
        

    def update_dependecies(deps):
        new_deps = {}
        for file, _ in adjacency_list.items():
            new_deps[file] = update_deps(file,deps)
        return new_deps
    
    updated_edges = update_dependecies(adjacency_list)
    
    # Update the actual edges in the Digraph
    for node, edges in updated_edges.items():
        for edge in edges:
            # Based on non redundant edges construct the directed graph
            if not(edge.lower() in dot.body and (node.lower(), edge.lower()) in dot.edges):
                dot.edge(edge.lower(), node.lower())
    
    # change direction
    dot.attr(rankdir='RL')
    dot.render(output_file_path, format='png')

    return dot

construct_graph(directory)

I have tested it on my old synthesizer project and the output was quite nice!

dot
Each node is a .c or a .h file and each subgraph is a directory where those files are contained.

When testing on larger projects however with a way more complex hierarchy than a simple sound/midi app the visuals get complex to analyse. The edges get mangeled up in groups and the individual links are comple to see.

So I thought if it was possible to create like a different type of edges, that would this time be made between components? Basically edges between the directories, to show what really depends on what “globally”. Another idea I had is to hierarchically order the groups, if one group clearly has more edges going from it to the others it must be important. Lastly it would be nice to reorder hierarchically the nodes inside groups.

If you have a better sugestion for the visualisation please do not hesitate, also note that the code is absolutely not optimised again it’s just a prototype to test an idea

LEAVE A COMMENT