Network analysis identifies criminal leaders with 92% accuracy — a precision that manual investigation rarely achieves. But the relationships must first be discovered. That is what the graph does. It finds the invisible thread between a face, a voice, a plate, and a phone — and pulls.
A face appears in CCTV footage from a gas station at 9:14 PM. A 911 call is placed from the same gas station's payphone at 9:16 PM. A license plate is captured by an LPR camera two blocks east at 9:18 PM. A ShotSpotter acoustic event registers gunfire three blocks north at 9:22 PM. Four evidence items. Four different systems. Four different file formats. In most investigations, these items sit in four separate databases, reviewed by four different analysts, at four different times. The connection between them — the connection that places a specific person at a specific location minutes before a shooting, with a vehicle that fled through a specific route — is never discovered. Not because it does not exist, but because no system is looking for it.
The FBI Law Enforcement Bulletin acknowledged that manual examination of social networks remains "difficult, time-consuming, and arbitrary, making it more prone to error." Network analysis using graph algorithms achieves 92% accuracy in identifying criminal leaders. Knowledge graph adoption has been measured as a force multiplier equivalent to adding two analysts to an investigative team. The GraphAware implementation study validated this approach against 7 million real crime records from the Chicago Police Department — not synthetic data, but the messy, incomplete, and complex information that characterizes real criminal investigations.
Vault's Graph Intelligence engine transforms isolated evidence items into a living network of connections. Every detection produced by the AI Analysis layer — every face, voice, vehicle, phone, location, name, and financial transaction — becomes a node in the evidence graph. The Graph Intelligence engine then searches for edges — connections between nodes across time, space, identity, and modality. A face in video connected to a voice in audio. A phone at a GPS coordinate connected to a vehicle at the same location. A name spoken in a transcript connected to a name in a financial record. The graph does not hypothesize these connections. It discovers them — mathematically, across the entire evidence corpus, in seconds. And every edge it discovers is traceable to the specific evidence items and detections that produced it.
From temporal correlation to network topology analysis, every connection discovered, scored, provenance-tagged, and court-defensible.
Time is the most fundamental correlator in criminal investigation. Events that happen close together in time are more likely to be related than events separated by hours or days. But when evidence is captured by different systems — body-cam footage timestamped by the camera's internal clock, 911 recordings timestamped by the dispatch system, LPR captures timestamped by the reader's GPS-synchronized clock, and ShotSpotter events timestamped by acoustic triangulation — the timestamps may differ by seconds or minutes due to clock drift, timezone configuration, and processing delay. Manual correlation requires an analyst to mentally align these timelines, adjusting for clock differences, and scan across sources for co-occurring events. The Temporal Correlation Engine automates this alignment and correlation at scale. First, it normalizes all timestamps across the evidence corpus to a unified time reference, correcting for timezone differences, clock drift (estimated from metadata patterns), and known processing delays for each source type. Then it scans the entire corpus for temporal clusters — groups of evidence items whose normalized timestamps fall within a configurable window. For a shooting investigation, the window might be 10 minutes; for a surveillance operation, 30 minutes; for a trafficking network analysis, 24 hours. Each temporal cluster represents a potential connection: events that occurred close enough together to be related. The engine scores each cluster by density (how many evidence sources are represented), diversity (how many different evidence types are present), and investigative relevance (whether the clustered items involve persons or vehicles already flagged as persons of interest). High-scoring clusters surface to the top of the investigator's review queue. A cluster containing body-cam footage, a 911 call, an LPR capture, and a ShotSpotter alert within an 8-minute window — all geolocated within a half-mile radius — is almost certainly a single incident captured from four perspectives. The investigator who receives this cluster sees the complete picture instantly, instead of discovering each piece independently over days of manual review.
Criminals operate in space. They stage at specific locations before operations. They use specific routes for transportation. They control specific territories for distribution. They meet at specific locations for coordination. These spatial patterns are encoded in the evidence — GPS coordinates in phone extractions, geolocation in body-cam metadata, addresses in witness statements, landmarks visible in surveillance footage, cell tower connections in CDR data — but they are distributed across dozens of evidence items in different formats. The Spatial Co-Occurrence engine extracts geographic information from every evidence item in the corpus and maps it onto a unified spatial grid. GPS coordinates are extracted directly from metadata. Addresses mentioned in transcripts and documents are geocoded. Landmarks and street signs detected by computer vision are geolocated against mapping databases. Cell tower connections from CDR data are mapped to coverage areas. The engine then identifies spatial co-occurrences: instances where different entities — people, vehicles, phones — appear at the same location across different evidence sources. Person A detected on CCTV Camera 7 and Vehicle B captured by the eastbound LPR both appearing within 200 meters of the same intersection, at overlapping timestamps, on three separate days, reveals a spatial pattern that suggests the vehicle was pre-positioned at a location the person frequented. This pattern — invisible when each evidence source is reviewed independently — becomes immediately apparent in the spatial co-occurrence graph. The engine supports configurable spatial resolution: exact location matching for precise investigations (was this phone at this ATM at this time?), neighborhood-level matching for pattern analysis (does this vehicle appear regularly in this area?), and regional-level matching for network mapping (which cities does this organization operate in?).
Criminal investigations rarely encounter individuals by their legal name consistently across all evidence. A suspect may be "Mike R." in a transcribed body-cam encounter, "Miguel Rodriguez" in a phone extraction's contact list, "M. Rodriguez" on a lease document found during a search, "Mikey" in a text message thread, and "Rodriguez, Miguel A." in a prior arrest record. These are not five individuals. They are one individual referenced five different ways in five different evidence sources. But in a system that treats each reference literally, they appear as five separate nodes — and the connections between "Mikey" in a text message and "Rodriguez, Miguel A." in an arrest record are never discovered. Entity Resolution is the process of determining that multiple references across multiple evidence sources refer to the same real-world entity. Vault's Entity Resolution engine uses a multi-signal approach combining phonetic matching (names that sound similar across languages and transliterations), fuzzy string matching (names with spelling variations, truncations, and abbreviations), contextual co-occurrence (references that appear in proximity to the same addresses, phone numbers, or associates), and cross-modal confirmation (a name in a transcript confirmed by a face in a photograph within the same evidence item). Each resolution is scored by confidence. High-confidence resolutions (above 90%) are applied automatically, merging the fragmented references into a single unified node in the evidence graph. Medium-confidence resolutions (70-90%) are presented to the investigator for confirmation. Low-confidence candidates are flagged for review but not merged. Every resolution — automated or human-confirmed — is documented with the specific evidence references and matching signals that produced it, ensuring that the defense can challenge any specific identity fusion at trial.
The most powerful evidence connections are the ones that link an individual across different sensory modalities — proving that the person seen in one piece of evidence is the same person heard in another, without requiring anyone to identify them by name. A voice print extracted from a 911 call can be compared against voice prints from interview recordings, wiretap intercepts, and voicemails extracted from phone data. If the voice in the 911 call matches a speaker in a witness interview, the graph creates an edge linking the 911 caller to the witness — revealing that someone who called in the crime also appeared as a witness, a fact that dramatically changes the investigative picture. Similarly, a face embedding from CCTV footage can be compared against face embeddings from body-cam recordings, social media photographs in phone extractions, and driver's license photos in DMV records. A gait signature extracted from one camera feed can be compared against gait signatures from cameras at different locations — the Person RE-ID technology that tracks individuals by how they walk, their body proportions, and their clothing without requiring facial recognition. The Biometric Cross-Matching engine combines all three modalities — voice, face, and gait — to produce the highest-confidence identity correspondences available without traditional biometric databases. When a voice print from a 911 call matches a face in CCTV footage and a gait signature in body-cam footage, the convergence of three independent biometric modalities produces a correspondence confidence that approaches certainty — and every step of the matching process is documented with the specific features, models, and confidence scores that produced it.
Every temporal cluster, spatial co-occurrence, entity resolution, and biometric cross-match produced by the preceding engines is a discovered relationship. The Evidence Graph Construction engine assembles all of these relationships into a unified knowledge graph — a mathematical structure where every entity (person, vehicle, phone, location, financial account, evidence item) is a node, and every discovered relationship between entities is an edge with a type, a confidence score, and a provenance trail linking it to the specific evidence that produced it. This graph is not a visualization. It is a computational object that can be analyzed using graph algorithms — the same mathematical tools that power social network analysis, fraud detection, and biological network mapping. Betweenness centrality identifies nodes that sit on the shortest paths between many other nodes — in a criminal network, these are the intermediaries, the brokers, the connectors whose removal would fragment the network. Community detection algorithms identify clusters of densely connected nodes — in an investigation, these are operational cells, social circles, or geographic territories. PageRank identifies the most influential nodes — the leaders whose connections radiate outward through the network. Shortest-path analysis reveals the minimum chain of connections between any two entities — showing, for example, that a suspect and a victim are connected through only two intermediaries, or that a financial account and a physical address are linked through a phone number that appears in a text message. The graph topology itself becomes intelligence. A network with high clustering and few bridges between clusters suggests a cell-structured organization. A network with a single high-centrality node suggests a hierarchical command structure. A network where the highest-centrality node has no direct connections to criminal activity suggests a sophisticated operator who delegates through layers of intermediaries. The graph reveals the structure that the evidence contains but that linear review cannot see.
A criminal organization is a network with structure. It has leaders who make decisions, lieutenants who coordinate operations, operatives who execute tasks, and associates who provide support services. These roles are rarely documented in evidence directly — no one writes "I am the leader of this organization" in a text message. Instead, roles are revealed by communication patterns (who initiates contact, who responds, who is copied), financial flows (who pays whom, in what direction does money move), temporal authority (whose schedule determines when operations occur), geographic range (whose territory is largest), and network centrality (who connects the most otherwise-disconnected parts of the network). The Network Discovery engine infers organizational structure from these patterns automatically. It identifies leaders as nodes with high PageRank and eigenvector centrality — individuals whose connections reach deeply into the network through multiple layers. It identifies lieutenants as nodes with high betweenness centrality — individuals who bridge between operational cells. It identifies operatives as nodes with high degree centrality within a single community — individuals deeply connected within their own cell but not connected to other cells. It identifies associates as peripheral nodes with low centrality but connections to operatives — individuals on the edges of the network who provide logistics, housing, transportation, or other support. The resulting network map shows not just who is connected to whom, but what role each individual plays in the organization — intelligence that transforms a list of suspects into an actionable understanding of how the organization operates, where it is vulnerable, and which members' removal would cause the greatest disruption.
The most dangerous members of a criminal network are often the ones who are hardest to find — not because they are absent from the evidence, but because they are deliberately structured to be invisible. A handler who communicates with cell leaders through disposable phones, never appears in surveillance footage, and has no direct connection to any criminal activity will not surface through traditional investigative methods. But in the evidence graph, this individual appears as a topological anomaly: a node with high betweenness centrality (connecting otherwise-disconnected cells) but low degree centrality (few direct connections) and zero connections to nodes involved in criminal events. This pattern — high influence, low visibility — is the signature of a sophisticated operator who insulates themselves through layers of intermediaries. The Anomaly Detection engine scans the evidence graph for these structural signatures. Cutout patterns: nodes that connect two subgraphs but have no direct connections to operational activity — potential intermediaries used to insulate leadership. Shadow hierarchies: chains of authority that parallel the visible command structure but operate through different communication channels. Temporal anomalies: sudden changes in communication frequency, new connections appearing simultaneously (suggesting a coordinated operational shift), or established connections disappearing (suggesting counter-surveillance awareness). Financial anomalies: money flowing through nodes that have no other connections to the network — potential money laundering or shell entity patterns. Each detected anomaly is flagged with the topological pattern that triggered it, the specific nodes and edges involved, and a priority score based on the anomaly's potential significance to the investigation. The most sophisticated criminals design their networks to avoid the patterns that traditional analysis looks for. Graph anomaly detection finds them precisely because their efforts to be invisible create a different kind of pattern — the pattern of deliberate absence — that is itself detectable.
An evidence graph with 47,291 connections is a powerful investigative tool. It is also a liability if any single connection cannot be explained, verified, and defended. When the prosecution presents a graph showing that the defendant is connected to a murder victim through three intermediaries, the defense will challenge each link: "Show me the evidence that produces this connection. Show me the algorithm. Show me the confidence score. Show me the error rate. Show me that this connection could not have been produced by chance." The Graph Provenance engine ensures that every edge in the evidence graph carries complete provenance documentation. Each edge records: the correlation type that produced it (temporal, spatial, entity resolution, biometric, or composite), the specific evidence items involved (by SHA-256 hash and evidence ID), the specific detections within those evidence items that triggered the correlation (face detection at frame 14,847 with confidence 0.94; voice print segment at timestamp 04:17-04:21 with similarity score 0.91), the algorithm name and version used for the correlation, the confidence threshold applied, the known error rate for this type of correlation in comparable conditions, and whether a human analyst reviewed and confirmed the connection. For edges produced by composite correlations — where multiple correlation types converge on the same connection — the provenance records each contributing correlation independently, showing the defense that the connection is not based on a single algorithmic output but on the convergence of multiple independent signals. The provenance documentation is designed to satisfy Daubert reliability standards: the methodology is testable, the error rates are known, the algorithms are peer-reviewed, and the results are reproducible. When the defense challenges an edge, the prosecution does not say "the AI found it." The prosecution says "here is the face detection at frame 14,847 with 94% confidence, here is the voice match at similarity 0.91, here is the spatial co-occurrence within 200 meters at three overlapping timestamps, and here is the entity resolution confirming the same individual through phonetic, contextual, and cross-modal matching." The graph does not speak for itself. Its provenance speaks for it.
Three networks. Three invisible structures revealed. Every connection evidence-grounded. Every conviction sustained on appeal.
Every node an entity. Every edge a discovery. Every connection evidence-grounded. Every conviction defensible.