by ordo-one
Fuzzy string matches at full speed
# Add to your Claude Code skills
git clone https://github.com/ordo-one/FuzzyMatchA high-performance fuzzy string matching library for Swift.
FuzzyMatch was developed for searching financial instrument databases — stock tickers, fund names, ISINs — where typo tolerance, prefix-aware ranking, and sub-millisecond latency matter. The same qualities make it well suited to any domain with a large, heterogeneous candidate set: code identifiers, file names, product catalogs, contact lists, or anything else a user might search with imprecise input.
Full API documentation is available on the Swift Package Index.
No comments yet. Be the first to share your thoughts!
Sendable compliance for concurrent usage[Range<String.Index>] for matched characters in scored results, with full support for typos, transpositions, and Unicode normalizationAdd FuzzyMatch to your Package.swift:
dependencies: [
.package(url: "https://github.com/ordo-one/FuzzyMatch.git", from: "1.0.0")
]
Then add it to your target dependencies:
.target(
name: "YourTarget",
dependencies: ["FuzzyMatch"]
)
import FuzzyMatch
let matcher = FuzzyMatcher()
// One-shot scoring — simplest API
if let match = matcher.score("getUserById", against: "getUser") {
print("score=\(match.score), kind=\(match.kind)")
}
// Top-N matching — returns sorted results
let query = matcher.prepare("config")
let top3 = matcher.topMatches(
["appConfig", "configManager", "database", "userConfig"],
against: query,
limit: 3
)
for result in top3 {
print("\(result.candidate): \(result.match.score)")
}
The Examples/FuzzySearch/ directory contains a macOS app for exploring how FuzzyMatch works interactively. It loads a 271K financial instrument corpus and live-searches as you type, showing the top results with highlighted matched characters. Switch between Edit Distance and Smith-Waterman algorithms to see how they rank differently, tweak all algorithm parameters in the inspector panel and see results update live, or use File > Open (Cmd+O) to load your own newline-delimited data.
Open the Xcode project and hit Run:
open Examples/FuzzySearch/FuzzySearch.xcodeproj
For quick exploration, prototyping, or when scoring a small number of candidates:
let matcher = FuzzyMatcher()
// One-shot: prepare + score in a single call
if let match = matcher.score("getUserById", against: "usr") {
print("Score: \(match.score)")
}
// Top-N: returns the best matches sorted by score
let query = matcher.prepare("config")
let top5 = matcher.topMatches(candidates, against: query, limit: 5)
// All matches: returns every match sorted by score
let all = matcher.matches(candidates, against: query)
Note: Convenience methods allocate a new buffer per call. For high-throughput or latency-sensitive use, see High-Performance API below.
For scoring many candidates against the same query — the recommended path for production use, interactive search, and batch processing:
let matcher = FuzzyMatcher()
// 1. Prepare the query once (precomputes bitmask, trigrams, etc.)
let query = matcher.prepare("getUser")
// 2. Create a reusable buffer (eliminates allocations in the scoring loop)
var buffer = matcher.makeBuffer()
// 3. Score candidates — zero heap allocations per call
let candidates = ["getUserById", "getUsername", "setUser", "fetchData"]
for candidate in candidates {
if let match = matcher.score(candidate, against: query, buffer: &buffer) {
print("\(candidate): score=\(match.score), kind=\(match.kind)")
}
}
Output:
getUserById: score=0.9988, kind=prefix
getUsername: score=0.9988, kind=prefix
setUser: score=0.9047619047619048, kind=prefix
For the highest possible throughput, use score(utf8:against:buffer:) with pre-extracted UTF-8 bytes. This @inlinable method enables cross-module inlining that the String overload cannot achieve on Swift 6.0 (where String.withUTF8 is non-inlinable), delivering 50-100% higher throughput depending on the algorithm:
let matcher = FuzzyMatcher()
let query = matcher.prepare("getUser")
var buffer = matcher.makeBuffer()
for var candidate in candidates {
candidate.withUTF8 { utf8 in
if let match = matcher.score(utf8: utf8, against: query, buffer: &buffer) {
print("score=\(match.score)")
}
}
}
Note: This performance gap is a Swift 6.0 limitation. When the library adopts Swift 6.2+ Span, the String API will recover full throughput and this method may be deprecated.
// Edit distance mode with custom tuning
let config = MatchConfig(
minScore: 0.5,
algorithm: .editDistance(EditDistanceConfig(
maxEditDistance: 3, // Allow up to 3 edits (default: 2)
prefixWeight: 2.0, // Boost prefix matches (default: 1.5)
substringWeight: 0.8, // Weight for substring matches (default: 1.0)
wordBoundaryBonus: 0.12, // Bonus for word boundary matches (default: 0.1)
consecutiveBonus: 0.06, // Bonus for consecutive matches (default: 0.05)
gapPenalty: .affine(open: 0.04, extend: 0.01) // Gap penalty model
))
)
let matcher = FuzzyMatcher(config: config)
// Smith-Waterman mode with custom tuning
let swConfig = MatchConfig(
algorithm: .smithWaterman(SmithWatermanConfig(
penaltyGapStart: 5,
bonusBoundary: 10,
bonusCamelCase: 7
))
)
let swMatcher = FuzzyMatcher(config: swConfig)
FuzzyMatcher uses intelligent scoring bonuses to improve ranking quality:
getUserById), snake_case boundaries (get_user), and after digits receive a bonus.affine(open:extend:) (default) - Starting a gap costs more than continuing one.linear(perCharacter:) - Each gap character costs the sameThis means queries like "gubi" will rank "getUserById" higher than "debugging" because the query characters match at word boundaries.
// Disable bonuses for pure edit-distance scoring
let noBonusConfig = MatchConfig(
algorithm: .editDistance(EditDistanceConfig(
wordBoundaryBonus: 0.0,
consecutiveBonus: 0.0,
gapPenalty: .none,
firstMatchBonus: 0.0
))
)
// Use linear gap penalty instead of affine
let linearConfig = MatchConfig(
algorithm: .editDistance(EditDistanceConfig(
gapPenalty: .linear(perCharacter: 0.01)
))
)
FuzzyMatcher is fully thread-safe. Each task should use its own buffer:
let matcher = FuzzyMatcher()
let query = matcher.prepare("getData")
let candidates = loadLargeCandidateList()
// Process concurrently using Swift TaskGroup
let workerCount = 8
let chunkSize = (candidates.count + workerCount - 1) / workerCount
await withTaskGroup(of: [ScoredMatch].self) { group in
for start in stride(from: 0, to: candidates.count, by: chunkSize) {
let end = min(start + chunkSize, candidates.count)
let chunk = candidates[start..<end]
group.addTask {
var buffer = matcher.makeBuffer() // Each task gets its own buffer
return chunk.compactMap { candidate in
matcher.score(candidate, against: query, buffer: &buffer)
}
}
}
// Collect results from all tasks
for await taskMatches in group {
// Handle matches...
}
}
After scoring, use attributedHighlight() to get a styled AttributedString for UI display. Call it only for visible results (typically ~10-20), not the full