summaryrefslogtreecommitdiff
path: root/README.md
blob: e6967d627e320e62beb6960bb1dd99b66ca04282 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
`crep` is a command-line tool designed for searching code using
[Tree-sitter](https://tree-sitter.github.io/tree-sitter/) queries. Unlike
traditional `grep`, which operates on text lines, `crep` understands the
structure of your code, allowing for more precise semantic searching.

## Features

- **Semantic Search**: Uses Tree-sitter to parse code into Concrete Syntax Trees
  (CSTs) and execute queries against them.
- **Broad Language Support**: Supports multiple languages including C, C++, Go,
  Python, PHP, Rust, JavaScript and Lua.
- **Multi-threaded**: Utilizes a custom thread pool for efficient scanning of
  large codebases.
- **Structural Matching**: Reports file path, line number, return type, function
  name, and parameters for each match.
- **Debug Mode**: Supports detailed logging via the `DEBUG` environment
  variable.

## Prerequisites

To build `crep`, you need:
- A C compiler (GCC or Clang)
- `make`
- `xxd` (used for embedding Tree-sitter queries)
- `pthread` library

## Installation

```bash
git clone https://github.com/mitjafelicijan/crep.git
cd crep
make all
```

## Usage

```bash
./crep [-c|--case-sensitive] [-l|--levenshtein <dist>] [-d|--depth <level>] <search_term> [path]
```

- `-c, --case-sensitive`: Enable case-sensitive matching (default is case-insensitive).
- `-l, --levenshtein <dist>`: Enable fuzzy matching with a maximum Levenshtein distance of `<dist>`.
- `-d, --depth <level>`: Set the maximum recursion depth for directory traversal.
- `<search_term>`: The string to search for within function/method names.
- `[path]`: Optional. The directory or file to search (defaults to current directory).

### Environment Variables

- `DEBUG=1` or `DEBUG=true`: Enable verbose debug logging.

### Examples

Search for all functions containing "init" in the current directory:

```bash
./crep init .
```

Search for functions containing "parse" in a specific file:
```bash
./crep parse main.c
```

Run with debug logging enabled:
```bash
DEBUG=1 ./crep init .
```

Search for "main" allowing for 2 typos (e.g. "mian"):

```bash
./crep -l 2 "mian" main.c
```

## How It Works

`crep` works by:

1. Identifying supported files based on extension.
2. Parsing the files using language-specific Tree-sitter grammars.
3. Executing a structural query to find function/method definitions, including
   return types and parameters.
4. Matching the identifier against the provided search term.
5. Displaying the results with added semantic context.

## Supported Languages

| Language   | Extensions     |
| ---------- | -------------- |
| C          | `.c`, `.h`     |
| C++        | `.cpp`, `.hpp` |
| Go         | `.go`          |
| Python     | `.py`          |
| PHP        | `.php`         |
| Rust       | `.rs`          |
| JavaScript | `.js`          |
| Lua        | `.lua`         |

## Additional resources

- https://en.wikipedia.org/wiki/Ctags
- https://www.emacswiki.org/emacs/TagsFile
- https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string-search_algorithm
- https://en.wikipedia.org/wiki/Levenshtein_distance
- https://github.com/tree-sitter/tree-sitter
- https://tree-sitter.github.io/tree-sitter/playground
- https://dreampuf.github.io/GraphvizOnline/
- https://github.com/tree-sitter-grammars