gitea/modules
Bruno Sofiato 900ac62251
Allow code search by filename (#32210)
This is a large and complex PR, so let me explain in detail its changes.

First, I had to create new index mappings for Bleve and ElasticSerach as
the current ones do not support search by filename. This requires Gitea
to recreate the code search indexes (I do not know if this is a breaking
change, but I feel it deserves a heads-up).

I've used [this
approach](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/analysis-pathhierarchy-tokenizer.html)
to model the filename index. It allows us to efficiently search for both
the full path and the name of a file. Bleve, however, does not support
this out-of-box, so I had to code a brand new [token
filter](https://blevesearch.com/docs/Token-Filters/) to generate the
search terms.

I also did an overhaul in the `indexer_test.go` file. It now asserts the
order of the expected results (this is important since matches based on
the name of a file are more relevant than those based on its content).
I've added new test scenarios that deal with searching by filename. They
use a new repo included in the Gitea fixture.

The screenshot below depicts how Gitea shows the search results. It
shows results based on content in the same way as the current version
does. In matches based on the filename, the first seven lines of the
file contents are shown (BTW, this is how GitHub does it).


![image](https://github.com/user-attachments/assets/9d938d86-1a8d-4f89-8644-1921a473e858)

Resolves #32096

---------

Signed-off-by: Bruno Sofiato <bruno.sofiato@gmail.com>
2024-10-11 23:35:04 +00:00
..
actions Fix wrong status of `Set up Job` when first step is skipped (#32120) 2024-09-24 18:34:08 +00:00
activitypub Remove SHA1 for support for ssh rsa signing (#31857) 2024-09-07 18:05:18 -04:00
analyze Rename code_langauge.go to code_language.go (#26377) 2023-08-07 15:00:53 -04:00
assetfs Use `Set[Type]` instead of `map[Type]bool/struct{}`. (#26804) 2023-08-30 06:55:25 +00:00
auth Add Passkey login support (#31504) 2024-06-29 22:50:03 +00:00
avatar Use `crypto/sha256` (#29386) 2024-02-25 13:32:13 +00:00
badge Implement actions badge svgs (#28102) 2024-02-27 18:56:18 +01:00
base fix OIDC introspection authentication (#31632) 2024-07-23 12:43:03 +00:00
cache bump to go 1.23 (#31855) 2024-09-10 02:23:07 +00:00
charset Render embedded code preview by permlink in markdown (#30234) 2024-04-02 17:48:27 +00:00
container Allow disabling authentication related user features (#31535) 2024-07-09 17:36:31 +00:00
csv Render embedded code preview by permlink in markdown (#30234) 2024-04-02 17:48:27 +00:00
dump Refactor "dump" sub-command (#30240) 2024-04-03 02:16:46 +00:00
emoji Update emoji set to Unicode 15 (#25595) 2023-06-29 16:29:48 +00:00
eventsource Final round of `db.DefaultContext` refactor (#27587) 2023-10-14 08:37:24 +00:00
generate Refactor JWT secret generating & decoding code (#29172) 2024-02-16 15:18:30 +00:00
git update git book link to v2 (#32221) 2024-10-09 13:04:34 +08:00
gitgraph More `db.DefaultContext` refactor (#27265) 2023-09-29 12:12:54 +00:00
gitrepo Use repo as of renderctx's member rather than a repoPath on metas (#29222) 2024-05-30 07:04:01 +00:00
globallock Use global lock instead of NewExclusivePool to allow distributed lock between multiple Gitea instances (#31813) 2024-09-06 10:12:41 +00:00
graceful Remove unused error in graceful manager (#29871) 2024-03-18 21:14:51 +00:00
hcaptcha Consume hcaptcha and pwn deps (#22610) 2023-01-29 09:49:51 -06:00
highlight Add option to disable ambiguous unicode characters detection (#28454) 2023-12-17 14:38:54 +00:00
hostmatcher Support allowed hosts for migrations to work with proxy (#32025) 2024-09-11 05:47:00 +00:00
html Refactor backend SVG package and add tests (#26335) 2023-08-05 04:34:59 +00:00
httpcache Fix wrong last modify time (#32102) 2024-09-21 21:56:25 +00:00
httplib Fix wrong last modify time (#32102) 2024-09-21 21:56:25 +00:00
indexer Allow code search by filename (#32210) 2024-10-11 23:35:04 +00:00
issue/template bump to go 1.23 (#31855) 2024-09-10 02:23:07 +00:00
json Replace `interface{}` with `any` (#25686) 2023-07-04 18:36:08 +00:00
label Make label templates have consistent behavior and priority (#23749) 2023-04-10 16:44:02 +08:00
lfs Distinguish LFS object errors to ignore missing objects during migration (#31702) 2024-07-31 10:29:48 +00:00
lfstransfer Add pure SSH LFS support (#31516) 2024-09-27 10:27:37 -04:00
log Add some tests to clarify the "must-change-password" behavior (#30693) 2024-04-27 12:23:37 +00:00
markup Use camo.Always instead of camo.Allways (#32097) 2024-09-21 12:50:54 +03:00
mcaptcha Implement FSFE REUSE for golang files (#21840) 2022-11-27 18:20:29 +00:00
metrics Rename project board -> column to make the UI less confusing (#30170) 2024-05-27 08:59:54 +00:00
migration Support migration from AWS CodeCommit (#31981) 2024-09-11 07:49:42 +08:00
nosql Update tool dependencies, lock govulncheck and actionlint (#25655) 2023-07-09 11:58:06 +00:00
optional Resolve lint for unused parameter and unnecessary type arguments (#30750) 2024-04-29 08:47:56 +00:00
options Use a general approach to access custom/static/builtin assets (#24022) 2023-04-12 18:16:45 +08:00
packages Add bin to Composer Metadata (#32099) 2024-09-21 22:42:17 +00:00
paginator Use more specific test methods (#24265) 2023-04-22 17:56:27 -04:00
pprof Implement FSFE REUSE for golang files (#21840) 2022-11-27 18:20:29 +00:00
private Move database operations of merging a pull request to post receive hook and add a transaction (#30805) 2024-05-07 07:36:48 +00:00
process Update misspell to 0.5.1 and add `misspellings.csv` (#30573) 2024-04-27 08:03:49 +00:00
proxy Use proxy for pull mirror (#22771) 2023-02-11 08:39:50 +08:00
proxyprotocol Implement FSFE REUSE for golang files (#21840) 2022-11-27 18:20:29 +00:00
public Refactor CORS handler (#28587) 2023-12-25 20:13:18 +08:00
queue bump to go 1.23 (#31855) 2024-09-10 02:23:07 +00:00
recaptcha Implement FSFE REUSE for golang files (#21840) 2022-11-27 18:20:29 +00:00
references Refactor to use UnsafeStringToBytes (#31358) 2024-06-14 01:26:33 +00:00
regexplru Upgrade go dependencies (#25819) 2023-07-14 11:00:31 +08:00
repository Support repo license (#24872) 2024-10-01 15:25:08 -04:00
secret Use `crypto/sha256` (#29386) 2024-02-25 13:32:13 +00:00
session Improve oauth2 client "preferred username field" logic and the error handling (#30622) 2024-04-25 11:22:32 +00:00
setting Enhance USER_DISABLED_FEATURES to allow disabling change username or full name (#31959) 2024-10-05 20:41:38 +00:00
sitemap Fix sitemap (#22272) 2022-12-30 23:31:00 +08:00
ssh Remove SSH workaround (#27893) 2023-11-03 15:21:05 +00:00
storage bump to go 1.23 (#31855) 2024-09-10 02:23:07 +00:00
structs Support repo license (#24872) 2024-10-01 15:25:08 -04:00
svg Refactor markdown attention render (#29984) 2024-03-22 12:16:23 +00:00
sync Use global lock instead of NewExclusivePool to allow distributed lock between multiple Gitea instances (#31813) 2024-09-06 10:12:41 +00:00
system Refactor to use UnsafeStringToBytes (#31358) 2024-06-14 01:26:33 +00:00
templates Lazy load avatar images (#32051) 2024-09-17 19:02:48 +00:00
test Remove sub-path from container registry realm (#31293) 2024-06-09 16:29:29 +08:00
testlogger Replace `interface{}` with `any` (#25686) 2023-07-04 18:36:08 +00:00
timeutil Refactor "dump" sub-command (#30240) 2024-04-03 02:16:46 +00:00
translation Render embedded code preview by permlink in markdown (#30234) 2024-04-02 17:48:27 +00:00
turnstile Add new captcha: cloudflare turnstile (#22369) 2023-02-05 15:29:03 +08:00
typesniffer Detect ogg mime-type as audio or video (#26494) 2023-08-15 10:31:25 +08:00
updatechecker Replace more db.DefaultContext (#27628) 2023-10-15 17:46:06 +02:00
uri Implement FSFE REUSE for golang files (#21840) 2022-11-27 18:20:29 +00:00
user Implement FSFE REUSE for golang files (#21840) 2022-11-27 18:20:29 +00:00
util Refactor to use UnsafeStringToBytes (#31358) 2024-06-14 01:26:33 +00:00
validation Check blocklist for emails when adding them to account (#26812) 2023-08-30 10:46:49 -05:00
web Refactor names (#31405) 2024-06-19 06:32:45 +08:00
webhook Fix schedule tasks bugs (#28691) 2024-01-12 21:50:38 +00:00
zstd Support compression for Actions logs (#31761) 2024-08-09 10:10:30 +08:00