This is a large and complex PR, so let me explain in detail its changes.
First, I had to create new index mappings for Bleve and ElasticSerach as
the current ones do not support search by filename. This requires Gitea
to recreate the code search indexes (I do not know if this is a breaking
change, but I feel it deserves a heads-up).
I've used [this
approach](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/analysis-pathhierarchy-tokenizer.html)
to model the filename index. It allows us to efficiently search for both
the full path and the name of a file. Bleve, however, does not support
this out-of-box, so I had to code a brand new [token
filter](https://blevesearch.com/docs/Token-Filters/) to generate the
search terms.
I also did an overhaul in the `indexer_test.go` file. It now asserts the
order of the expected results (this is important since matches based on
the name of a file are more relevant than those based on its content).
I've added new test scenarios that deal with searching by filename. They
use a new repo included in the Gitea fixture.
The screenshot below depicts how Gitea shows the search results. It
shows results based on content in the same way as the current version
does. In matches based on the filename, the first seven lines of the
file contents are shown (BTW, this is how GitHub does it).
![image](https://github.com/user-attachments/assets/9d938d86-1a8d-4f89-8644-1921a473e858)
Resolves#32096
---------
Signed-off-by: Bruno Sofiato <bruno.sofiato@gmail.com>
This PR addresses the missing `bin` field in Composer metadata, which
currently causes vendor-provided binaries to not be symlinked to
`vendor/bin` during installation.
In the current implementation, running `composer install` does not
publish the binaries, leading to issues where expected binaries are not
available.
By properly declaring the `bin` field, this PR ensures that binaries are
correctly symlinked upon installation, as described in the [Composer
documentation](https://getcomposer.org/doc/articles/vendor-binaries.md).
X-Forwarded-Host has many problems: non-standard, not well-defined
(X-Forwarded-Port or not), conflicts with Host header, it already caused
problems like #31907. So do not use X-Forwarded-Host, just use Host
header directly.
Official document also only uses `Host` header and never mentioned
others.
Close#31801. Follow #31761.
Since there are so many benefits of compression and there are no reports
of related issues after weeks, it should be fine to enable compression
by default.
This will allow instance admins to view signup pattern patterns for
public instances. It is modelled after discourse, mastodon, and
MediaWiki's approaches.
Note: This has privacy implications, but as the above-stated open-source
projects take this approach, especially MediaWiki, which I have no doubt
looked into this thoroughly, it is likely okay for us, too. However, I
would be appreciative of any feedback on how this could be improved.
---------
Co-authored-by: Giteabot <teabot@gitea.io>
Replace #32001.
To prevent the context cache from being misused for long-term work
(which would result in using invalid cache without awareness), the
context cache is designed to exist for a maximum of 10 seconds. This
leads to many false reports, especially in the case of slow SQL.
This PR increases it to 5 minutes to reduce false reports.
5 minutes is not a very safe value, as a lot of changes may have
occurred within that time frame. However, as far as I know, there has
not been a case of misuse of context cache discovered so far, so I think
5 minutes should be OK.
Please note that after this PR, if warning logs are found again, it
should get attention, at that time it can be almost 100% certain that it
is a misuse.
https://github.com/go-fed/httpsig seems to be unmaintained.
Switch to github.com/42wim/httpsig which has removed deprecated crypto
and default sha256 signing for ssh rsa.
No impact for those that use ed25519 ssh certificates.
This is a breaking change for:
- gitea.com/gitea/tea (go-sdk) - I'll be sending a PR there too
- activitypub using deprecated crypto (is this actually used?)
Follow #31908. The main refactor is that it has removed the returned
context of `Lock`.
The returned context of `Lock` in old code is to provide a way to let
callers know that they have lost the lock. But in most cases, callers
shouldn't cancel what they are doing even it has lost the lock. And the
design would confuse developers and make them use it incorrectly.
See the discussion history:
https://github.com/go-gitea/gitea/pull/31813#discussion_r1732041513 and
https://github.com/go-gitea/gitea/pull/31813#discussion_r1734078998
It's a breaking change, but since the new module hasn't been used yet, I
think it's OK to not add the `pr/breaking` label.
## Design principles
It's almost copied from #31908, but with some changes.
### Use spinlock even in memory implementation (unchanged)
In actual use cases, users may cancel requests. `sync.Mutex` will block
the goroutine until the lock is acquired even if the request is
canceled. And the spinlock is more suitable for this scenario since it's
possible to give up the lock acquisition.
Although the spinlock consumes more CPU resources, I think it's
acceptable in most cases.
### Do not expose the mutex to callers (unchanged)
If we expose the mutex to callers, it's possible for callers to reuse
the mutex, which causes more complexity.
For example:
```go
lock := GetLocker(key)
lock.Lock()
// ...
// even if the lock is unlocked, we cannot GC the lock,
// since the caller may still use it again.
lock.Unlock()
lock.Lock()
// ...
lock.Unlock()
// callers have to GC the lock manually.
RemoveLocker(key)
```
That's why
https://github.com/go-gitea/gitea/pull/31813#discussion_r1721200549
In this PR, we only expose `ReleaseFunc` to callers. So callers just
need to call `ReleaseFunc` to release the lock, and do not need to care
about the lock's lifecycle.
```go
release, err := locker.Lock(ctx, key)
if err != nil {
return err
}
// ...
release()
// if callers want to lock again, they have to re-acquire the lock.
release, err := locker.Lock(ctx, key)
// ...
```
In this way, it's also much easier for redis implementation to extend
the mutex automatically, so that callers do not need to care about the
lock's lifecycle. See also
https://github.com/go-gitea/gitea/pull/31813#discussion_r1722659743
### Use "release" instead of "unlock" (unchanged)
For "unlock", it has the meaning of "unlock an acquired lock". So it's
not acceptable to call "unlock" when failed to acquire the lock, or call
"unlock" multiple times. It causes more complexity for callers to decide
whether to call "unlock" or not.
So we use "release" instead of "unlock" to make it clear. Whether the
lock is acquired or not, callers can always call "release", and it's
also safe to call "release" multiple times.
But the code DO NOT expect callers to not call "release" after acquiring
the lock. If callers forget to call "release", it will cause resource
leak. That's why it's always safe to call "release" without extra
checks: to avoid callers to forget to call it.
### Acquired locks could be lost, but the callers shouldn't stop
Unlike `sync.Mutex` which will be locked forever once acquired until
calling `Unlock`, for distributed lock, the acquired lock could be lost.
For example, the caller has acquired the lock, and it holds the lock for
a long time since auto-extending is working for redis. However, it lost
the connection to the redis server, and it's impossible to extend the
lock anymore.
In #31908, it will cancel the context to make the operation stop, but
it's not safe. Many operations are not revert-able. If they have been
interrupted, then the instance goes corrupted. So `Lock` won't return
`ctx` anymore in this PR.
### Multiple ways to use the lock
1. Regular way
```go
release, err := Lock(ctx, key)
if err != nil {
return err
}
defer release()
// ...
```
2. Early release
```go
release, err := Lock(ctx, key)
if err != nil {
return err
}
defer release()
// ...
// release the lock earlier
release()
// continue to do something else
// ...
```
3. Functional way
```go
if err := LockAndDo(ctx, key, func(ctx context.Context) error {
// ...
return nil
}); err != nil {
return err
}
```
To help #31813, but do not replace it, since this PR just introduces the
new module but misses some work:
- New option in settings. `#31813` has done it.
- Use the locks in business logic. `#31813` has done it.
So I think the most efficient way is to merge this PR first (if it's
acceptable) and then finish #31813.
## Design principles
### Use spinlock even in memory implementation
In actual use cases, users may cancel requests. `sync.Mutex` will block
the goroutine until the lock is acquired even if the request is
canceled. And the spinlock is more suitable for this scenario since it's
possible to give up the lock acquisition.
Although the spinlock consumes more CPU resources, I think it's
acceptable in most cases.
### Do not expose the mutex to callers
If we expose the mutex to callers, it's possible for callers to reuse
the mutex, which causes more complexity.
For example:
```go
lock := GetLocker(key)
lock.Lock()
// ...
// even if the lock is unlocked, we cannot GC the lock,
// since the caller may still use it again.
lock.Unlock()
lock.Lock()
// ...
lock.Unlock()
// callers have to GC the lock manually.
RemoveLocker(key)
```
That's why
https://github.com/go-gitea/gitea/pull/31813#discussion_r1721200549
In this PR, we only expose `ReleaseFunc` to callers. So callers just
need to call `ReleaseFunc` to release the lock, and do not need to care
about the lock's lifecycle.
```go
_, release, err := locker.Lock(ctx, key)
if err != nil {
return err
}
// ...
release()
// if callers want to lock again, they have to re-acquire the lock.
_, release, err := locker.Lock(ctx, key)
// ...
```
In this way, it's also much easier for redis implementation to extend
the mutex automatically, so that callers do not need to care about the
lock's lifecycle. See also
https://github.com/go-gitea/gitea/pull/31813#discussion_r1722659743
### Use "release" instead of "unlock"
For "unlock", it has the meaning of "unlock an acquired lock". So it's
not acceptable to call "unlock" when failed to acquire the lock, or call
"unlock" multiple times. It causes more complexity for callers to decide
whether to call "unlock" or not.
So we use "release" instead of "unlock" to make it clear. Whether the
lock is acquired or not, callers can always call "release", and it's
also safe to call "release" multiple times.
But the code DO NOT expect callers to not call "release" after acquiring
the lock. If callers forget to call "release", it will cause resource
leak. That's why it's always safe to call "release" without extra
checks: to avoid callers to forget to call it.
### Acquired locks could be lost
Unlike `sync.Mutex` which will be locked forever once acquired until
calling `Unlock`, in the new module, the acquired lock could be lost.
For example, the caller has acquired the lock, and it holds the lock for
a long time since auto-extending is working for redis. However, it lost
the connection to the redis server, and it's impossible to extend the
lock anymore.
If the caller don't stop what it's doing, another instance which can
connect to the redis server could acquire the lock, and do the same
thing, which could cause data inconsistency.
So the caller should know what happened, the solution is to return a new
context which will be canceled if the lock is lost or released:
```go
ctx, release, err := locker.Lock(ctx, key)
if err != nil {
return err
}
defer release()
// ...
DoSomething(ctx)
// the lock is lost now, then ctx has been canceled.
// Failed, since ctx has been canceled.
DoSomethingElse(ctx)
```
### Multiple ways to use the lock
1. Regular way
```go
ctx, release, err := Lock(ctx, key)
if err != nil {
return err
}
defer release()
// ...
```
2. Early release
```go
ctx, release, err := Lock(ctx, key)
if err != nil {
return err
}
defer release()
// ...
// release the lock earlier and reset the context back
ctx = release()
// continue to do something else
// ...
```
3. Functional way
```go
if err := LockAndDo(ctx, key, func(ctx context.Context) error {
// ...
return nil
}); err != nil {
return err
}
```
When opening a repository, it will call `ensureValidRepository` and also
`CatFileBatch`. But sometimes these will not be used until repository
closed. So it's a waste of CPU to invoke 3 times git command for every
open repository.
This PR removed all of these from `OpenRepository` but only kept
checking whether the folder exists. When a batch is necessary, the
necessary functions will be invoked.
fix#23668
My plan:
* In the `actions.list` method, if workflow is selected and IsAdmin,
check whether the on event contains `workflow_dispatch`. If so, display
a `Run workflow` button to allow the user to manually trigger the run.
* Providing a form that allows users to select target brach or tag, and
these parameters can be configured in yaml
* Simple form validation, `required` input cannot be empty
* Add a route `/actions/run`, and an `actions.Run` method to handle
* Add `WorkflowDispatchPayload` struct to pass the Webhook event payload
to the runner when triggered, this payload carries the `inputs` values
and other fields, doc: [workflow_dispatch
payload](https://docs.github.com/en/webhooks/webhook-events-and-payloads#workflow_dispatch)
Other PRs
* the `Workflow.WorkflowDispatchConfig()` method still return non-nil
when workflow_dispatch is not defined. I submitted a PR
https://gitea.com/gitea/act/pulls/85 to fix it. Still waiting for them
to process.
Behavior should be same with github, but may cause confusion. Here's a
quick reminder.
*
[Doc](https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#workflow_dispatch)
Said: This event will `only` trigger a workflow run if the workflow file
is `on the default branch`.
* If the workflow yaml file only exists in a non-default branch, it
cannot be triggered. (It will not even show up in the workflow list)
* If the same workflow yaml file exists in each branch at the same time,
the version of the default branch is used. Even if `Use workflow from`
selects another branch
![image](https://github.com/go-gitea/gitea/assets/3114995/4bf596f3-426b-48e8-9b8f-0f6d18defd79)
```yaml
name: Docker Image CI
on:
workflow_dispatch:
inputs:
logLevel:
description: 'Log level'
required: true
default: 'warning'
type: choice
options:
- info
- warning
- debug
tags:
description: 'Test scenario tags'
required: false
type: boolean
boolean_default_true:
description: 'Test scenario tags'
required: true
type: boolean
default: true
boolean_default_false:
description: 'Test scenario tags'
required: false
type: boolean
default: false
environment:
description: 'Environment to run tests against'
type: environment
required: true
default: 'environment values'
number_required_1:
description: 'number '
type: number
required: true
default: '100'
number_required_2:
description: 'number'
type: number
required: true
default: '100'
number_required_3:
description: 'number'
type: number
required: true
default: '100'
number_1:
description: 'number'
type: number
required: false
number_2:
description: 'number'
type: number
required: false
number_3:
description: 'number'
type: number
required: false
env:
inputs_logLevel: ${{ inputs.logLevel }}
inputs_tags: ${{ inputs.tags }}
inputs_boolean_default_true: ${{ inputs.boolean_default_true }}
inputs_boolean_default_false: ${{ inputs.boolean_default_false }}
inputs_environment: ${{ inputs.environment }}
inputs_number_1: ${{ inputs.number_1 }}
inputs_number_2: ${{ inputs.number_2 }}
inputs_number_3: ${{ inputs.number_3 }}
inputs_number_required_1: ${{ inputs.number_required_1 }}
inputs_number_required_2: ${{ inputs.number_required_2 }}
inputs_number_required_3: ${{ inputs.number_required_3 }}
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- run: ls -la
- run: env | grep inputs
- run: echo ${{ inputs.logLevel }}
- run: echo ${{ inputs.boolean_default_false }}
```
![image](https://github.com/go-gitea/gitea/assets/3114995/a58a842d-a0ff-4618-bc6d-83a9596d07c8)
![image](https://github.com/go-gitea/gitea/assets/3114995/44a7cca5-7bd4-42a9-8723-91751a501c88)
---------
Co-authored-by: TKaxv_7S <954067342@qq.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: Denys Konovalov <kontakt@denyskon.de>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Fix#31395
This regression is introduced by #30273. To find out how GitHub handles
this case, I did [some
tests](https://github.com/go-gitea/gitea/issues/31395#issuecomment-2278929115).
I use redirect in this PR instead of checking if the corresponding `.md`
file exists when rendering the link because GitHub also uses redirect.
With this PR, there is no need to resolve the raw wiki link when
rendering a wiki page. If a wiki link points to a raw file, access will
be redirected to the raw link.
Fix#31271.
When gogit is enabled, `IsObjectExist` calls
`repo.gogitRepo.ResolveRevision`, which is not correct. It's for
checking references not objects, it could work with commit hash since
it's both a valid reference and a commit object, but it doesn't work
with blob objects.
So it causes #31271 because it reports that all blob objects do not
exist.
Support compression for Actions logs to save storage space and
bandwidth. Inspired by
https://github.com/go-gitea/gitea/issues/24256#issuecomment-1521153015
The biggest challenge is that the compression format should support
[seekable](https://github.com/facebook/zstd/blob/dev/contrib/seekable_format/zstd_seekable_compression_format.md).
So when users are viewing a part of the log lines, Gitea doesn't need to
download the whole compressed file and decompress it.
That means gzip cannot help here. And I did research, there aren't too
many choices, like bgzip and xz, but I think zstd is the most popular
one. It has an implementation in Golang with
[zstd](https://github.com/klauspost/compress/tree/master/zstd) and
[zstd-seekable-format-go](https://github.com/SaveTheRbtz/zstd-seekable-format-go),
and what is better is that it has good compatibility: a seekable format
zstd file can be read by a regular zstd reader.
This PR introduces a new package `zstd` to combine and wrap the two
packages, to provide a unified and easy-to-use API.
And a new setting `LOG_COMPRESSION` is added to the config, although I
don't see any reason why not to use compression, I think's it's a good
idea to keep the default with `none` to be consistent with old versions.
`LOG_COMPRESSION` takes effect for only new log files, it adds `.zst` as
an extension to the file name, so Gitea can determine if it needs
decompression according to the file name when reading. Old files will
keep the format since it's not worth converting them, as they will be
cleared after #31735.
<img width="541" alt="image"
src="https://github.com/user-attachments/assets/e9598764-a4e0-4b68-8c2b-f769265183c9">
Found at
https://github.com/go-gitea/gitea/pull/31790#issuecomment-2272898915
`unit-tests-gogit` never work since the workflow set `TAGS` with
`gogit`, but the Makefile use `TEST_TAGS`.
This PR adds the values of `TAGS` to `TEST_TAGS`, ensuring that setting
`TAGS` is always acceptable and avoiding confusion about which one
should be set.
Fix#31738
When pushing a new branch, the old commit is zero. Most git commands
cannot recognize the zero commit id. To get the changed files in the
push, we need to get the first diverge commit of this branch. In most
situations, we could check commits one by one until one commit is
contained by another branch. Then we will think that commit is the
diverge point.
And in a pre-receive hook, this will be more difficult because all
commits haven't been merged and they actually stored in a temporary
place by git. So we need to bring some envs to let git know the commit
exist.
close #27031
If the rpm package does not contain a matching gpg signature, the
installation will fail. See (#27031) , now auto-signing rpm uploads.
This option is turned off by default for compatibility.
If the assign the pull request review to a team, it did not show the
members of the team in the "requested_reviewers" field, so the field was
null. As a solution, I added the team members to the array.
fix#31764
Fix#31137.
Replace #31623#31697.
When migrating LFS objects, if there's any object that failed (like some
objects are losted, which is not really critical), Gitea will stop
migrating LFS immediately but treat the migration as successful.
This PR checks the error according to the [LFS api
doc](https://github.com/git-lfs/git-lfs/blob/main/docs/api/batch.md#successful-responses).
> LFS object error codes should match HTTP status codes where possible:
>
> - 404 - The object does not exist on the server.
> - 409 - The specified hash algorithm disagrees with the server's
acceptable options.
> - 410 - The object was removed by the owner.
> - 422 - Validation error.
If the error is `404`, it's safe to ignore it and continue migration.
Otherwise, stop the migration and mark it as failed to ensure data
integrity of LFS objects.
And maybe we should also ignore others errors (maybe `410`? I'm not sure
what's the difference between "does not exist" and "removed by the
owner".), we can add it later when some users report that they have
failed to migrate LFS because of an error which should be ignored.
See discussion on #31561 for some background.
The introspect endpoint was using the OIDC token itself for
authentication. This fixes it to use basic authentication with the
client ID and secret instead:
* Applications with a valid client ID and secret should be able to
successfully introspect an invalid token, receiving a 200 response
with JSON data that indicates the token is invalid
* Requests with an invalid client ID and secret should not be able
to introspect, even if the token itself is valid
Unlike #31561 (which just future-proofed the current behavior against
future changes to `DISABLE_QUERY_AUTH_TOKEN`), this is a potential
compatibility break (some introspection requests without valid client
IDs that would previously succeed will now fail). Affected deployments
must begin sending a valid HTTP basic authentication header with their
introspection requests, with the username set to a valid client ID and
the password set to the corresponding client secret.
When you are entering a number in the issue search, you likely want the
issue with the given ID (code internal concept: issue index).
As such, when a number is detected, the issue with the corresponding ID
will now be added to the results.
Fixes#4479
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Make it posible to let mails show e.g.:
`Max Musternam (via gitea.kithara.com) <gitea@kithara.com>`
Docs: https://gitea.com/gitea/docs/pulls/23
---
*Sponsored by Kithara Software GmbH*
Issue template dropdown can have many entries, and it could be better to
have them rendered as list later on if multi-select is enabled.
so this adds an option to the issue template engine to do so.
DOCS: https://gitea.com/gitea/docs/pulls/19
---
## demo:
```yaml
name: Name
title: Title
about: About
labels: ["label1", "label2"]
ref: Ref
body:
- type: dropdown
id: id6
attributes:
label: Label of dropdown (list)
description: Description of dropdown
multiple: true
list: true
options:
- Option 1 of dropdown
- Option 2 of dropdown
- Option 3 of dropdown
- Option 4 of dropdown
- Option 5 of dropdown
- Option 6 of dropdown
- Option 7 of dropdown
- Option 8 of dropdown
- Option 9 of dropdown
```
![image](https://github.com/user-attachments/assets/102ed0f4-89da-420b-ab2a-1788b59676f9)
![image](https://github.com/user-attachments/assets/a2bdb14e-43ff-4cc6-9bbe-20244830453c)
---
*Sponsored by Kithara Software GmbH*
We have some instances that only allow using an external authentication
source for authentication. In this case, users changing their email,
password, or linked OpenID connections will not have any effect, and
we'd like to prevent showing that to them to prevent confusion.
Included in this are several changes to support this:
* A new setting to disable user managed authentication credentials
(email, password & OpenID connections)
* A new setting to disable user managed MFA (2FA codes & WebAuthn)
* Fix an issue where some templates had separate logic for determining
if a feature was disabled since it didn't check the globally disabled
features
* Hide more user setting pages in the navbar when their settings aren't
enabled
---------
Co-authored-by: Kyle D <kdumontnu@gmail.com>
Fixes#22722
### Problem
Currently, it is not possible to force push to a branch with branch
protection rules in place. There are often times where this is necessary
(CI workflows/administrative tasks etc).
The current workaround is to rename/remove the branch protection,
perform the force push, and then reinstate the protections.
### Solution
Provide an additional section in the branch protection rules to allow
users to specify which users with push access can also force push to the
branch. The default value of the rule will be set to `Disabled`, and the
UI is intuitive and very similar to the `Push` section.
It is worth noting in this implementation that allowing force push does
not override regular push access, and both will need to be enabled for a
user to force push.
This applies to manual force push to a remote, and also in Gitea UI
updating a PR by rebase (which requires force push)
This modifies the `BranchProtection` API structs to add:
- `enable_force_push bool`
- `enable_force_push_whitelist bool`
- `force_push_whitelist_usernames string[]`
- `force_push_whitelist_teams string[]`
- `force_push_whitelist_deploy_keys bool`
### Updated Branch Protection UI:
<img width="943" alt="image"
src="https://github.com/go-gitea/gitea/assets/79623665/7491899c-d816-45d5-be84-8512abd156bf">
### Pull Request `Update branch by Rebase` option enabled with source
branch `test` being a protected branch:
![image](https://github.com/go-gitea/gitea/assets/79623665/e018e6e9-b7b2-4bd3-808e-4947d7da35cc)
<img width="1038" alt="image"
src="https://github.com/go-gitea/gitea/assets/79623665/57ead13e-9006-459f-b83c-7079e6f4c654">
---------
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Running git update-index for every individual file is slow, so add and
remove everything with a single git command.
When such a big commit lands in the default branch, it could cause PR
creation and patch checking for all open PRs to be slow, or time out
entirely. For example, a commit that removes 1383 files was measured to
take more than 60 seconds and timed out. With this change checking took
about a second.
This is related to #27967, though this will not help with commits that
change many lines in few files.