Commit Graph

54 Commits

Author SHA1 Message Date
zeripath c88547ce71
Add Goroutine stack inspector to admin/monitor (#19207)
Continues on from #19202.

Following the addition of pprof labels we can now more easily understand the relationship between a goroutine and the requests that spawn them. 

This PR takes advantage of the labels and adds a few others, then provides a mechanism for the monitoring page to query the pprof goroutine profile.

The binary profile that results from this profile is immediately piped in to the google library for parsing this and then stack traces are formed for the goroutines.

If the goroutine is within a context or has been created from a goroutine within a process context it will acquire the process description labels for that process. 

The goroutines are mapped with there associate pids and any that do not have an associated pid are placed in a group at the bottom as unbound.

In this way we should be able to more easily examine goroutines that have been stuck.

A manager command `gitea manager processes` is also provided that can export the processes (with or without stacktraces) to the command line.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-03-31 19:01:43 +02:00
zeripath 4e57bd1d30
Add number in queue status to monitor page (#18712)
Add number in queue status to the monitor page so that administrators can
assess how much work is left to be done in the queues.

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-02-12 13:31:26 +08:00
zeripath f8b21ac04a
Simplify Boost/Pause logic (#18673)
* Simplify Boost/Pause logic

#18658 has added a check to see if we need to boost because there is still work to do
however the check is slightly complex and not ideal. There's no point boosting if
the queue is paused or can't scale. Therefore merge the two selects into one and add
a check to p.paused.

Signed-off-by: Andrew Thornton <art27@cantab.net>

* And on resume add a zeroboost if necessary

Signed-off-by: Andrew Thornton <art27@cantab.net>

* simplify

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: Lauris BH <lauris@nix.lv>
2022-02-08 13:53:34 -05:00
zeripath df44017328
Restart zero worker if there is still work to do (#18658)
* Restart zero worker if there is still work to do

It is possible for the zero worker to timeout before all the work is finished.
This may mean that work may take a long time to complete because a worker will only
be induced on repushing.

Also ensure that requested count is reset after pulls and push mirror sync requests and add some more trace logging to the queue push.

Fix #18607

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-02-08 14:02:32 +00:00
zeripath 7ba1b7112f
Only attempt to flush queue if the underlying worker pool is not finished (#18593)
* Only attempt to flush queue if the underlying worker pool is not finished

There is a possible race whereby a worker pool could be cancelled but yet the
underlying queue is not empty. This will lead to flush-all cycling because it
cannot empty the pool.

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Apply suggestions from code review

Co-authored-by: Gusted <williamzijl7@hotmail.com>

Co-authored-by: Gusted <williamzijl7@hotmail.com>
2022-02-05 20:51:25 +00:00
6543 6f6b8491da
add gitea-fmt back (#18526) 2022-02-01 12:43:09 -05:00
zeripath be77ede954
Change some logging levels (#18421)
* Change some logging levels

* PlainTextWithBytes - 4xx/5xx this should just be TRACE
* notFoundInternal - the "error" here is too noisy and should be DEBUG
* WorkerPool - Worker pool scaling messages are normal and should be DEBUG

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-01-29 20:52:37 +00:00
zeripath 92b715e0f2
Attempt to prevent the deadlock in the QueueDiskChannel Test again (#18415)
* Attempt to prevent the deadlock in the QueueDiskChannel Test again

This time we're going to adjust the pause tests to only test the right
flag.

* Only switch off pushback once we know that we are not pushing anything else
* Ensure full redirection occurs
* More nicely handle a closed datachan
* And handle similar problems in queue_channel_test

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-01-29 11:37:08 +00:00
zeripath 713985b1a4
Prevent deadlocks in persistable channel pause test (#18410)
* Prevent deadlocks in persistable channel pause test

Because of reuse of the old paused/resumed channels in this test there
was a potential for deadlock. This PR ensures that the channels are always
reobtained.

It further adds some control code to detect hangs in future - and it
ensures that the pausing warning is not shown on shutdown.

Signed-off-by: Andrew Thornton <art27@cantab.net>

* do not warn but do pause

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-01-26 01:09:57 +02:00
zeripath ab7f701671
Make WrappedQueues and PersistableChannelUniqueQueues Pausable (#18393)
Implements the Pausable interface on WrappedQueues and PersistableChannelUniqueQueues

Reference #15928

Signed-off-by: Andrew Thornton art27@cantab.net
2022-01-24 22:54:35 +00:00
zeripath a82fd98d53
Pause queues (#15928)
* Start adding mechanism to return unhandled data

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Create pushback interface

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Add Pausable interface to WorkerPool and Manager

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Implement Pausable and PushBack for the bytefifos

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Implement Pausable and Pushback for ChannelQueues and ChannelUniqueQueues

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Wire in UI for pausing

Signed-off-by: Andrew Thornton <art27@cantab.net>

* add testcases and fix a few issues

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix build

Signed-off-by: Andrew Thornton <art27@cantab.net>

* prevent "race" in the test

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix jsoniter mismerge

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix conflicts

Signed-off-by: Andrew Thornton <art27@cantab.net>

* fix format

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Add warnings for no worker configurations and prevent data-loss with redis/levelqueue

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Use StopTimer

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-01-22 21:22:14 +00:00
6543 54e9ee37a7
format with gofumpt (#18184)
* gofumpt -w -l .

* gofumpt -w -l -extra .

* Add linter

* manual fix

* change make fmt
2022-01-20 18:46:10 +01:00
zeripath a85e75b2b1
Prevent deadlock in TestPersistableChannelQueue (#17717)
* Prevent deadlock in TestPersistableChannelQueue

There is a potential deadlock in TestPersistableChannelQueue due to attempting to
shutdown the test queue before it is ready.

Signed-off-by: Andrew Thornton <art27@cantab.net>

* prevent npe

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-11-19 01:13:25 +00:00
wxiaoguang 750a8465f5
A better go code formatter, and now `make fmt` can run in Windows (#17684)
* go build / format tools
* re-format imports
2021-11-17 20:34:35 +08:00
zeripath 7117c7774a
Make the Mirror Queue a queue (#17326)
Convert the old mirror syncing queue to the more modern queue format.

Fix a bug in the from the repo-archive queue PR - the assumption was made that uniqueness could be enforced with by checking equality in a map in channel unique queues - however this only works for primitive types - which was the initial intention but is an imperfect. This is fixed by marshalling the data and placing the martialled data in the unique map instead.

The documentation is also updated to add information about the deprecated configuration values.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-10-17 12:43:25 +01:00
Eng Zer Jun f2e7d5477f
refactor: move from io/ioutil to io and os package (#17109)
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2021-09-22 13:38:34 +08:00
zeripath 6c125e9797
Use immediate queues in integration tests and ensure that immediate (#16927)
queue type is also used for unique queues.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-09-03 11:20:57 +01:00
zeripath 06b9d553bc
Timeout on flush in testing (#16864)
* Timeout on flush in testing

At the end of each test the queues are flushed. At present there is no limit on the
length of time a flush can take which can lead to long flushes.

However, if the CI task is cancelled we lose the log information as to where the long
flush was taking place.

This PR simply adds a default time limit of 2 minutes - at which point an error will
be produced. This should allow us to more easily find the culprit.

Signed-off-by: Andrew Thornton <art27@cantab.net>

* return better error

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: 6543 <6543@obermui.de>
2021-08-30 00:27:51 -04:00
Lunny Xiao 9f31f3aa8a
Add an abstract json layout to make it's easier to change json library (#16528)
* Add an abstract json layout to make it's easier to change json library

* Fix import

* Fix import sequence

* Fix blank lines

* Fix blank lines
2021-07-24 18:03:58 +02:00
zeripath 49bd9a1111
Fix race in log (#16490)
A race has been detected in #1441 relating to getting log levels.

This PR protects the GetLevel and GetStacktraceLevel calls with a RW mutex.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-07-20 20:09:29 +01:00
zeripath e83abfc289
Prevent race in TestPersistableChannelQueue (#16468)
* Prevent race in TestPersistableChannelQueue

A slight race has become apparent in the TestPersistableChannelQueue.

This PR simply adds locking to prevent the race.

* make print value of "$(GOTESTFLAGS)" on test-backend and unit-test-coverage


Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
2021-07-17 19:09:56 +02:00
KN4CK3R 3607f79d78
Fixed assert statements. (#16089) 2021-06-07 07:27:09 +02:00
zeripath fe18a85f54
Fix panic (#16072)
There is an incorrect casting in the wrapped queue.

Fix #16071

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-06-05 15:23:22 +01:00
zeripath ba526ceffe
Multiple Queue improvements: LevelDB Wait on empty, shutdown empty shadow level queue, reduce goroutines etc (#15693)
* move shutdownfns, terminatefns and hammerfns out of separate goroutines

Coalesce the shutdownfns etc into a list of functions that get run at shutdown
rather then have them run at goroutines blocked on selects.

This may help reduce the background select/poll load in certain
configurations.

* The LevelDB queues can actually wait on empty instead of polling

Slight refactor to cause leveldb queues to wait on empty instead of polling.

* Shutdown the shadow level queue once it is empty

* Remove bytefifo additional goroutine for readToChan as it can just be run in run

* Remove additional removeWorkers goroutine for workers

* Simplify the AtShutdown and AtTerminate functions and add Channel Flusher

* Add shutdown flusher to CUQ

* move persistable channel shutdown stuff to Shutdown Fn

* Ensure that UPCQ has the correct config

* handle shutdown during the flushing

* reduce risk of race between zeroBoost and addWorkers

* prevent double shutdown

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-05-15 16:22:26 +02:00
zeripath aa65a607e4
Queue manager FlushAll can loop rapidly - add delay (#15733)
* Queue manager FlushAll can loop rapidly - add delay

Add delay within FlushAll to prevent rapid loop when workers are busy

Signed-off-by: Andrew Thornton <art27@cantab.net>

* as per lunny

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: 6543 <6543@obermui.de>
2021-05-12 00:22:08 +01:00
zeripath e22ee468cf
Exponential Backoff for ByteFIFO (#15724)
This PR is another in the vein of queue improvements. It suggests an
exponential backoff for bytefifo queues to reduce the load from queue
polling. This will mostly be useful for redis queues.

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: Lauris BH <lauris@nix.lv>
2021-05-08 17:29:47 +01:00
zeripath d11b9fbcce
Prevent race in TestChannelQueue_Batch (#15703)
There is a potential race in TestChannelQueue_Batch due to boost workers starting up

This PR simply removes the boosts from this test.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-05-03 17:16:59 +01:00
zeripath 0590176a23
Only use boost workers for leveldb shadow queues (#15696)
* The leveldb shadow queue of a persistable channel queue should always start with 0
workers and just use boost to add additional workers if necessary.

* create a zero boost so that if there are no workers in a pool - boost to start the workers

* actually set timeout appropriately on boosted workers

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-05-02 08:22:30 +01:00
zeripath 84f5a0bc62
Always set the merge base used to merge the commit (#15352)
The issue is that the TestPatch will reset the PR MergeBase - and it is possible for TestPatch to update the MergeBase whilst a merge is ongoing. The ensuing merge will then complete but it doesn't re-set the MergeBase it used to merge the PR.

Fixes the intermittent error in git test.

Signed-off-by: Andrew Thornton art27@cantab.net
2021-04-10 09:27:29 +01:00
6543 9c4601bdf8
Code Formats, Nits & Unused Func/Var deletions (#15286)
* _ to unused func options

* rm useless brakets

* rm trifial non used models functions

* rm dead code

* rm dead global vars

* fix routers/api/v1/repo/issue.go

* dont overload import module
2021-04-09 09:40:34 +02:00
zeripath f0e15250b9
Migrate to use jsoniter instead of encoding/json (#14841)
* Migrate to use jsoniter

* fix tests

* update gitea.com/go-chi/binding

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
2021-03-01 22:08:10 +01:00
zeripath b3c2e23cbb
Prevent race in PersistableChannelUniqueQueue.Has (#14651)
There is potentially a race with a slow starting internal
queue causing a NPE if Has is checked before the internal
queue has been setup.

This PR adds a lock on the Has() fn.

Fix #14311

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-02-13 20:02:09 +01:00
6543 ac97ea573c
[Vendor] Update go-redis to v8.5.0 (#13749)
* Update go-redis to v8.4.0

* github.com/go-redis/redis/v8  v8.4.0 -> v8.5.0

* Apply suggestions from code review

Co-authored-by: zeripath <art27@cantab.net>

* TODO

* Use the Queue termination channel as the default context for pushes

Signed-off-by: Andrew Thornton <art27@cantab.net>

* missed one

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: zeripath <art27@cantab.net>
2021-02-10 21:28:32 +00:00
zeripath c8f7a6b774
Slightly simplify the queue settings code to help reduce the risk of problems (#12976)
Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2020-10-15 17:40:03 -04:00
zeripath 5cfc1f573f
Fix the issue reported on #12385 (#12969)
Missed setting ConnectionString on queuesettings

Signed-off-by: Andrew Thornton <art27@cantab.net>
2020-09-28 19:00:54 -04:00
zeripath 7f8e3192cd
Allow common redis and leveldb connections (#12385)
* Allow common redis and leveldb connections

Prevents multiple reopening of redis and leveldb connections to the same
place by sharing connections.

Further allows for more configurable redis connection type using the
redisURI and a leveldbURI scheme.

Signed-off-by: Andrew Thornton <art27@cantab.net>

* add unit-test

Signed-off-by: Andrew Thornton <art27@cantab.net>

* as per @lunny

Signed-off-by: Andrew Thornton <art27@cantab.net>

* add test

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Update modules/cache/cache_redis.go

* Update modules/queue/queue_disk.go

* Update modules/cache/cache_redis.go

* Update modules/cache/cache_redis.go

* Update modules/queue/unique_queue_disk.go

* Update modules/queue/queue_disk.go

* Update modules/queue/unique_queue_disk.go

* Update modules/session/redis.go

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: Lauris BH <lauris@nix.lv>
2020-09-28 00:09:46 +03:00
Lunny Xiao 91e7ad569a
Add queue for code indexer (#10332)
* Add queue for code indexer

* Fix lint

* Fix test

* Fix lint

* Fix bug

* Fix bug

* Fix lint

* Add noqueue

* Fix tests

* Rename noqueue to immediate
2020-09-07 23:05:08 +08:00
zeripath 69b8d7ba19
use assignment in tests (#12734)
Signed-off-by: Andrew Thornton <art27@cantab.net>
2020-09-06 01:50:57 +03:00
zeripath 74bd9691c6
Re-attempt to delete temporary upload if the file is locked by another process (#12447)
Replace all calls to os.Remove/os.RemoveAll by retrying util.Remove/util.RemoveAll and remove circular dependencies from util.

Fix #12339

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: silverwind <me@silverwind.io>
2020-08-11 21:05:34 +01:00
zeripath 217647f331
Multiple small admin dashboard fixes (#12153)
* Remove spurious spacing between Maintenance Operations and its table on dashboard
* Prevent (EXTRA string) comments in Task headers
* Redirect tasks started from monitor page back to monitor
* Fix #12107 - redirects from process cancel should use AppSubUrl
* When wrapping queues set the name correctly

Signed-off-by: Andrew Thornton <art27@cantab.net>
2020-07-05 22:38:03 +03:00
zeripath c58bc4bf80
Prevent timer leaks in Workerpool and others (#11333)
There is a potential memory leak in `Workerpool` due to the intricacies of
`time.Timer` stopping.

Whenever a `time.Timer` is `Stop`ped its channel must be cleared using a
`select` if the result of the `Stop()` is `false`.

Unfortunately in `Workerpool` these were checked the wrong way round.

However, there were a few other places that were not being checked.

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2020-05-08 16:46:05 +01:00
Lunny Xiao fcc8cdd446
Improve config logging when WrappedQueue times out (#11174)
Before
```sh
Unable to set the internal queue for -wrapper Error: Timedout creating queue redis with cfg []byte{0x7b, 0x22, 0x41, 0x64, 0x64, 0x72, 0x65, 0x73, 0x73, 0x65, 0x73, 0x22, 0x3a, 0x22, 0x31, 0x32, 0x37, 0x2e, 0x30, 0x2e, 0x30, 0x2e, 0x31, 0x3a, 0x36, 0x33, 0x37, 0x39, 0x22, 0x2c, 0x22, 0x42, 0x61, 0x74, 0x63, 0x68, 0x4c, 0x65, 0x6e, 0x67, 0x74, 0x68, 0x22, 0x3a, 0x32, 0x30, 0x2c, 0x22, 0x42, 0x6c, 0x6f, 0x63, 0x6b, 0x54, 0x69, 0x6d, 0x65, 0x6f, 0x75, 0x74, 0x22, 0x3a, 0x31, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x30, 0x2c, 0x22, 0x42, 0x6f, 0x6f, 0x73, 0x74, 0x54, 0x69, 0x6d, 0x65, 0x6f, 0x75, 0x74, 0x22, 0x3a, 0x33, 0x30, 0x30, 0x30, 0x30, 0x30
......
```

After
```sh
Unable to set the internal queue for -wrapper Error: Timedout creating queue redis with cfg "{\"Addresses\":\"127.0.0.1:6379\",\"BatchLength\":20,\"BlockTimeout\":1000000000,\"BoostTimeout\":300000000000,\"BoostWorkers\":5,\"DBIndex\":0,\"DataDir\":\".../data/queues/mail\",\"MaxWorkers\":10,\"Name\":\"mail\",\"Network\":\"\",\"Password\":\"\",\"QueueLength\":20,\"QueueName\":\"mail_queue\",\"SetName\":\"\",\"Workers\":1}" in
```
2020-04-22 13:38:40 +01:00
zeripath e83daf77ad
Avoid logging []byte in queue failures - convert to string first (#10865)
Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: guillep2k <18600385+guillep2k@users.noreply.github.com>
2020-03-29 15:12:15 +08:00
Lunny Xiao cf7ece6245
Fix queue log param (#10733) 2020-03-16 16:59:21 +08:00
zeripath 88986746d5
Fix Workerpool deadlock (#10283)
* Prevent deadlock on boost

* Force a boost in testchannelqueue
2020-02-15 18:44:58 +00:00
Lunny Xiao 3d69bbd58f
Fix queue pop error and stat empty repository error (#10248)
* Fix queue pop error and stat empty repository error

* Fix error
2020-02-12 18:12:27 +08:00
zeripath 2c903383b5
Add Unique Queue infrastructure and move TestPullRequests to this (#9856)
* Upgrade levelqueue to version 0.2.0

This adds functionality for Unique Queues

* Add UniqueQueue interface and functions to create them

* Add UniqueQueue implementations

* Move TestPullRequests over to use UniqueQueue

* Reduce code duplication

* Add bytefifos

* Ensure invalid types are logged

* Fix close race in PersistableChannelQueue Shutdown
2020-02-02 23:19:58 +00:00
zeripath 9b9dd19d7d
Fix broken FlushAll (#10101)
* go function contexting is not what you expect

* Apply suggestions from code review

Co-Authored-By: Lauris BH <lauris@nix.lv>

Co-authored-by: Lauris BH <lauris@nix.lv>
2020-02-01 23:43:50 +00:00
Lunny Xiao eac5142ac7
Fix leveldb test race (#10054)
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2020-01-30 11:09:39 -05:00
zeripath c01221e70f
Queue: Make WorkerPools and Queues flushable (#10001)
* Make WorkerPools and Queues flushable

Adds Flush methods to Queues and the WorkerPool
Further abstracts the WorkerPool
Adds a final step to Flush the queues in the defer from PrintCurrentTest
Fixes an issue with Settings inheritance in queues

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Change to for loop

* Add IsEmpty and begin just making the queues composed WorkerPools

* subsume workerpool into the queues and create a flushable interface

* Add manager command

* Move flushall to queue.Manager and add to testlogger

* As per @guillep2k

* as per @guillep2k

* Just make queues all implement flushable and clean up the wrapped queue flushes

* cope with no timeout

Co-authored-by: Lauris BH <lauris@nix.lv>
2020-01-28 20:01:06 -05:00